Most active commenters
  • mannykannot(4)
  • shkkmo(4)
  • ddingus(3)

←back to thread

549 points orcul | 35 comments | | HN request time: 0.61s | source | bottom
Show context
Animats ◴[] No.41890003[source]
This is an important result.

The actual paper [1] says that functional MRI (which is measuring which parts of the brain are active by sensing blood flow) indicates that different brain hardware is used for non-language and language functions. This has been suspected for years, but now there's an experimental result.

What this tells us for AI is that we need something else besides LLMs. It's not clear what that something else is. But, as the paper mentions, the low-end mammals and the corvids lack language but have some substantial problem-solving capability. That's seen down at squirrel and crow size, where the brains are tiny. So if someone figures out to do this, it will probably take less hardware than an LLM.

This is the next big piece we need for AI. No idea how to do this, but it's the right question to work on.

[1] https://www.nature.com/articles/s41586-024-07522-w.epdf?shar...

replies(35): >>41890104 #>>41890470 #>>41891063 #>>41891228 #>>41891262 #>>41891383 #>>41891507 #>>41891639 #>>41891749 #>>41892068 #>>41892137 #>>41892518 #>>41892576 #>>41892603 #>>41892642 #>>41892738 #>>41893400 #>>41893534 #>>41893555 #>>41893732 #>>41893748 #>>41893960 #>>41894031 #>>41894713 #>>41895796 #>>41895908 #>>41896452 #>>41896476 #>>41896479 #>>41896512 #>>41897059 #>>41897270 #>>41897757 #>>41897835 #>>41905326 #
1. CSMastermind ◴[] No.41892068[source]
When you look at how humans play chess they employ several different cognitive strategies. Memorization, calculation, strategic thinking, heuristics, and learned experience.

When the first chess engines came out they only employed one of these: calculation. It wasn't until relatively recently that we had computer programs that could perform all of them. But it turns out that if you scale that up with enough compute you can achieve superhuman results with calculation alone.

It's not clear to me that LLMs sufficiently scaled won't achieve superhuman performance on general cognitive tasks even if there are things humans do which they can't.

The other thing I'd point out is that all language is essentially synthetic training data. Humans invented language as a way to transfer their internal thought processes to other humans. It makes sense that the process of thinking and the process of translating those thoughts into and out of language would be distinct.

replies(6): >>41892323 #>>41892362 #>>41892675 #>>41893389 #>>41893580 #>>41895058 #
2. PaulDavisThe1st ◴[] No.41892323[source]
> It's not clear to me that LLMs sufficiently scaled won't achieve superhuman performance on general cognitive tasks

If "general cognitive tasks" means "I give you a prompt in some form, and you give me an incredible response of some form " (forms may differ or be the same) then it is hard to disagree with you.

But if by "general cognitive task" you mean "all the cognitive things that human do", then it is really hard to see why you would have any confidence that LLMs have any hope of achieving superhuman performance at these things.

replies(1): >>41893022 #
3. nox101 ◴[] No.41892362[source]
It sounds like you think this research is wrong? (it claims llms can not reason)

https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine...

or do you maybe think no logical reasoning is needed to do everything a human can do? Tho humans seem to be able to do logical reasoning

replies(3): >>41892408 #>>41892707 #>>41892803 #
4. astrange ◴[] No.41892408[source]
It says "current" LLMs can't "genuinely" reason. Also, one of the researchers then posted an internship for someone to work on LLM reasoning.

I think the paper should've included controls, because we don't know how strong the result is. They certainly may have proven that humans can't reason either.

replies(1): >>41892660 #
5. mannykannot ◴[] No.41892660{3}[source]
If they had human controls, they might well show that some humans can’t do any better, but based on how they generated test cases, it seems unlikely to me that doing so would prove that humans cannot reason (of course, if that’s actually the case, we cannot trust ourselves to devise, execute and interpret these tests in the first place!)

Some people will use any limitation of LLMs to deny there is anything to see here, while others will call this ‘moving the goalposts’, but the most interesting questions, I believe, involve figuring out what the differences are, putting aside the question of whether LLMs are or are not AGIs.

6. threeseed ◴[] No.41892675[source]
> It's not clear to me that LLMs sufficiently scaled won't achieve superhuman performance

To some extent this is true.

To calculate A + B you could for example generate A, B for trillions of combinations and encode that within the network. And it would calculate this faster than any human could.

But that's not intelligence. And Apple's research showed that LLMs are simply inferring relationships based on the tokens it has access to. Which you can throw off by adding useless information or trying to abstract A + B.

replies(1): >>41893161 #
7. CSMastermind ◴[] No.41892707[source]
The later.

While I generally do suspect that we need to invent some new technique in the realm of AI in order for software to do everything a human can do, I use analogies like chess engines to caution myself from certainty.

8. bbor ◴[] No.41892803[source]
I’ll pop in with a friendly “that research is definitely wrong”. If they want to prove that LLMs can’t reason, shouldn’t they stringently define that word somewhere in their paper? As it stands, they’re proving something small (some of today’s LLMs have XYZ weaknesses) and claiming something big (humans have an ineffable calculator-soul).

LLMs absolutely 100% can reason, if we take the dictionary definition; it’s trivial to show their ability to answer non-memorized questions, and the only way to do that is some sort of reasoning. I personally don’t think they’re the most efficient tool for deliberative derivation of concepts, but I also think any sort of categorical prohibition is anti-scientific. What is the brain other than a neural network?

Even if we accept the most fringe, anthropocentric theories like Penrose & Hammerhoff’s quantum tubules, that’s just a neural network with fancy weights. How could we possibly hope to forbid digital recreations of our brains from “truly” or “really” mimicking them?

replies(4): >>41893179 #>>41893265 #>>41893282 #>>41893782 #
9. jhrmnn ◴[] No.41893022[source]
Even in cognitive tasks expressed via language, something like a memory feels necessary. At which point it’s not a LLM as in a generic language model. It would become a language model conditioned on the memory state.
replies(1): >>41893745 #
10. Dylan16807 ◴[] No.41893161[source]
> To calculate A + B you could for example generate A, B for trillions of combinations and encode that within the network. And it would calculate this faster than any human could.

I don't feel like this is a very meaningful argument because if you can do that generation then you must already have a superhuman machine for that task.

11. visarga ◴[] No.41893179{3}[source]
Chasing our own tail with concepts like "reasoning". Let's move the concept a bit - "search". Can LLMs search for novel ideas and discoveries? They do under the right circumstances. You got to provide idea testing environments, the missing ingredient. Search and learn, it's what humans do and AI can do as well.

The whole issue with "reasoning" is that is an incompletely defined concept. Over what domain, what problem space, and what kind of experimental access do we define "reasoning"? Search is better as a concept because it comes packed with all these things, and without conceptual murkiness. Search is scientifically studied to a greater extent.

I don't think we doubt LLMs can learn given training data, we already accuse them of being mere interpolators or parrots. And we can agree to some extent the LLMs can recombine concepts correctly. So they got down the learning part.

And for the searching part, we can probably agree its a matter of access to the search space not AI. It's an environment problem, and even a social one. Search is usually more extended than the lifetime of any agent, so it has to be a cultural process, where language plays a central role.

When you break reasoning/progress/intelligence into "search and learn" it becomes much more tractable and useful. We can also make more grounded predictions on AI, considering the needs for search that are implied, not just the needs for learning.

How much search did AlphaZero need to beat us at go? How much search did humans pack in our 200K years history over 10,000 generations? What was the cost of that journey of search? That kind of questions. In my napkin estimations we solved 1:10000 of the problem by learning, search is 10000x to a million times harder.

replies(1): >>41893284 #
12. shkkmo ◴[] No.41893265{3}[source]
> LLMs absolutely 100% can reason, if we take the dictionary definition; it’s trivial to show their ability to answer non-memorized questions, and the only way to do that is some sort of reasoning.

Um... What? That is a huge leap to make.

'Reasoning' is a specific type of thought process and humans regularly make complicated decisions without doing it. We uses hunches and intuition and gut feelings. We make all kinds of snap assessments that we don't have time to reason through. As such, answering novel questions doesn't necessarily show a system is capable of reasoning.

I see absolutely nothing resumbling an argument for humans having an "ineffable calculator soul", I think that might be you projecting. There is no 'categorical prohibition', only an analysis of the current flaws of specific models.

Personally, my skepticism about imminent AGI has to do believing we may be underestimating the complexity of the software running on our brain. We've reached the point where we can create digital "brains", or atleast portions of them. We may be missing some other pieces of a digital brain, or we may just not have the right software to run on it yet. I suspect it is both but that we'll have fully functional digital brains well before we figure out the software to run on them.

replies(1): >>41895516 #
13. tsimionescu ◴[] No.41893282{3}[source]
> Even if we accept the most fringe, anthropocentric theories like Penrose & Hammerhoff’s quantum tubules, that’s just a neural network with fancy weights.

First, while it is a fringe idea with little backing it, it's far from the most fringe.

Secondly, it is not at all known that animal brains are accurately modeled as an ANN, any more so than any other Turing-compatible system can be modeled as an ANN. Biological neurons are themselves small computers, like all living cells in general, with not fully understood capabilities. The way biological neurons are connected is far more complex than a weight in an ANN. And I'm not talking about fantasy quantum effects in microtubules, I'm talking about well-established biology, with many kinds of synapses, some of which are "multicast" in a spatially distinct area instead of connected to specific neurons. And about the non-neuronal glands which are known to change neuron behavior and so on.

How critical any of these differences are to cognition is anyone's guess at this time. But dismissing them and reducing the brain to a bigger NN is not wise.

replies(2): >>41893426 #>>41894649 #
14. shkkmo ◴[] No.41893284{4}[source]
You can't breakdown cognition into just "search" and "learn" without either ridiculously overloading those concepts or leaving a ton out.
15. shkkmo ◴[] No.41893389[source]
Sure, when humans use multiple skill to address a specific problem, you can sometimes outperform them by scaling a spefic one of those skills.

When it comes to general intelligence, I think we are trying to run before we can walk. We can't even make a computer with a basic, animal level understanding of the world. Yet we are trying to take a tool that was developed on top of system that already had an understanding of the world and use it to work backwards to give computers an understanding of the world.

I'm pretty skeptical that we're going to succeed at this. I think you have to be able to teach a computer to climb a tree or hunt (subhuman AGI) before you can create superhuman AGI.

16. adrianN ◴[] No.41893426{4}[source]
It is my understanding that Penrose doesn’t claim that brains are needed for cognition, just that brains are needed for a somewhat nebulous „conscious experience“, which need not have any observable effects. I think that it’s fairly uncontroversial that a machine can produce behavior that is indistinguishable from human intelligence over some finite observation time. The Chinese room speaks Chinese, even if it lacks understanding for some definitions of the term.
replies(1): >>41893950 #
17. senand ◴[] No.41893580[source]
This seems quite reasonable, but I recently heard a podcast (https://www.preposterousuniverse.com/podcast/2024/06/24/280-...) that LLMs are more likely to be very good at navigating what they have been trained on, but very poor at abstract reasoning and discovering new areas outside of their training. As a single human, you don't notice, as the training material is greater than everything we could ever learn.

After all, that's what Artificial General Intelligence would at least in part be about: finding and proving new math theorems, creating new poetry, making new scientific discoveries, etc.

There is even a new challenge that's been proposed: https://arcprize.org/blog/launch

> It makes sense that the process of thinking and the process of translating those thoughts into and out of language would be distinct

Yes, indeed. And LLMs seem to be very good at _simulating_ the translation of thought into language. They don't actually do it, at least not like humans do.

replies(2): >>41894802 #>>41898231 #
18. ddingus ◴[] No.41893745{3}[source]
More than a memory.

Needs to be a closed loop, running on its own.

We get its attention, and it responds, or frankly if we did manage any sort of sentience, even a simulation of it, then the fact is it may not respond.

To me, that is the real test.

19. ddingus ◴[] No.41893782{3}[source]
Can they reason, or is the volume of training data sufficient for them to match relationships up to appropriate expressions?

Basically, if humans have had meaningful discussions about it, the product of their reasoning is there for the LLM, right?

Seems to me, the "how many R's are there in the word "strawberry" problem is very suggestive of the idea LLM systems cannot reason. If they could, the question is not difficult.

The fact is humans may never have actually discussed that topic in any meaningful way captured in the training data.

And because of that and how specific the question is, the LLM has no clear relationships to map into a response. It just does best case, whatever the math deemed best.

Seems plausible enough to support the opinion LLM'S cannot reason.

What we do know is LLMs can work with anything expressed in terms of relationships between words.

There is a ton of reasoning templates contained in that data.

Put another way:

Maybe LLM systems are poor at deduction, save for examples contained in the data. But there are a ton of examples!

So this is hard to notice.

Maybe LLM systems are fantastic at inference! And so those many examples get mapped to the prompt at hand very well.

And we do notice that and see it like real thinking, not just some horribly complex surface containing a bazillion relationships...

replies(1): >>41894915 #
20. jstanley ◴[] No.41893950{5}[source]
But conscious experience does produce observable effects.

For that not to be the case, you'd have to take the position that humans experience consciousness and they talk about consciousness but that there is no causal link between the two! It's just a coincidence that the things you find yourself saying about consciousness line up with your internal experience?

https://www.lesswrong.com/posts/fdEWWr8St59bXLbQr/zombies-zo...

replies(1): >>41893995 #
21. adrianN ◴[] No.41893995{6}[source]
That philosophers talk about p-zombies seems like evidence to me that at least some of them don't believe that consciousness needs to have observable effects that can't be explained without consciousness. I don't say that I believe that too. I don't believe that there is anything particularly special about brains.
replies(2): >>41894429 #>>41895399 #
22. GoblinSlayer ◴[] No.41894429{7}[source]
If brain isn't more special than Chinese room, then brain understands Chinese no better than Chinese room?
replies(1): >>41895247 #
23. Koala_ice ◴[] No.41894649{4}[source]
There's a lot of other interesting biology besides propagation of electrical signals. Examples include: 1/ Transport of mRNAs (in specialized vesicle structures!) between neurons. 2/ Activation and integration of retrotransposons during brain development (which I have long hypothesized acts as a sort of randomization function for the neural field). 3/ Transport of proteins between and within neurons. This isn't just adventitious movement, either - neurons have a specialized intracellular transport system that allows them to deliver proteins to faraway locations (think >1 meters).
24. klabb3 ◴[] No.41894802[source]
> As a single human, you don't notice, as the training material is greater than everything we could ever learn.

This bias is real. Current gen ai works proportionally well the more known it is. The more training data, the better the performance. When we ask something very specific, we have the impression that it’s niche. But there is tons of training data also on many niche topics, which essentially enhances the magic trick – it looks like sophisticated reasoning. Whenever you truly go “off the beaten path”, you get responses that are (a) nonsensical (illogical) and (b) “pulls” you back towards a “mainstream center point” so to say. Anecdotally of course..

I’ve noticed this with software architecture discussions. I would have some pretty standard thing (like session-based auth) but I have some specific and unusual requirement (like hybrid device- and user identity) and it happily spits out good sounding but nonsensical ideas. Combining and interpolating entirely in the the linguistic domain is clearly powerful, but ultimately not enough.

25. chongli ◴[] No.41894915{4}[source]
The “how many R’s are in the word strawberry?” problem can’t be solved by LLMs specifically because they do not have access to the text directly. Before the model sees the user input it’s been tokenized by a preprocessing step. So instead of the string “strawberry”, the model just sees an integer token the word has been mapped to.
replies(1): >>41899772 #
26. TheOtherHobbes ◴[] No.41895058[source]
Chess is essentially a puzzle. There's a single explicit, quantifiable goal, and a solution either achieves the goal or it doesn't.

Solving puzzles is a specific cognitive task, not a general one.

Language is a continuum, not a puzzle. The problem with LLMs is that testing has been reduced to performance on language puzzles, mostly with hard edges - like bar exams, or letter counting - and they're a small subset of general language use.

27. mannykannot ◴[] No.41895247{8}[source]
The brain is faster than the Chinese room, but other than that, yes, that's the so-called systems reply; Searle's response to it (have the person in the room memorize the instruction book) is beside the point, as you can teach people to perform all sorts of algorithms without them needing to understand the result.

As many people have pointed out, Searle's argument begs the question by tacitly assuming that if anything about the room understands Chinese, it can only be the person within it.

28. mannykannot ◴[] No.41895399{7}[source]
The p-zombie argument is the best-known of a group of conceivability arguments, which ultimately depend on the notion that if a proposition is conceivably true, then there is a metaphysically possible world in which it is true. Skeptics suppose that this is just a complicated way of equivocating over what 'conceivable' means, and even David Chalmers, the philosopher who has done the most to bring the p-zombie argument to wide attention, acknowledges that it depends on the assumption of what he calls 'perfect conceivability', which is tantamount to irrefutable knowledge.

To deal with the awkwardly apparent fact that consciousness certainly seems to have physical effects, zombiephiles challenge the notion that physics is causally closed, so that it is conceivable that something non-physical can cause physical effects. Their approach is to say that the causal closure of physics is not provable, but at this point, the argument has become a lexicographical one, about the definition of the words 'physics' and 'physical' (if one insists that 'physical' does not refer to a causally-closed concept, then we still need a word for the causal closure within which the physical is embedded - but that's just what a lot of people take 'physical' to mean in the first place.) None of the anti-physicalists have been able, so far, to shed any light on how the mind is causally effective in the physical world.

You might be interested in the late Daniel Dennett's "The Unimagined Preposterousness of Zombies": https://dl.tufts.edu/concern/pdfs/6m312182x

replies(1): >>41898241 #
29. bbor ◴[] No.41895516{4}[source]
All well said, and I agree on many of your final points! But you beautifully highlighted my issue at the top:

  'Reasoning' is a specific type of thought process 
If so, what exactly is it? I don’t need a universally justified definition, I’m just looking for an objective, scientific one. A definition that would help us say for sure that a particular cognition is or isn’t a product of reason.

I personally have lots of thoughts on the topic and look to Kant and Hegel for their definitions of reason as the final faculty of human cognition (after sensibility, understanding, and judgement), and I even think there’s good reason (heh) to think that LLMs are not a great tool for that on their own. But my point is that none of the LLM critics have a definition anywhere close to that level of specificity.

Usually, “reason” is used to mean “good cognition”, so “LLMs can’t reason” is just a variety of cope/setting up new goalposts. We all know LLMs aren’t flawless or infinite in their capabilities, but I just don’t find this kind of critique specific enough to have any sort of scientific validity. IMHO

replies(2): >>41896163 #>>41897126 #
30. mannykannot ◴[] No.41896163{5}[source]
I feel you are putting too much emphasis on the importance and primacy of having a definition of words like 'reasoning'.

As humanity has struggled to understand the world, it has frequently given names to concepts that seem to matter, well before it is capable of explaining with any sort of precision what these things are, and what makes them matter - take the word 'energy', for example.

It seems clear to me that one must have these vague concepts before one can begin to to understand them, and also that it would be bizarre not to give them a name at that point - and so, at that point, we have a word without a locked-down definition. To insist that we should have the definition locked down before we begin to investigate the phenomenon or concept is precisely the wrong way to go about understanding it: we refine and rewrite the definitions as a consequence of what our investigations have discovered. Again, 'energy' provides a useful case study for how this happens.

A third point about the word 'energy' is that it has become well-defined within physics, and yet retains much of its original vagueness in everyday usage, where, in addition, it is often used metaphorically. This is not a problem, except when someone makes the lexicographical fallacy of thinking that one can freely substitute the physics definition into everyday speech (or vice-versa) without changing the meaning.

With many concepts about the mental, including 'reasoning', we are still in the learning-and-writing-the-definition stage. For example, let's take the definition you bring up: reasoning as good cognition. This just moves us on to the questions of what 'cognition' means, and what distinguishes good cognition from bad cognition (for example, is a valid logical argument predicated on what turns out to be a false assumption an example of reasoning-as-good-cognition?) We are not going to settle the matter by leafing through a dictionary, any more than Pedro Carolino could write a phrase book just from a Portugese-English dictionary (and you are probably aware that looking up definitions-of-definitions recursively in a dictionary often ends up in a loop.)

A lot of people want to jump the gun on this, and say definitively either that LLMs have achieved reasoning (or general intelligence or a theory of mind or even consciousness, for that matter) or that they have not (or cannot.) What we should be doing, IMHO, is to put aside these questions until we have learned enough to say more precisely what these terms denote, by studying humans, other animals, and what I consider to be the surprising effectiveness of LLMs - and that is what the interviewee in the article we are nominally discussing here is doing.

You entered this thread by saying (about the paper underlying an article in Ars Tech [1]) I’ll pop in with a friendly “that research is definitely wrong”. If they want to prove that LLMs can’t reason..., but I do not think there is anything like that claim in the paper itself (one should not simply trust what some person on HN says about a paper. That, of course, goes as much for what I say about it as what the original poster said.) To me, this looks like the sort of careful, specific and objective work that will lead to us a better understanding of our concepts of the mental.

[1] https://arxiv.org/pdf/2410.05229

replies(1): >>41898385 #
31. shkkmo ◴[] No.41897126{5}[source]
> don’t need a universally justified definition, I’m just looking for an objective, scientific one. A definition that would help us say for sure that a particular cognition is or isn’t a product of reason.

Unfortunately, you won't get one. We simply don't know enough about cognition to create rigourous definitions of the type you are looking for.

Instead, this paper, and the community in general are trying to perform practical capability assessments. The claim that the GSM8k measures "mathematical reasoning" or "logical reasoning" didn't come from the skeptics.

Alan Turring didn't try to define intelligence, he created a practical test that he thought would be a good benchmark. These days we believe we have better ones.

> I just don’t find this kind of critique specific enough to have any sort of scientific validity. IMHO

"Good cognition" seems like dismisal of a definition, but this is exactly the definition that the people working on this care about. They are not philosphers, they are engineers who are trying to make a system "better" so "good cognition" is exactly what they want.

The paper digs into finding out more about what types of changes impacts peformance on established metrics. The "noop" result is pretty interesting since "relevancy detection" isn't something we commonly think of as key to "good cognition", but a consequence of it.

32. NemoNobody ◴[] No.41898231[source]
What part of AI today leads you to believe that an AGI would be capable of self directed creativity? Today that is impossible - no AI is truly generating "new" stuff, no poetry is constructed creatively, no images are born from a feeling, inspiration is only part of AI generation is you consider it utilizing it's training data, which isn't actually creativity.

I'm not sure why everyone assumes an AGI would just automatically do creativity considering most people are not very creative, despite them quite literally being capable, most people can't create anything. Why wouldn't an AGI have the same issues with being "awake" that we do? Being capable of knowing stuff - as you pointed out, far more facts than a person ever could, I think an awake AGI may even have more "issues" with the human condition than us.

Also - say an AGI comes into existence that is awake, happy and capable of truly original creativity - why tf does it write us poetry? Why solve world hunger - it doesn't hunger. Why cure cancer - what can cancer do to to it?

AGI as currently envisioned is a mythos of fantasy and science fiction.

33. lanstin ◴[] No.41898241{8}[source]
Like what is magic - it turns out to be the ability to go from interior thoughts to stuff happening in the shared world - physics is just the mechanism of the particular magical system we have.
34. NemoNobody ◴[] No.41898385{6}[source]
This is one of my favorite comments I've ever read on HN.

The first three paragraphs you wrote very succinctly and obviously summarize the fundamental flaw of our modern science - that it can't make leaps, at all.

There is no leap of faith in science but there is science that requires such leaps.

We are stuck bc those most capable of comprehending concepts they don't understand and are unexplainable - they won't allow themselves to even develop a vague understanding of such concepts. The scientific method is their trusty hammer and their faith in it renders all that isn't a nail unscientific.

Admitting that they don't kno enough would be akin to societal suicide of their current position - the deciders of what is or isn't true, so I don't expect them to withhold their conclusions til they are more able to.

They are the "priest class" now ;)

I agree with your humble opinion - there is much more we could learn if that was our intent and considering the potential of this, I think we absolutely ought to make certain that we do everything in our power to attain the best possible outcomes of these current and future developments.

Transparent and honest collaboration for the betterment of humanity is the only right path to an AGI god - to oversimplify a lil bit.

Very astute, well formulated position, presented in accessible language and with humility even!

Well done.

35. ddingus ◴[] No.41899772{5}[source]
I think my point stands, despite a poor example.[0]

Other examples exist.

[0]That example is due to tokenization. DoH! I knew better too.

Ah well.