The upshot of this is that LLMs are quite good at the stuff that he thinks only humans will be able to do. What they aren't so good at (yet) is really rigorous reasoning, exactly the opposite of what 20th century people assumed.
The upshot of this is that LLMs are quite good at the stuff that he thinks only humans will be able to do. What they aren't so good at (yet) is really rigorous reasoning, exactly the opposite of what 20th century people assumed.
I'd say no, human brains are "trained" on billions of years of sensory data. A very small amount of that is human-generated.
My point in bringing up that metaphor is to focus the analogy: When people say "we're just statistical models trained on sensory data", we tend to focus way too much on the "sensory data" part, which has led to for example AI manufacturers investing billions of dollars into slurping up as much human intellectual output as possible to train "smarter" models.
The focus on the sensory input inherently devalues our quality of being; that who we are is predominately explicable by the world around us.
However: We should be focusing on the "statistical model" part: that even if it is accurate to holistically describe the human brain as a statistical model trained on sensory data (which I have doubts about, but those are fine to leave to the side): its very clear that the fundamental statistical model itself is simply so far superior in human brains that comparing it to an LLM is like comparing us to a dog.
It should also be a focal point for AI manufacturers and researchers. If you are on the hunt for something along the spectrum of human level intelligence, and during this hunt you are providing it ten thousand lifetimes of sensory data, to produce something that, maybe, if you ask it right, it can behave similarity to a human who has trained in the domain in only years: You're barking up the wrong tree. What you're producing isn't even on the same spectrum; that doesn't mean it isn't useful, but its not human-like intelligence.
LLMs have access to what we generate, but not the source. So it embed how we may use words, but not why we use this word and not others.
I don't understand this point - we can obviously collect sensory data and use that for training. Many AI/LLM/robotics projects do this today...
> So it embed how we may use words, but not why we use this word and not others.
Humans learn language by observing other humans use language, not by being taught explicit rules about when to use which word and why.
Here's my broad concern: On the one hand, we have an AI thought leader (Sam Altman) who defines super-intelligence as surpassing human intelligence at all measurable tasks. I don't believe it is controversial to say that we've established that the goal of LLM intelligence is something along these lines: it exists on the spectrum of human intelligence, its trained on human intelligence, and we want it to surpass human intelligence, on that spectrum.
On the other hand: we don't know how the statistical model of human intelligence works, at any level at all which would enable reproduction or comparison, and there's really good reason to believe that the human intelligence statistical model is vastly superior to the LLM model. The argument for this lies in my previous comment: the vast majority of contribution of intelligence advances in LLM intelligence comes from increasing the volume of training data. Some intelligence likely comes from statistical modeling breakthroughs since the transformer, but by and large its from training data. On the other hand: Comparatively speaking, the most intelligent humans are not more intelligent because they've been alive for longer and thus had access to more sensory data. Some minor level of intelligence comes from the quality of your sensory data (studying, reading, education). But the vast majority of intelligence difference between humans is inexplicable; Einstein was just Born Smarter; God granted him a unique and better statistical model.
This points to the undeniable reality that, at the very least, the statistical model of the human brain and that of an LLM is very different, which should cause you to raise eyebrows at Sam Altman's statement that superintelligence will evolve along the spectrum of human intelligence. It might, but its like arguing that the app you're building is going to be the highest quality and fastest MacOS app ever built, and you're building it using WPF and compiling it for x86 to run on WINE and Rosetta. GPT isn't human intelligence; at best, it might be emulating, extremely poorly and inefficiently, some parts of human intelligence. But, they didn't get the statistical model right, and without that its like forcing a square peg into a round hole.
Sensory data is not the main issue, but how we interpret them.
In Jacob Bronowski's The Origins of Knowledge and Imagination, IIRC, there's an argument that our eyes are very coarse sensors. Instead they do basic analysis from which the brain can infer the real world around us with other data from other organs. Like Plato's cave, but with much more dimensions.
But we humans came with the same mechanisms that roughly interpret things the same way. So there's some commonality there about the final interpretation.
> Humans learn language by observing other humans use language, not by being taught explicit rules about when to use which word and why.
Words are symbols that refers to things and the relations between them. In the same book, there's a rough explanation for language which describe the three elements that define it: Symbols or terms, the grammar (or the rules for using the symbols), and a dictionary which maps the symbols to things and the rules to interactions in another domain that we already accept as truth.
Maybe we are not taught the rules explicitly, but there's a lot of training done with corrections when we say a sentence incorrectly. We also learn the symbols and the dictionary as we grow and explore.
So LLMs learn the symbols and the rules, but not the whole dictionary. It can use the rules to create correct sentences, and relates some symbols to other, but ultimately there's no dictionary behind it.
Because we can't compare human and LLM architectural substrates, LLMs will never surpass human-level performance on _all_ tasks that require applying intelligence?
If my summary is correct, then is there any hypothetical replacement for LLM (for example, LLM+robotics, LLMs with CoT, multi-modal LLMs, multi-modal generative AI systems, etc) which would cause you to then consider this argument invalid (i.e. for the replacement, it could, sometime replace humans for all tasks)?
There are 2 types of grammar for natural language - descriptive (how the language actually works and is used) and prescriptive (a set of rule about how a language should be used). There is no known complete and consistent rule-based grammar for any natural human language - all of these grammar are based on some person or people, in a particular period of time, selecting a subset of the real descriptive grammar of the language and saying 'this is the better way'. Prescriptive, rule-based grammar is not at all how humans learn their first language, nor is prescriptive grammar generally complete or consistent. Babies can easily learn any language, even ones that do not have any prescriptive grammar rules, just by observing - there have been many studies that confirm this.
> there's a lot of training done with corrections when we say a sentence incorrectly.
There's a lot of the same training for LLMs.
> So LLMs learn the symbols and the rules, but not the whole dictionary. It can use the rules to create correct sentences, and relates some symbols to other, but ultimately there's no dictionary behind it.
LLMs definitely learn 'the dictionary' (more accurately a set of relations/associations between words and other types of data) and much better than humans do, not that such a 'dictionary' is an actual determined part of the human brain.
LLM luddites often call LLMs stochastic parrots or advanced text prediction engines. They're right, in my view, and I feel that LLM evangelists often don't understand why. Because LLMs have a vastly different statistical model, even when they showcase signs of human-like intelligence, what we're seeing cannot possibly be human-like intelligence, because human intelligence is inseparable from its statistical model.
But, it might still be intelligence. It might still be economically productive and useful and cool. It might also be scarier than most give it credit for being; we're building something that clearly has some kind of intelligence, crudely forcing a mask of human skin over it, oblivious to what's underneath.
I don't buy it. I think our eyes are approximately as fine as we perceive them to be.
When you look through a pair of binoculars at a boat and some trees on the other side of a lake, the only organ that's getting a magnified view is the eyes, so any information you derive comes from the eyes and your imagination, it can't have been secretly inferred from other senses.
No reason to think an LLM (a few generations down the line if not now) cannot do that
And we can distort quite far (see cartoons in drawing, dubstep in music,...)
- basic features (color, brightness and contrast, edges and shapes, motion and direction)
- depth and spatial relationships
- recognition
- location and movement
- focus and attention
- prediction and filling in gaps
“Seeing” real world requires much more than simply seeing with one eye.