[1] Large language models may become an important component in whatever comes next, but I think we still need a component that can do proper reasoning and has proper memory not susceptible to hallucinating facts.
[1] Large language models may become an important component in whatever comes next, but I think we still need a component that can do proper reasoning and has proper memory not susceptible to hallucinating facts.
It seems a matter of perspective to me whether you call it "dead end" or "stepping stone".
To give some pause before dismissing the current state of the art prematurely:
I would already consider LLM based current systems more "intelligent" than a housecat. And a pets intelligence is enough to have ethical implications, so we arguably reached a very important milestone already.
I would argue that the biggest limitation on current "AI" is that it is architected to not have agency; if you had GPT-3 level intelligence in an easily anthropomorphizeable package (furby-style, capable of emoting/communicating by itself) public outlook might shift drastically without even any real technical progress.
I do suspect this is only achieveable because the model was specifically trained for this.
But the same is true for humans; children can't really "reason themselves" into basic arithmetic-- that's a skill that requires considerable training.
I do concede that this (learning/skill aquisition) is something that humans can do "online" (within days/weeks/months) while LLMs need a separate process for it.
> in a strong version of this test I would want nothing related to long multiplication in the training data.
Is this not a bit of a double standard? I think at least 99/100 humans with minimal previous math exposure would utterly fail this test.
The models can do surprisingly large numbers correctly, but they essentially memorized them. As you make the numbers longer and longer, the result becomes garbage. If they would actually reason about it, this would not happen, multiplying those long numbers is not really harder than multiplying two digit numbers, just more time consuming and annoying.
And I do not want the model to figure multiplication out on its own, I want to provide it with what teachers tell children until they get to long multiplication. The only thing where I want to push the AI is to do it for much longer numbers, not only two, three, four digits or whatever you do in primary school.
And the difference is not only in online vs offline, large language models have almost certainly been trained on heaps of basic mathematics, but did not learn to multiply. They can explain to you how to do it because they have seen countless explanation and examples, but they can not actually do it themselves.
Only a very small % of the population is leveraging AI in any meaningful way. But I think today's tools are sufficient for them to do so if they wanted to start and will only get better (even if the LLMs don't, which they will).
When I wrote dead end, I meant for achieving an AI that can properly reason and knows what it knows and maybe is even able to learn. For finding stuff in heaps of text, large language models are relatively fine and can improve productivity, with the somewhat annoying fact that one has to double check what the model says.
> When early automobiles began appearing in the 1890’s — first steam-powered, then electric, then gasoline –most carriage and wagon makers dismissed them. Why wouldn’t they? The first cars were: Loud and unreliable, Expensive and hard to repair, Starved for fuel in a world with no gas stations, Unsuitable for the dirt roads of rural America
That sounds like complaints against today's LLM limitations. It will be interesting to see how your comment ages in 5-10-15 years. You might be technically right that LLMs are a dead end. But the article isn't about LLMs really, it's about the change to an "AI" world from a non-AI world and how the author believes it will be similar to the change from the non-car to the car world.
You might even say LLMs are good with text in the same way that early automobiles were good for transportation, provided you watched out for the potholes and stream crossings and didn't try to cross the river on the railroad bridge. (DeLoreans are said to be good at that, though :).)
edit (it's late, I'm just being a snark. I don't think researchers whose job is implicitly tied to hype is a good example of a worker increasing their productivity)
An interesting experiment would be to have a robot with an LLM mind and see what things it could figure out, like would it learn to charge itself or something. But personally I don't think they have anywhere near the general intelligence of animals.
That last one isn’t useful to society, but it is for the individual.
I know plenty of people using LLMs using for stuff like this, in all sorts of walks of life.
IMO "ability to communicate" is a somewhat fair proxy for intelligence (even if it does not capture all of an animals capabilities), and current LLMs are clearly superior to any animal in that regard.
“Do long arithmetic entirely in your mind” is not a test most humans can pass. Maybe a few savants. This makes me suspect it is not a reliable test of reasoning.
Humans also get a training run every night. As we sleep, our brains are integrating our experiences from the day into our existing minds, so we can learn things from day to day. Kids definitely do not learn long multiplication in just one day. LLMs don’t work like this; they get only one training run and that is when they have to learn everything all at once.
LLMs for sure cannot learn and reason the same way humans do. Does that mean they cannot reason at all? Harder question IMO. You’re right that Python did the math, but the LLM wrote the Python. Maybe that is like their version of “doing it on paper.”
Also I am not asking to learn it in one day, you can dump everything that a child would hear and read during primary school into the context. You can even do it interactively, maybe the model has questions.