←back to thread

625 points lukebennett | 1 comments | | HN request time: 0.212s | source
Show context
osigurdson ◴[] No.42144420[source]
This "running out of data" thing suggests that there is something fundamentally wrong with how things are working. A new driver does not need to experience 8000 different rabbit-on-road situations from all angles to know to slow down when we see one on the road. Similarly we don't need 10,000 addition examples to learn how to add. It is as though there is no generalization in the models - just fundamentally search.
replies(2): >>42144498 #>>42149778 #
surrTurr ◴[] No.42144498[source]
i think you underestimate the amount of data a driver experiences in a single 5 minute drive
replies(2): >>42144649 #>>42159546 #
1. qnleigh ◴[] No.42159546[source]
A charitable interpretation of what you're saying is that humans produce lots of original data from their experiences of the world, like thinking about their experiences, imagining what they would have done differently, and perhaps even dreaming. I agree with the root comment that something is fundamentally missing, and probably it is the ability to iteratively learn from one's own experiences, test understanding, and recursively improve.

There are definitely teams working on applying reinforcement learning to LLMs. Maybe that will unlock new potential from finite training data.