←back to thread

333 points mooreds | 6 comments | | HN request time: 0.653s | source | bottom
1. merizian ◴[] No.44484927[source]
The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.

In practice, continual learning has not been an important component of improvement in deep learning history thus far. Instead, large diverse datasets and scale have proven to work the best. I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve. I think generally, anthropomorphisms lack predictive power.

I think maybe a big real crux is the amount of acceleration you can achieve once you get very competent programming AIs spinning the RL flywheel. The author mentioned uncertainty about this, which is fair, and I share the uncertainty. But it leaves the rest of the piece feeling too overconfident.

replies(3): >>44486063 #>>44486155 #>>44488036 #
2. 827a ◴[] No.44486063[source]
Continuous learning might not have been important in the history of deep learning evolution yet, but that might just be because the deep learning folks are measuring the wrong thing. If you want to build the most intelligent AI ever, based on whatever synthetic benchmark is hot this month, then you'd do exactly what the labs are doing. If you want to build the most productive and helpful AI; intelligence isn't always the best goal. Its usually helpful, but an idiot who learns from his mistakes is often more valuable than a egotistical genius.
replies(1): >>44489676 #
3. Davidzheng ◴[] No.44486155[source]
Well Alphaproof used test-time-training methods to generate similar problems (alphazero style) for each question it encounters.
4. imtringued ◴[] No.44488036[source]
>The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component. >I think generally, anthropomorphisms lack predictive power.

I didn't expect someone get this part so wrong the way you did. Continuous learning has almost nothing to do with humans and anthropomorphism. If anything, continuous learning is the bitter lesson cranked up to the next level. Rather than carefully curating datasets using human labor, the system learns on its own even when presented with an unfiltered garbage data stream.

>I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve.

The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear. You can build a static model of the system's properties, but the static model will fail, because the real system has changed and you now have a permanent modelling error. Correcting the modelling error requires changing the model, hence continual learning has become mandatory. I think it is pretty telling that you failed to take the existence of reality (a separate entity from the model) into account. The paradigm didn't stop working, it never worked in the first place.

It might be difficult to understand the bitter lesson, but let me rephrase it once more: Generalist compute scaling approaches will beat approaches based around human expert knowledge. Continual learning reduces the need for human expert knowledge in curating datasets, making it the next step in the generalist compute scaling paradigm.

replies(1): >>44492286 #
5. energy123 ◴[] No.44489676[source]
The LLM does learn from its mistakes - while it's training. Each epoch it learns from its mistakes.
6. merizian ◴[] No.44492286[source]
> The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear.

To be clear, this isn’t what Dwarkesh was pointing at, and I think you are using the term “continual learning” differently to him. And he is primarily interested in it because humans do it.

The article introduces a story about how humans learn, and calls it continual learning:

> How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student … This just wouldn’t work … Yes, there’s RL fine tuning. But it’s just not a deliberate, adaptive process the way human learning is.

The point I’m making is just that this is bad form: “AIs can’t do X, but humans can. Humans do task X because they have Y, but AIs don’t have Y, so AIs will find X hard.” Consider I replace X with “common sense reasoning” and Y with “embodied experience”. That would have seemed reasonable in 2020, but ultimately would have been a bad bet.

I don’t disagree with anything else in your response. I also buy into bitter lesson (and generally: easier to measure => easier to optimize). I think it’s just different uses of the same terms. And I don’t necessarily think what you’re referring to as continual learning won’t work.