←back to thread

336 points mooreds | 2 comments | | HN request time: 0.437s | source
Show context
merizian ◴[] No.44484927[source]
The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component.

In practice, continual learning has not been an important component of improvement in deep learning history thus far. Instead, large diverse datasets and scale have proven to work the best. I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve. I think generally, anthropomorphisms lack predictive power.

I think maybe a big real crux is the amount of acceleration you can achieve once you get very competent programming AIs spinning the RL flywheel. The author mentioned uncertainty about this, which is fair, and I share the uncertainty. But it leaves the rest of the piece feeling too overconfident.

replies(3): >>44486063 #>>44486155 #>>44488036 #
1. imtringued ◴[] No.44488036[source]
>The problem with the argument is that it assumes future AIs will solve problems like humans do. In this case, it’s that continuous learning is a big missing component. >I think generally, anthropomorphisms lack predictive power.

I didn't expect someone get this part so wrong the way you did. Continuous learning has almost nothing to do with humans and anthropomorphism. If anything, continuous learning is the bitter lesson cranked up to the next level. Rather than carefully curating datasets using human labor, the system learns on its own even when presented with an unfiltered garbage data stream.

>I believe a good argument for continual learning being necessary needs to directly address why the massive cross-task learning paradigm will stop working, and ideally make concrete bets on what skills will be hard for AIs to achieve.

The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear. You can build a static model of the system's properties, but the static model will fail, because the real system has changed and you now have a permanent modelling error. Correcting the modelling error requires changing the model, hence continual learning has become mandatory. I think it is pretty telling that you failed to take the existence of reality (a separate entity from the model) into account. The paradigm didn't stop working, it never worked in the first place.

It might be difficult to understand the bitter lesson, but let me rephrase it once more: Generalist compute scaling approaches will beat approaches based around human expert knowledge. Continual learning reduces the need for human expert knowledge in curating datasets, making it the next step in the generalist compute scaling paradigm.

replies(1): >>44492286 #
2. merizian ◴[] No.44492286[source]
> The reason why I in particular am so interested in continual learning has pretty much zero to do with humans. Sensors and mechanical systems change their properties over time through wear and tear.

To be clear, this isn’t what Dwarkesh was pointing at, and I think you are using the term “continual learning” differently to him. And he is primarily interested in it because humans do it.

The article introduces a story about how humans learn, and calls it continual learning:

> How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student … This just wouldn’t work … Yes, there’s RL fine tuning. But it’s just not a deliberate, adaptive process the way human learning is.

The point I’m making is just that this is bad form: “AIs can’t do X, but humans can. Humans do task X because they have Y, but AIs don’t have Y, so AIs will find X hard.” Consider I replace X with “common sense reasoning” and Y with “embodied experience”. That would have seemed reasonable in 2020, but ultimately would have been a bad bet.

I don’t disagree with anything else in your response. I also buy into bitter lesson (and generally: easier to measure => easier to optimize). I think it’s just different uses of the same terms. And I don’t necessarily think what you’re referring to as continual learning won’t work.