←back to thread

577 points simonw | 6 comments | | HN request time: 0.2s | source | bottom
1. bob1029 ◴[] No.44725490[source]
> still think it’s noteworthy that a model running on my 2.5 year old laptop (a 64GB MacBook Pro M2) is able to produce code like this—especially code that worked first time with no further edits needed.

I believe we are vastly underestimating what our existing hardware is capable of in this space. I worry that narratives like the bitter lesson and the efficient compute frontier are pushing a lot of brilliant minds away from investigating revolutionary approaches.

It is obvious that the current models are deeply inefficient when you consider how much you can decimate the precision of the weights post-training and still have pelicans on bicycles, etc.

replies(2): >>44725533 #>>44758837 #
2. jonas21 ◴[] No.44725533[source]
Wasn't the bitter lesson about training on large amounts of data? The model that he's using was still trained on a massive corpus (22T tokens).
replies(2): >>44725644 #>>44725671 #
3. yahoozoo ◴[] No.44725644[source]
What does that have to do with quantizing?
4. itsalotoffun ◴[] No.44725671[source]
I think GP means that if you internalize the bitter lesson (more data more compute wins), you stop imagining how to squeeze SOTA minus 1 performance out of constrained compute environments.
replies(1): >>44730409 #
5. reactordev ◴[] No.44730409{3}[source]
This. When we ran out of speed on the CPU, we moved to the GPU. Same thing here. The more we work with (22T) models, quants, and decimating precision - the more we learn and find more novel ways to do things.
6. Breza ◴[] No.44758837[source]
Very well put. There's a lot to be gained from using smaller models and existing hardware. So many enterprise PMs skip straight to using a cutting edge LLM via API. There are many tasks where a self-hosted LLM or even a finetuned small language model can either complete a preliminary step or even handle the full task for much less money. And if a self-hosted model can do the job today, imagine what you'll be able to do in a year or five when you have more powerful hardware and even better models.