Andrej Karpathy: Software in the era of AI [video]

1. khalic ◴[19 Jun 25 10:28 UTC] No.44317209[source]▶

His dismissal of smaller and local models suggests he underestimates their improvement potential. Give phi4 a run and see what I mean.

replies(5): >>44317248 #>>44317295 #>>44317350 #>>44317621 #>>44317716 #

2. TeMPOraL ◴[19 Jun 25 10:34 UTC] No.44317248[source]▶

>>44317209 (TP) #

He ain't dismissing them. Comparing local/"open" model to Linux (and closed services to Windows and MacOS) is high praise. It's also accurate.

replies(1): >>44317671 #

3. sriram_malhar ◴[19 Jun 25 10:40 UTC] No.44317295[source]▶

>>44317209 (TP) #

Of all the things you could suggest, a lack of understanding is not one that can be pinned on Karpathy. He does know his technical stuff.

replies(1): >>44317657 #

4. diggan ◴[19 Jun 25 10:48 UTC] No.44317350[source]▶

>>44317209 (TP) #

> suggests a lack of understanding of these smaller models capabilities

If anything, you're showing a lack of understanding of what he was talking about. The context is this specific time, where we're early in a ecosystem and things are expensive and likely centralized (ala mainframes) but if his analogy/prediction is correct, we'll have a "Linux" moment in the future where that equation changes (again) and local models are competitive.

And while I'm a huge fan of local models run them for maybe 60-70% of what I do with LLMs, they're nowhere near proprietary ones today, sadly. I want them to, really badly, but it's important to be realistic here and realize the differences of what a normal consumer can run, and what the current mainframes can run.

replies(2): >>44317720 #>>44317744 #

5. mprovost ◴[19 Jun 25 11:30 UTC] No.44317621[source]▶

>>44317209 (TP) #

You can disagree with his conclusions but I don't think his understanding of small models is up for debate. This is the person who created micrograd/makemore/nanoGPT and who has produced a ton of educational materials showing how to build small and local models.

replies(1): >>44317729 #

6. khalic ◴[19 Jun 25 11:36 UTC] No.44317657[source]▶

>>44317295 #

We all have blind spots

replies(1): >>44317692 #

7. khalic ◴[19 Jun 25 11:38 UTC] No.44317671[source]▶

>>44317248 #

This is a bad comparison

8. diggan ◴[19 Jun 25 11:41 UTC] No.44317692{3}[source]▶

>>44317657 #

Sure, but maybe suggesting that the person who literally spent countless hours educating others on how to build small models locally from scratch, is lacking knowledge about local small models is going a bit beyond "people have blind spots".

replies(1): >>44317749 #

9. dist-epoch ◴[19 Jun 25 11:44 UTC] No.44317716[source]▶

>>44317209 (TP) #

I tried the local small models. They are slow, much less capable, and ironically much more expensive to run than the frontier cloud models.

replies(1): >>44317776 #

10. khalic ◴[19 Jun 25 11:45 UTC] No.44317720[source]▶

>>44317350 #

He understands the technical part, of course, I was referring to his prediction that large models will be always be necessary.

There is a point where an LLM is good enough for most tasks, I don’t need a megamind AI in order to greet clients, and both large and small/medium model size are getting there, with the large models hitting a computing/energy demand barrier. The small models won’t hit that barrier anytime soon.

replies(1): >>44317799 #

11. khalic ◴[19 Jun 25 11:46 UTC] No.44317729[source]▶

>>44317621 #

I’m going to edit, it was badly formulated, he underestimates their potential for growth is what I meant by that

replies(1): >>44317808 #

12. khalic ◴[19 Jun 25 11:47 UTC] No.44317744[source]▶

>>44317350 #

I edited to make it clearer

13. khalic ◴[19 Jun 25 11:48 UTC] No.44317749{4}[source]▶

>>44317692 #

Their potential, not how they work, it was very badly formulated, just corrected it

14. khalic ◴[19 Jun 25 11:52 UTC] No.44317776[source]▶

>>44317716 #

Phi4-mini runs on a basic laptop CPU at 20T/s… how is that slow? Without optimization…

replies(1): >>44317806 #

15. vikramkr ◴[19 Jun 25 11:56 UTC] No.44317799{3}[source]▶

>>44317720 #

Did he predict they'd always be necessary? He mostly seemed to predict the opposite, that we're at the early stage of a trajectory that has yet to have it's Linux moment

replies(1): >>44322878 #

16. dist-epoch ◴[19 Jun 25 11:57 UTC] No.44317806{3}[source]▶

>>44317776 #

I was running Qwen3-32B locally even faster, 70T/s, still way too slow for me. I'm generating thousands of tokens of output per request (not coding), running locally I could get 6 mil tokens per day and pay electricity, or I can get more tokens per day from Google Gemini 2.5 Flash for free.

Running models locally is a privilege for the rich and those with too much disposable time.

replies(1): >>44423931 #

17. diggan ◴[19 Jun 25 11:57 UTC] No.44317808{3}[source]▶

>>44317729 #

> underestimates their potential for growth

As far as I understood the talk and the analogies, he's saying that local models will eventually replace the current popular "mainframe" architecture. How is that underestimating them?

18. khalic ◴[19 Jun 25 21:55 UTC] No.44322878{4}[source]▶

>>44317799 #

I understand, thanks for pointing that out

19. yencabulator ◴[30 Jun 25 14:39 UTC] No.44423931{4}[source]▶

>>44317806 #

Try Qwen3-30B-A3B. It's MoE to an extent where its use of memory bandwidth looks more like a 3B model, and thus it typically goes faster.