(www.youtube.com)

1480 points sandslash | 1 comments | 19 Jun 25 00:33 UTC | HN request time: 1.108s | source

Show context

khalic ◴[19 Jun 25 10:28 UTC] No.44317209[source]▶

His dismissal of smaller and local models suggests he underestimates their improvement potential. Give phi4 a run and see what I mean.

replies(5): >>44317248 #>>44317295 #>>44317350 #>>44317621 #>>44317716 #

mprovost ◴[19 Jun 25 11:30 UTC] No.44317621[source]▶

>>44317209 #

You can disagree with his conclusions but I don't think his understanding of small models is up for debate. This is the person who created micrograd/makemore/nanoGPT and who has produced a ton of educational materials showing how to build small and local models.

replies(1): >>44317729 #

khalic ◴[19 Jun 25 11:46 UTC] No.44317729[source]▶

>>44317621 #

I’m going to edit, it was badly formulated, he underestimates their potential for growth is what I meant by that

replies(1): >>44317808 #

1. diggan ◴[19 Jun 25 11:57 UTC] No.44317808[source]▶

>>44317729 #

> underestimates their potential for growth

As far as I understood the talk and the analogies, he's saying that local models will eventually replace the current popular "mainframe" architecture. How is that underestimating them?

↑

Andrej Karpathy: Software in the era of AI [video]