←back to thread

1479 points sandslash | 3 comments | | HN request time: 0.626s | source
Show context
khalic ◴[] No.44317209[source]
His dismissal of smaller and local models suggests he underestimates their improvement potential. Give phi4 a run and see what I mean.
replies(5): >>44317248 #>>44317295 #>>44317350 #>>44317621 #>>44317716 #
1. mprovost ◴[] No.44317621[source]
You can disagree with his conclusions but I don't think his understanding of small models is up for debate. This is the person who created micrograd/makemore/nanoGPT and who has produced a ton of educational materials showing how to build small and local models.
replies(1): >>44317729 #
2. khalic ◴[] No.44317729[source]
I’m going to edit, it was badly formulated, he underestimates their potential for growth is what I meant by that
replies(1): >>44317808 #
3. diggan ◴[] No.44317808[source]
> underestimates their potential for growth

As far as I understood the talk and the analogies, he's saying that local models will eventually replace the current popular "mainframe" architecture. How is that underestimating them?