NanoChat – The best ChatGPT that $100 can buy

(github.com)

882 points huseyinkeles | 2 comments | 13 Oct 25 15:22 UTC | HN request time: 0s | source

https://x.com/karpathy/status/1977755427569111362

Show context

karimf ◴[13 Oct 25 16:02 UTC] No.45569878[source]▶

I've always thought about the best way to contribute to humanity: number of people you help x how much you help them. I think what Karpathy is doing is one of the highest leverage ways to achieve that.

Our current world is build on top of open source projects. This is possible because there are a lot of free resources to learn to code so anyone from anywhere in the world can learn and make a great piece of software.

I just hope the same will happen with the AI/LLM wave.

replies(13): >>45571834 #>>45571836 #>>45571900 #>>45571959 #>>45571975 #>>45572208 #>>45572425 #>>45572536 #>>45572555 #>>45572584 #>>45572596 #>>45573593 #>>45575488 #

bkettle ◴[13 Oct 25 18:51 UTC] No.45571975[source]▶

>>45569878 #

This free tradition in software is I think one of the things that I love so much, but I don't see how it can continue with LLMs due to the extremely high training costs and the powerful hardware required for inference. It just seems like writing software will necessarily require paying rent to the LLM hosts to keep up. I guess it's possible that we'll figure out a way to do local inference in a way that is accessible to everyone in the way that most other modern software tools are, but the high training costs make that seem unlikely to me.

I also worry that as we rely on LLMs more and more, we will stop producing the kind of tutorials and other content aimed at beginners that makes it so easy to pick up programming the manual way.

replies(3): >>45572063 #>>45572731 #>>45573771 #

levocardia ◴[13 Oct 25 20:03 UTC] No.45572731[source]▶

>>45571975 #

There's a Stephen Boyd quote that's something like "if your optimization problem is too computationally expensive, just go on vacation to Greece for a few weeks and by the time you get back, computers might be fast enough to solve it." With LLMs there's sort of an equivalent situation with cost: how mindblowing would it be able to train this kind of LLM at all even just 4 years ago? And today you can get a kindergartener level chat model for about $100. Not hard to imagine the same model costing $10 of compute in a few years.

There's also a reasonable way to "leapfrog" the training cost with a pre-trained model. So if you were doing nanochat as a learning exercise and had no money, the idea would be to code it up, run one or two very slow gradient descent iterations on your slow machine to make sure it is working, then download a pre-trained version from someone who could spare the compute.

replies(1): >>45572897 #

dingnuts ◴[13 Oct 25 20:19 UTC] No.45572897[source]▶

>>45572731 #

> today you can get a kindergartener level chat model for about $100. Not hard to imagine the same model costing $10 of compute in a few years.

No, it's extremely hard to imagine since I used one of Karpathy's own models to have a basic chat bot like six years ago. Yes, it spoke nonsense; so did my GPT-2 fine tune four years ago and so does this.

And so does ChatGPT

Improvement is linear at best. I still think it's actually a log curve and GPT3 was the peak of the "fun" part of the curve. The only evidence I've seen otherwise is bullshit benchmarks, "agents" that increase performance 2x by increasing token usage 100x, and excited salesmen proclaiming the imminence of AGI

replies(1): >>45573051 #

simonw ◴[13 Oct 25 20:34 UTC] No.45573051[source]▶

>>45572897 #

Apparently 800 million weekly users are finding ChatGPT useful in its present state.

replies(1): >>45573673 #

1. infinitezest ◴[13 Oct 25 21:43 UTC] No.45573673[source]▶

>>45573051 #

1. According to who? Open AI? 2. Its current state is "basically free and containing no ads". I don't think this will remain true given that, as far as I know, the product is very much not making money.

replies(1): >>45573708 #

2. simonw ◴[13 Oct 25 21:48 UTC] No.45573708[source]▶

>>45573673 (TP) #

Yes, that number is according to OpenAI. They released that 800m number at DevDay last week.

The most recent leaked annualized revenue rate was $12bn/year. They're spending a lot more than that but convincing customers to hand over $12bn is still a very strong indicator of demand. https://www.theinformation.com/articles/openai-hits-12-billi...

↑