(github.com)

766 points huseyinkeles | 1 comments | 13 Oct 25 15:22 UTC | HN request time: 0.208s | source

https://x.com/karpathy/status/1977755427569111362

Show context

mhitza ◴[13 Oct 25 17:43 UTC] No.45571218[source]▶

Should be "that you can train for $100"

Curios to try it someday on a set of specialized documents. Though as I understand the cost of running this is whatever GPU you can rent with 80GB of VRAM. Which kind of leaves hobbyists and students out. Unless some cloud is donating gpu compute capacity.

replies(2): >>45571268 #>>45571369 #

Onavo ◴[13 Oct 25 17:56 UTC] No.45571369[source]▶

>>45571218 #

A GPU with 80GB VRAM costs around $1-3 USD an hour on commodity clouds (i.e. the non-Big 3 bare metal providers e.g. https://getdeploying.com/reference/cloud-gpu/nvidia-h100). I think it's accessible to most middle class users in first world countries.

replies(1): >>45571954 #

antinomicus ◴[13 Oct 25 18:49 UTC] No.45571954[source]▶

>>45571369 #

Isn’t the whole point to run your model locally?

replies(4): >>45572029 #>>45572031 #>>45572477 #>>45572856 #

1. theptip ◴[13 Oct 25 18:56 UTC] No.45572029[source]▶

>>45571954 #

No, that’s clearly not a goal of this project.

This is a learning tool. If you want a local model you are almost certainly better using something trained on far more compute. (Deepseek, Qwen, etc)

↑

NanoChat – The best ChatGPT that $100 can buy