←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 0s | source
Show context
robrenaud ◴[] No.42953186[source]
> "Note that this s1 dataset is distillation. Every example is a thought trace generated by another model, Qwen2.5"

The traces are generated by Gemini Flash Thinking.

8 hours of H100 is probably more like $24 if you want any kind of reliability, rather than $6.

replies(1): >>42953657 #
zaptrem ◴[] No.42953657[source]
"You can train a SOTA LLM for $0.50" (as long as you're distilling a model that cost $500m into another pretrained model that cost $5m)
replies(2): >>42955053 #>>42955523 #
1. knutzui ◴[] No.42955523[source]
The original statement stands, if what you are suggesting in addition to it is true. If the initial one-time investment of $505m is enough to distill new SOTA models for $0.50 a piece, then the average cost for subsequent models will trend toward $0.50.