> "Note that this s1 dataset is distillation. Every example is a thought trace generated by another model, Qwen2.5"
The traces are generated by Gemini Flash Thinking.
8 hours of H100 is probably more like $24 if you want any kind of reliability, rather than $6.
replies(1):