←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 2 comments | | HN request time: 0.421s | source
1. GTP ◴[] No.42947974[source]
Sorry for being lazy, but I just don't have the time right now to read the paper. Is there in the paper or somewhere else a comparison based on benchmarks of S1 vs R1 (the full R1, not quantized or distilled)?
replies(1): >>42948101 #
2. pama ◴[] No.42948101[source]
The S1 paper is not meant to compete with R1. It simply shows that with 1k well curated examples for finetuning (26 minutes training on 16 GPU) and with a simple hack for controlling the length of the thinking process, one can dramatically increase the performance of a non-reasoning model and show a clear increase in benefit with increased test-time compute. It is worth a quick skim.