(timkellogg.me)

851 points tkellogg | 2 comments | 05 Feb 25 11:05 UTC | HN request time: 0.443s | source

1. GTP ◴[05 Feb 25 13:09 UTC] No.42947974[source]▶

Sorry for being lazy, but I just don't have the time right now to read the paper. Is there in the paper or somewhere else a comparison based on benchmarks of S1 vs R1 (the full R1, not quantized or distilled)?

replies(1): >>42948101 #

2. pama ◴[05 Feb 25 13:21 UTC] No.42948101[source]▶

>>42947974 (TP) #

The S1 paper is not meant to compete with R1. It simply shows that with 1k well curated examples for finetuning (26 minutes training on 16 GPU) and with a simple hack for controlling the length of the thinking process, one can dramatically increase the performance of a non-reasoning model and show a clear increase in benefit with increased test-time compute. It is worth a quick skim.

↑

S1: A $6 R1 competitor?