S1: A $6 R1 competitor?

(timkellogg.me)

851 points tkellogg | 1 comments | 05 Feb 25 11:05 UTC | HN request time: 0.213s | source

1. vagab0nd ◴[08 Feb 25 15:24 UTC] No.42983523[source]▶

Cool trick. But is this better than reinforcement learning, where the LLM decides for itself the optimal thinking time for each prompt?