←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 0.248s | source
1. vagab0nd ◴[] No.42983523[source]
Cool trick. But is this better than reinforcement learning, where the LLM decides for itself the optimal thinking time for each prompt?