(timkellogg.me)

851 points tkellogg | 3 comments | 05 Feb 25 11:05 UTC | HN request time: 0.458s | source

1. theturtletalks ◴[05 Feb 25 14:02 UTC] No.42948588[source]▶

Deepseek R1 uses <think/> and wait and you can see it in the thinking tokens second guessing itself. How does the model know when to wait?

These reasoning models are feeding more to OP's last point about NVidia and OpenAI data centers not being wasted since reason models require more tokens and faster tps.

replies(2): >>42948620 #>>42952806 #

2. qwertox ◴[05 Feb 25 14:04 UTC] No.42948620[source]▶

>>42948588 (TP) #

Probably when it would expect a human to second guess himself, as shown in literature and maybe other sources.

3. UncleEntity ◴[05 Feb 25 18:18 UTC] No.42952806[source]▶

>>42948588 (TP) #

From playing around they seem to 'wait' when there's a contradiction in their logic.

And I think the second point is due to The Market thinking there is no need to spend ever increasing amounts of compute to get to the next level of AI overlordship.

Of course Jevon's paradox is also all in the news these days..

↑

S1: A $6 R1 competitor?