S1: A $6 R1 competitor?

(timkellogg.me)

851 points tkellogg | 2 comments | 05 Feb 25 11:05 UTC | HN request time: 0.412s | source

Show context

mtrovo ◴[05 Feb 25 16:48 UTC] No.42951263[source]▶

I found the discussion around inference scaling with the 'Wait' hack so surreal. The fact such an ingeniously simple method can impact performance makes me wonder how many low-hanging fruit we're still missing. So weird to think that improvements on a branch of computer science is boiling down to conjuring the right incantation words, how you even change your mindset to start thinking this way?

replies(16): >>42951704 #>>42951764 #>>42951829 #>>42953577 #>>42954518 #>>42956436 #>>42956535 #>>42956674 #>>42957820 #>>42957909 #>>42958693 #>>42960400 #>>42960464 #>>42961717 #>>42964057 #>>43000399 #

ascorbic ◴[05 Feb 25 20:18 UTC] No.42954518[source]▶

>>42951263 #

I've noticed that R1 says "Wait," a lot in its reasoning. I wonder if there's something inherently special in that token.

replies(2): >>42954757 #>>42959520 #

lionkor ◴[05 Feb 25 20:38 UTC] No.42954757[source]▶

>>42954518 #

Semantically, wait is a bit of a stop-and-breathe point.

Consider the text:

I think I'll go swimming today. Wait, ___

what comes next? Well, not something that would usually follow without the word "wait", probably something entirely orthogonal that impacts the earlier sentence in some fundamental way, like:

Wait, I need to help my dad.

replies(1): >>42960020 #

1. ascorbic ◴[06 Feb 25 07:35 UTC] No.42960020[source]▶

>>42954757 #

Yes, R1 seems to mostly use it like that. It's either to signal a problem with its previous reasoning, or if it's thought of a better approach. In coding it's often something like "this API won't work here" or "there's a simpler way to do this".

replies(1): >>43000689 #

2. fennecfoxy ◴[10 Feb 25 14:29 UTC] No.43000689[source]▶

>>42960020 (TP) #

I guess it goes to show how important reiteration is for general logic problems. And tbf when finding a solution to something myself I'll consider each part, and/or consider parts in relation to each other and/or consider all parts in relation to each other (on a higher level) before coming to a final solution.

It's weird because I feel like we should've known that from work in general logic/problem solving studies, surely?

↑