(timkellogg.me)

851 points tkellogg | 3 comments | 05 Feb 25 11:05 UTC | HN request time: 0.549s | source

1. cowsaymoo ◴[05 Feb 25 13:55 UTC] No.42948512[source]▶

>>42946854 (OP) #

The part about taking control of a reasoning model's output length using <think></think> tags is interesting.

> In s1, when the LLM tries to stop thinking with "</think>", they force it to keep going by replacing it with "Wait".

I had found a few days ago that this let you 'inject' your own CoT and jailbreak it easier. Maybe these are related?

https://pastebin.com/G8Zzn0Lw

https://news.ycombinator.com/item?id=42891042#42896498

replies(2): >>42948587 #>>42965658 #

2. causal ◴[05 Feb 25 14:02 UTC] No.42948587[source]▶

>>42948512 (TP) #

This even points to a reason why OpenAI hides the "thinking" step: it would be too obvious that the context is being manipulated to induce more thinking.

3. zamalek ◴[06 Feb 25 19:30 UTC] No.42965658[source]▶

>>42948512 (TP) #

It's weird that you need to do that at all, couldn't you just reject that token and use the next most probable?

↑

S1: A $6 R1 competitor?