The part about taking control of a reasoning model's output length using <think></think> tags is interesting.
> In s1, when the LLM tries to stop thinking with "</think>", they force it to keep going by replacing it with "Wait".
I had found a few days ago that this let you 'inject' your own CoT and jailbreak it easier. Maybe these are related?
replies(2):