It's not at all obvious where to drop the context, though. Maybe it helps to have similar tasks in the context, maybe not. It did really, shockingly well on a historical HTR task I gave it, so I gave it another one, in some ways an easier one... Thought it wouldn't hurt to have text in a similar style in the context. But then it suddenly did very poorly.
Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.