Pricing is really good for this benchmark value. Let’s see how it holds against people testing it.
If this is sonoma-dusk that was on preview on openrouter, it's pretty cool. I've tested it with some code reverse engineering tasks, and it is at or above gpt5-mini level, while being faster. Works well till about 110-130k tokens tasks, then it gets the case of "getthereitis" and finishes the task even if not all constraints are met (i.e. will say I've solved x/400 tests, the rest can be done later)