Has anyone else done this and felt the same? Every now and then I try to reevaluate all the models. So far it still feels like Claude is in the lead just because it will predictably do what I want when given a mid-sized problem. Meanwhile o3 will sometimes one-shot a masterpiece, sometimes go down the complete wrong path.
This might also just be a feature of the change in problem size - perhaps the larger problems that necessitate o3 are also too open-ended and would require much more planning up front. But at that point it's actually more natural to just iterate with sonnet and stay in the driver's seat a bit. Plus sonnet runs 5x faster.