Chinese software always has such a design language:
- prepaid and then use credit to subscribe
- strange serif font
- that slider thing for captcha
But I'm going to try it out now.
Chinese software always has such a design language:
- prepaid and then use credit to subscribe
- strange serif font
- that slider thing for captcha
But I'm going to try it out now.
but in my testing other models do not work well, looks like prompts are either very optimized for Claude, or other models are just not great yet with such agentic environment
I was especially disappointed with grok code. it is very fast as advertised but in generating spaces and new lines in function calling until it hits max tokens. I wonder if that isn't why it gets so much tokens on openrouter.
gpt-5 just wasn't using the tools very well
I didn't tested glm yet, but with current anthropic subscription value, alternative would need to be very cheap if you consider daily use
edit: I noticed that also have very inexpensive subscription https://z.ai/subscribe, if they trained model to work well with CC this might actually be viable alternative
I think there are multiple things going on. First, models are either trained with tool calling in mind or not, the ones that don't, won't work well as agents. Secondly, each companies models are trained with the agent software in mind, and the agent software is built with their specific models in mind. Thirdly, each model responds differently to different system/user prompts, and the difference can be really stark.
I'm currently working on a tool that lets me execute the same prompts with the same environment over multiple agents. Currently I'm running Codex, Claude Code, Gemini, Qwen Code and AMP for every single change, just to see the differences in responses, and even reusing the same system prompt across all of them gives wildly different results. Not to mention how quickly the quality drops off the cliff as soon as you switch out any non-standard model for any of those CLIs. Mix-and-match models between those five tools, and it becomes clear as day that the model<>software is more interlocked than it seems.
The only project I've had success with switching out the model of, has been using GPT-OSS-120b locally with Codex, but that still required me to manually hack in support for changing the temperature, and changing the prompts Codex use a bit, to get OK results.
Claude code router is a good first step. But you also need to MITM CC while it’s running and collect the back and forth for a while. I would do it if I had more free time, surprised someone smart hasn’t already tried.