←back to thread

210 points vincirufus | 2 comments | | HN request time: 0.001s | source
Show context
arjie ◴[] No.45145744[source]
Okay, I'm going to try it, but why didn't you link the information on how to integrate it with Claude Code: https://docs.z.ai/scenario-example/develop-tools/claude

Chinese software always has such a design language:

- prepaid and then use credit to subscribe

- strange serif font

- that slider thing for captcha

But I'm going to try it out now.

replies(6): >>45145985 #>>45146764 #>>45146977 #>>45147778 #>>45148478 #>>45155380 #
Szpadel ◴[] No.45146977[source]
you can use any model with Claude code thanks to https://github.com/musistudio/claude-code-router

but in my testing other models do not work well, looks like prompts are either very optimized for Claude, or other models are just not great yet with such agentic environment

I was especially disappointed with grok code. it is very fast as advertised but in generating spaces and new lines in function calling until it hits max tokens. I wonder if that isn't why it gets so much tokens on openrouter.

gpt-5 just wasn't using the tools very well

I didn't tested glm yet, but with current anthropic subscription value, alternative would need to be very cheap if you consider daily use

edit: I noticed that also have very inexpensive subscription https://z.ai/subscribe, if they trained model to work well with CC this might actually be viable alternative

replies(3): >>45147046 #>>45147052 #>>45148517 #
diggan ◴[] No.45148517[source]
> but in my testing other models do not work well, looks like prompts are either very optimized for Claude, or other models are just not great yet with such agentic environment

I think there are multiple things going on. First, models are either trained with tool calling in mind or not, the ones that don't, won't work well as agents. Secondly, each companies models are trained with the agent software in mind, and the agent software is built with their specific models in mind. Thirdly, each model responds differently to different system/user prompts, and the difference can be really stark.

I'm currently working on a tool that lets me execute the same prompts with the same environment over multiple agents. Currently I'm running Codex, Claude Code, Gemini, Qwen Code and AMP for every single change, just to see the differences in responses, and even reusing the same system prompt across all of them gives wildly different results. Not to mention how quickly the quality drops off the cliff as soon as you switch out any non-standard model for any of those CLIs. Mix-and-match models between those five tools, and it becomes clear as day that the model<>software is more interlocked than it seems.

The only project I've had success with switching out the model of, has been using GPT-OSS-120b locally with Codex, but that still required me to manually hack in support for changing the temperature, and changing the prompts Codex use a bit, to get OK results.

replies(1): >>45151146 #
1. oceanplexian ◴[] No.45151146[source]
It’s probably not that hard to take the OSS models and fine tune them for CC. Which means with a little bit of reverse engineering and some free time you could get an open source models working perfectly with it.

Claude code router is a good first step. But you also need to MITM CC while it’s running and collect the back and forth for a while. I would do it if I had more free time, surprised someone smart hasn’t already tried.

replies(1): >>45151965 #
2. Szpadel ◴[] No.45151965[source]
Of course it already was tried, example: https://github.com/Yuyz0112/claude-code-reverse