←back to thread

435 points crawshaw | 1 comments | | HN request time: 0s | source
Show context
_bin_ ◴[] No.43998743[source]
I've found sonnet-3.7 to be incredibly inconsistent. It can do very well but has a strong tendency to get off-track and run off and do weird things.

3.5 is better for this, ime. I hooked claude desktop up to an MCP server to fake claude-code less the extortionate pricing and it works decently. I've been trying to apply it for rust work; it's not great yet (still doesn't really seem to "understand" rust's concepts) but can do some stuff if you make it `cargo check` after each change and stop it if it doesn't.

I expect something like o3-high is the best out there (aider leaderboards support this) either alone or in combination with 4.1, but tbh that's out of my price range. And frankly, I can't mentally get past paying a very high price for an LLM response that may or may not be useful; it leaves me incredibly resentful as a customer that your model can fail the task, requiring multiple "re-rolls", and you're passing that marginal cost to me.

replies(3): >>43998797 #>>43999022 #>>43999599 #
layoric ◴[] No.43999022[source]
I've been using Mistral Medium 3 last couple of days, and I'm honestly surprised at how good it is. Highly recommend giving it a try if you haven't, especially if you are trying to reduce costs. I've basically switched from Claude to Mistral and honestly prefer it even if costs were equal.
replies(1): >>43999216 #
nico ◴[] No.43999216[source]
How are you running the model? Mistral’s api or some local version through ollama, or something else?
replies(2): >>43999701 #>>44000490 #
kyleee ◴[] No.43999701[source]
Is mistral on open router?
replies(1): >>44000020 #