←back to thread

Devstral

(mistral.ai)
701 points mfiguiere | 1 comments | | HN request time: 0s | source
Show context
CSMastermind ◴[] No.44055203[source]
I don't believe the benchmarks they're presenting.

I haven't tried it out yet but every model I've tested from Mistral has been towards the bottom of my benchmarks in a similar place to Llama.

Would be very surprised if the real life performance is anything like they're claiming.

replies(2): >>44056495 #>>44057452 #
1. idonotknowwhy ◴[] No.44057452[source]
I don't believe them either. We really have to test these ourselves imo.

Qwen3 is a step backwards for me for example. And GLM4 is my current goto despite everyone saying it's "only good at html"

The 70b cogito model is also really good for me but doesn't get any attention.

I think it depends on our projects / languages we're using.

Still looking forward to trying this one though :)