(mistral.ai)

216 points veggieroll | 1 comments | 16 Oct 24 14:31 UTC | HN request time: 0.199s | source

Show context

tarruda ◴[16 Oct 24 19:58 UTC] No.41863180[source]▶

They didn't add a comparison to Qwen 2.5 3b, which seems to surpass Ministral 3b MMLU, HumanEval, GSM8K: https://qwen2.org/qwen2-5/#qwen25-05b15b3b-performance

These benchmarks don't really matter that much, but it is funny how this blog post conveniently forgot to compare with a model that already exists and performs better.

replies(2): >>41863218 #>>41863231 #

1. DreamGen ◴[16 Oct 24 20:03 UTC] No.41863231[source]▶

>>41863180 #

Also, the 3B model, which is API only (so the only thing that matters is price, quality and speed) should be compared to something like Gemini Flash 1.5 8B which is cheaper than this 3B API and also has higher benchmark performance, super long context support, etc.

↑

Un Ministral, Des Ministraux