Every time I tried a Mistral model I was left rather underwhelmed and just went back to the usual options. Seems like their only USP at this point is Made in EU.
replies(4):
- 1100 tokens/second Mistral Flash Answers https://www.youtube.com/watch?v=CC_F2umJH58
- 189.9 tokens/second Gemini 2.5 Flash Lite https://openrouter.ai/google/gemini-2.5-flash-lite
- 45.92 tokens/second GPT-5 Nano https://openrouter.ai/openai/gpt-5-nano
- 1799 tokens/second gpt-oss-120b (via Cerebras) https://openrouter.ai/openai/gpt-oss-120b
- 666.8 tokens/second Qwen3 235B A22B Thinking 2507 (via Cerebras) https://openrouter.ai/qwen/qwen3-235b-a22b-thinking-2507
Gemini 2.5 Flash Lite and GPT-5 Nano seem to be comparatively slow.That being said, I can not find non-marketing numbers for Mistral Flash Answers. Real-world tps are likely lower, so this comparison chart is not very fair.