They are also releasing model weights for most of their models, where companies like Antropic and until recently OpenAI were FUDing the world that open source will doom us all.
Mistral smartest model is still behind Google, Antropic but they will catch up.
Inspired by the Greek word for human: Anthropos / ἄνθρωπος, the same etymology as English words like anthropology, the study of humans.
(I'd hazard a guess that your first language is something like a Romance language such as French, where people would pronounce that "anthro..." as if there is no h? So a particularly reasonable letter to forget when typing!)
- 1100 tokens/second Mistral Flash Answers https://www.youtube.com/watch?v=CC_F2umJH58
- 189.9 tokens/second Gemini 2.5 Flash Lite https://openrouter.ai/google/gemini-2.5-flash-lite
- 45.92 tokens/second GPT-5 Nano https://openrouter.ai/openai/gpt-5-nano
- 1799 tokens/second gpt-oss-120b (via Cerebras) https://openrouter.ai/openai/gpt-oss-120b
- 666.8 tokens/second Qwen3 235B A22B Thinking 2507 (via Cerebras) https://openrouter.ai/qwen/qwen3-235b-a22b-thinking-2507
Gemini 2.5 Flash Lite and GPT-5 Nano seem to be comparatively slow.That being said, I can not find non-marketing numbers for Mistral Flash Answers. Real-world tps are likely lower, so this comparison chart is not very fair.
Speed and cost is a relevant factor. I have pipelines that need to execute tons of completions and has to produce summaries. Mistral Small is great at it and the responses are lightning fast.
For that use case if you went with US models it would be way more expensive and slow while not offering any benefit at all.
Which makes it particularly hard to write, compared to other latin languages.