The model weights for Ministral 8B Instruct are available for research use. Both models will be available from our cloud partners shortly."
I'm the head of R&D at Rev.ai and this is exactly what we've seen in ASR. We started at $1.20/hr, and our new models are $0.10/hr in < 2 years. We have done human transcription for ~15 years and the revenue from ASR is 3 orders of magnitude less ($90/hr vs $0.10/hr) and it will likely go lower. However, our volumes are many orders of magnitude higher now for serving ASR, so it's about even or growth in most cases still.
I think for Mistral to compete with Meta they need a better API. The on-prem/self-hosted people will always choose the best models for themselves and you won't be able to monetize them in a FOSS world anyways, so you just need the better platform. Right now, Meta isn't providing a top-tier platform, but that may eventually change.
The quintessential example is "cheval" (horse) which becomes "chevaux" (horses), which is the rule they're following (or being cute about). Un mistral, des mistraux. Un ministral, des ministraux.
(Ironically the plural of the Mistral wind in the Larousse dictionnary would technically be Mistrals[1][2], however weird that sounds to my french ears and to the people who wrote that article perhaps!)
[1] https://www.larousse.fr/dictionnaires/francais/mistral_mistr... [2] https://fr.wiktionary.org/wiki/mistral
I'd say keeping up with the reddit LocalLLama community is the "easiest" way and it's by no means easy.
Hard to imagine anyone competing with AWS/GCP/Azure for slices of GPUs/TPU. AFAIK, most major models are available a la carte via API on these providers (with a few exclusives). I can’t imagine how anyone can compete the big clouds on serving an API, and I can’t imagine them staying “non compliant” for long.
I would love a curated list.
I’m not opposed to licensing but “email us for a license” is a bad sign for indie developers, in my experience.
8b weights are here https://huggingface.co/mistralai/Ministral-8B-Instruct-2410
Commercial entities aren’t permitted to use or distribute 8b weights - from the agreement (which states research purposes only):
"Research Purposes": means any use of a Mistral Model, Derivative, or Output that is solely for (a) personal, scientific or academic research, and (b) for non-profit and non-commercial purposes, and not directly or indirectly connected to any commercial activities or business operations. For illustration purposes, Research Purposes does not include (1) any usage of the Mistral Model, Derivative or Output by individuals or contractors employed in or engaged by companies in the context of (a) their daily tasks, or (b) any activity (including but not limited to any testing or proof-of-concept) that is intended to generate revenue, nor (2) any Distribution by a commercial entity of the Mistral Model, Derivative or Output whether in return for payment or free of charge, in any medium or form, including but not limited to through a hosted or managed service (e.g. SaaS, cloud instances, etc.), or behind a software layer.
the classical way to pluralize "–al" words:
un animal → des animaux [en: animal(s)]
un journal → des journaux [en: journal(s)]
with some exceptions: un carnaval → des carnavals [en: carnival(s)]
un festival → des festivals [en: festival(s)]
un idéal → des idéals (OR des idéaux) [en: ideal(s)]
un val → des vals (OR des vaux) [en: valley(s)]
There is no logic there (as many things in french), it's up to Mistral to choose how the plural can beEDIT: Format + better examples
The only plural form people will probably know is from the song Mistral Gagnant where the lyrics include les mistrals gagnants but that refers to sweets!
Not sure why anyone would think "les mistraux"... ;)
Apparently val gave vale in English.
At least they're not claiming it's Open Source / Open Weights, kind of happy about that, as other companies didn't get the memo that lying/misleading about stuff like that is bad.
I don't know what the precise rules or patterns actually might be. But one fact that jumped out at me is that -mal and -nal start with nasal consonants and three of the "exceptions" end in -val.
By the way "Le dormeur du val" (The sleeper of the small valley) is one of Rimbaud's most famous poems, often learned at school.
Back to this precise one, there's no precise rule or pattern underneath, no rhyme or reason, it's just exceptions based on usage and even those can have their own exceptions. Like "idéals/idéaux", I (french) personally never even heard that "idéals" was a thing. Yet it is, somehow : https://www.larousse.fr/dictionnaires/francais/idéal/41391
There's nothing to be happy about when businesses try to wall-off a feature to make you salivate over it more. You're within your right to nitpick licensing differences, but unless everyone gets government-subsidized H100s in their garage I don't think the code will be of use to anyone except moneyed competitors that want to undermine foundational work.
The "Trésor de la langue française informatisé" (which hasn't been updated since 1994) says val is deprecated, but it's common in classic literary novels, together with un vallon, a near synonym.
Not a good sign at all as it means their investors are already getting nervous.
https://fr.m.wikipedia.org/wiki/Mistral_gagnant_(confiserie)
I don't think that can work without a significant lobbying push towards models running on the edge but who knows (especially since they have a former French Minister in the founding team).
These benchmarks don't really matter that much, but it is funny how this blog post conveniently forgot to compare with a model that already exists and performs better.
I don’t think it’s fair to claim the weights are available if you need to hammer out a custom agreement with mistral’s sales team first.
If they had a self-serve process, or some sort of shink-wrapped deal up to say 500k users, that would be great. But bespoke contracts are rarely cheap or easy to get. This comes from my experience building a bunch of custom infra for Flux1-dev, only to find I wasn’t big enough for a custom agreement, because, duh, the service doesn’t exist yet. Mistral is not BFL, but sales teams don’t like speculating on usage numbers for a product that hasn’t been released yet. Which is a bummer considering most innovation happens at a small scale initially.
Imo use the model that makes the most sense when you ask it stuff, and personally I'd go for the one with the least censorship (which imo isn't AliBaba Qwen anything)
Of course, it is not always the opposite, otherwise it wouldn't be random. A penis (un penis) is masculine for instance.
To my knowledge there aren't that many languages that are managed as officially as French is.
I don't quite follow your argument - what exactly is Meta competing for? It doesn't sell access to a hosted models and shows no interest of being involved in the cloud business. My guess is Meta is driven by enabling wider adoption of AI, and their bet is more (AI-generated) content is good for its existing content-hosting-and-ad-selling business, and good for it's aspirational Metaverse business too, should it pan out.
Last year Mistral watched as every provider host their models with little to no value capture.
Nemo is Apache 2.0 license, they could have easily made that a Mistral Research License model.
It's hard to pitch vc for more money to build more models when you don't capture anything making it Apache 2.0.
Not everyone can be Meta.
Magnet links are cute but honestly, most people rather use HF to get their models.
The subreddit is… not great. It’s a decent way of keeping up, but don’t read the posts too much (and even then, there is a heavy social aspect, and the models that are discussed there are a very specific subset of what’s available). There is a lot of groupthink, the discussions are never rigorous. Most of the posts are along the lines of “I tested a benchmark and it is 0.5 points ahead of Llama-whatever on that one benchmark I made up, therefore it’s the dog’s and everything else is shite”. The Zuckerberg worshiping is also disconcerting. Returns diminish quickly as you spend more time on that subreddit.
The Académie tried to codify what was used at the time (which varied a lot) to try and create a standard, but that's why there's so many exceptions to the rules everywhere : they went with "tradition" when creating the system instead of logical rules or purer phonetical approach (which some proposed).
There's a bunch of info on the wikipedia link about it, and how each wave or "réforme" tries to make it simpler (while still keeping the old version around as correct).
Each one is always hotly debated/rejected by parents too when they see their kids learning the newly simplified rules.
Recently, the spelling of onion in french went from "Oignon" (old spelling with a silent I) to "Ognon" (simplifying it out), and event that one made me have a "hmm" moment ;)
-- https://fr.wikipedia.org/wiki/Vall%C3%A9e I agree. it's weird. I'm sure there are other similar examples
This is getting off-topic, but anyway…
The Larousse definition is wrong, that’s for sure. The Tramontane comes from the West, between the Pyrenees and the Massif Central, it is not at all the same current as the Mistral.
I am not sure how prevalent “les Mistrals” is in the literature. I don’t doubt that some people wrote this, possibly for some poetic effect, but it sounds very wrong as well. Mistral is a proper noun, and it is not collective like “Alizés”. It means specifically the wind that blows along the Rhône valley, there cannot be more than one.
[edit] as other pointed out, there is the Mistral gagnant sweet, which can indeed be plural.
In any case, this is (officially) obsolete now.
https://en.m.wikipedia.org/wiki/Mistral-class_landing_helico...