←back to thread

216 points veggieroll | 2 comments | | HN request time: 0s | source
Show context
ed ◴[] No.41860918[source]
3b is is API-only so you won’t be able to run it on-device, which is the killer app for these smaller edge models.

I’m not opposed to licensing but “email us for a license” is a bad sign for indie developers, in my experience.

8b weights are here https://huggingface.co/mistralai/Ministral-8B-Instruct-2410

Commercial entities aren’t permitted to use or distribute 8b weights - from the agreement (which states research purposes only):

"Research Purposes": means any use of a Mistral Model, Derivative, or Output that is solely for (a) personal, scientific or academic research, and (b) for non-profit and non-commercial purposes, and not directly or indirectly connected to any commercial activities or business operations. For illustration purposes, Research Purposes does not include (1) any usage of the Mistral Model, Derivative or Output by individuals or contractors employed in or engaged by companies in the context of (a) their daily tasks, or (b) any activity (including but not limited to any testing or proof-of-concept) that is intended to generate revenue, nor (2) any Distribution by a commercial entity of the Mistral Model, Derivative or Output whether in return for payment or free of charge, in any medium or form, including but not limited to through a hosted or managed service (e.g. SaaS, cloud instances, etc.), or behind a software layer.

replies(8): >>41861229 #>>41861251 #>>41862331 #>>41862714 #>>41862802 #>>41863345 #>>41865597 #>>41866472 #
tarruda ◴[] No.41862331[source]
Isn't 3b the kind of size you'd expect to be able to run on the edge? What is the point of using 3b via API when you can use larger and more capable models?
replies(1): >>41862987 #
littlestymaar ◴[] No.41862987[source]
GP misunderstood: 3b will be available for running on edge devices, but you must sign a deal with Mistral to get access to the weights to run.

I don't think that can work without a significant lobbying push towards models running on the edge but who knows (especially since they have a former French Minister in the founding team).

replies(1): >>41863187 #
1. ed ◴[] No.41863187[source]
> GP misunderstood

I don’t think it’s fair to claim the weights are available if you need to hammer out a custom agreement with mistral’s sales team first.

If they had a self-serve process, or some sort of shink-wrapped deal up to say 500k users, that would be great. But bespoke contracts are rarely cheap or easy to get. This comes from my experience building a bunch of custom infra for Flux1-dev, only to find I wasn’t big enough for a custom agreement, because, duh, the service doesn’t exist yet. Mistral is not BFL, but sales teams don’t like speculating on usage numbers for a product that hasn’t been released yet. Which is a bummer considering most innovation happens at a small scale initially.

replies(1): >>41866847 #
2. littlestymaar ◴[] No.41866847[source]
I'm not defending Mistral here, I don't think it's a good idea I just wanted to to not out that there is no paradox as if the 3b model was API-only.