←back to thread

216 points veggieroll | 3 comments | | HN request time: 0.001s | source
Show context
lairv ◴[] No.41859815[source]
Hard to see how can Mistral compete with Meta, they have order of magnitude less compute, their models are only slightly better (at least on the benchmarks) with less permissive licenses?
replies(8): >>41860302 #>>41860361 #>>41860373 #>>41860530 #>>41861065 #>>41861206 #>>41861265 #>>41865550 #
simonw ◴[] No.41860302[source]
Yeah, the license thing is definitely a problem. It's hard to get excited about an academic research license for a 3B or 8B model when the Llama 3.1 and 3.2 models are SO good, and are licensed for commercial usage.
replies(2): >>41860447 #>>41860586 #
1. harisec ◴[] No.41860447[source]
Qwen 2.5 models are better than Llama and Mistral.
replies(1): >>41860478 #
2. speedgoose ◴[] No.41860478[source]
I disagree. I tried the small ones but they too frequently output Chinese when the prompt is English.
replies(1): >>41860876 #
3. harisec ◴[] No.41860876[source]
I never had this problem but i guess it depends on the prompt.