(mistral.ai)

701 points mfiguiere | 2 comments | 21 May 25 14:21 UTC | HN request time: 0s | source

Show context

oofbaroomf ◴[21 May 25 18:19 UTC] No.44054477[source]▶

The SWE-Bench scores are very, very high for an open source model of this size. 46.8% is better than o3-mini (with Agentless-lite) and Claude 3.6 (with AutoCodeRover), but it is a little lower than Claude 3.6 with Anthropic's proprietary scaffold. And considering you can run this for almost free, this is a very extraordinary model.

replies(3): >>44056216 #>>44056570 #>>44058287 #

AstroBen ◴[21 May 25 21:41 UTC] No.44056570[source]▶

>>44054477 #

extraordinary.. or suspicious that the benchmarks aren't doing their job

replies(1): >>44057569 #

echelon ◴[22 May 25 00:25 UTC] No.44057569[source]▶

>>44056570 #

I wasn't considering Mistral for anything, but this show of goodwill to open source is amazing. I'll have to give this a try.

replies(1): >>44062135 #

1. qeternity ◴[22 May 25 13:59 UTC] No.44062135[source]▶

>>44057569 #

Mistral have a long history of open weight models...

replies(1): >>44070156 #

2. alhimik45 ◴[23 May 25 05:37 UTC] No.44070156[source]▶

>>44062135 (TP) #

But at the same time they don't open weights of Codestral...

↑

Devstral