(mistral.ai)

701 points mfiguiere | 1 comments | 21 May 25 14:21 UTC | HN request time: 0.236s | source

Show context

oofbaroomf ◴[21 May 25 18:19 UTC] No.44054477[source]▶

The SWE-Bench scores are very, very high for an open source model of this size. 46.8% is better than o3-mini (with Agentless-lite) and Claude 3.6 (with AutoCodeRover), but it is a little lower than Claude 3.6 with Anthropic's proprietary scaffold. And considering you can run this for almost free, this is a very extraordinary model.

replies(3): >>44056216 #>>44056570 #>>44058287 #

falcor84 ◴[21 May 25 20:53 UTC] No.44056216[source]▶

>>44054477 #

Just to confirm, are you referring to Claude 3.7?

replies(1): >>44056250 #

oofbaroomf ◴[21 May 25 20:56 UTC] No.44056250[source]▶

>>44056216 #

No. I am referring to Claude 3.5 Sonnet New, released October 22, 2024, with model ID claude-3-5-sonnet-20241022, colloquially referred to as Claude 3.6 Sonnet because of Anthropic's confusing naming.

replies(4): >>44056271 #>>44056382 #>>44056760 #>>44061050 #

SkyPuncher ◴[21 May 25 20:59 UTC] No.44056271[source]▶

>>44056250 #

> colloquially referred to as Claude 3.6

Interesting. I've never heard this.

replies(2): >>44057177 #>>44060064 #

1. simonw ◴[21 May 25 23:09 UTC] No.44057177[source]▶

>>44056271 #

It's the reason Anthropic called their next release 3.7 Sonnet - the 3.6 version number was already being used by some in the community to refer to their 3.5v2.

↑

Devstral