←back to thread

Devstral

(mistral.ai)
701 points mfiguiere | 1 comments | | HN request time: 0s | source
Show context
oofbaroomf ◴[] No.44054477[source]
The SWE-Bench scores are very, very high for an open source model of this size. 46.8% is better than o3-mini (with Agentless-lite) and Claude 3.6 (with AutoCodeRover), but it is a little lower than Claude 3.6 with Anthropic's proprietary scaffold. And considering you can run this for almost free, this is a very extraordinary model.
replies(3): >>44056216 #>>44056570 #>>44058287 #
falcor84 ◴[] No.44056216[source]
Just to confirm, are you referring to Claude 3.7?
replies(1): >>44056250 #
oofbaroomf ◴[] No.44056250[source]
No. I am referring to Claude 3.5 Sonnet New, released October 22, 2024, with model ID claude-3-5-sonnet-20241022, colloquially referred to as Claude 3.6 Sonnet because of Anthropic's confusing naming.
replies(4): >>44056271 #>>44056382 #>>44056760 #>>44061050 #
1. moffkalast ◴[] No.44061050[source]
The model formerly known as Claude 3.6 Sonnet?