←back to thread

376 points meetpateltech | 1 comments | | HN request time: 0.25s | source
Show context
ofirpress ◴[] No.44008115[source]
[I'm one of the co-creators of SWE-bench] The team managed to improve on the already very strong o3 results on SWE-bench, but it's interesting that we're just seeing an improvement of a few percentage points. I wonder if getting to 85% from 75% on Verified is going to take as long as it took to get from 20% to 75%.
replies(2): >>44008209 #>>44009418 #
1. mr_north_london ◴[] No.44009418[source]
How long did it take to go from 20% to 75%?