/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
OpenAI o3 and o4-mini
(openai.com)
555 points
maheshrijal
| 3 comments |
16 Apr 25 17:01 UTC
|
HN request time: 0s
|
source
Show context
brap
◴[
16 Apr 25 17:11 UTC
]
No.
43707838
[source]
▶
>>43707719 (OP)
#
Where's the comparison with Gemini 2.5 Pro?
replies(3):
>>43707846
#
>>43707897
#
>>43708606
#
gallerdude
◴[
16 Apr 25 17:16 UTC
]
No.
43707897
[source]
▶
>>43707838
#
For coding, I like the Aider polyglot benchmark, since it covers multiple programming languages.
Gemini 2.5 Pro got 72.9%
o3 high gets 81.3%, o4-mini high gets 68.9%
replies(4):
>>43708090
#
>>43708632
#
>>43709557
#
>>43709763
#
1.
vessenes
◴[
16 Apr 25 18:16 UTC
]
No.
43708632
[source]
▶
>>43707897
#
where do you find those o3 high numbers?
https://aider.chat/docs/leaderboards/
currently has gemini 2.5 pro as the leader at, as you say, 72.9%.
replies(1):
>>43708984
#
ID:
GO
2.
re-thc
◴[
16 Apr 25 18:49 UTC
]
No.
43708984
[source]
▶
>>43708632 (TP)
#
It's in the OpenAI article post (OP) i.e. OpenAI ran Aider themselves.
replies(1):
>>43730783
#
3.
vessenes
◴[
18 Apr 25 18:43 UTC
]
No.
43730783
[source]
▶
>>43708984
#
Update: the leaderboard has o3 high + 4o tops of the charts now with 82.7%. This is a) amazing b) 20x more expensive than Gemini.
↑