←back to thread

GPT-5.2

(openai.com)
1053 points atgctg | 1 comments | | HN request time: 0.239s | source
Show context
josalhor ◴[] No.46235005[source]
From GPT 5.1 Thinking:

ARC AGI v2: 17.6% -> 52.9%

SWE Verified: 76.3% -> 80%

That's pretty good!

replies(7): >>46235062 #>>46235070 #>>46235153 #>>46235160 #>>46235180 #>>46235421 #>>46236242 #
minimaxir ◴[] No.46235160[source]
Note that GPT 5.2 newly supports a "xhigh" reasoning level, which could explain the better benchmarks.

It'll be noteworthy to see the cost-per-task on ARC AGI v2.

replies(2): >>46235539 #>>46236189 #
1. walletdrainer ◴[] No.46236189[source]
5.1-codex supports that too, no? Pretty sure I’ve been using xhigh for at least a week now