←back to thread

GPT-5.2

(openai.com)

1019 points atgctg | 4 comments | 11 Dec 25 18:04 UTC | HN request time: 0s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

zone411 ◴[11 Dec 25 19:46 UTC] No.46236209[source]▶

>>46234788 (OP) #

I've benchmarked it on the Extended NYT Connections benchmark (https://github.com/lechmazur/nyt-connections/):

The high-reasoning version of GPT-5.2 improves on GPT-5.1: 69.9 → 77.9.

The medium-reasoning version also improves: 62.7 → 72.1.

The no-reasoning version also improves: 22.1 → 27.5.

Gemini 3 Pro and Grok 4.1 Fast Reasoning still score higher.

replies(4): >>46236325 #>>46236642 #>>46237650 #>>46241682 #

1. Donald ◴[11 Dec 25 19:57 UTC] No.46236325[source]▶

Gemini 3 Pro Preview gets 96.8% on the same benchmark? That's impressive

replies(2): >>46236367 #>>46236593 #

2. capitainenemo ◴[11 Dec 25 20:01 UTC] No.46236367[source]▶

>>46236325 (TP) #

And performs very well on the latest 100 puzzles too, so isn't just learning the data set (unless I guess they routinely index this repo).

I wonder how well AIs would do at bracket city. I tried gemini on it and was underwhelmed. It made a lot of terrible connections and often bled data from one level into the next.

3. bigyabai ◴[11 Dec 25 20:19 UTC] No.46236593[source]▶

>>46236325 (TP) #

GPT-5.2 might be Google's best Gemini advertisement yet.

replies(1): >>46236759 #

4. outside1234 ◴[11 Dec 25 20:34 UTC] No.46236759[source]▶

Especially when you see the price