(openai.com)

1094 points atgctg | 1 comments | 11 Dec 25 18:04 UTC | HN request time: 0.203s | source

Show context

zone411 ◴[11 Dec 25 19:46 UTC] No.46236209[source]▶

I've benchmarked it on the Extended NYT Connections benchmark (https://github.com/lechmazur/nyt-connections/):

The high-reasoning version of GPT-5.2 improves on GPT-5.1: 69.9 → 77.9.

The medium-reasoning version also improves: 62.7 → 72.1.

The no-reasoning version also improves: 22.1 → 27.5.

Gemini 3 Pro and Grok 4.1 Fast Reasoning still score higher.

scrollop ◴[11 Dec 25 21:47 UTC] No.46237650[source]▶

Why no grok 4.1 reasoning?

replies(1): >>46239494 #

sanex ◴[12 Dec 25 00:43 UTC] No.46239494[source]▶

Do people other than Elon fans use grok? Honest question. I've never tried it.

1. scrollop ◴[12 Dec 25 12:57 UTC] No.46243696[source]▶

I hate the guy, however grok scores high on arc-2 so it would be silly to not at least rank it.