GPT-5.2 | slacker news

1. agentifysh ◴[11 Dec 25 22:21 UTC] No.46238067[source]▶

Looks like they've begun censoring posts at r/Codex and not allowing complaint threads so here is my honest take:

- It is faster which is appreciated but not as fast as Opus 4.5

- I see no changes, very little noticeable improvements over 5.1

- I do not see any value in exchange for +40% in token costs

All in all I can't help but feel that OpenAI is facing an existential crisis. Gemini 3 even when its used from AI Studio offers close to ChatGPT Pro performance for free. Anthropic's Claude Code $100/month is tough to beat. I am using Codex with the $40 credits but there's been a silent increase in token costs and usage limitations.

replies(3): >>46238965 #>>46240393 #>>46241715 #

2. AstroBen ◴[11 Dec 25 23:43 UTC] No.46238965[source]▶

>>46238067 (TP) #

Did you notice much improvement going from Gemini 2.5 to 3? I didn't

I just think they're all struggling to provide real world improvements

replies(8): >>46239052 #>>46239296 #>>46239714 #>>46240131 #>>46240302 #>>46240549 #>>46240983 #>>46241460 #

3. XCSme ◴[11 Dec 25 23:53 UTC] No.46239052[source]▶

>>46238965 #

Maybe they are just more consistent, which is a bit hard to notice immediately.

4. dcre ◴[12 Dec 25 00:17 UTC] No.46239296[source]▶

>>46238965 #

Nearly everyone else (and every measure) seems to have found 3 a big improvement over 2.5.

5. enraged_camel ◴[12 Dec 25 01:14 UTC] No.46239714[source]▶

>>46238965 #

Gemini 3 was a massive improvement over 2.5, yes.

6. cmrdporcupine ◴[12 Dec 25 02:14 UTC] No.46240131[source]▶

>>46238965 #

I think what they're actually struggling with is costs. And I think they're all behind the scenes quantizing models to manage load here and there, and they're all giving inconsistent results.

I noticed huge improvement from Sonnet 4.5 to Opus 4.5 when it became unthrottled a couple weeks ago. I wasn't going to sign back up with Anthropic but I did. But two weeks in it's already starting to seem to be inconsistent. And when I go back to Sonnet it feels like they did something to lobotomize it.

Meanwhile I can fire up DeepSeek 3.2 or GLM 4.6 for a fraction of the cost and get almost as good as results.

7. agentifysh ◴[12 Dec 25 02:42 UTC] No.46240302[source]▶

>>46238965 #

oh yes im noticing significant improvements across the board but mainly having 1,000,000 token context makes a ton of difference, I can keep digging at a problem with out compaction.

8. ◴[12 Dec 25 03:01 UTC] No.46240393[source]▶

>>46238067 (TP) #

9. free652 ◴[12 Dec 25 03:25 UTC] No.46240549[source]▶

>>46238965 #

yes, 2.5 just couldnt use tools right. 3.0 is way better at coding. better than sonnet 4.5/

10. dudeinhawaii ◴[12 Dec 25 04:52 UTC] No.46240983[source]▶

>>46238965 #

I noticed a quite noticeable improvement to the point where I made it my go-to model for questions. Coding-wise, not so much. As an intelligent model, writing up designs, investigations, general exploration/research tasks, it's top notch.

11. chillfox ◴[12 Dec 25 06:40 UTC] No.46241460[source]▶

>>46238965 #

Gemini 3 Pro is the first model from Google that I have found usable, and it's very good. It has replaced Claude for me in some cases, but Claude is still my goto for use in coding agents.

(I only access these models via API)

12. hmottestad ◴[12 Dec 25 07:27 UTC] No.46241715[source]▶

>>46238067 (TP) #

I’m curious about if the model has gotten more consistent throughout the full context window? It’s something that OpenAI touted in the release, and I’m curious if it will make a difference for long running tasks or big code reviews.

replies(1): >>46242088 #

13. agentifysh ◴[12 Dec 25 08:36 UTC] No.46242088[source]▶

>>46241715 #

one positive is that 5.2 is very good at finding bugs but not sure about throughputs I'd imagine it might be improved but haven't seen a real task to benchmark it on.

what I am curious about is 5.2-codex but many of us complained about 5.1-codex (it seemed to get tunnel visioned) and I have been using vanilla 5.1

its just getting very tiring to deal with 5 different permutations of 3 completely separate models but perhaps this is the intent and will keep you on a chase.