←back to thread

GPT-5.2

(openai.com)
1053 points atgctg | 3 comments | | HN request time: 0.521s | source
Show context
system2 ◴[] No.46234981[source]
"Investors are putting pressure, change the version number now!!!"
replies(1): >>46235001 #
exe34 ◴[] No.46235001[source]
I'm quite sad about the S-curve hitting us hard in the transformers. For a short period, we had the excitement of "ooh if GPT-3.5 is so good, GPT-4 is going to be amazing! ooh GPT-4 has sparks of AGI!" But now we're back to version inflation for inconsequential gains.
replies(4): >>46235029 #>>46235236 #>>46235245 #>>46235399 #
1. ToValueFunfetti ◴[] No.46235236[source]
Take this all with a grain of salt as it's hearsay:

From what I understand, nobody has done any real scaling since the GPT-4 era. 4.5 was a bit larger than 4, but not as much as the orders of magnitude difference between 3 and 4, and 5 is smaller than 4.5. Google and Anthropic haven't gone substantially bigger than GPT-4 either. Improvements since 4 are almost entirely from reasoning and RL. In 2026 or 2027, we should see a model that uses the current datacenter buildout and actually scales up.

replies(2): >>46235487 #>>46235961 #
2. snovv_crash ◴[] No.46235487[source]
Datacenter capacity is being snapped up for inference too though.
3. Leynos ◴[] No.46235961[source]
4.5 is widely believed to be an order of magnitude larger than GPT-4, as reflected in the API inference cost. The problem is the quantity of parameters you can fit in the memory of one GPU. Pretty much every large GPT model from 4 onwards has been mixture of experts, but for a 10 trillion parameter scale model, you'd be talking a lot of experts and a lot of inter-GPU communication.

With FP4 in the Blackwell GPUs, it should become much more practical to run a model of that size at the deployment roll-out of GPT-5.x. We're just going to have to wait for the GBx00 systems to be physically deployed at scale.