←back to thread

GPT-5.2

(openai.com)
1019 points atgctg | 4 comments | | HN request time: 0.303s | source
Show context
sfmike ◴[] No.46234974[source]
Everything is still based on 4 4o still right? is a new model training just too expensive? They can consult deepseek team maybe for cost constrained new models.
replies(4): >>46235000 #>>46235052 #>>46235127 #>>46235143 #
1. Wowfunhappy ◴[] No.46235052[source]
I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?
replies(2): >>46235181 #>>46236200 #
2. brokencode ◴[] No.46235181[source]
Typically I think, but you could pre-train your previous model on new data too.

I don’t think it’s publicly known for sure how different the models really are. You can improve a lot just by improving the post-training set.

3. rockinghigh ◴[] No.46236200[source]
They add new data to the existing base model via continuous pre-training. You save on pre-training, the next token prediction task, but still have to re-run mid and post training stages like context length extension, supervised fine tuning, reinforcement learning, safety alignment ...
replies(1): >>46239272 #
4. astrange ◴[] No.46239272[source]
Continuous pretraining has issues because it starts forgetting the older stuff. There is some research into other approaches.