(openai.com)

1019 points atgctg | 4 comments | 11 Dec 25 18:04 UTC | HN request time: 0.303s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

sfmike ◴[11 Dec 25 18:21 UTC] No.46234974[source]▶

Everything is still based on 4 4o still right? is a new model training just too expensive? They can consult deepseek team maybe for cost constrained new models.

replies(4): >>46235000 #>>46235052 #>>46235127 #>>46235143 #

1. Wowfunhappy ◴[11 Dec 25 18:28 UTC] No.46235052[source]▶

>>46234974 #

I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?

replies(2): >>46235181 #>>46236200 #

2. brokencode ◴[11 Dec 25 18:37 UTC] No.46235181[source]▶

>>46235052 (TP) #

Typically I think, but you could pre-train your previous model on new data too.

I don’t think it’s publicly known for sure how different the models really are. You can improve a lot just by improving the post-training set.

3. rockinghigh ◴[11 Dec 25 19:46 UTC] No.46236200[source]▶

>>46235052 (TP) #

They add new data to the existing base model via continuous pre-training. You save on pre-training, the next token prediction task, but still have to re-run mid and post training stages like context length extension, supervised fine tuning, reinforcement learning, safety alignment ...

replies(1): >>46239272 #

4. astrange ◴[12 Dec 25 00:15 UTC] No.46239272[source]▶

>>46236200 #

Continuous pretraining has issues because it starts forgetting the older stuff. There is some research into other approaches.

↑

GPT-5.2