(openai.com)

1053 points atgctg | 1 comments | 11 Dec 25 18:04 UTC | HN request time: 0.195s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

sfmike ◴[11 Dec 25 18:21 UTC] No.46234974[source]▶

Everything is still based on 4 4o still right? is a new model training just too expensive? They can consult deepseek team maybe for cost constrained new models.

replies(4): >>46235000 #>>46235052 #>>46235127 #>>46235143 #

Wowfunhappy ◴[11 Dec 25 18:28 UTC] No.46235052[source]▶

>>46234974 #

I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?

replies(2): >>46235181 #>>46236200 #

rockinghigh ◴[11 Dec 25 19:46 UTC] No.46236200[source]▶

>>46235052 #

They add new data to the existing base model via continuous pre-training. You save on pre-training, the next token prediction task, but still have to re-run mid and post training stages like context length extension, supervised fine tuning, reinforcement learning, safety alignment ...

replies(1): >>46239272 #

1. astrange ◴[12 Dec 25 00:15 UTC] No.46239272[source]▶

>>46236200 #

Continuous pretraining has issues because it starts forgetting the older stuff. There is some research into other approaches.

↑

GPT-5.2