OpenAI o3 and o4-mini

(openai.com)

555 points maheshrijal | 5 comments | 16 Apr 25 17:01 UTC | HN request time: 0s | source

Show context

_fat_santa ◴[16 Apr 25 17:24 UTC] No.43708027[source]▶

So at this point OpenAI has 6 reasoning models, 4 flagship chat models, and 7 cost optimized models. So that's 17 models in total and that's not even counting their older models and more specialized ones. Compare this with Anthropic that has 7 models in total and 2 main ones that they promote.

This is just getting to be a bit much, seems like they are trying to cover for the fact that they haven't actually done much. All these models feel like they took the exact same base model, tweaked a few things and released it as an entirely new model rather than updating the existing ones. In fact based on some of the other comments here it sounds like these are just updates to their existing model, but they release them as new models to create more media buzz.

replies(22): >>43708044 #>>43708100 #>>43708150 #>>43708219 #>>43708340 #>>43708462 #>>43708605 #>>43708626 #>>43708645 #>>43708647 #>>43708800 #>>43708970 #>>43709059 #>>43709249 #>>43709317 #>>43709652 #>>43709926 #>>43710038 #>>43710114 #>>43710609 #>>43710652 #>>43713438 #

shmatt ◴[16 Apr 25 17:59 UTC] No.43708462[source]▶

>>43708027 #

Im old enough to remember the mystery and hype before o*/o1/strawberry that was supposed to be essentially AGI. We had serious news outlets write about senior people at OpenAI quitting because o1 was SkyNet

Now we're up to o4, AGI is still not even in near site (depending on your definition, I know). And OpenAI is up to about 5000 employees. I'd think even before AGI a new model would be able to cover for at least 4500 of those employees being fired, is that not the case?

replies(8): >>43708694 #>>43708755 #>>43708824 #>>43709411 #>>43709774 #>>43710199 #>>43710213 #>>43710748 #

irthomasthomas ◴[16 Apr 25 18:34 UTC] No.43708824[source]▶

>>43708462 #

Yeah, I don't know exactly what at an AGI model will look like, but I think it would have more than 200k context window.

replies(3): >>43709865 #>>43710042 #>>43710363 #

doug_durham ◴[16 Apr 25 20:31 UTC] No.43710042[source]▶

>>43708824 #

Do you have a 200k context window? I don't. Most humans can only keep 6 or 7 things in short term memory. Beyond those 6 or 7 you are pulling data from your latent space, or replacing of the short term slots with new content.

replies(1): >>43710246 #

sixQuarks ◴[16 Apr 25 20:50 UTC] No.43710246[source]▶

>>43710042 #

But context windows for LLMs include all the “long term memory” things you’re excluding from humans

replies(1): >>43710333 #

kadushka ◴[16 Apr 25 21:00 UTC] No.43710333[source]▶

>>43710246 #

Long term memory in an LLM is its weights.

replies(1): >>43711008 #

1. echoangle ◴[16 Apr 25 22:22 UTC] No.43711008[source]▶

>>43710333 #

Not really, because humans can form long term memories from conversations, but LLM users aren’t finetuning models after every chat so the model remembers.

replies(2): >>43711243 #>>43711244 #

2. esafak ◴[16 Apr 25 23:00 UTC] No.43711243[source]▶

>>43711008 (TP) #

He's right, but most people don't have the resources, nor indeed the weights themselves, to keep training the models. But the weights are very much long term memory.

3. kadushka ◴[16 Apr 25 23:00 UTC] No.43711244[source]▶

>>43711008 (TP) #

users aren’t finetuning models after every chat

Users can do that if they want, but it’s more effective and more efficient to do that after every billion chats, and I’m sure OpenAI does it.

replies(1): >>43713784 #

4. echoangle ◴[17 Apr 25 06:43 UTC] No.43713784[source]▶

>>43711244 #

If you want the entire model to remember everything it talked about with every user, sure. But ideally, I would want the model to remember what I told it a few million tokens ago, but not what you told it (because to me, the model should look like my private copy that only talks to me).

replies(1): >>43717591 #

5. kadushka ◴[17 Apr 25 14:41 UTC] No.43717591{3}[source]▶

>>43713784 #

ideally, I would want the model to remember what I told it a few million tokens ago

Yes, you can keep finetuning your model on every chat you have with it. You can definitely make it remember everything you have ever said. LLMs are excellent at remembering their training data.

↑