←back to thread

555 points maheshrijal | 5 comments | | HN request time: 0s | source
Show context
_fat_santa ◴[] No.43708027[source]
So at this point OpenAI has 6 reasoning models, 4 flagship chat models, and 7 cost optimized models. So that's 17 models in total and that's not even counting their older models and more specialized ones. Compare this with Anthropic that has 7 models in total and 2 main ones that they promote.

This is just getting to be a bit much, seems like they are trying to cover for the fact that they haven't actually done much. All these models feel like they took the exact same base model, tweaked a few things and released it as an entirely new model rather than updating the existing ones. In fact based on some of the other comments here it sounds like these are just updates to their existing model, but they release them as new models to create more media buzz.

replies(22): >>43708044 #>>43708100 #>>43708150 #>>43708219 #>>43708340 #>>43708462 #>>43708605 #>>43708626 #>>43708645 #>>43708647 #>>43708800 #>>43708970 #>>43709059 #>>43709249 #>>43709317 #>>43709652 #>>43709926 #>>43710038 #>>43710114 #>>43710609 #>>43710652 #>>43713438 #
shmatt ◴[] No.43708462[source]
Im old enough to remember the mystery and hype before o*/o1/strawberry that was supposed to be essentially AGI. We had serious news outlets write about senior people at OpenAI quitting because o1 was SkyNet

Now we're up to o4, AGI is still not even in near site (depending on your definition, I know). And OpenAI is up to about 5000 employees. I'd think even before AGI a new model would be able to cover for at least 4500 of those employees being fired, is that not the case?

replies(8): >>43708694 #>>43708755 #>>43708824 #>>43709411 #>>43709774 #>>43710199 #>>43710213 #>>43710748 #
irthomasthomas ◴[] No.43708824[source]
Yeah, I don't know exactly what at an AGI model will look like, but I think it would have more than 200k context window.
replies(3): >>43709865 #>>43710042 #>>43710363 #
doug_durham ◴[] No.43710042[source]
Do you have a 200k context window? I don't. Most humans can only keep 6 or 7 things in short term memory. Beyond those 6 or 7 you are pulling data from your latent space, or replacing of the short term slots with new content.
replies(1): >>43710246 #
sixQuarks ◴[] No.43710246[source]
But context windows for LLMs include all the “long term memory” things you’re excluding from humans
replies(1): >>43710333 #
kadushka ◴[] No.43710333[source]
Long term memory in an LLM is its weights.
replies(1): >>43711008 #
1. echoangle ◴[] No.43711008[source]
Not really, because humans can form long term memories from conversations, but LLM users aren’t finetuning models after every chat so the model remembers.
replies(2): >>43711243 #>>43711244 #
2. esafak ◴[] No.43711243[source]
He's right, but most people don't have the resources, nor indeed the weights themselves, to keep training the models. But the weights are very much long term memory.
3. kadushka ◴[] No.43711244[source]
users aren’t finetuning models after every chat

Users can do that if they want, but it’s more effective and more efficient to do that after every billion chats, and I’m sure OpenAI does it.

replies(1): >>43713784 #
4. echoangle ◴[] No.43713784[source]
If you want the entire model to remember everything it talked about with every user, sure. But ideally, I would want the model to remember what I told it a few million tokens ago, but not what you told it (because to me, the model should look like my private copy that only talks to me).
replies(1): >>43717591 #
5. kadushka ◴[] No.43717591{3}[source]
ideally, I would want the model to remember what I told it a few million tokens ago

Yes, you can keep finetuning your model on every chat you have with it. You can definitely make it remember everything you have ever said. LLMs are excellent at remembering their training data.