←back to thread

113 points alexmolas | 1 comments | | HN request time: 0.2s | source
Show context
numlocked ◴[] No.45141473[source]
I don’t quite understand. The article says things like:

“With the constant upward pressure on embedding sizes not limited by having to train models in-house, it’s not clear where we’ll slow down: Qwen-3, along with many others is already at 4096”

But aren’t embedding models separate from the LLMs? The size of attention heads in LLMs etc isn’t inherently connected to how a lab might train and release an embedding model. I don’t really understand why growth in LLM size fundamentally puts upward pressure on embedding size as they are not intrinsically connected.

replies(3): >>45141991 #>>45142050 #>>45142874 #
1. gojomo ◴[] No.45142874[source]
The LLMs need the embedding function, benefit from growth, do the training – and then other uses get that embedding "for free".

So an old down-pressure on sizes – internal training costs & resource limits – now weaker. And as long as LLMs are seeing benefits from larger embeddings, they'll become more common and available. (Of course via truncation/etc, no one is forced to use larger than works for them... but larger may keep becoming more common & available.)