←back to thread

Embeddings are underrated (2024)

(technicalwriting.dev)
484 points jxmorris12 | 2 comments | | HN request time: 2.397s | source
Show context
simianwords ◴[] No.43966334[source]
I don't think any of the current consumer LLM tools use embeddings for web search. Instead they do it at the text level.

The evidence for this is the COT summary with ChatGPT - I have seen something where the the LLM uses quotes to grep on the web.

Embeddings seem good in theory but in practice its probably best to ask an LLM to do a deep search instead by giving it instructions like "use synonyms and common typos and grep".

Does any one know any live example of a consumer product using embeddings?

replies(2): >>43966401 #>>43966424 #
1. zhobbs ◴[] No.43966401[source]
My understanding is that modern search engines are using embeddings / vector search under the hood.

So even if LLM's aren't directly passing a vector to the search engine, my assumption is that the search engine is converting to a vector and searching.

"You interact with embeddings every time you complete a Google Search" from https://cloud.google.com/vertex-ai/generative-ai/docs/embedd...

replies(1): >>43966444 #
2. simianwords ◴[] No.43966444[source]
Fair, and maybe key point here is that it uses embeddings to help with the search results along with many manual heuristics in place. I hardly think google search works just by dumping embeddings then doing KNN's and calling it a day.