Embeddings are underrated (2024)

(technicalwriting.dev)

484 points jxmorris12 | 2 comments | 12 May 25 15:05 UTC | HN request time: 2.397s | source

Show context

simianwords ◴[12 May 25 18:52 UTC] No.43966334[source]▶

I don't think any of the current consumer LLM tools use embeddings for web search. Instead they do it at the text level.

The evidence for this is the COT summary with ChatGPT - I have seen something where the the LLM uses quotes to grep on the web.

Embeddings seem good in theory but in practice its probably best to ask an LLM to do a deep search instead by giving it instructions like "use synonyms and common typos and grep".

Does any one know any live example of a consumer product using embeddings?

replies(2): >>43966401 #>>43966424 #

1. zhobbs ◴[12 May 25 18:59 UTC] No.43966401[source]▶

>>43966334 #

My understanding is that modern search engines are using embeddings / vector search under the hood.

So even if LLM's aren't directly passing a vector to the search engine, my assumption is that the search engine is converting to a vector and searching.

"You interact with embeddings every time you complete a Google Search" from https://cloud.google.com/vertex-ai/generative-ai/docs/embedd...

replies(1): >>43966444 #

2. simianwords ◴[12 May 25 19:03 UTC] No.43966444[source]▶

>>43966401 (TP) #

Fair, and maybe key point here is that it uses embeddings to help with the search results along with many manual heuristics in place. I hardly think google search works just by dumping embeddings then doing KNN's and calling it a day.

↑