←back to thread

276 points Fendy | 1 comments | | HN request time: 0.2s | source
Show context
qaq ◴[] No.45170557[source]
"I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself. That flips the usual assumption on its head." Hmm well start sending full documents as part of context see it flip back :).
replies(3): >>45170757 #>>45171312 #>>45182178 #
heywoods ◴[] No.45170757[source]
Egress costs? I’m really surprised by this. Thanks for sharing.
replies(2): >>45170991 #>>45177575 #
1. andreasgl ◴[] No.45177575[source]
They’re likely using an HNSW index, which typically requires a lot of memory for large data sets.