(zilliz.com)

1. qaq ◴[08 Sep 25 16:47 UTC] No.45170557[source]▶

"I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself. That flips the usual assumption on its head." Hmm well start sending full documents as part of context see it flip back :).

replies(3): >>45170757 #>>45171312 #>>45182178 #

2. heywoods ◴[08 Sep 25 17:01 UTC] No.45170757[source]▶

>>45170557 (TP) #

Egress costs? I’m really surprised by this. Thanks for sharing.

replies(2): >>45170991 #>>45177575 #

3. qaq ◴[08 Sep 25 17:18 UTC] No.45170991[source]▶

>>45170757 #

Sry maybe should've being more clear it was a sarcastic remark. The whole point of doing vector db search is to feed LLM with very targeted context so you can save $ on API calls to LLM.

replies(2): >>45171144 #>>45190030 #

4. infecto ◴[08 Sep 25 17:30 UTC] No.45171144{3}[source]▶

>>45170991 #

That’s not the whole point it’s in the intersection of reducing tokens sent but also getting search both specific and generic enough to capture the correct context data.

replies(1): >>45173722 #

5. ◴[08 Sep 25 17:41 UTC] No.45171312[source]▶

>>45170557 (TP) #

6. j45 ◴[08 Sep 25 20:43 UTC] No.45173722{4}[source]▶

>>45171144 #

It's possible to create linking documents between the documents to help smooth out things in some cases.

7. andreasgl ◴[09 Sep 25 04:56 UTC] No.45177575[source]▶

>>45170757 #

They’re likely using an HNSW index, which typically requires a lot of memory for large data sets.

8. dahcryn ◴[09 Sep 25 14:09 UTC] No.45182178[source]▶

>>45170557 (TP) #

if they use AzureSearch, I fully understand it. Those things are hella expensive

9. heywoods ◴[09 Sep 25 22:11 UTC] No.45190030{3}[source]▶

>>45170991 #

No worries. I should probably make sure I have at least a token understanding of the topic cloud based architecture before commenting next time haha.

↑

Will Amazon S3 Vectors kill vector databases or save them?