←back to thread

379 points Sirupsen | 1 comments | | HN request time: 0s | source
Show context
omneity ◴[] No.40922055[source]
> In 2022, production-grade vector databases were relying on in-memory storage

This is irking me. pg_vector has existed from before that, doesn't require in-memory storage and can definitely handle vector search for 100m+ documents in a decently performant manner. Did they have a particular requirement somewhere?

replies(1): >>40922076 #
jbellis ◴[] No.40922076[source]
Have you tried it? pgvector performance falls off a cliff once you can't cache in ram. Vector search isn't like "normal" workloads that follow a nice pareto distribution.
replies(1): >>40922426 #
omneity ◴[] No.40922426[source]
Tried and deployed in production with similar sized collections.

You only need enough memory to load the index, definitely not the whole collection. A typical index would most likely fit within a few GBs. And even if you need dozens of GBs of RAM it won’t cost nearly as much as $20k/month as the article surmises.

replies(1): >>40926357 #
lyu07282 ◴[] No.40926357[source]
How do you get to "a few GBs"? A hundred million embeddings, if you have 4 byte floats 1024 dimensions would be >400 GB alone.
replies(1): >>40928308 #
1. omneity ◴[] No.40928308{3}[source]
I did say the index, not the embeddings themselves. The index is a more compact representation of your embeddings collection, and that's what you need in memory. One approach for indexing is to calculate centroids of your embeddings.

You have multiple parameters to tweak, that affect retrieval performance as well as the memory footprint of your indexes. Here's a rundown on that: https://tembo.io/blog/vector-indexes-in-pgvector