(turbopuffer.com)

379 points Sirupsen | 1 comments | 09 Jul 24 14:48 UTC | HN request time: 0s | source

Show context

omneity ◴[09 Jul 24 22:40 UTC] No.40922055[source]▶

> In 2022, production-grade vector databases were relying on in-memory storage

This is irking me. pg_vector has existed from before that, doesn't require in-memory storage and can definitely handle vector search for 100m+ documents in a decently performant manner. Did they have a particular requirement somewhere?

replies(1): >>40922076 #

jbellis ◴[09 Jul 24 22:42 UTC] No.40922076[source]▶

>>40922055 #

Have you tried it? pgvector performance falls off a cliff once you can't cache in ram. Vector search isn't like "normal" workloads that follow a nice pareto distribution.

replies(1): >>40922426 #

omneity ◴[09 Jul 24 23:38 UTC] No.40922426[source]▶

>>40922076 #

Tried and deployed in production with similar sized collections.

You only need enough memory to load the index, definitely not the whole collection. A typical index would most likely fit within a few GBs. And even if you need dozens of GBs of RAM it won’t cost nearly as much as $20k/month as the article surmises.

replies(1): >>40926357 #

lyu07282 ◴[10 Jul 24 12:55 UTC] No.40926357[source]▶

>>40922426 #

How do you get to "a few GBs"? A hundred million embeddings, if you have 4 byte floats 1024 dimensions would be >400 GB alone.

replies(1): >>40928308 #

1. omneity ◴[10 Jul 24 15:58 UTC] No.40928308{3}[source]▶

>>40926357 #

I did say the index, not the embeddings themselves. The index is a more compact representation of your embeddings collection, and that's what you need in memory. One approach for indexing is to calculate centroids of your embeddings.

You have multiple parameters to tweak, that affect retrieval performance as well as the memory footprint of your indexes. Here's a rundown on that: https://tembo.io/blog/vector-indexes-in-pgvector

↑

Turbopuffer: Fast search on object storage