←back to thread

379 points Sirupsen | 2 comments | | HN request time: 0s | source
Show context
bigbones ◴[] No.40920788[source]
Sounds like a source-unavailable version of Quickwit? https://quickwit.io/
replies(2): >>40920922 #>>40943710 #
pushrax ◴[] No.40920922[source]
LSM tree storage engine vs time series storage engine, similar philosophy but different use cases
replies(1): >>40923436 #
singhrac ◴[] No.40923436[source]
Maybe I misunderstood both products but I think neither Quickwit or Turbopuffer is either of those things intrinsically (though log structured messages are a good fit for Quickfit). I think Quickwit is essentially Lucene/Elasticsearch (i.e. sparse queries or BM25) and Turbopuffer does vector search (or dense queries) like say Faiss/Pinecone/Qdrant/Vectorize, both over object storage.
replies(1): >>40926621 #
1. pushrax ◴[] No.40926621[source]
It's true that turbopuffer does vector search, though it also does BM25.

The biggest difference at a low level is that turbopuffer records have unique primary keys, and can be updated, like in a normal database. Old records that were overwritten won't be returned in searches. The LSM tree storage engine is used to achieve this. The LSM tree also enables maintenance of global indexes that can be used for efficient retrieval without any time-based filter.

Quickwit records are immutable. You can't overwrite a record (well, you can, but overwritten records will also be returned in searches). The data files it produces are organized into a time series, and if you don't pass a time-based filter it has to look at every file.

replies(1): >>40927816 #
2. singhrac ◴[] No.40927816[source]
Ah I didn’t catch that Quickwit had immutable records. That explains the focus on log usage. Thanks!