Turbopuffer: Fast search on object storage

(turbopuffer.com)

379 points Sirupsen | 1 comments | 09 Jul 24 14:48 UTC | HN request time: 0.213s | source

Show context

eknkc ◴[09 Jul 24 21:29 UTC] No.40921379[source]▶

Is there a good general purpose solution where I can store a large read only database in s3 or something and do lookups directly on it?

Duckdb can open parquet files over http and query them but I found it to trigger a lot of small requests reading bunch of places from the files. I mean a lot.

I mostly need key / value lookups and could potentially store each key in a seperate object in s3 but for a couple hundred million objects.. It would be a lot more managable to have a single file and maybe a cacheable index.

replies(5): >>40922137 #>>40922166 #>>40922842 #>>40923712 #>>40927099 #

tionis ◴[10 Jul 24 01:02 UTC] No.40922842[source]▶

>>40921379 #

You could use a sqlite database and use range queries using something like this: https://github.com/psanford/sqlite3vfshttp https://github.com/phiresky/sql.js-httpvfs

Simon Willison wrote about it: https://simonwillison.net/2022/Aug/10/sqlite-http/

replies(2): >>40924060 #>>40924633 #

1. arcanemachiner ◴[10 Jul 24 05:50 UTC] No.40924060[source]▶

>>40922842 #

That whole thing still blows my mind.

↑