←back to thread

379 points Sirupsen | 1 comments | | HN request time: 0.213s | source
Show context
eknkc ◴[] No.40921379[source]
Is there a good general purpose solution where I can store a large read only database in s3 or something and do lookups directly on it?

Duckdb can open parquet files over http and query them but I found it to trigger a lot of small requests reading bunch of places from the files. I mean a lot.

I mostly need key / value lookups and could potentially store each key in a seperate object in s3 but for a couple hundred million objects.. It would be a lot more managable to have a single file and maybe a cacheable index.

replies(5): >>40922137 #>>40922166 #>>40922842 #>>40923712 #>>40927099 #
tionis ◴[] No.40922842[source]
You could use a sqlite database and use range queries using something like this: https://github.com/psanford/sqlite3vfshttp https://github.com/phiresky/sql.js-httpvfs

Simon Willison wrote about it: https://simonwillison.net/2022/Aug/10/sqlite-http/

replies(2): >>40924060 #>>40924633 #
1. arcanemachiner ◴[] No.40924060[source]
That whole thing still blows my mind.