Zero-latency SQLite storage in every Durable Object

(simonwillison.net)

Show context

stavros ◴[13 Oct 24 23:40 UTC] No.41832728[source]▶

This is a really interesting design, but these kinds of smart systems always inhabit an uncanny valley for me. You need them in exactly two cases:

1. You have a really high-load system that you need to figure out some clever ways to scale.

2. You're working on a toy project for fun.

If #2, fine, use whatever you want, it's great.

If this is production, or for Work(TM), you need something proven. If you don't know you need this, you don't need it, go with a boring Postgres database and a VM or something.

If you do know you need this, then you're kind of in a bind: It's not really very mature yet, as it's pretty new, and you're probably going to hit a bunch of weird edge cases, which you probably don't really want to have to debug or live with.

So, who are these systems for, in the end? They're so niche that they can't easily mature and be used by lots of serious players, and they're too complex with too many tradeoffs to be used by 99.9% of companies.

The only people I know for sure are the target market for this sort of thing is the developers who see something shiny, build a company (or, worse, build someone else's company) on it, and then regret it pretty soon and move to something else (hopefully much more boring).

Does anyone have more insight on this? I'd love to know.

replies(8): >>41832813 #>>41832877 #>>41832980 #>>41832987 #>>41833057 #>>41833093 #>>41833218 #>>41835368 #

1. klabb3 ◴[14 Oct 24 01:09 UTC] No.41833218[source]▶

>>41832728 #

Databases is an extremely slow-maturing area, similar to programming languages, but are all deviations from Postgres shiny and hipster?

The idea of colocating data and behavior is really a quantifiable reduction in complexity. It removes latency and bandwidth concerns, which means both operational concerns and development concerns (famously the impact of the N+1 problem is greatly reduced). You can absolutely argue that networked Postgres is better for other reasons (and you may be right) but SQLite is about as boring and predictable as you can get, with known strong advantages. This is the reason it’s getting popular on the server.

That said, I don’t like the idea of creating many small databases very much - as they suggest with Durable Objects. That gives noSQL nightmares - breaking all kinds of important invariants of relational dbs. I think it’s much preferable to use SQLite as a monolithic database like it’s done in their D1 product.

replies(4): >>41833285 #>>41833308 #>>41834216 #>>41834497 #

2. crabmusket ◴[14 Oct 24 01:21 UTC] No.41833285[source]▶

>>41833218 (TP) #

> That gives noSQL nightmares - breaking all kinds of important invariants of relational dbs

IMO Durable Objects map well to use cases where there actually are documents. Think of Figma. There is a ton of data that lives inside the literal Figma document. It would be awful to have a relational table for like "shapes" with one row per rectangle across Figma's entire customer base. That's just not an appropriate use of a relational database.

So let's say I built Figma on MongoDB, where each Figma document is a Mongo document. That corresponds fairly straightforwardly to each Figma document being a Durable Object instance, using either the built-in noSQL storage that Durable Objects already have, or a small Sqlite relational database which does have a "shapes" table, but only containing the shapes in this one document.

replies(2): >>41833997 #>>41836913 #

3. ◴[14 Oct 24 01:24 UTC] No.41833308[source]▶

>>41833218 (TP) #

4. jchanimal ◴[14 Oct 24 03:46 UTC] No.41833997[source]▶

>>41833285 #

We are wrestling with questions like this on the new document database we’re building. A database should correspond to some administrative domain object.

Today in Fireproof a database is a unit of sharing, but we are working toward a broader model where a database corresponds to an individual application’s state. So one database is all the shared documents not just a single unit of sharing.

These small changes early on can have big impact later. If you’re interested in these sort of design questions, the Fireproof Discord is where we are hashing out the v0.20 api.

(I was an early contributor to Apache CouchDB. Damien Katz, creator of CouchDB, is helping with engineering and raised these questions recently, along with other team members.)

5. masterj ◴[14 Oct 24 04:33 UTC] No.41834216[source]▶

>>41833218 (TP) #

If you adopt a wide-column db like Cassandra or DynamoDB, don’t you have to pick a shard for your table? The idea behind Durable Objects seems similar

replies(1): >>41834628 #

6. 8n4vidtmkvmk ◴[14 Oct 24 05:33 UTC] No.41834497[source]▶

>>41833218 (TP) #

N+1 problem is also reduced if you keep your one and only server next to your one and only database.

This was actually the solution we came up with at a very big global company. Well, not 1 server, but 1 data center. If your write leaders are all in one place it apparently doesn't matter that everything else is global, for certain write requests at least.

7. simpsond ◴[14 Oct 24 05:58 UTC] No.41834628[source]▶

>>41834216 #

You have a row key, which gets consistently hashed to a shard / node on the ring.

8. klabb3 ◴[14 Oct 24 12:32 UTC] No.41836913[source]▶

>>41833285 #

> Durable Objects map well to use cases where there actually are documents

Right. I wouldn’t dispute this. This is akin to a file format from software back in the day (like say photoshop but now with multiplayer). What this means is that you get different compatibility boundaries and you relinquish centralized control and ability to do transparent migrations and analysis. For all intents and purposes, the documents should be more or less opaque and self-contained. I personally like this, but I also recognize that most web engineers of our current generation are not used to think in this disciplined and defensive way upfront.

↑