←back to thread

334 points gjvc | 4 comments | | HN request time: 0.64s | source
Show context
throwaway892238 ◴[] No.31849720[source]
This is the future of databases, but nobody seems to realize it yet.

One of the biggest problems with databases (particularly SQL ones) is they're a giant pile of mutable state. The whole idea of "migrations" exists because it is impossible to "just" revert any arbitrary change to a database, diff changes automatically, merge changes automatically. You need some kind of intelligent tool or framework to generate DDL, DML, DCL, they have to be applied in turn, something has to check if they've already been applied, etc. And of course you can't roll back a change once it's been applied, unless you create even more program logic to figure out how to do that. It's all a big hack.

By treating a database as version-controlled, you can treat any operation as immutable. Make any change you want and don't worry about conflicts. You can always just go back to the last working version, revert a specific change, merge in one or more changes from different working databases. Make a thousand changes a day, and when one breaks, revert it. No snapshotting and slowly restoring the whole database due to a non-reversible change. Somebody dropped the main table in prod? Just revert the drop. Need to make a change to the prod database but the staging database is different? Branch the prod database, make a change, test it, merge back into prod.

The effect is going to be as radical as the popularization of containers. Whether you like them or not, they are revolutionizing an industry and are a productivity force multiplier.

replies(11): >>31849825 #>>31849875 #>>31849951 #>>31850566 #>>31850778 #>>31851109 #>>31851356 #>>31852067 #>>31853553 #>>31858826 #>>31865675 #
1. blowski ◴[] No.31849825[source]
It looks incredible, but somehow seems too good to be true.

What are the tradeoffs here? When wouldn't I want to use this?

replies(1): >>31849955 #
2. timsehn ◴[] No.31849955[source]
Creator here.

It's slower. This is `sysbench` Dolt vs MySQL.

https://docs.dolthub.com/sql-reference/benchmarks/latency

We've dedicated this year to performance with a storage engine rewrite. We'll have some performance wins coming in the back half of the year. We think we can get under 2X MySQL.

It also requires more disk. Each change is at least on average 4K on disk. So, you might need more/bigger hard drives.

replies(2): >>31850114 #>>31851592 #
3. EarthLaunch ◴[] No.31850114[source]
Another commenter noted a need for migrations in order to handle rollbacks without data loss.
4. kragen ◴[] No.31851592[source]
(Disclaimer: I haven't tried Dolt.)

In your benchmark it's only 2.1–7.4 times slower than MySQL, average 4.4. And any database someone could fit on a disk 20 years ago (I forget, maybe 8 GB?) fits in RAM now, which makes it about 256 times faster, which is a lot bigger than 4.4. You can get a 20 TB disk now, which is enough space So anything that could be done with MySQL 20 years ago can be done faster and cheaper with Dolt now, which covers, I think the technical term is, a fucking shitload of applications. It probably includes literally every 20th-century application of relational databases.

Well, except for things that have over 5 billion transactions (20 TB ÷ 4 kB/txn) over their lifetime, I guess, so it might be important to find a way to compact that 4K. 5 billion transactions is 19 months at 100 TPS. If you could get that down to 256 bytes it would be almost 25 years of 100 TPS.

Also, as I understand it, and correct me if I'm wrong here, that 4.4× slowdown buys you a bulletproof and highly performant and scalable strategy for backups (with PITR), staging servers, data warehousing, readslaves, disk error detection and recovery, cryptographically secure audit logs, bug reproduction, and backtesting. Along with the legal security the Apache 2 license gives you, which you don't have with Datomic.

Sounds fantastic! It sounds like you're selling its performance a bit short. If someone is really concerned about such a small performance loss they probably aren't really in the market for a new RDBMS.