←back to thread

287 points shadaj | 1 comments | | HN request time: 0.21s | source
1. gregw2 ◴[] No.43198525[source]
What I noticed missing in this analysis of distributed systems programming was a recognition/discussion of how distributed databases (or datalakes) decoupling storage from compute have changed the art of the possible.

In the old days of databases, if you put all your data in one place, you could scale up (SMP) but scaling out (MPP) really was challenging. Nowdays, you (iceberg), or a DB vendor (Snowflake, Databricks, BigQuery, even BigTable, etc), put all your data on S3/GCS/ADLS and you can scale out compute to read traffic as much as you want (as long as you accept something like a snapshot isolation read level and traffic is largely read-only or writes are distributed across your tables and not all to one big table.)

You can now share data across your different compute nodes or applications/systems by managing permissions pointers managed via a cloud metadata/catalog service. You can get microservice databases without each having completely separate datastores in a way.