Distributed systems programming has stalled

1. margorczynski ◴[27 Feb 25 18:20 UTC] No.43196933[source]▶

>>43195702 (OP) #

Distributed systems are cool but most people don't really get how much complexity it introduces which leads them to fad-driven decisions like using Event Sourcing where there is no fundamental need to use it. I've seen projects getting burned because of the complexity and overhead it introduces where "simpler" approaches worked well and were easy to extend/fix. Hard to find and fix bugs, much slower feature addition and lots of more goodies the blogs with toy examples don't speak about.

replies(3): >>43196997 #>>43197027 #>>43204750 #

2. nine_k ◴[27 Feb 25 18:27 UTC] No.43196997[source]▶

>>43196933 (TP) #

The best recipe I know is to start from a modular monolith [1] and split it when and if you need to scale way past a few dozen nodes.

Event sourcing is a logical structure; you can implement it with SQLite or even flat files, locally, if you your problem domain is served well by it. Adding Kafka as the first step is most likely a costly overkill.

[1]: https://awesome-architecture.com/modular-monolith/

replies(1): >>43197153 #

3. rjbwork ◴[27 Feb 25 18:30 UTC] No.43197027[source]▶

>>43196933 (TP) #

If the choice has already been made to do a distributed system (outside of the engineer's control...), is a choice to use Event Sourcing by the engineer then a good idea?

4. margorczynski ◴[27 Feb 25 18:43 UTC] No.43197153[source]▶

>>43196997 #

What you're speaking of is a need/usability-based design and extension where you design the solution with certain "safety valves" that let you scale it up when needed.

This is in contrast to the fad-driven design and over-engineering that I'm speaking of (here I simply used ES as an example) that is usually introduced because someone in power saw a blog post or 1h talk and it looked cool. And Kafka will be used because it is the most "scalable" and shiny solution, there is no pros-vs-cons analysis.

5. mrkeen ◴[28 Feb 25 12:12 UTC] No.43204750[source]▶

>>43196933 (TP) #

In my experience:

1) We are surrounded by distributed systems all the time. When we buy and sell B2B software, we don't know what's stored in our partners databases, they don't know what's in ours. Who should ask whom, and when? If the data sources disagree, whose is correct? Just being given access to a REST API and a couple of webhooks is all you need to be in full distributed systems land.

2) I honestly do not know of a better approach than event-sourcing (i.e. replicated state machine) to coordinate among multiple masters like this. The only technique I can think of that comes close is Paxos - which does not depend on events. But then the first thing I would do if I only had Paxos, would be to use it to bootstrap some kind of event system on top of it.

Even the non-event-sourcing technologies like DBs use events (journals, write-ahead-logs, sstables, etc.) in their own implementation. (However that does not imply that you're getting events 'for free' by using these systems.)

My co-workers do not put any alternatives forward. Reading a database, deciding what action to do, and then carrying out said action is basically the working definition of a race-condition. Bankers and accountants had this figured out thousands of years ago: a bank can't send a wagon across the country with queries like "How much money is in Joe's account?" wait a week for the reply, and then send a second wagon saying "Update Joe's account so it has $36.43 in it now". It's laughable. But now that we have 50-150ms latencies, we feel comfortable doing GETs and POSTs (with a million times more traffic) and somehow think we're not going to get our numbers wrong.

Like, what's an alternative? I have a shiny billion-dollar fully-ACID SQL db with my customer accounts in them. And my SAAS partner bank also has that technology. Put forward literally any idea other than events that will let us coordinate their accounts such that they're not able to double-spend money, or are prevented from spending money if a node is down. I want an alternative to event sourcing.

replies(1): >>43204979 #

6. margorczynski ◴[28 Feb 25 12:41 UTC] No.43204979[source]▶

>>43204750 #

Again - do not fixate on the ES thing as it was put forward only as an example. You're presenting a case when for the given scenario after analysis and weighting the alternatives this is the most optimal solution where I'm speaking about introducing unnecessary complexity just because the tech is cool and trendy.