Distributed systems programming has stalled

(www.shadaj.me)

287 points shadaj | 4 comments | 27 Feb 25 16:12 UTC | HN request time: 0.279s | source

Show context

bsnnkv ◴[27 Feb 25 16:52 UTC] No.43196091[source]▶

>>43195702 (OP) #

Last month I switched from a role working on a distributed system (FAANG) to a role working on embedded software which runs on cards in data center racks.

I was in my last role for a year, and 90%+ of my time was spent investigating things that went "missing" at one of many failure points between one of the many distributed components.

I wrote less than 200 lines of code that year and I experienced the highest level of burnout in my professional career.

The technical aspect that contributed the most to this burnout was both the lack of observability tooling and the lack of organizational desire to invest in it. Whenever I would bring up this gap I would be told that we can't spend time/money and wait for people to create "magic tools".

So far the culture in my new embedded (Rust, fwiw) position is the complete opposite. If you're burnt out working on distributed systems and you care about some of the same things that I do, it's worth giving embedded software dev a shot.

replies(24): >>43196122 #>>43196159 #>>43196163 #>>43196180 #>>43196239 #>>43196674 #>>43196899 #>>43196910 #>>43196931 #>>43197177 #>>43197902 #>>43198895 #>>43199169 #>>43199589 #>>43199688 #>>43199980 #>>43200186 #>>43200596 #>>43200725 #>>43200890 #>>43202090 #>>43202165 #>>43205115 #>>43208643 #

EtCepeyd ◴[27 Feb 25 17:07 UTC] No.43196239[source]▶

>>43196091 #

This resonates a lot with me.

Distributed systems require insanely hard math at the bottom (paxos, raft, gossip, vector clocks, ...) It's not how the human brain works natively -- we can learn abstract thinking, but it's very hard. Embedded systems sometimes require the parallelization of some hot spots, but those are more like the exception AIUI, and you have a lot more control over things; everything is more local and sequential. Even data race free multi-threaded programming in modern C and C++ is incredibly annoying; I dislike dealing with both an explicit mesh of peers, and with a leaky abstraction that lies that threads are "symmetric" (as in SMP) while in reality there's a complicated messaging network underneath. Embedded is simpler, and it seems to require less that practitioners become advanced mathematicians for day to day work.

replies(5): >>43196342 #>>43196567 #>>43196906 #>>43197331 #>>43199711 #

1. motorest ◴[27 Feb 25 17:39 UTC] No.43196567[source]▶

>>43196239 #

> Distributed systems require insanely hard math at the bottom (paxos, raft, gossip, vector clocks, ...) It's not how the human brain works natively -- we can learn abstract thinking, but it's very hard.

I think this take is misguided. Most of the systems nowadays, specially those involving any sort of network cals, are already distributed systems. Yet, the amount of systems go even close to touching fancy consensus algorithms is very very limited. If you are in a position to design a system and you hear "Paxos" coming out of your mouth, that's the moment you need to step back and think about what you are doing. Odds are you are creating your own problems, and then blaming the tools.

replies(2): >>43197705 #>>43199817 #

2. convolvatron ◴[27 Feb 25 19:42 UTC] No.43197705[source]▶

>>43196567 (TP) #

this is completely backwards. the tools may have some internal consistency guarantees, handle some classes of failures, etc. They are leaky abstractions that are partially correct. There were not collectively designed to handle all failures and consistent views no matter their composition.

From the other direction, Paxos, two generals, serializability, etc. are not hard concepts at all. Implementing custome solutions in this space _is_ hard and prone to error, but the foundations are simple and sound.

You seem to be claiming that you shouldn't need to understand the latter, that the former gives you everything you need. I would say that if you build systems using existing tools without even thinking about the latter, you're just signing up to handling preventable errors manually and treating this box that you own and black and inscrutable.

3. yodsanklai ◴[27 Feb 25 23:35 UTC] No.43199817[source]▶

>>43196567 (TP) #

I remember when I prepared for system design interviews in FAANG, I was anxious I would get asked about Paxos (which I learned at school). Now that I'm working there, never heard about Paxos or fancy distributed algorithms. We rely on various high-level services for deployment, partitioning, monitoring, logging, service discovery, storage...

And Paxos doesn't require much maths. It's pretty tricky to consider all possible interleavings, but in term of maths, it's really basic discrete maths.

replies(1): >>43202413 #

4. motorest ◴[28 Feb 25 06:33 UTC] No.43202413[source]▶

>>43199817 #

I'm starting to believe these talks of fancy high-complexity solutions come from people who desperately try to come up with convoluted problems they create for themselves only to be able to say they did a fancy high-complexity solution. Instead of going with obvious simple reliable solutions, they opt for convoluted high-complexity unreliable hacks. Then, when they are confronted by the mess they created for themselves, they hide behind the high-complexity of their solution, as if the problem was the solution itself and not making the misjudged call to adopt it.

It's so funny how all of a sudden every single company absolutely must implement Paxos. No exception. Your average senior engineer at a FANG working with global deployments doesn't come close to even hearing about it, but these guys somehow absolutely must have Paxos. Funny.

↑