Distributed systems programming has stalled

(www.shadaj.me)

287 points shadaj | 1 comments | 27 Feb 25 16:12 UTC | HN request time: 0.208s | source

Show context

bsnnkv ◴[27 Feb 25 16:52 UTC] No.43196091[source]▶

>>43195702 (OP) #

Last month I switched from a role working on a distributed system (FAANG) to a role working on embedded software which runs on cards in data center racks.

I was in my last role for a year, and 90%+ of my time was spent investigating things that went "missing" at one of many failure points between one of the many distributed components.

I wrote less than 200 lines of code that year and I experienced the highest level of burnout in my professional career.

The technical aspect that contributed the most to this burnout was both the lack of observability tooling and the lack of organizational desire to invest in it. Whenever I would bring up this gap I would be told that we can't spend time/money and wait for people to create "magic tools".

So far the culture in my new embedded (Rust, fwiw) position is the complete opposite. If you're burnt out working on distributed systems and you care about some of the same things that I do, it's worth giving embedded software dev a shot.

replies(24): >>43196122 #>>43196159 #>>43196163 #>>43196180 #>>43196239 #>>43196674 #>>43196899 #>>43196910 #>>43196931 #>>43197177 #>>43197902 #>>43198895 #>>43199169 #>>43199589 #>>43199688 #>>43199980 #>>43200186 #>>43200596 #>>43200725 #>>43200890 #>>43202090 #>>43202165 #>>43205115 #>>43208643 #

jasonjayr ◴[27 Feb 25 16:55 UTC] No.43196122[source]▶

>>43196091 #

> Whenever I would bring up this gap I would be told that we can't spent time and wait for people to create "magic tools".

That sounds like an awful organizational ethos. 30hrs to make a "magic tool" to save 300hrs across the organization sounds like a no-brainer to anyone paying attention. It sounds like they didn't even want to invest in out-sourced "magic tools" to help either.

replies(2): >>43196181 #>>43196562 #

1. cmrdporcupine ◴[27 Feb 25 17:38 UTC] No.43196562[source]▶

>>43196122 #

Consider that there is a class of human motivation / work culture that considers "figuring it out" to be the point of the job and just accepts or embraces complexity as "that's what I'm paid to do" and gets an ego-satisfaction from it. Why admit weakness? I can read the logs by timestamp and resolve the confusions from the CAP theorem from there!

Excessive drawing of boxes and lines, and the production of systems around them becomes a kind of Glass Bead Game. "I'm paid to build abstractions and then figure out how to keep them glued together!" Likewise, recomposing events in your head from logs, or from side effects -- that's somehow the marker of being good at your job.

The same kind of motivation underlies people who eschew or disparage GUI debuggers (log statements should be good enough or you're not a real programmer), too.

Investing in observability tools means admitting that the complexity might overwhelm you.

As an older software engineer the complexity overwhelmed me a long time ago and I strongly believe in making the machines do analysis work so I don't have to. Observability is a huge part of that.

Also many people need to be shown what observability tools / frameworks can do for them, as they may not have had prior exposure.

And back to the topic of the whole thread, too: can we back up and admit that distributed systems is questionable as an end in itself? It's a means to an end, and distributing something should be considered only as an approach when a simpler, monolithic system (that is easier to reasona bout) no longer suffices.

Finally I find that the original authors of systems are generally not the ones interested in building out observability hooks and tools because for them the way the system works (or doesn't work) is naturally intuitive because of their experience writing it.

↑