Distributed systems programming has stalled

(www.shadaj.me)

287 points shadaj | 2 comments | 27 Feb 25 16:12 UTC | HN request time: 0.389s | source

Show context

bsnnkv ◴[27 Feb 25 16:52 UTC] No.43196091[source]▶

>>43195702 (OP) #

Last month I switched from a role working on a distributed system (FAANG) to a role working on embedded software which runs on cards in data center racks.

I was in my last role for a year, and 90%+ of my time was spent investigating things that went "missing" at one of many failure points between one of the many distributed components.

I wrote less than 200 lines of code that year and I experienced the highest level of burnout in my professional career.

The technical aspect that contributed the most to this burnout was both the lack of observability tooling and the lack of organizational desire to invest in it. Whenever I would bring up this gap I would be told that we can't spend time/money and wait for people to create "magic tools".

So far the culture in my new embedded (Rust, fwiw) position is the complete opposite. If you're burnt out working on distributed systems and you care about some of the same things that I do, it's worth giving embedded software dev a shot.

replies(24): >>43196122 #>>43196159 #>>43196163 #>>43196180 #>>43196239 #>>43196674 #>>43196899 #>>43196910 #>>43196931 #>>43197177 #>>43197902 #>>43198895 #>>43199169 #>>43199589 #>>43199688 #>>43199980 #>>43200186 #>>43200596 #>>43200725 #>>43200890 #>>43202090 #>>43202165 #>>43205115 #>>43208643 #

alabastervlog ◴[27 Feb 25 18:16 UTC] No.43196899[source]▶

>>43196091 #

I've found the rush to distributed computing when it's not strictly necessary kinda baffling. The costs in complexity are extreme. I can't imagine the median company doing this stuff is actually getting either better uptime or performance out of it—sure, it maybe recovers better if something breaks, maybe if you did everything right and regularly test that stuff (approximately nobody does though), but there's also so very much more crap that can break in the first place.

Plus: far worse performance ("but it scales smoothly" OK but your max probable scale, which I'll admit does seem high on paper if you've not done much of this stuff before, can fit on one mid-size server, you've just forgotten how powerful computers are because you've been in cloud-land too long...) and crazy-high costs for related hardware(-equivalents), resources, and services.

All because we're afraid to shell into an actual server and tail a log, I guess? I don't know what else it could be aside from some allergy to doing things the "old way"? I dunno man, seems way simpler and less likely to waste my whole day trying to figure out why, in fact, the logs I need weren't fucking collected in the first place, or got buried some damn corner of our Cloud I'll never find without writing a 20-line "log query" in some awful language I never use for anything else, in some shitty web dashboard.

Fewer, or cheaper, personnel? I've never seen cloud transitions do anything but the opposite.

It's like the whole industry went collectively insane at the same time.

[EDIT] Oh, and I forgot, for everything you gain in cloud capabilities it seems like you lose two or three things that are feasible when you're running your own servers. Simple shit that's just "add two lines to the nginx config and do an apt-install" becomes three sprints of custom work or whatever, or just doesn't happen because it'd be too expensive. I don't get why someone would give that stuff up unless they really, really had to.

[EDIT EDIT] I get that this rant is more about "the cloud" than distributed systems per se, but trying to build "cloud native" is the way that most orgs accidentally end up dealing with distributed systems in a much bigger way than they have to.

replies(10): >>43197578 #>>43197608 #>>43197740 #>>43199134 #>>43199560 #>>43201628 #>>43201737 #>>43202751 #>>43204072 #>>43225726 #

motorest ◴[28 Feb 25 07:31 UTC] No.43202751[source]▶

>>43196899 #

> I've found the rush to distributed computing when it's not strictly necessary kinda baffling.

I'm not entirely sure you understand the problem domain, or even the high-level problem. The is or ever was a "rush" to distributed computing.

What you actually have is this global epifany that having multiple computers communicating over a network to do something actually has a name, and it's called distributed computing.

This means that we had (and still have) guys like you who look at distributed systems and somehow do not understand they are looking at distributed systems. They don't understand that mundane things like a mobile app supporting authentication or someone opening a webpage or email is a distributed system. They don't understand that the discussion on monolith vs microservices is orthogonal to the topic of distributed systems.

So the people railing against distributed systems are essentially complaining about their own ignorance and failure to actually understand the high-level problem.

You have two options: acknowledge that, unless you're writing a desktop app that does nothing over a network, odds are every single application you touch is a node in a distributed system, or keep fooling yourself into believing it isn't. I mean, if a webpage fails to load then you just hit F5, right? And if your app just fails to fetch something from a service you just restart it, right? That can't possibly be a distributed system, and those scenarios can't possibly be mitigated by basic distributed computing strategies, isn't it?

Everything is simple to those who do not understand the problem, and those who do are just making things up.

replies(1): >>43206886 #

1. lucyjojo ◴[28 Feb 25 15:44 UTC] No.43206886[source]▶

>>43202751 #

you and the guy you are answering too are not talking the same language (technically yes but you are putting different meanings to the same words).

this would lead to a pointless conversation, if it were to ever happen.

replies(1): >>43217147 #

2. motorest ◴[01 Mar 25 08:10 UTC] No.43217147[source]▶

>>43206886 (TP) #

> you and the guy you are answering too are not talking the same language (technically yes but you are putting different meanings to the same words).

That's the point, isn't it? It's simply wrong to assert that there's a rush to distributed systems when they are already ubiquitous in the real world, even if this comes as a surprise to people like OP. Get acquainted with the definition of distributed computing, and look at reality.

The only epiphany taking place is people looking at distributed systems and thinking that, yes, perhaps they should be treated as distributed systems. Perhaps the interfaces between multiple microservices are points of failure, but replacing them with a monolith does not make it less of a distributed system. Worse, taking down your monolith is also a failure mode, one with higher severity. How do you mitigate that failure mode? Well, educate yourself about distributed computing.

If you look at a distributed system and call it something other than distributed system, are you really speaking a different language, or are you simply misguided?

↑