←back to thread

287 points shadaj | 2 comments | | HN request time: 0.436s | source
Show context
cmrdporcupine ◴[] No.43196622[source]
Two things:

Distributed systems are difficult to reason about.

Computer hardware today is very powerful.

There is a yo-yo process in our industry over the last 50 years between centralization and distribution. We necessarily distribute when we hit the limits of what centralization can accomplish because in general centralization is easier to reason about.

When we hit those junctures, there's a flush of effort into distributed systems. The last major example of this I can think of was the 2000-2010 period, when MapReduce, "NoSQL" databases, Google's massive arrays of supposedly identical commodity grey boxes (not the case anymore), the High Scalability blog, etc. were the flavour of the time.

But then, frankly, mass adoption of SSDs, much more powerful computers, etc. made a lot of those things less necessary. The stuff that most people are doing doesn't require a high level of distributed systems sophistication.

Distributed systems are an interesting intellectual puzzle. But they should be a means to an end not an end in themselves.

replies(3): >>43196814 #>>43197204 #>>43198508 #
tonyarkles ◴[] No.43198508[source]
> But then, frankly, mass adoption of SSDs, much more powerful computers, etc. made a lot of those things less necessary. The stuff that most people are doing doesn't require a high level of distributed systems sophistication.

I did my MSc in Distributed Systems and it was always funny (to me) to ask a super simple question when someone was presenting distributed system performance metrics that they'd captured to compare how a system scaled across multiple systems: how long does it take your laptop to process the same dataset? No one ever seemed to have that data.

And then the (in)famous COST paper came out and validated the question I'd been asking for years: https://www.usenix.org/system/files/conference/hotos15/hotos...

replies(1): >>43199038 #
1. cmrdporcupine ◴[] No.43199038[source]
“You can have a second computer once you’ve shown you know how to use the first one.” –Paul Barham

Wow I love that.

Many people in our profession didn't seem to really notice when the number of IOPS on predominant storage media went from under 200 to well over 100,000 in a matter of just a few years.

I remember evaluating and using clusters of stuff like Cassandra back in the late 00s because it just wasn't possible to push enough data to disk to keep up with traffic on a single machine. It's such an insanely different scenario now.

replies(1): >>43199289 #
2. tonyarkles ◴[] No.43199289[source]
My not-super-humble opinion is that people didn’t notice because SSDs became mainstream/cheap around the same time cloud migration got popular. Lots of VPS providers offer pretty mediocre IOPS and disk bandwidth on the lower tiers; I’d argue disproportionately so. A $300 desktop from Costco with 8GB of RAM and a 500GB SSD is going to kick the crap out of most 8GB RAM VPSes for IO performance. So… right when rack mounted servers could affordably provide insane amounts of IO performance, we all quit buying rack mount servers and didn’t notice how much worse off we are with VPSes.