Most active commenters
  • bombcar(3)

←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 11 comments | | HN request time: 0.419s | source | bottom
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
_Rabs_ ◴[] No.39444028[source]
So much of this. The amount of times I've seen someone complain about slow DB performance when they're trying to connect to it from a different VPC, and bottlenecking themselves to 100Mbits is stupidly high.

Literally depending on where things are in a data center... If you're looking for closely coupled and on a 10G line on the same switch, going to the same server rack. I bet you performance will be so much more consistent.

replies(3): >>39444090 #>>39444438 #>>39505345 #
1. silverquiet ◴[] No.39444438[source]
> Literally depending on where things are in a data center

I thought cloud was supposed to abstract this away? That's a bit of a sarcastic question from a long-time cloud skeptic, but... wasn't it?

replies(3): >>39444488 #>>39445334 #>>39446736 #
2. doubled112 ◴[] No.39444488[source]
Reality always beats the abstraction. After all, it's just somebody else's computer in somebody else's data center.
replies(1): >>39444553 #
3. bombcar ◴[] No.39444553[source]
Which can cause considerable "amusement" depending on the provider - one I won't name directly but is much more centered on actual renting racks than their (now) cloud offering - if you had a virtual machine older than a year or so, deleting and restoring it would get you on a newer "host" and you'd be faster for the same cost.

Otherwise it'd stay on the same physical piece of hardware it was allocated to when new.

replies(1): >>39444620 #
4. doubled112 ◴[] No.39444620{3}[source]
Amusing is a good description.

"Hardware degradation detected, please turn it off and back on again"

I could do a migration with zero downtime in VMware for a decade but they can't seamlessly move my VM to a machine that works in 2024? Great, thanks. Amusing.

replies(2): >>39445263 #>>39445751 #
5. bombcar ◴[] No.39445263{4}[source]
I have always been incredibly saddened that apparently the cloud providers usually have nothing as advanced as old VMware was.
6. kccqzy ◴[] No.39445334[source]
It's more of a matter of adding additional abstraction layers. For example in most public clouds the best you can hope for is to place two things in the same availability zone to get the best performance. But when I worked at Google, internally they had more sophisticated colocation constraint than that: for example you can require two things to be on the same rack.
7. wmf ◴[] No.39445751{4}[source]
Cloud providers have live migration now but I guess they don't want to guarantee anything.
replies(1): >>39447261 #
8. treflop ◴[] No.39446736[source]
Cloud makes provisioning more servers quicker because you are paying someone to basically have a bunch of servers ready to go right away with an API call instead of a phone call, maintained by a team that isn’t yours, with economies of scale working for the provider.

Cloud does not do anything else.

None of these latency/speed problems are cloud-specific. If you have on-premise servers and you are storing your data on network-attached storage, you have the exact same problems (and also the same advantages).

Unfortunately the gap between local and network storage is wide. You win some, you lose some.

replies(1): >>39447952 #
9. bombcar ◴[] No.39447261{5}[source]
It's better (and better still with other providers) but I naively thought that "add more RAM" or "add more disk" was something they would be able to do with a reboot at most.

Nope, some require a full backup and restore.

replies(1): >>39447719 #
10. wmf ◴[] No.39447719{6}[source]
Resizing VMs doesn't really fit the "cattle" thinking of public cloud, although IMO that was kind of a premature optimization. This would be a perfect use case for live migration.
11. silverquiet ◴[] No.39447952[source]
Oh, I'm not a complete neophyte (in what seems like a different life now, I worked for a big hosting provider actually), I was just surprised that there was a big penalty for cross-VPC traffic implied by the parent poster.