←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 1 comments | | HN request time: 0.223s | source
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
vlovich123 ◴[] No.39444024[source]
Why do they fundamentally need to be network attached storage instead of local to the VM?
replies(5): >>39444042 #>>39444055 #>>39444065 #>>39444132 #>>39444197 #
pclmulqdq ◴[] No.39444065[source]
Reliability. SSDs break and screw up a lot more frequently and more quickly than CPUs. Amazon has published a lot on the architecture of EBS, and they go through a good analysis of this. If you have a broken disk and you locally attach, you have a broken machine.

RAID helps you locally, but fundamentally relies on locality and low latency (and maybe custom hardware) to minimize the time window where you get true data corruption on a bad disk. That is insufficient for cloud storage.

replies(1): >>39450096 #
1. vlovich123 ◴[] No.39450096[source]
Sure, but there's plenty of software that's written to use distributed unreliable storage similar to how cloud providers write their own software (e.g. Kafka). I can understand if many applications are just need something like EBS that's durable but looks like a normal disk, but not so sure it's a fundamentally required abstraction.