←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 1 comments | | HN request time: 0.205s | source
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
vlovich123 ◴[] No.39444024[source]
Why do they fundamentally need to be network attached storage instead of local to the VM?
replies(5): >>39444042 #>>39444055 #>>39444065 #>>39444132 #>>39444197 #
1. drewda ◴[] No.39444197[source]
The major clouds do offer VMs with fast local storage, such as SSDs connected by NVMe connections directly to the VM host machine:

- https://cloud.google.com/compute/docs/disks/local-ssd

- https://learn.microsoft.com/en-us/azure/virtual-machines/ena...

- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-inst...

They sell these VMs at a higher cost because it requires more expensive components and is limited to host machines with certain configurations. In our experience, it's also harder to request quota increases to get more of these VMs -- some of the public clouds have a limited supply of these specific types of configurations in some regions/zones.

As others have noted, instance storage isn't as dependable. But it can be the most performant way to do IO-intense processing or to power one node of a distributed database.