Most active commenters
  • pclmulqdq(3)

←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 12 comments | | HN request time: 0.6s | source | bottom
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
jsnell ◴[] No.39444096[source]
According to the submitted article, the numbers are from AWS instance types where the SSD is "physically attached" to the host, not about SSD-backed NAS solutions.

Also, the article isn't just about SSDs being no faster than a network. It's about SSDs being two orders of magnitude slower than datacenter networks.

replies(3): >>39444161 #>>39444353 #>>39448728 #
pclmulqdq ◴[] No.39444161[source]
It's because the "local" SSDs are not actually physically attached and there's a network protocol in the way.
replies(14): >>39444222 #>>39444248 #>>39444253 #>>39444261 #>>39444341 #>>39444352 #>>39444373 #>>39445175 #>>39446024 #>>39446163 #>>39446271 #>>39446742 #>>39446840 #>>39446893 #
1. candiddevmike ◴[] No.39444253[source]
Depends on the cloud provider. Local SSDs are physically attached to the host on GCP, but that makes them only useful for temporary storage.
replies(3): >>39444326 #>>39444754 #>>39445986 #
2. pclmulqdq ◴[] No.39444326[source]
If you're at G, you should read the internal docs on exactly how this happens and it will be interesting.
replies(3): >>39444529 #>>39450240 #>>39450805 #
3. rfoo ◴[] No.39444529[source]
Why would I lose all data on these SSDs when I initiate a power off of the VM on console, then?

I believe local SSDs are definitely attached to the host. They are just not exposed via NVMe ZNS hence the performance hit.

replies(2): >>39444859 #>>39445006 #
4. amluto ◴[] No.39444754[source]
Which is a weird sort of limitation. For any sort of you-own-the-hardware arrangement, NVMe disks are fine for long term storage. (Obviously one should have backups, but that’s a separate issue. One should have a DR plan for data on EBS, too.)

You need to migrate that data if you replace an entire server, but this usually isn’t a very big deal.

replies(1): >>39444869 #
5. manquer ◴[] No.39444859{3}[source]
It is because on reboot you may not get the same physical server . They are not rebooting the physical server for you , just the VM

Same VM is not allocated for a variety of reasons , scheduled maintenance, proximity to other hosts on the vpc , balancing quiet and noisy neighbors so on.

It is not that the disk will always wiped , sometimes the data is still there on reboot just that there is no guarantee allowing them to freely move between hosts

replies(1): >>39448758 #
6. supriyo-biswas ◴[] No.39444869[source]
This is Hyrum’s law at play: AWS wants to make sure that the instance stores aren’t seen as persistent, and therefore enforce the failure mode for normal operations as well.

You should also see how they enforce similar things for their other products and APIs, for example, most of their services have encrypted pagination tokens.

7. res0nat0r ◴[] No.39445006{3}[source]
Your EC2 instance with instance-store storage when stopped can be launched on any other random host in the AZ when you power it back on. Since your rootdisk is an EBS volume attached across the network, so when you start your instance back up you're going to be launched likely somewhere else with an empty slot, and empty local-storage. This is why there is always a disclaimer that this local storage is ephemeral and don't count on it being around long-term.
replies(1): >>39446333 #
8. throwawaaarrgh ◴[] No.39445986[source]
Yes, that's what their purpose is in cloud applications: temporary high performance storage only.

If you want long term local storage you'll have to reserve an instance host.

9. mrcarrot ◴[] No.39446333{4}[source]
I think the parent was agreeing with you. If the “local” SSDs _weren’t_ actually local, then presumably they wouldn’t need to be ephemeral since they could be connected over the network to whichever host your instance was launched on.
10. mr_toad ◴[] No.39448758{4}[source]
Data persists between reboots, but not shutdowns:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-inst...

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-inst...

11. jsolson ◴[] No.39450240[source]
In most cases, they're physically plugged into a PCIe CEM slot in the host.

There is no network in the way, you are either misinformed or thinking of a different product.

12. seedless-sensat ◴[] No.39450805[source]
Why are you protecting Google's internal architecture onto to AWS? Your Google mental model is not correct here