←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 1 comments | | HN request time: 0.966s | source
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
jsnell ◴[] No.39444096[source]
According to the submitted article, the numbers are from AWS instance types where the SSD is "physically attached" to the host, not about SSD-backed NAS solutions.

Also, the article isn't just about SSDs being no faster than a network. It's about SSDs being two orders of magnitude slower than datacenter networks.

replies(3): >>39444161 #>>39444353 #>>39448728 #
pclmulqdq ◴[] No.39444161[source]
It's because the "local" SSDs are not actually physically attached and there's a network protocol in the way.
replies(14): >>39444222 #>>39444248 #>>39444253 #>>39444261 #>>39444341 #>>39444352 #>>39444373 #>>39445175 #>>39446024 #>>39446163 #>>39446271 #>>39446742 #>>39446840 #>>39446893 #
mike_hearn ◴[] No.39444261[source]
They do this because they want SSDs to be in a physically separate part of the building for operational reasons, or what's the point in giving you a "local" SSD that isn't actually plugged into the real machine?
replies(2): >>39444961 #>>39446759 #
ianburrell ◴[] No.39444961[source]
The reason for having most instances use network storage is that it makes possible migrating instances to other hosts. If the host fails, the network storage can be pointed at the new host with a reboot. AWS sends out notices regularly when they are going to reboot or migrate instances.

Their probably should be more local instance storage types for using with instances that can be recreated without loss. But it is simple for them to have a single way of doing things.

At work, someone used fast NVMe instance storage for Clickhouse which is a database. It was a huge hassle to copy data when instances were going to be restarted because the data would be lost.

replies(3): >>39445385 #>>39446444 #>>39447003 #
youngtaff ◴[] No.39446444[source]
> At work, someone used fast NVMe instance storage for Clickhouse which is a database. It was a huge hassle to copy data when instances were going to be restarted because the data would be lost.

This post on how Discord RAIDed local NVMe volumes with slower remote volumes might be on interest https://discord.com/blog/how-discord-supercharges-network-di...

replies(1): >>39447097 #
1. ianburrell ◴[] No.39447097[source]
We moved to running Clickhouse on EKS with EBS volumes for storage. It can better survive instances going down. I didn't work on it so don't how much slower it is. Lowering the management burden was big priority.