Most active commenters
  • bravetraveler(4)

←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 18 comments | | HN request time: 0.28s | source | bottom
Show context
pclmulqdq ◴[] No.39443994[source]
This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #
jsnell ◴[] No.39444096[source]
According to the submitted article, the numbers are from AWS instance types where the SSD is "physically attached" to the host, not about SSD-backed NAS solutions.

Also, the article isn't just about SSDs being no faster than a network. It's about SSDs being two orders of magnitude slower than datacenter networks.

replies(3): >>39444161 #>>39444353 #>>39448728 #
pclmulqdq ◴[] No.39444161[source]
It's because the "local" SSDs are not actually physically attached and there's a network protocol in the way.
replies(14): >>39444222 #>>39444248 #>>39444253 #>>39444261 #>>39444341 #>>39444352 #>>39444373 #>>39445175 #>>39446024 #>>39446163 #>>39446271 #>>39446742 #>>39446840 #>>39446893 #
jsnell ◴[] No.39444373[source]
I think you're wrong about that. AWS calls this class of storage "instance storage" [0], and defines it as:

> Many Amazon EC2 instances can also include storage from devices that are located inside the host computer, referred to as instance storage.

There might be some wiggle room in "physically attached", but there's none in "storage devices located inside the host computer". It's not some kind of AWS-only thing either. GCP has "local SSD disks"[1], which I'm going to claim are likewise local, not over the network block storage. (Though the language isn't as explicit as for AWS.)

[0] https://aws.amazon.com/ec2/instance-types/

[1] https://cloud.google.com/compute/docs/disks#localssds

replies(5): >>39444464 #>>39445545 #>>39447509 #>>39449306 #>>39450882 #
1. 20after4 ◴[] No.39444464[source]
If the SSD is installed in the host server, doesn't that still allow for it to be shared among many instances running on said host? I can imagine that a compute node has just a handful of SSDs and many hundreds of instances sharing the I/O bandwidth.
replies(5): >>39444584 #>>39444763 #>>39444820 #>>39445938 #>>39446130 #
2. discodave ◴[] No.39444584[source]
If you have one of the metal instance types, then you get the whole host, e.g. i4i.metal:

https://aws.amazon.com/ec2/instance-types/i4i/

3. aeyes ◴[] No.39444763[source]
On AWS yes, the older instances which I am familiar with had 900GB drives and they sliced that up into volumes of 600, 450, 300, 150, 75GB depending on instance size.

But they also tell you how much IOPS you get: https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/stora...

4. ownagefool ◴[] No.39444820[source]
PCI bus, etc too
5. throwawaaarrgh ◴[] No.39445938[source]
Instance storage is not networked. That's why it's there.
6. queuebert ◴[] No.39446130[source]
How do these machines manage the sharing of one local SSD across multiple VMs? Is there some wrapper around the I/O stack? Does it appear as a network share? Geniuinely curious...
replies(4): >>39446222 #>>39446276 #>>39446886 #>>39447488 #
7. felixg3 ◴[] No.39446222[source]
Probably NVME namespaces [0]?

[0]: https://nvmexpress.org/resource/nvme-namespaces/

replies(1): >>39448056 #
8. dan-robertson ◴[] No.39446276[source]
AWS have custom firmware for at least some of their SSDs, so could be that
9. magicalhippo ◴[] No.39446886[source]
In say VirtualBox you can create a file backed on the physical disk, and attach it to the VM so the VM sees it as a NVMe drive.

In my experience this is also orders of magnitude slower that true direct access, ie PCIe pass-through, as all access has to pass through the VM storage driver and so could explain what is happening.

replies(1): >>39448073 #
10. icedchai ◴[] No.39447488[source]
With Linux and KVM/QEMU, you can map an entire physical disk, disk partition, or file to a block device in the VM. For my own VM hosts, I use LVM and map a logical volume to the VM. I assumed cloud providers did something conceptually similar, only much more sophisticated.
replies(2): >>39448087 #>>39455138 #
11. bravetraveler ◴[] No.39448056{3}[source]
Less fancy, quite often... at least on VPS providers [1]. They like to use reflinked files off the base images. This way they only store what differs.

1: Which is really a cloud without a certain degree of software defined networking/compute/storage/whatever.

12. bravetraveler ◴[] No.39448073{3}[source]
The storage driver may have more impact on VBox. You can get very impressive results with 'virtio' on KVM
replies(1): >>39448690 #
13. bravetraveler ◴[] No.39448087{3}[source]
Files with reflinks are a common choice, the main benefit being: only storing deltas. The base OS costs basically nothing

LVM/block like you suggest is a good idea. You'd be surprised how much access time is trimmed by skipping another filesystem like you'd have with a raw image file

14. magicalhippo ◴[] No.39448690{4}[source]
Yeah I've yet to try that. I know I get a similar lack of performance with Bhyve (FreeBSD) using VirtIO, so it's not a given it's fast.

I have no idea how AWS run their VMs, was just saying a slow storage driver could give such results.

replies(1): >>39449157 #
15. bravetraveler ◴[] No.39449157{5}[source]
> just saying a slow storage driver could give such results

Oh, absolutely - not to contest that! There's a whole lot of academia on 'para-virtualized' and so on in this light.

That's interesting to hear about FreeBSD; basically all of my experience has been with Linux/Windows.

16. jethro_tell ◴[] No.39455138{3}[source]
Heh, you'd probably be surprised, there's some really cool cutting edge stuff being done in those data centers but a lot of what is done is just plan old standard server management without much in the way of tricks. Its just someone else does it instead of you and the billing department is counting milliseconds.
replies(1): >>39455931 #
17. icedchai ◴[] No.39455931{4}[source]
Do cloud providers document these internals anywhere? I'd love to read about that sort of thing.
replies(1): >>39456966 #
18. jethro_tell ◴[] No.39456966{5}[source]
Not generally, especially not the super generic stuff. Where they really excel is having the guy that wrote the kernel driver or hypervisor on staff. But a lot of it is just an automated version of what you'd do on a smaller scale