←back to thread

SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)
589 points greghn | 4 comments | | HN request time: 0.87s | source
Show context
zokier ◴[] No.39444037[source]
> Since then, several NVMe instance types, including i4i and im4gn, have been launched. Surprisingly, however, the performance has not increased; seven years after the i3 launch, we are still stuck with 2 GB/s per SSD.

AWS marketing claims otherwise:

    Up to 800K random write IOPS
    Up to 1 million random read IOPS
    Up to 5600 MB/second of sequential writes
    Up to 8000 MB/second of sequential reads

https://aws.amazon.com/blogs/aws/new-storage-optimized-amazo...
replies(1): >>39444172 #
sprachspiel ◴[] No.39444172[source]
This is for 8 SSDs and a single modern PCIe 5.0 has better specs than this.
replies(2): >>39444346 #>>39444404 #
nik_0_0 ◴[] No.39444404[source]
Is it? The line preceding the bullet list on that page seems to state otherwise:

“”

  Each storage volume can deliver the following performance (all measured using 4 KiB blocks):

  * Up to 8000 MB/second of sequential reads
“”
replies(1): >>39444564 #
sprachspiel ◴[] No.39444564{3}[source]
Just tested a i4i.32xlarge:

  $ lsblk
  NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
  loop0          7:0    0  24.9M  1 loop /snap/amazon-ssm-agent/7628
  loop1          7:1    0  55.7M  1 loop /snap/core18/2812
  loop2          7:2    0  63.5M  1 loop /snap/core20/2015
  loop3          7:3    0 111.9M  1 loop /snap/lxd/24322
  loop4          7:4    0  40.9M  1 loop /snap/snapd/20290
  nvme0n1      259:0    0     8G  0 disk 
  ├─nvme0n1p1  259:1    0   7.9G  0 part /
  ├─nvme0n1p14 259:2    0     4M  0 part 
  └─nvme0n1p15 259:3    0   106M  0 part /boot/efi
  nvme2n1      259:4    0   3.4T  0 disk 
  nvme4n1      259:5    0   3.4T  0 disk 
  nvme1n1      259:6    0   3.4T  0 disk 
  nvme5n1      259:7    0   3.4T  0 disk 
  nvme7n1      259:8    0   3.4T  0 disk 
  nvme6n1      259:9    0   3.4T  0 disk 
  nvme3n1      259:10   0   3.4T  0 disk 
  nvme8n1      259:11   0   3.4T  0 disk
Since nvme0n1 is the EBS boot volume, we have 8 SSDs. And here's the read bandwidth for one of them:

  $ sudo fio --name=bla --filename=/dev/nvme2n1 --rw=read --iodepth=128 --ioengine=libaio --direct=1 --blocksize=16m
  bla: (g=0): rw=read, bs=(R) 16.0MiB-16.0MiB, (W) 16.0MiB-16.0MiB, (T) 16.0MiB-16.0MiB, ioengine=libaio, iodepth=128
  fio-3.28
  Starting 1 process
  ^Cbs: 1 (f=1): [R(1)][0.5%][r=2704MiB/s][r=169 IOPS][eta 20m:17s]
So we should have a total bandwidth of 2.7*8=21 GB/s. Not that great for 2024.
replies(6): >>39444657 #>>39444735 #>>39444982 #>>39445321 #>>39445456 #>>39543485 #
zokier ◴[] No.39445321{4}[source]
I wonder if there is some tuning that needs to be done here, it seems suprising that the advertised rate would be this much off otherwise.
replies(1): >>39445596 #
1. jeffbee ◴[] No.39445596[source]
I would start with the LBA format, which is likely to be suboptimal for compatibility.
replies(1): >>39447787 #
2. zokier ◴[] No.39447787[source]
somehow I4g drives don't like to get formatted

    # nvme format /dev/nvme1 -n1 -f
    NVMe status: INVALID_OPCODE: The associated command opcode field is not valid(0x2001)
    # nvme id-ctrl /dev/nvme1 | grep oacs
    oacs      : 0
but the LBA format indeed is sus:

    LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0 Best (in use)
replies(1): >>39448000 #
3. jeffbee ◴[] No.39448000[source]
It's a shame. The recent "datacenter nvme" standards involving fb, goog, et al mandate 4K LBA support.
replies(1): >>39448552 #
4. zokier ◴[] No.39448552{3}[source]
it'd be great if you'd manage to throw together quick blogpost about i4g io perf, there obviously something funny going on and I imagine you guys could figure it out much easier than anybody else, especially if you are already having some figures in the marketing.