SSDs have become fast, except in the cloud

(databasearchitects.blogspot.com)

589 points greghn | 1 comments | 20 Feb 24 16:59 UTC | HN request time: 0.203s | source

Show context

pclmulqdq ◴[20 Feb 24 17:22 UTC] No.39443994[source]▶

This was a huge technical problem I worked on at Google, and is sort of fundamental to a cloud. I believe this is actually a big deal that drives peoples' technology directions.

SSDs in the cloud are attached over a network, and fundamentally have to be. The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

replies(30): >>39444009 #>>39444024 #>>39444028 #>>39444046 #>>39444062 #>>39444085 #>>39444096 #>>39444099 #>>39444120 #>>39444138 #>>39444328 #>>39444374 #>>39444396 #>>39444429 #>>39444655 #>>39444952 #>>39445035 #>>39445917 #>>39446161 #>>39446248 #>>39447169 #>>39447467 #>>39449080 #>>39449287 #>>39449377 #>>39449994 #>>39450169 #>>39450172 #>>39451330 #>>39466088 #

scottlamb ◴[20 Feb 24 18:33 UTC] No.39444952[source]▶

>>39443994 #

> The problem is that this network is so large and slow that it can't give you anywhere near the performance of a local SSD. This wasn't a problem for hard drives, which was the backing technology when a lot of these network attached storage systems were invented, because they are fundamentally slow compared to networks, but it is a problem for SSD.

Certainly true that SSD bandwidth and latency improvements are hard to match, but I don't understand why intra-datacenter network latency in particular is so bad. This ~2020-I-think version of the "Latency Numbers Everyone Should Know" says 0.5 ms round trip (and mentions "10 Gbps network" on another line). [1] It was the same thing in a 2012 version (that only mentions "1 Gbps network"). [2] Why no improvement? I think that 2020 version might have been a bit conservative on this line, and nice datacenters may even have multiple 100 Gbit/sec NICs per machine in 2024, but still I think the round trip actually is strangely bad.

I've seen experimental networking stuff (e.g. RDMA) that claims significantly better latency, so I don't think it's a physical limitation of the networking gear but rather something at the machine/OS interaction area. I would design large distributed systems significantly differently (be much more excited about extra tiers in my stack) if the standard RPC system offered say 10 µs typical round trip latency.

[1] https://static.googleusercontent.com/media/sre.google/en//st...

[2] https://gist.github.com/jboner/2841832

replies(3): >>39445409 #>>39445433 #>>39446206 #

KaiserPro ◴[20 Feb 24 20:05 UTC] No.39446206[source]▶

>>39444952 #

Networks are not reliable, despite what you hear, so latency is used to mask re-tries and delays.

The other thing to note about big inter-DC links are heavily QoS'd and contented, because they are both expensive and a bollock to maintain.

Also, from what I recall, 40gig links are just parallel 10 gig links, so have no lower latency. I'm not sure if 100/400 gigs are ten/fourty lines of ten gigs in parallel or actually able to issue packets at 10/40 times a ten gig link. I've been away from networking too long

replies(2): >>39446231 #>>39446971 #

1. wmf ◴[20 Feb 24 21:12 UTC] No.39446971[source]▶

>>39446206 #

40gig links are just parallel 10 gig links, so have no lower latency

That's not correct. Higher link speeds do have lower serialization latency, although that's a small fraction of overall network latency.

↑