←back to thread

283 points ghuntley | 2 comments | | HN request time: 0s | source
Show context
avallach ◴[] No.45136107[source]
Maybe I'm misunderstanding, but after reading it sounds to me not like "io_uring is faster than mmap" but "raid0 with 8 SSDs has more throughput than 3 channel DRAM".
replies(1): >>45136255 #
1. nine_k ◴[] No.45136255[source]
The title has been edited incorrectly. The original page title is "Memory is slow, Disk is fast", and it states exactly what you say: an NVMe RAID can offer more bandwidth than RAM.
replies(1): >>45138212 #
2. kentonv ◴[] No.45138212[source]
No, the title edit is fair, where the original title is misleading.

Obviously, no matter how you read from disk, it has to go through RAM. Disk bandwidth cannot exceed memory bandwidth.*

But what the article actually tests is a program that uses mmap() to read from page cache, vs. a program that uses io_uring to read directly from disk (with O_DIRECT). You'd think the mmap() program would win, because the data in page cache is already in memory, whereas the io_uring program is explicitly skipping cache and pulling from disk.

However, the io_uring program uses 6 threads to pull from disk, which then feed into one thread that sequentially processes the data. Whereas the program using mmap() uses a single thread for everything. And even though the mmap() is pulling from page cache, that single thread still has to get interrupted by page faults as it reads, because the kernel does not proactively map the pages from cache even if they are available (unless, you know, you tell it to, with madvise() etc., but the test did not). So the mmap() test has one thread that has to keep switching between kernel and userspace and, surprise, that is not as fast as a thread which just stays in userspace while 6 other threads feed it data.

To be fair, the article says all this, if you read it. Other than the title being cheeky it's not hiding anything.

* OK, the article does mention that there exists CPUs which can do I/O directly into L3 cache which could theoretically beat memory bandwidth, but this is not actually something that is tested in the article.