←back to thread

283 points ghuntley | 1 comments | | HN request time: 0s | source
Show context
ayende ◴[] No.45135399[source]
This is wrong, because your mmap code is being stalled for page faults (including soft page faults that you have when the data is in memory, but not mapped to your process).

The io_uring code looks like it is doing all the fetch work in the background (with 6 threads), then just handing the completed buffers to the counter.

Do the same with 6 threads that would first read the first byte on each page and then hand that page section to the counter, you'll find similar performance.

And you can use both madvice / huge pages to control the mmap behavior

replies(4): >>45135629 #>>45138707 #>>45140052 #>>45147766 #
mrlongroots ◴[] No.45138707[source]
Yes, it doesn't take a benchmark to find out that storage can not be faster than memory.

Even if you had a million SSDs and somehow were able to connect them to a single machine somehow, you would not outperform memory, because the data needs to be read into memory first, and can only then be processed by the CPU.

Basic `perf stat` and minor/major faults should be a first-line diagnostic.

replies(3): >>45139067 #>>45143065 #>>45152315 #
alphazard ◴[] No.45139067[source]
> storage can not be faster than memory

This is an oversimplification. It depends what you mean by memory. It may be true when using NVMe on modern architectures in a consumer use case, but it's not true about computer architecture in general.

External devices can have their memory mapped to virtual memory addresses. There are some network cards that do this for example. The CPU can load from these virtual addresses directly into registers, without needing to make a copy to the general purpose fast-but-volatile memory. In theory a storage device could also be implemented in this way.

replies(3): >>45140329 #>>45143170 #>>45147924 #
1. mrlongroots ◴[] No.45147924{3}[source]
> The CPU can load from these virtual addresses directly into registers

This requires that device to bring meaningful amounts of its own memory. GPUs do that with VRAM. A storage device does not come with its own RAM, but interesting point!