> This was a comparison of two methods of moving data from the VFS to application memory. Depending on cache status this would run the whole gambit of mapping existing memory pages, kernel to userspace memory copies, and actual disk access.
Yes, I think maybe a reasonable statement is that a benchmark is supposed to isolate a meaningful effect. This benchmark was not set up correctly to isolate a meaningful effect IMO.
> Also, while we’re being annoyingly technical, a lot of server CPUs can DMA straight to the L3 cache so your proof of impossibility is not correct.
Interesting, didn't know that, thanks!
I think this does not invalidate the point though. You can temporarily stream directly to the L3 cache with DDIO, but as it fills out the cache will get flushed back to the main memory anyway and you will ultimately be memory-bound. I don't think there is some way to do some non-temporal magic here that circumvents main memory entirely.