←back to thread

283 points ghuntley | 1 comments | | HN request time: 0s | source
Show context
nextaccountic ◴[] No.45136365[source]
The real difference is that with io_uring and O_DIRECT you manage the cache yourself (and can't share with other processes, and the OS can't reclaim the cache automatically if under memory pressure), and with mmap this is managed by the OS.

If Linux had an API to say "manage this buffer you handled me from io_uring as if it were a VFS page cache (and as such it can be shared with other processes, like mmap), if you want it back just call this callback (so I can cleanup my references to it) and you are good to go", then io_uring could really replace mmap.

What Linux has currently is PSI, which lets the OS reclaim memory when needed but doesn't help with the buffer sharing thing

replies(1): >>45140261 #
touisteur ◴[] No.45140261[source]
Yes Linus has been ranting for decades against O_DIRECT saying similar things (aka better hints on pages and cache usage).

The notorious archive of Linus rants on [0] starts with "The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances". It gets better afterwards, though I'm not clear whether his articulated vision is implemented yet.

[0] https://yarchive.net/comp/linux/o_direct.html

replies(1): >>45141128 #
jandrewrogers ◴[] No.45141128[source]
I know people like to post this rant but in this case Linus simply doesn't understand the problem domain. O_DIRECT is commonly used in contexts where the fundamental mechanisms of the kernel cache are inappropriate. It can't be fixed with hints.

As a database example, there are major classes of optimization that require perfect visibility into the state of the entire page cache with virtually no overhead and strict control over every change of state that occurs. O_DIRECT allows you to achieve this. The optimizations are predicated on the impossibility of an external process modifying state. It requires perfect control of the schedule which is invalidated if the kernel borrows part of the page cache. Whether or not the kernel asks nicely doesn't matter, it breaks a design invariant.

The Linus rant is from a long time ago. Given the existence of things like io_uring which explicitly enables this type of behavior almost to the point of encouraging it, Linus may understand the use cases better now.

replies(1): >>45141608 #
1. touisteur ◴[] No.45141608[source]
I discovered his rant(s) about this recently and indeed thought it was interesting in the light of io_uring. If there's a similar compendium of Linus rants against io_uring I'm interested.