←back to thread

lsr: ls with io_uring

(rockorager.dev)
335 points mpweiher | 9 comments | | HN request time: 0.987s | source | bottom
1. api ◴[] No.44605484[source]
Why isn’t it possible — or is it — to make libc just use uring instead of syscall?

Yes I know uring is an async interface, but it’s trivial to implement sync behavior on top of a single chain of async send-wait pairs, like doing a simple single threaded “conversational” implementation of a network protocol.

It wouldn’t make a difference in most individual cases but overall I wonder how big a global speed boost you’d get by removing a ton of syscalls?

Or am I failing to understand something about the performance nuances here?

replies(3): >>44605692 #>>44605822 #>>44605934 #
2. ninkendo ◴[] No.44605692[source]
In order to make this work, libc would have to:

- Start some sort of async executor thread to service the io_uring requests/responses

- Make it so every call to "normal" syscalls causes the calling thread to sleep until the result is available (that's 1 syscall)

- When the executor thread gets a result, have it wake up the original thread (that's another syscall)

So you're basically turning 1 syscall into 2 in order to emulate the legacy syscalls.

io_uring only makes sense if you're already async. Emulating sync on top of async is nearly always a terrible idea.

replies(1): >>44605962 #
3. loeg ◴[] No.44605822[source]
In addition to sibling's concern about syscall amplification, the async just isn't useful to the application (from a latency perspective) if you just serialize a bunch of sync requests through it.
replies(1): >>44607769 #
4. yencabulator ◴[] No.44605934[source]
Not speaking of ls which is more about metadata operations, but general file read/write workloads:

io_uring requires API changes because you don't call it like the old read(please_fill_this_buffer). You maintain a pool of buffer that belong to the ringbuffer, and reads take buffers from the pool. You consume the data from the buffer and return it to the pool.

With the older style, you're required to maintain O(pending_reads) buffers. With the io_uring style, you have a pool of O(num_reads_completing_at_once) (I assume with backpressure but haven't actually checked).

replies(1): >>44607754 #
5. wtallis ◴[] No.44605962[source]
You don't need to start spawning new threads to use io_uring as a backend for synchronous IO APIs. You just need to set up the rings once, then when the program does an fwrite or whatever, that gets implemented as sending a submission queue entry followed by a single io_uring_enter syscall that informs the kernel there's something in the submission queue, and using the arguments indicating that the calling process wants to block until there's something in the completion queue.
replies(1): >>44607784 #
6. api ◴[] No.44607754[source]
In a single threaded flow your buffer pool is just the buffer you were given, and you don't return until the call completes. There are no actual concurrent calls in the ring. All you're doing is using io_uring to avoid syscall.

Other replies lead me to believe it's not worth doing though, that it would not actually save syscalls and might make things worse.

replies(1): >>44609153 #
7. ◴[] No.44607769[source]
8. ninkendo ◴[] No.44607784{3}[source]
> using the arguments indicating the calling process wants to block

Nice to know io_uring has facilities for backwards compatibility with blocking code here. But yeah, that's still a syscall, and given that the whole benefit of io_uring is in avoiding (or at least, coalescing) syscalls, I doubt having libc "just" use io_uring is going to give any tangible benefit.

9. yencabulator ◴[] No.44609153{3}[source]
Can you use io_uring in a way that doesn't gain the benefits of using it? Yes. Does the traditional C/POSIX API force you into that pattern? Almost certainly.