Performance claims aside, the real win with io_uring is how much control it gives you over async I/O without the syscall overhead. mmap’s great for simplicity, but once you hit high-concurrency or multi-buffer use cases, io_uring starts flexing. Anyone benchmarked it with real-world workloads (e.g., DB-backed APIs or log ingestion)?