Calling mmap “zero copy” is generous. I guess we glaze over the whole page fault thing, or the fact that performance is heavily dependent on how much memory pressure the process is under.
This is the same n00b trap that derailed the llama.cpp project last year because people don’t understand how memory maps and paging works, and the tradeoffs.