←back to thread

283 points ghuntley | 3 comments | | HN request time: 0s | source
Show context
jared_hulbert ◴[] No.45133330[source]
Cool. Original author here. AMA.
replies(5): >>45133433 #>>45133597 #>>45133666 #>>45133764 #>>45135337 #
1. whizzter ◴[] No.45135337[source]
Like people mention, hugetlb,etc could be an improvement, but the core issue holding it it down probably has to do with mmap, 4k pages and paging behaviours, mmap will cause faults for each "small" 4k page not in memory, causing a kernel jump and then whatever machinery to fill in the page-cache (and bring up data from disk with the associated latency).

This in contrast with the io_uring worker method where you keep the thread busy by submitting requests and letting the kernel do the work without expensive crossings.

The 2g fully in-mem shows the CPU's real perf, the dip to 50gb is interesting, perhaps when going over 50% memory the Linux kernel evicts pages or something similar that is hurting perf, maybe plot a graph of perf vs test-size to see if there is an obvious cliff.

replies(2): >>45141276 #>>45142133 #
2. pianom4n ◴[] No.45141276[source]
The in-memory solution creates a 2nd copy of the data so 50GB doesn't fit in memory anymore. The kernel is forced to drop and then reload part of the cached file.
3. jared_hulbert ◴[] No.45142133[source]
When I run the 50GB in-mem setup I still have 40GB+ of free memory, I drop the page cache before I run "sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'" there wouldn't really be anything to evict from page cache and swap isn't changing.

I think I'm crossing the numa boundary which means some percentage of the accesses are higher latency.