Most active commenters
  • inetknght(8)
  • bawolff(3)

←back to thread

283 points ghuntley | 24 comments | | HN request time: 0.813s | source | bottom
1. bawolff ◴[] No.45133765[source]
Shouldn't you also compare to mmap with huge page option? My understanding is its presicely meant for this circumstance. I don't think its a fair comparison without it.

Respectfully, the title feels a little clickbaity to me. Both methods are still ultimately reading out of memory, they are just using different i/o methods.

replies(2): >>45134007 #>>45138806 #
2. jared_hulbert ◴[] No.45134007[source]
The original blog post title is intentionally clickbaity. You know, to bait people into clicking. Also I do want to challenge people to really think here.

Seeing if the cached file data can be accessed quickly is the point of the experiment. I can't get mmap() to open a file with huge pages.

void* buffer = mmap(NULL, size_bytes, PROT_READ, (MAP_HUGETLB | MAP_HUGE_1GB), fd, 0); doesn't work.

You can can see my code here https://github.com/bitflux-ai/blog_notes. Any ideas?

replies(2): >>45134269 #>>45134410 #
3. mastax ◴[] No.45134269[source]
MAP_HUGETLB can't be used for mmaping files on disk, it can only be used with MAP_ANONYMOUS, with a memfd, or with a file on a hugetlbfs pseudo-filesystem (which is also in memory).
replies(2): >>45134451 #>>45135606 #
4. jandrewrogers ◴[] No.45134410[source]
Read the man pages, there are restrictions on using the huge page option with mmap() that mean it won’t do what you might intuit it will in many cases. Getting reliable huge page mappings is a bit fussy on Linux. It is easier to control in a direct I/O context.
5. inetknght ◴[] No.45134451{3}[source]
> MAP_HUGETLB can't be used for mmaping files on disk

False. I've successfully used it to memory-map networked files.

replies(4): >>45134599 #>>45134638 #>>45135603 #>>45140875 #
6. loloquwowndueo ◴[] No.45134599{4}[source]
Share your code?
replies(2): >>45134637 #>>45140890 #
7. inetknght ◴[] No.45134637{5}[source]
I don't work there any more (it was a decade ago) and I'm pretty busy right now with a new job coming up (offered today).

Do you have kernel documentation that says that hugetlb doesn't work for files? I don't see that stated anywhere.

replies(1): >>45136551 #
8. minitech ◴[] No.45134638{4}[source]
That doesn’t sound like the intended meaning of “on disk”.
replies(1): >>45134653 #
9. inetknght ◴[] No.45134653{5}[source]
Kernel doesn't really care about "on disk", it cares about "on filesystem".

The "on disk" distinction is a simplification.

replies(1): >>45134845 #
10. pclmulqdq ◴[] No.45134845{6}[source]
The kernel absolutely does care about the "on disk" distinction because it determines what driver to use.
replies(1): >>45135217 #
11. ddtaylor ◴[] No.45135217{7}[source]
The interface is handled by the kernel.
12. squirrellous ◴[] No.45135603{4}[source]
This is quite interesting since I, too, was under the impression that mmap cannot be used on disk-backed files with huge pages. I tried and failed to find any official kernel documentation around this, but I clearly remember trying to do this at work (on a regular ECS machine with Ubuntu) and getting errors.

Based on this SO discussion [1], it is possibly a limitation with popular filesystems like ext4?

If anyone knows more about this, I'd love to know what exactly are the requirements for using hugepages this way.

[1] https://stackoverflow.com/questions/44060678/huge-pages-for-...

replies(2): >>45136878 #>>45140880 #
13. mananaysiempre ◴[] No.45135606{3}[source]
It looks like there is in theory support for that[1]? But the patches for ext4[2] did not go through.

[1] https://lwn.net/Articles/686690/

[2] https://lwn.net/Articles/718102/

14. Sesse__ ◴[] No.45136551{6}[source]
It's filesystem-dependent. In particular, tmpfs will work. To the best of my knowledge, no “normal” filesystems (e.g., ext4, xfs) will.
replies(1): >>45145898 #
15. bawolff ◴[] No.45136878{5}[source]
Trying to google this i found https://lwn.net/Articles/718102/ which suggests that there was discussion about it back in 2017. But i can't find anything else about it except a patchset that i guess wasnt merged (?). So maybe it was just a proposal that never made it in.

Honestly i never knew any of this i thought huge pages just worked for all of mmap.

16. mrlongroots ◴[] No.45138806[source]
You don't need hugepages for basic 5GB/s sequential scans. I don't know the exact circumstances that would cause TLB pressure, but this is not it.

You can maybe reduce the number of page faults, but you can do that by walking the mapped address space once before the actual benchmark too.

17. inetknght ◴[] No.45140875{4}[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

replies(2): >>45141621 #>>45142964 #
18. inetknght ◴[] No.45140880{5}[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

replies(1): >>45147460 #
19. inetknght ◴[] No.45140890{5}[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

20. jared_hulbert ◴[] No.45141621{5}[source]
Adding MAP_HUGE_1GB and not MAP_HUGETLB does compile and run for me. Not convinced that its' actually doing anything. Performance is the same.
replies(1): >>45141792 #
21. inetknght ◴[] No.45141792{6}[source]
Well now that it works, feel free to start poking around at it for a follow-up blog post :)
22. bawolff ◴[] No.45142964{5}[source]
The mmap man page kind of implies that would be a no-op, but i haven't tested myself.
23. inetknght ◴[] No.45145898{7}[source]
It works fine on my ext4 fs...
24. squirrellous ◴[] No.45147460{6}[source]
Cool! Thanks for the example. The aforementioned work thing requires MAP_SHARED as well which IIRC is the reason it would fail when used together with files and huge pages, but private mappings work as you show.