Most active commenters
  • inetknght(8)
  • bawolff(3)

←back to thread

283 points ghuntley | 20 comments | | HN request time: 0s | source | bottom
Show context
bawolff ◴[] No.45133765[source]
Shouldn't you also compare to mmap with huge page option? My understanding is its presicely meant for this circumstance. I don't think its a fair comparison without it.

Respectfully, the title feels a little clickbaity to me. Both methods are still ultimately reading out of memory, they are just using different i/o methods.

replies(2): >>45134007 #>>45138806 #
jared_hulbert ◴[] No.45134007[source]
The original blog post title is intentionally clickbaity. You know, to bait people into clicking. Also I do want to challenge people to really think here.

Seeing if the cached file data can be accessed quickly is the point of the experiment. I can't get mmap() to open a file with huge pages.

void* buffer = mmap(NULL, size_bytes, PROT_READ, (MAP_HUGETLB | MAP_HUGE_1GB), fd, 0); doesn't work.

You can can see my code here https://github.com/bitflux-ai/blog_notes. Any ideas?

replies(2): >>45134269 #>>45134410 #
1. mastax ◴[] No.45134269[source]
MAP_HUGETLB can't be used for mmaping files on disk, it can only be used with MAP_ANONYMOUS, with a memfd, or with a file on a hugetlbfs pseudo-filesystem (which is also in memory).
replies(2): >>45134451 #>>45135606 #
2. inetknght ◴[] No.45134451[source]
> MAP_HUGETLB can't be used for mmaping files on disk

False. I've successfully used it to memory-map networked files.

replies(4): >>45134599 #>>45134638 #>>45135603 #>>45140875 #
3. loloquwowndueo ◴[] No.45134599[source]
Share your code?
replies(2): >>45134637 #>>45140890 #
4. inetknght ◴[] No.45134637{3}[source]
I don't work there any more (it was a decade ago) and I'm pretty busy right now with a new job coming up (offered today).

Do you have kernel documentation that says that hugetlb doesn't work for files? I don't see that stated anywhere.

replies(1): >>45136551 #
5. minitech ◴[] No.45134638[source]
That doesn’t sound like the intended meaning of “on disk”.
replies(1): >>45134653 #
6. inetknght ◴[] No.45134653{3}[source]
Kernel doesn't really care about "on disk", it cares about "on filesystem".

The "on disk" distinction is a simplification.

replies(1): >>45134845 #
7. pclmulqdq ◴[] No.45134845{4}[source]
The kernel absolutely does care about the "on disk" distinction because it determines what driver to use.
replies(1): >>45135217 #
8. ddtaylor ◴[] No.45135217{5}[source]
The interface is handled by the kernel.
9. squirrellous ◴[] No.45135603[source]
This is quite interesting since I, too, was under the impression that mmap cannot be used on disk-backed files with huge pages. I tried and failed to find any official kernel documentation around this, but I clearly remember trying to do this at work (on a regular ECS machine with Ubuntu) and getting errors.

Based on this SO discussion [1], it is possibly a limitation with popular filesystems like ext4?

If anyone knows more about this, I'd love to know what exactly are the requirements for using hugepages this way.

[1] https://stackoverflow.com/questions/44060678/huge-pages-for-...

replies(2): >>45136878 #>>45140880 #
10. mananaysiempre ◴[] No.45135606[source]
It looks like there is in theory support for that[1]? But the patches for ext4[2] did not go through.

[1] https://lwn.net/Articles/686690/

[2] https://lwn.net/Articles/718102/

11. Sesse__ ◴[] No.45136551{4}[source]
It's filesystem-dependent. In particular, tmpfs will work. To the best of my knowledge, no “normal” filesystems (e.g., ext4, xfs) will.
replies(1): >>45145898 #
12. bawolff ◴[] No.45136878{3}[source]
Trying to google this i found https://lwn.net/Articles/718102/ which suggests that there was discussion about it back in 2017. But i can't find anything else about it except a patchset that i guess wasnt merged (?). So maybe it was just a proposal that never made it in.

Honestly i never knew any of this i thought huge pages just worked for all of mmap.

13. inetknght ◴[] No.45140875[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

replies(2): >>45141621 #>>45142964 #
14. inetknght ◴[] No.45140880{3}[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

replies(1): >>45147460 #
15. inetknght ◴[] No.45140890{3}[source]
My bad, don't use `MAP_HUGETLB`, just use `MAP_HUGE_1GB`.

See a quick example I whipped up here: https://github.com/inetknght/mmap-hugetlb

16. jared_hulbert ◴[] No.45141621{3}[source]
Adding MAP_HUGE_1GB and not MAP_HUGETLB does compile and run for me. Not convinced that its' actually doing anything. Performance is the same.
replies(1): >>45141792 #
17. inetknght ◴[] No.45141792{4}[source]
Well now that it works, feel free to start poking around at it for a follow-up blog post :)
18. bawolff ◴[] No.45142964{3}[source]
The mmap man page kind of implies that would be a no-op, but i haven't tested myself.
19. inetknght ◴[] No.45145898{5}[source]
It works fine on my ext4 fs...
20. squirrellous ◴[] No.45147460{4}[source]
Cool! Thanks for the example. The aforementioned work thing requires MAP_SHARED as well which IIRC is the reason it would fail when used together with files and huge pages, but private mappings work as you show.