←back to thread

95 points ingve | 1 comments | | HN request time: 0.231s | source
Show context
londons_explore ◴[] No.44568753[source]
This is dumb. The abstraction is at the wrong level.

Applications should assume the page size is 1 byte. One should be able to map, protect, etc memory ranges down to byte granularity - which is the granularity of everything else in computers. One fewer thing for programmers to worry about. History has shown that performance hacks with ongoing complexity tend not to survive (eg. interlaced video).

At the hardware level, rather than picking a certain number of bits of the address as the page size, you have multiple page tables, and multiple TLB caches - eg. one for 1 megabyte pages, one for 4 kilobyte pages, and one for individual byte pages. The hardware will simultaneously check all the tables (parallelism is cheap in hardware!).

The benefit of this is that, assuming the vast majority of bytes in a process address space are made of large mappings, you can fit far more mappings in the (divided up) TLB - which results in better performance too, whilst still being able to do precise byte-level protections.

The OS is the only place there is complexity - which has to find a way to fit the mappings the application wants into what the hardware can do (ie. 123456 bytes might become 30 4-kilobyte pages and 576 byte pages.).

replies(6): >>44568880 #>>44569014 #>>44569104 #>>44569119 #>>44569150 #>>44577473 #
1. pjc50 ◴[] No.44569150[source]
If you're going to go that far, you might as well move malloc() into hardware and start using ARM-style secure tagged pointers. Then finally C users can be free of memory allocation bugs.