This is great, and a bit of a buried lede. Some of the economics of mercenary spyware depend on chains with interchangeable parts, and countermeasures targeting that property directly are interesting.
This is great, and a bit of a buried lede. Some of the economics of mercenary spyware depend on chains with interchangeable parts, and countermeasures targeting that property directly are interesting.
They also imply a very different system architecture.
Why would you need MTE if you have CHERI?
But here’s a reason to do both: CHERI’s UAF story isn’t great. Adding MTE means you get a probabilistic story at least
I think it's two halves of the same coin and Apple chose the second half of the coin.
The two systems are largely orthogonal; I think if Apple chose to go from one to the other it will be a generational change rather than an incremental one. The advantage of MTE/MIE is you can do it incrementally by just changing the high bits the allocator supplies; CHERI requires a fundamental paradigm shift. Apple love paradigm shifts but there's no indication they're going to do one here; if they do, it will be a separate effort.
Overall my _personal_ opinion is that CHERI is a huge win at a huge cost, while MTE is a huge win at a low cost. But, there are definitely vulnerability classes that each system excels at.
That’s strictly better, in theory.
(Not sure it’s practically better. You could make an argument that it’s not.)
And CHERI fixes it only optionally, if you accept having to change a lot more code
I also think this argument is compelling because one exists in millions of consumer drives, to-be-more (MTE -> MIE) and one does not.
> We have used CHERI’s ISA facilities as a foundation to build a software object-capability model supporting orders of magnitude greater compartmentalization performance, and hence granularity, than current designs. We use capabilities to build a hardware-software domain-transition mechanism and programming model suitable for safe communication between mutually distrusting software
and https://github.com/CTSRD-CHERI/cheripedia/wiki/Colocation-Tu...
> Processes are Unix' natural compartments, and a lot of existing software makes use of that model. The problem is, they are heavy-weight; communication and context switching overhead make using them for fine-grained compartmentalisation impractical. Cocalls, being fast (order of magnitude slower than a function call, order of magnitude faster than a cheapest syscall), aim to fix that problem.
This functionality revolves around two functions: cocall(2) for the caller (client) side, and coaccept(2) for the callee (service) side. Underneath they are implemented using CHERI magic in the form of CInvoke / LDPBR CPU instruction to switch protection domains without the need to enter the kernel, but from the API user point of view they mostly look like ordinary system calls and follow the same conventions, errno et al.
There's a decent chance that we get back whatever performance we pay for CHERI with interest as new systems architecture possibilities open up.
MTE helps us secure existing architectures. CHERI makes new architectures possible.
Okay a bit drastic, I don’t really know if this will affect them.
That's Apple and here is Google (who have been at memory safety since the early Chrome/Android days):
Google folks were responsible for pushing on Hardware MTE ... It originally came from the folks who also did work on ASAN, syzkaller, etc ... with the help and support of folks in Android ... ARM/etc as well.
I was the director for the teams that created/pushed on it ... So I'm very familiar with the tradeoffs.
...
Put another way - the goal was to make it possible to use have the equivalent of ASAN be flipped on and off when you want it.
Keeping it on all the time as a security mitigation was a secondary possibility, and has issues besides memory overhead.
For example, you will suddenly cause tons of user-visible crashes. But not even consistently. You will crash on phones with MTE, but not without it (which is most of them).
This is probably not the experience you want for a user.
For a developer, you would now have to force everyone to test on MTE enabled phones when there are ~1mn of them. This is not likely to make developers happy.
Are there security exploits it will mitigate? Yes, they will crash instead of be exploitable. Are there harmless bugs it will catch? Yes.
...
As an aside - It's also not obvious it's the best choice for run-time mitigation.
https://news.ycombinator.com/item?id=39671337Google Security (ex: TAG & Project Zero) do so much to tackle CSVs but with MTE the mothership dropped the ball so hard.
AOSP's security posture is frustrating (as Google seemingly solely decides what's good and what's bad and imposes that decision on each of their 3bn users & ~1m developers, despite some in the security community, like Daniel Micay, urging them to reconsider). The steps Apple has been taking (in both empowering the developers and locking down its own OS) in response to Celebgate and Pegasus hacks has been commendable.
There is a section in the technical reports that talks about garbage collection.
I don't think CHERI is currently being used with different privileged threads in the same address space.
I do agree it is a pain not seeing this becoming widely adopted.
As for disabling JIT, it would have the same effect as early Androids, lagging behind Symbian devices, with applications that were wrappers around NDK code.
Well, Apple already routinely forces developers to recompile their applications so if Apple wants to introduce something needing a compiler / toolchain update they can do that easily. And they also control the entire SoC from start to finish and unlike pretty much everyone else also hold an ARM Architecture License so they can go and change whatever they want in the hardware side as well.
Not to mention the dynamic linker.
Maybe you've been confused by a description of how it works inside a processor. In early CHERI designs, capabilities were in different architectural processor registers from integers.
In recent CHERI designs, the same register numbers are used for capabilities and other registers. A micro-architecture could be designed to have either all registers be capability registers with the tag bit, or use register renaming to separate integer and capability registers.
I suppose a CHERI MCU for embedded systems with small memory could theoretically have tag pages in separate SRAM instead of caching main memory, but I have not seen that.
With CHERI, there is nothing to guess. You either have a capability or you don't.
When I say that this optional feature would force you to change a lot more code I’m comparing CHERI without intra object overflow protection to CHERI with intra object object overflow protection.
Finally, 6 million lines of code is not that impressive. Real OSes are measured in billions