←back to thread

146 points returningfory2 | 3 comments | | HN request time: 0.607s | source
Show context
mmastrac ◴[] No.43645485[source]
This is a great way to see why invalid UTF-8 strings and unicode chars cause undefined behaviour in Rust. `char` is a special integer type, known to have a valid range which is a sub-range of its storage type. Outside of dataless enums, this is the only datatype with this behaviour (EDIT: I neglected NonZero<...>/NonZeroXXX and some other zero-niche types).

If you manage to construct an invalid char from an invalid string or any other way, you can defeat the niche optimization code and accidentally create yourself an unsound transmute, which is game over for soundness.

replies(5): >>43645776 #>>43645961 #>>43646463 #>>43646643 #>>43651356 #
tlb ◴[] No.43646643[source]
If the compiler is using a niche, it should really check every assignment that it's not accidentally the niche. That's still faster than also writing the tag.
replies(1): >>43647456 #
1. jenadine ◴[] No.43647456[source]
It doesn't need to check the assignment because that type cannot be the niche by construction.
replies(1): >>43651422 #
2. imtringued ◴[] No.43651422[source]
The problem is that you need to validate every potentially written niche after an unsafe block.

There is no generic way to re-validate structs in a bounded address space. You'd need something akin to a garbage collector that traces references at fixed offsets including type knowledge. This is not completely infeasible since Rust has a lot of information at compile time to avoid checks, but the extreme cases where people are writing to complicated graph like structures inside unsafe {} can realistically only be dealt with through tracing all safe references that lie inside the bounded address space.

In practice it will also be a struggle to sandbox C code into a small enough CHERI style address space so that you don't have to check literally your entire computer's memory after an FFI call.

It's not the enums that are the problem. unsafe can break anything if you are determined enough.

replies(1): >>43654054 #
3. im3w1l ◴[] No.43654054[source]
Isn't it the other way around? An unsafe block must respect niches.