A less baity title might be "Rust pitfalls: Runtime correctness beyond memory safety."
A less baity title might be "Rust pitfalls: Runtime correctness beyond memory safety."
This regularly drives C++ programmers mad: the statement "C++ is all unsafe" is taken as some kind of hyperbole, attack or dogma, while the intent may well be to factually point out the lack of statically checked guarantees.
It is subtle but not inconsistent that strong static checks ("safe Rust") may still leave the possibility of runtime errors. So there is a legitimate, useful broader notion of "safety" where Rust's static checking is not enough. That's a bit hard to express in a title - "correctness" is not bad, but maybe a bit too strong.
You might be talking about "correct", and that's true, Rust generally favors correctness more than most other languages (e.g. Rust being obstinate about turning a byte array into a file path, because not all file paths are made of byte arrays, or e.g. the myriad string types to denote their semantics).
[1] https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html
But this can be used with other enums too, and in those cases, having a zero NonZero would essentially transmute the enum into an unexpected variant, which may cause an invariant to break, thus potentially causing memory unsafety in whatever required that invariant.
By that standard anything and everything might be tainted as "unsafe", which is precisely GP's point. Whether the unsafety should be blamed on the outside code that's allowed to create a 0-valued NonZero<…> or on the code that requires this purported invariant in the first place is ultimately a matter of judgment, that people may freely disagree about.
The issue is that this could potentially allow creating a struct whose invariants are broken in safe rust. This breaks encapsulation, which means modules which use unsafe code (like `std::vec`) have no way to stop safe code from calling them with the invariants they rely on for safety broken. Let me give an example starting with an enum definition:
// Assume std::vec has this definition
struct Vec<T> {
capacity: usize,
length: usize,
arena: * T
}
enum Example {
First {
capacity: usize,
length: usize,
arena: usize,
discriminator: NonZero<u8>
},
Second {
vec: Vec<u8>
}
}
Now assume the compiler has used niche optimization so that if the byte corresponding to `discriminator` is 0, then the enum is `Example::Second`, while if the byte corresponding to `discriminator` is not 0, then the enum is `Example::First` with discriminator being equal to its given non-zero value. Furthermore, assume that `Example::First`'s `capacity`, `length`, and `arena` fields are in the in the same position as the fields of the same name for `Example::Second.vec`. If we allow `fn NonZero::new_unchecked(u8) -> NonZero<u8>` to be a safe function, we can create an invalid Vec: fn main() {
let evil = NonZero::new_unchecked(0);
// We write as an Example::First,
// but this is read as an Example::Second
// because discriminator == 0 and niche optimization
let first = Example::First {
capacity: 9001, length: 9001,
arena: 0x20202020,
discriminator: evil
}
if let Example::Second{ vec: bad_vec } = first {
// If the layout of Example is as I described,
// and no optimizations occur, we should end up in here.
// This writes 255 to address 0x20202020
bad_vec[0] = 255;
}
}
So if we allowed new_unchecked to be safe, then it would be impossible to write a sound definition of Vec.It's not, though. NonZero<T> has an invariant that a zero value is undefined behavior. Therefore, any API which allows for the ability to create one must be unsafe. This is a very straightforward case.