Smart pointers for the kernel

(lwn.net)

Show context

smodo ◴[18 Oct 24 02:38 UTC] No.41875908[source]▶

I’m not very well versed in kernel development. But I am a Rust dev and have observed the discussion about Rust in Linux with interest… Having said that, this part of the article has me baffled:

>> implementing these features for a smart-pointer type with a malicious or broken Deref (the trait that lets a programmer dereference a value) implementation could break the guarantees Rust relies on to determine when objects can be moved in memory. (…) [In] keeping with Rust's commitment to ensuring safe code cannot cause memory-safety problems, the RFC also requires programmers to use unsafe (specifically, implementing an unsafe marker trait) as a promise that they've read the relevant documentation and are not going to break Pin.

To the uninformed this seems like crossing the very boundary that you wanted Rust to uphold? Yes it’s only an impl Trait but still… I can hear the C devs now. ‘We pinky promise to clean up after our mallocs too!’

replies(7): >>41875965 #>>41876037 #>>41876088 #>>41876177 #>>41876213 #>>41876426 #>>41877004 #

foundry27 ◴[18 Oct 24 02:53 UTC] No.41875965[source]▶

>>41875908 #

Rust’s whole premise of guaranteed memory safety through compiletime checks has always been undermined when confronted with the reality that certain foundational operations must still be implemented using unsafe. Inevitably folks concede that lower level libraries will have these unsafe blocks and still expect higher level code to trust them, and at that point we’ve essentially recreated the core paradigm of C: trust in the programmer’s diligence. Yeah Rust makes this trust visible, but it doesn’t actually eliminate it in “hard” code.

The punchline here, so to speak, is that for all Rust’s claims to revolutionize safety, it simply(!) formalizes the same unwritten social contract C developers have been meandering along with for decades. The uniqueness boils down to “we still trust the devs, but at least now we’ve made them swear on it in writing”.

replies(10): >>41876016 #>>41876042 #>>41876122 #>>41876128 #>>41876303 #>>41876330 #>>41876352 #>>41876459 #>>41876891 #>>41877732 #

kelnos ◴[18 Oct 24 03:09 UTC] No.41876042[source]▶

>>41875965 #

I don't think you're giving Rust enough credit here.

For those projects that don't use any unsafe, we can say -- absent compiler bugs or type system unsoundness -- that there will be no memory leaks or data races or undefined behavior. That's useful! Very useful!

For projects that do need unsafe, that unsafe code can be cordoned off into a corner where it can be made as small as possible, and can be audited. The rest of the code base is just as safe as one with no unsafe at all. This is also very useful!

Now, sure, if most projects needed to use unsafe, and/or if most projects had to write a significant amount of unsafe, then sure, I'd agree with you. But that's just not the reality for nearly all projects.

With C, everything is unsafe. Everything can have memory leaks or data races or undefined behavior. Audits for these issues need to examine every single line of code. Compilers and linters and sanitizers can help you here, but they can never be comprehensive or guarantee the absence of problems.

I've been writing C for more than 20 years now. I still write memory leaks. I still write NULL pointer dereferences. I still struggle sometimes to get my data ownership (and/or locking) right when I have to write multithreaded code. When I get to write Rust, I'm so happy that I don't have to worry about those things, or spend time with valgrind or ASAN or clang's scan-build to figure out what I've done wrong. Rust lets me focus more on what I actually care about, the actual code and algorithms and structure of my program.

replies(11): >>41876080 #>>41876102 #>>41876214 #>>41876335 #>>41876602 #>>41876895 #>>41877492 #>>41877865 #>>41880946 #>>41882084 #>>41888463 #

1. hansvm ◴[18 Oct 24 05:33 UTC] No.41876602[source]▶

>>41876042 #

This is giving Rust a bit too much credit though.

- Memory leaks are not just possible in Rust, they're easy to write and mildly encouraged by the constraints the language places on you. IME I see more leaks in Rust in the wild than in C, C#, Python, C++, ...

- You can absolutely have data races in a colloquial sense in Rust, just not in the sense of the narrower definition they created to be able to say they don't have data races. An easy way to do so is choosing the wrong memory ordering for atomic loads and stores, including subtle issues like those arising from mixing `seq_cst` and `acquire`. I think those kinds of bugs are rare in the wild, but one project I inherited was riddled with data races in Safe rust.

- Unsafe is a kind of super-unsafe that's harder to write correctly than C or C++, limiting its utility as an escape hatch. It'll trigger undefined behavior in surprising ways if you don't adhere to a long list of rules in your unsafe code blocks (in a way which safe code can detect). The list changes between Rust versions, requiring re-audits. Some algorithms (especially multi-threaded ones) simply can't even be written in small, easily verifiable unsafe blocks without causing UB. The unsafeness colors surrounding code.

replies(2): >>41877210 #>>41881575 #

2. simonask ◴[18 Oct 24 07:50 UTC] No.41877210[source]▶

>>41876602 (TP) #

Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

The Rustonomicon [1] serves as a decent introduction to what you can or can't do in unsafe code, and none of that changed to my knowledge.

I agree that it's sometimes challenging to contain `unsafe` in a small blast zone, but it's pretty rare IME.

[1]: https://doc.rust-lang.org/nomicon/intro.html

replies(2): >>41878761 #>>41879338 #

3. hansvm ◴[18 Oct 24 12:25 UTC] No.41878761[source]▶

>>41877210 #

> Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

At a minimum you have to check that the rules haven't changed for each version [0].

The issue with destructors just before 1.0 dropped [1] would have been something to scrutinize pretty closely. I'm not aware of any major changes since then which would affect previously audited code, but new code for new Rust versions (e.g., when SIMD stabilized) needs to be considered with new rules as well.

> none of that changed to my knowledge

This is perhaps a bit pedantic, but the nomicon has bug fixes all the time (though the underlying UB scenarios in the compiler remain stable), and it's definitely worth re-examining your unsafe Rust when you see changes which might have incorrectly led a programmer to write some UB.

[0] https://doc.rust-lang.org/reference/behavior-considered-unde... [1] https://cglab.ca/~abeinges/blah/everyone-poops/

4. steveklabnik ◴[18 Oct 24 13:36 UTC] No.41879338[source]▶

>>41877210 #

There was at least one in the first year after 1.0, we had warnings on for like nine months and then finally broke the code later.

That I only remember such things vaguely and not in a “oh yeah here’s the last ten times this happened and here’s the specifics” speaks to how often it happens, which is not often.

Lots of times soundness fixes are found by people looking for them, not for code in the wild. Fixing cve-rs will mean a “breaking” change in the literal sense that that code will no longer compile, but outside of that example, no known code in the wild triggers that bug, so nobody will notice the breakage.

5. littlestymaar ◴[18 Oct 24 17:27 UTC] No.41881575[source]▶

>>41876602 (TP) #

There's some truth in what you're saying, but its also wildly exaggerated and “everything that is exaggerated is insignificant”.

replies(1): >>41883690 #

6. hansvm ◴[18 Oct 24 21:31 UTC] No.41883690[source]▶

>>41881575 #

> but its also wildly exaggerated

Such as?

> everything that is exaggerated is insignificant

But are the non-exaggerated things significant?

replies(1): >>41892597 #

7. vacuity ◴[20 Oct 24 03:14 UTC] No.41892597{3}[source]▶

>>41883690 #

> Memory leaks are not just possible in Rust...IME I see more leaks in Rust in the wild than in C, C#, Python, C++, ...

Perhaps, but data here would be nice. Yes, Rust has Rc/Arc and whatnot. If we're talking anecdotes, I mostly see "we rewrote it in Rust and it uses less memory".

> You can absolutely have data races in a colloquial sense in Rust, just not in the sense of the narrower definition they created to be able to say they don't have data races...

Sure, although it's not a contrived Rust definition. Race conditions are far more general and harder to prevent. Race conditions often arise in higher levels of abstraction, so it's not so much Rust's focus. I don't know what to say about your comment on atomics except that Rust isn't making them much harder than C++ is.

> Unsafe is a kind of super-unsafe that's harder to write correctly than C or C++, limiting its utility as an escape hatch...

This is definitely important for anyone considering or learning Rust to know. However, how often is this a problem compared to what would be done in C or C++? Someone upthread mentioned enhanced greppability/auditability, Rust has much stronger prevention of memory corruption by default (even if unsafe can poke holes), and the culture is generally more averse to "unsafe". This seems more like a theoretical concern with little practical grounding. I highly doubt this is a real problem.

↑