Most active commenters

hansvm(3)
vacuity(3)

Popular/hot comments

>>41876214 #
>>41877865 #

←back to thread

Smart pointers for the kernel

(lwn.net)

Show context

smodo ◴[18 Oct 24 02:38 UTC] No.41875908[source]▶

>>41875792 (OP) #

I’m not very well versed in kernel development. But I am a Rust dev and have observed the discussion about Rust in Linux with interest… Having said that, this part of the article has me baffled:

>> implementing these features for a smart-pointer type with a malicious or broken Deref (the trait that lets a programmer dereference a value) implementation could break the guarantees Rust relies on to determine when objects can be moved in memory. (…) [In] keeping with Rust's commitment to ensuring safe code cannot cause memory-safety problems, the RFC also requires programmers to use unsafe (specifically, implementing an unsafe marker trait) as a promise that they've read the relevant documentation and are not going to break Pin.

To the uninformed this seems like crossing the very boundary that you wanted Rust to uphold? Yes it’s only an impl Trait but still… I can hear the C devs now. ‘We pinky promise to clean up after our mallocs too!’

replies(7): >>41875965 #>>41876037 #>>41876088 #>>41876177 #>>41876213 #>>41876426 #>>41877004 #

foundry27 ◴[18 Oct 24 02:53 UTC] No.41875965[source]▶

>>41875908 #

Rust’s whole premise of guaranteed memory safety through compiletime checks has always been undermined when confronted with the reality that certain foundational operations must still be implemented using unsafe. Inevitably folks concede that lower level libraries will have these unsafe blocks and still expect higher level code to trust them, and at that point we’ve essentially recreated the core paradigm of C: trust in the programmer’s diligence. Yeah Rust makes this trust visible, but it doesn’t actually eliminate it in “hard” code.

The punchline here, so to speak, is that for all Rust’s claims to revolutionize safety, it simply(!) formalizes the same unwritten social contract C developers have been meandering along with for decades. The uniqueness boils down to “we still trust the devs, but at least now we’ve made them swear on it in writing”.

replies(10): >>41876016 #>>41876042 #>>41876122 #>>41876128 #>>41876303 #>>41876330 #>>41876352 #>>41876459 #>>41876891 #>>41877732 #

1. kelnos ◴[18 Oct 24 03:09 UTC] No.41876042[source]▶

>>41875965 #

I don't think you're giving Rust enough credit here.

For those projects that don't use any unsafe, we can say -- absent compiler bugs or type system unsoundness -- that there will be no memory leaks or data races or undefined behavior. That's useful! Very useful!

For projects that do need unsafe, that unsafe code can be cordoned off into a corner where it can be made as small as possible, and can be audited. The rest of the code base is just as safe as one with no unsafe at all. This is also very useful!

Now, sure, if most projects needed to use unsafe, and/or if most projects had to write a significant amount of unsafe, then sure, I'd agree with you. But that's just not the reality for nearly all projects.

With C, everything is unsafe. Everything can have memory leaks or data races or undefined behavior. Audits for these issues need to examine every single line of code. Compilers and linters and sanitizers can help you here, but they can never be comprehensive or guarantee the absence of problems.

I've been writing C for more than 20 years now. I still write memory leaks. I still write NULL pointer dereferences. I still struggle sometimes to get my data ownership (and/or locking) right when I have to write multithreaded code. When I get to write Rust, I'm so happy that I don't have to worry about those things, or spend time with valgrind or ASAN or clang's scan-build to figure out what I've done wrong. Rust lets me focus more on what I actually care about, the actual code and algorithms and structure of my program.

replies(11): >>41876080 #>>41876102 #>>41876214 #>>41876335 #>>41876602 #>>41876895 #>>41877492 #>>41877865 #>>41880946 #>>41882084 #>>41888463 #

2. dzaima ◴[18 Oct 24 03:19 UTC] No.41876080[source]▶

>>41876042 (TP) #

nit - Rust does allow memory leaks in safe code. https://doc.rust-lang.org/std/mem/fn.forget.html#safety

replies(2): >>41876601 #>>41877088 #

3. foobazgt ◴[18 Oct 24 03:27 UTC] No.41876102[source]▶

>>41876042 (TP) #

Yes, the drawback of unsafe is one single goof in just one unsafe block can blow your entire program wide open. The advantage is that your entire program isn't one gigantic unsafe block (like C).

The magnitude matters.

replies(1): >>41876201 #

4. gauge_field ◴[18 Oct 24 03:52 UTC] No.41876201[source]▶

>>41876102 #

Also, in my experience, the locality and unsafe api is better for testing purposes compared to unsafe language. If I have an unsafe code that provides safe api with certain safety conditions.

1) I have a more ergonomic/precise/local contract to satisfy safety

2) Since this unsafe block is local, it is easier to set up its testing conditions for various scenarios. Otherwise, testing for bigger unsafe block (e.g. unsafe language) would also have to handle coupling between api from which ub originates and the rest of the code.

5. foundry27 ◴[18 Oct 24 03:56 UTC] No.41876214[source]▶

>>41876042 (TP) #

I’ll propose that most Rust projects that do useful work (in the potential energy sense?) depend on unsafe code, and it’s likely going to be found in the codebases of their dependencies and transitive dependencies. But I agree with almost all of what you’re saying about C and Rust; I work on a C operating system professionally, and I know those same pain points intimately. I program in Rust for fun, and it’s great to use.

At the end of the day this isn’t a technical argument I’m trying to make, it’s a philosophical one. I think that the more we normalize eroding the core benefits the language safety features provide, one enhancement proposal at a time, one escape hatch added each year for special interfaces, the less implicit trust you can have in rust projects without reviewing them and their dependencies for correctness.

I think that trust has enormous value, and I think it would suck to lose it. (reflect: what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code)

replies(3): >>41876350 #>>41876614 #>>41877241 #

6. gary_0 ◴[18 Oct 24 04:27 UTC] No.41876335[source]▶

>>41876042 (TP) #

Also Rust is far from the only language that gives you escape-hatches out of the safety sandbox where you can make a mess if you're reckless. Java, Python, Go, C#... (heck, C# also has an `unsafe` keyword) but hardly anyone would argue those languages have the same safety issues that C has.

replies(2): >>41877474 #>>41892539 #

7. GolDDranks ◴[18 Oct 24 04:30 UTC] No.41876350[source]▶

>>41876214 #

I’ll propose that ALL Rust projects that do useful work depend on unsafe code.

If one claims otherwise, I say they have no understanding of Rust. But also, if one helds that against Rust's value promise, I, again, say that they have no understanding of Rust.

replies(2): >>41877027 #>>41877122 #

8. eru ◴[18 Oct 24 05:33 UTC] No.41876601[source]▶

>>41876080 #

Yes, memory leaks are rarer in Rust than in C, but they are an entirely different topic that 'unsafe' blocks.

9. hansvm ◴[18 Oct 24 05:33 UTC] No.41876602[source]▶

>>41876042 (TP) #

This is giving Rust a bit too much credit though.

- Memory leaks are not just possible in Rust, they're easy to write and mildly encouraged by the constraints the language places on you. IME I see more leaks in Rust in the wild than in C, C#, Python, C++, ...

- You can absolutely have data races in a colloquial sense in Rust, just not in the sense of the narrower definition they created to be able to say they don't have data races. An easy way to do so is choosing the wrong memory ordering for atomic loads and stores, including subtle issues like those arising from mixing `seq_cst` and `acquire`. I think those kinds of bugs are rare in the wild, but one project I inherited was riddled with data races in Safe rust.

- Unsafe is a kind of super-unsafe that's harder to write correctly than C or C++, limiting its utility as an escape hatch. It'll trigger undefined behavior in surprising ways if you don't adhere to a long list of rules in your unsafe code blocks (in a way which safe code can detect). The list changes between Rust versions, requiring re-audits. Some algorithms (especially multi-threaded ones) simply can't even be written in small, easily verifiable unsafe blocks without causing UB. The unsafeness colors surrounding code.

replies(2): >>41877210 #>>41881575 #

10. kloop ◴[18 Oct 24 05:35 UTC] No.41876614[source]▶

>>41876214 #

> reflect: what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code

That the community is going to be significantly more dramatic than average

11. weinzierl ◴[18 Oct 24 06:39 UTC] No.41876895[source]▶

>>41876042 (TP) #

"For projects that do need unsafe, that unsafe code can be cordoned off into a corner, where it can be made as small as possible, and can be audited. The rest of the code base is just as safe as one with no unsafe at all. This is also very useful!"

Exactly this, and very well put!

I'd just like to add one small but important detail. It's one of the things that is so obvious to one group that they rarely even mention it, but at the same time so obscure to the others that they are completely oblivious to it.

While the unsafe code is cordoned off into a corner its effects are not. A bug in an unsafe block in one part of your program can trigger an outcome in a completely different and safe part of your program, that normally safe Rust should prevent.

To put it more metaphorically, Rust restricts the places where bombs can be placed, it does not limit the blast radius in case a bomb goes off.

This is still huge progress compared to C/C++, where the bombs can and usually are everywhere and trying to write it safely feels a lot like playing minesweeper.

replies(1): >>41879134 #

12. I_AM_A_SMURF ◴[18 Oct 24 07:11 UTC] No.41877027{3}[source]▶

>>41876350 #

It's definitely all of them. Even HashMap uses unsafe.

replies(1): >>41879265 #

13. thayne ◴[18 Oct 24 07:25 UTC] No.41877088[source]▶

>>41876080 #

It's also possible to leak memory in languages with tracing garbage collection, just create a data structure that holds strong references to objects that are no longer needed, which commonly happens when using something like a HashMap as a cache without any kind of expiration.

14. Dylan16807 ◴[18 Oct 24 07:32 UTC] No.41877122{3}[source]▶

>>41876350 #

I get the impression they're only counting code outside the standard library, in which case tons of useful programs are fully safe.

15. simonask ◴[18 Oct 24 07:50 UTC] No.41877210[source]▶

>>41876602 #

Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

The Rustonomicon [1] serves as a decent introduction to what you can or can't do in unsafe code, and none of that changed to my knowledge.

I agree that it's sometimes challenging to contain `unsafe` in a small blast zone, but it's pretty rare IME.

[1]: https://doc.rust-lang.org/nomicon/intro.html

replies(2): >>41878761 #>>41879338 #

16. jdiez17 ◴[18 Oct 24 07:55 UTC] No.41877241[source]▶

>>41876214 #

Of course all software ultimately runs on hardware, which has things like registers and hidden internal state which affect how that hardware accesses or writes to physical memory and all sorts of other "unsafe" things.

In a more practical sense, all software, even Python programs, ultimately call C functions that are unsafe.

It's like that saying "all abstractions are wrong, some are useful".

> what does seeing “written in rust” as a suffix make you think about a project’s qualities before you ever read the code

By itself, that tells me very little about a project. Same thing if I see a project written in Python or Go, which are nominally memory safe programming languages. I perceive a statistically significant likelihood that software written in these languages will not segfault on me, but it's no guarantee. If I see two programs with the same functionality, where one is written in Python and another one in Rust, I also have some expectation that the one written in Rust will be more performant.

But you cannot draw general conclusions from that piece of information alone.

However, as a programmer, Rust is a tool that makes it easier for me to write code that will not segfault or cause data races.

17. Y_Y ◴[18 Oct 24 08:43 UTC] No.41877474[source]▶

>>41876335 #

In C unsafe code is typically marked by surrounding it with {braces}.

replies(1): >>41878316 #

18. ◴[18 Oct 24 08:46 UTC] No.41877492[source]▶

>>41876042 (TP) #

19. lertn ◴[18 Oct 24 10:06 UTC] No.41877865[source]▶

>>41876042 (TP) #

With C you can take proven algorithms from CLRS and translate them directly without boilerplate.

The same algorithms already become ugly/obfuscated in idiomatic C++.

Looking at the macro in the LWN article, the approach of Rust of using wrappers and boxes and complex macros to emulate features appears to go into the same direction as C++.

Still in 2024, gdb is far less useful for C++ than for C. C++ error messages are far less useful.

All of that matters for reliable software, crashes (which can occur anyway with unsafe) are just a tiny part of the equation.

replies(3): >>41878724 #>>41887959 #>>41892565 #

20. m4rtink ◴[18 Oct 24 11:17 UTC] No.41878316{3}[source]▶

>>41877474 #

Good one! ;-)

21. ArtixFox ◴[18 Oct 24 12:21 UTC] No.41878724[source]▶

>>41877865 #

With C, you are not 100% sure that ur code will work. You have to verify and extensively test it. With C++ you have some very vague guarantees about ur code but you can easily transition from C and even have some interesting type safety like mp-units. with Rust, you have some good guarantees that ur code wont have UAF, will be threadsafe, etc etc and you can probably invent some interesting typesafety like mp-units.

In all 3, you gotta verify [frama-C, astree,bedrock, the many projects working on rust, esp the coq one] and extensively test it.

But by default, all 3 provide a different level of gurantees

22. hansvm ◴[18 Oct 24 12:25 UTC] No.41878761{3}[source]▶

>>41877210 #

> Wait, when exactly did the soundness rules change since 1.0? When have you had to re-audit unsafe code?

At a minimum you have to check that the rules haven't changed for each version [0].

The issue with destructors just before 1.0 dropped [1] would have been something to scrutinize pretty closely. I'm not aware of any major changes since then which would affect previously audited code, but new code for new Rust versions (e.g., when SIMD stabilized) needs to be considered with new rules as well.

> none of that changed to my knowledge

This is perhaps a bit pedantic, but the nomicon has bug fixes all the time (though the underlying UB scenarios in the compiler remain stable), and it's definitely worth re-examining your unsafe Rust when you see changes which might have incorrectly led a programmer to write some UB.

[0] https://doc.rust-lang.org/reference/behavior-considered-unde... [1] https://cglab.ca/~abeinges/blah/everyone-poops/

23. tialaramex ◴[18 Oct 24 13:13 UTC] No.41879134[source]▶

>>41876895 #

An important element of Rust's culture of safety, which is if anything more important than its safety technology which merely enables that culture to flourish, is as follows:

It is categorically the fault of that unsafe code when the bomb goes off. In a language like C++ it is very tempting for the person who planted the bomb to say "Oh, actually in paragraph sixteen of the documentation it does tell you about the bomb so it's not my fault" but nobody reads documentation, so Rust culturally requires that they mark the function unsafe, which is one last reminder to go read that documentation if you must use it.

Because this is a matter of culture not technology we can expect further refinement both in terms of what the rules are exactly and the needed technology to deliver that. Rust 1.82 which shipped yesterday adds unsafe extern (previously all the extern functions were unsafe, but er, maybe we should flag the whole block? This will become usual going foward) and unsafe attributes (the attributes which meddle with linking are not safe to just sprinkle on things for example, again this will become usual for those attributes)

24. steveklabnik ◴[18 Oct 24 13:28 UTC] No.41879265{4}[source]▶

>>41877027 #

It’s more fundamental than that: the Rust language does not encode hardware specifics into the language, and so way deep down there, you have to write down bytes to an address that Rust considers arbitrary. Unless you only want to run programs that accept no input and take no output, which is not exactly a useful subset of programs.

25. steveklabnik ◴[18 Oct 24 13:36 UTC] No.41879338{3}[source]▶

>>41877210 #

There was at least one in the first year after 1.0, we had warnings on for like nine months and then finally broke the code later.

That I only remember such things vaguely and not in a “oh yeah here’s the last ten times this happened and here’s the specifics” speaks to how often it happens, which is not often.

Lots of times soundness fixes are found by people looking for them, not for code in the wild. Fixing cve-rs will mean a “breaking” change in the literal sense that that code will no longer compile, but outside of that example, no known code in the wild triggers that bug, so nobody will notice the breakage.

26. littlestymaar ◴[18 Oct 24 16:28 UTC] No.41880946[source]▶

>>41876042 (TP) #

> For those projects that don't use any unsafe, we can say -- absent compiler bugs or type system unsoundness -- that there will be no memory leaks or data races or undefined behavior. That's useful! Very useful!

It's very useful indeed, I've been programming in Rust daily for the past 7 years (wow time flies) and the times when I've needed unsafe code can still be counted on my two hands.

27. littlestymaar ◴[18 Oct 24 17:27 UTC] No.41881575[source]▶

>>41876602 #

There's some truth in what you're saying, but its also wildly exaggerated and “everything that is exaggerated is insignificant”.

replies(1): >>41883690 #

28. fshbbdssbbgdd ◴[18 Oct 24 18:23 UTC] No.41882084[source]▶

>>41876042 (TP) #

Just to give an experience report as someone maintaining a 50k line rust codebase at work. I didn’t write this code and have only read parts of it. I am not a rust expert. I faced a really puzzling bug - basically errors coming out of an API that had nothing to do with the call site. After struggling to debug, I search for “unsafe” and looked at the 6 unsafe blocks in the project (totaling a few dozen lines of code), and found one of those had a bug. It turns out the unsafe operation was corrupting the system the code was interacting with and causing errors that pop up during later calls. This bug would have been much more difficult to track down if I couldn’t narrow down the tricky code with “unsafe”.

replies(1): >>41886685 #

29. hansvm ◴[18 Oct 24 21:31 UTC] No.41883690{3}[source]▶

>>41881575 #

> but its also wildly exaggerated

Such as?

> everything that is exaggerated is insignificant

But are the non-exaggerated things significant?

replies(1): >>41892597 #

30. ◴[19 Oct 24 09:21 UTC] No.41886685[source]▶

>>41882084 #

31. pjmlp ◴[19 Oct 24 14:31 UTC] No.41887959[source]▶

>>41877865 #

That is a gdb problem, there are much better debuggers out there.

In C boilerplate is a called pre-processor magic.

32. throwawaymaths ◴[19 Oct 24 15:46 UTC] No.41888463[source]▶

>>41876042 (TP) #

> Compilers and linters ... can help you here

Yeah. They can. You could even conceivably start a new c project with a "sufficiently right toolset" so that you get the same safety as rust...

Just, no one does.

33. vacuity ◴[20 Oct 24 02:55 UTC] No.41892539[source]▶

>>41876335 #

Unlike the other slew of "memory safe languages", Rust aims for a middle ground of sorts where "unsafe" is more visible and acknowledged but also guarded against. It's a rather third way of treating the inevitable escape hatches. It's more about how it's taught and treated socially that makes Rust's unsafe a different experience from, say, how C or alternatively Java approaches "unsafe".

34. vacuity ◴[20 Oct 24 03:03 UTC] No.41892565[source]▶

>>41877865 #

It may be that a typical well-written C program for an algorithm is more concise and/or elegant, but how exactly do we scale developers writing C programs well? For that matter, forget algorithms, what about C string handling? Yes, Rust has a gazillion string types (offtopic: I don't really understand those complaints, as usually you only need a few), but whether in the standard library or a third-party library, I don't get the sense that I need to be very diligent. And I don't think people should need high diligence for using strings.

As an OS nerd, this is what I like to use as an example: yes, the seL4 verified microkernel is impressive and if it was written in a language other than C, it wouldn't have both the practicality and the assurance. It was specified in Haskell but ultimately the C is what is deployed, so C it is. A Rust version might not be verifiable even in the next ten years. But the people who can't use seL4 and need an 80% "reasonably secure" or whatever OS have a strong case to use Rust over C. The formal verification for the C code of seL4 is partly a crutch for C's lack of safety and correctness by default.

35. vacuity ◴[20 Oct 24 03:14 UTC] No.41892597{4}[source]▶

>>41883690 #

> Memory leaks are not just possible in Rust...IME I see more leaks in Rust in the wild than in C, C#, Python, C++, ...

Perhaps, but data here would be nice. Yes, Rust has Rc/Arc and whatnot. If we're talking anecdotes, I mostly see "we rewrote it in Rust and it uses less memory".

> You can absolutely have data races in a colloquial sense in Rust, just not in the sense of the narrower definition they created to be able to say they don't have data races...

Sure, although it's not a contrived Rust definition. Race conditions are far more general and harder to prevent. Race conditions often arise in higher levels of abstraction, so it's not so much Rust's focus. I don't know what to say about your comment on atomics except that Rust isn't making them much harder than C++ is.

> Unsafe is a kind of super-unsafe that's harder to write correctly than C or C++, limiting its utility as an escape hatch...

This is definitely important for anyone considering or learning Rust to know. However, how often is this a problem compared to what would be done in C or C++? Someone upthread mentioned enhanced greppability/auditability, Rust has much stronger prevention of memory corruption by default (even if unsafe can poke holes), and the culture is generally more averse to "unsafe". This seems more like a theoretical concern with little practical grounding. I highly doubt this is a real problem.

↑