Most active commenters
  • forrestthewoods(4)

←back to thread

Pitfalls of Safe Rust

(corrode.dev)
168 points pjmlp | 16 comments | | HN request time: 0.742s | source | bottom
1. forrestthewoods ◴[] No.43604132[source]
> Overflow errors can happen pretty easily

No they can’t. Overflows aren’t a real problem. Do not add checked_mul to all your maths.

Thankfully Rust changed overflow behavior from “undefined” to “well defined twos-complement”.

replies(4): >>43604262 #>>43605035 #>>43605473 #>>43605491 #
2. LiamPowell ◴[] No.43604262[source]
What makes you think this is the case?

Having done a bunch of formal verification I can say that overflows are probably the most common type of bug by far.

replies(1): >>43605030 #
3. imtringued ◴[] No.43605030[source]
Yeah, they're so common they've become a part of our culture when it comes to interacting with computers.

Arithmetic overflows have become the punchline of video game exploits.

Unsigned underflow is also one of the most dangerous types. You go from one of the smallest values to one of the biggest values.

replies(1): >>43606425 #
4. conradludgate ◴[] No.43605035[source]
Overflow errors absolutely do happen. They're just no longer UB. It doesn't make them non-errors though. If your bank account balance overflowed, you'd be pretty upset.
replies(1): >>43605368 #
5. bogeholm ◴[] No.43605368[source]
On the other hand, there’s a solid use case for underflow.
6. int_19h ◴[] No.43605473[source]
The vast majority of code that does arithmetic will not produce a correct result with two's complement. It is simply assuming that the values involved are small enough that it won't matter. Sometimes it is a correct assumption, but whenever it involves anything derived from inputs, it can go very wrong.
replies(2): >>43605520 #>>43607770 #
7. wongarsu ◴[] No.43605491[source]
I'm a big fan of liberal use of saturating_mul/add/sub whenever there is a conceivable risk of coming withing a couple orders of magnitude of overflow. Or checked_*() or whatever the best behavior in the given case is. For my code it happens to mostly be saturating.

Overflow bugs are a real pain, and so easy to prevent in Rust with just a function call. It's pretty high on my list of favorite improvements over C/C++

replies(1): >>43606416 #
8. zozbot234 ◴[] No.43605520[source]
For any arithmetic expression that involves only + - * operators and equally-sized machine words, two's complement will actually yield a "correct" result. It's just that the given result might be indicating a different range than you expect.
9. forrestthewoods ◴[] No.43606416[source]
If you saturate you almost never ever want to use the result. You need to check and if it saturates do something else.
replies(2): >>43609149 #>>43613178 #
10. forrestthewoods ◴[] No.43606425{3}[source]
Unsigned integers were largely a mistake. Use i64 and called it a day. (Rusts refusal to allow indexing with i64 or isize is a huge mistake.)

Don’t do arithmetic with u8 or probably even u16.

replies(1): >>43606973 #
11. throwaway17_17 ◴[] No.43606973{4}[source]
Can you expand on your thoughts here? What is the root issue with unsigned integers? Is your complaint primarily based on the implications/consequences of overflow or underflow? I’m genuinely curious as I very often prefer u32 for most numeric computations (although in a more ‘mathematics’ domain signed ints will often be the correct choice).
replies(1): >>43608493 #
12. Spivak ◴[] No.43607770[source]
This is something that's always bugged me because, yes, this is a real problem that produces real bugs. But at the same time if you really care about this issue then every arithmetic operation is unsafe and there is never a time you should use them without overflow checks. Sometimes you can know something won't overflow but outside of some niche type systems you can't really prove it to the compiler to elide the check in a way that is safe against code modifications— i.e. if someone edits code that breaks the assumption we needed to know it won't overflow it will err.

But at the same time in real code in the real world you just do the maths, throw caution to the wind, and if it overflows and produces a bug you just fix it there. It's not worth the performance hit and your fellow developers will call you mad if you try to have a whole codebase with only checked maths.

replies(1): >>43613793 #
13. forrestthewoods ◴[] No.43608493{5}[source]
> What is the root issue with unsigned integers?

If I play "Appeal to Authority" you can read some thoughts on this from Alexandrescu, Stroustrup, and Carruth here: https://stackoverflow.com/questions/18795453/why-prefer-sign...

Unsigned integers are appealing because they make a range of invalid values impossible to represent. That's good! Indices can't be negative so simply do not allow negative values.

The issues are numerous, and benefits are marginal. First and foremost it is extremely common to do offset math on indices whereby negative values are perfectly valid. Given two indices idxA and idxB if you have unsigned indices then one of (idxB - idxA) or (idxA - idxB) will underflow and cause catastrophe. (Unless they're the same, of course).

The benefits are marginal because even though unsigned cannot represent a value below the valid range it can represent a value above container.size() so you still need to bounds check the upper range. If you can't go branchless then who cares about eliminating one branch that can always be treated as cold.

On a modern 64-bit machine doing math on smaller integers isn't any faster and may in fact be slower!

Now it can be valuable to store smaller integers. Especially for things like index lists. But in this case you're probably not doing any math so the overflow/underflow issue is somewhat moot.

Anyhow. Use unsigned when doing bitmasking or bitmanipulation. Otherwise default to signed integer. And default to i64/int64_t. You can use smaller integer types and even unsigned. Just use i64 by default and only use something else if you have a particular reason.

I'm kinda rambling and those thoughts are scattered. Hope it was helpful.

14. wongarsu ◴[] No.43609149{3}[source]
You obviously have to decide it on a case-by-case basis. But anything that is only used in a comparison is usually fine with saturating. And many things that measures values or work with measurements are fine with saturating if it's documented. Saturating is how most analog equipment works too, and in non-interactive use cases "just pick the closest value we can represent" is often better than erroring out or recording nothing at all.

Of course don't use saturating_add to calculate account balance, there you should use checked_add.

15. sfink ◴[] No.43613178{3}[source]
If you're going to check then you shouldn't be saturating, you should just be checking.
16. int_19h ◴[] No.43613793{3}[source]
I think this is very much a cultural issue rather than a technical one. Just look at array bounds checking: widespread in the mainframe era even in systems languages, relegated to high-level languages for a very long time on the basis of unacceptable perf hit in low-level code, but more recently seeing more acceptance in new systems languages (e.g. Rust).

Similarly in this case, it's not like we don't have languages that do checked arithmetic throughout by default. VB.NET, for example, does exactly that. Higher-level languages have other strategies to deal with the problem; e.g. unbounded integer types as in Python, which simply never overflow. And, like you say, this sort of thing is considered unacceptable for low-level code on perf grounds, but, given the history with nulls and OOB checking, I think there is a lesson here.