←back to thread

Futurelock: A subtle risk in async Rust

(rfd.shared.oxide.computer)
421 points bcantrill | 1 comments | | HN request time: 0.331s | source

This RFD describes our distillation of a really gnarly issue that we hit in the Oxide control plane.[0] Not unlike our discovery of the async cancellation issue[1][2][3], this is larger than the issue itself -- and worse, the program that hits futurelock is correct from the programmer's point of view. Fortunately, the surface area here is smaller than that of async cancellation and the conditions required to hit it can be relatively easily mitigated. Still, this is a pretty deep issue -- and something that took some very seasoned Rust hands quite a while to find.

[0] https://github.com/oxidecomputer/omicron/issues/9259

[1] https://rfd.shared.oxide.computer/rfd/397

[2] https://rfd.shared.oxide.computer/rfd/400

[3] https://www.youtube.com/watch?v=zrv5Cy1R7r4

Show context
dvt ◴[] No.45776550[source]
> &mut future1 is dropped, but this is just a reference and so has no effect. Importantly, the future itself (future1) is not dropped.

There's a lot of talk about Rust's await implementation, but I don't really think that's the issue here. After all, Rust doesn't guarantee convergence. Tokio, on the other hand (being a library that handles multi-threading), should (at least when using its own constructs, e.g. the `select!` macro).

So, since the crux of the problem is the `tokio::select!` macro, it seems like a pretty clear tokio bug. Side note, I never looked at it before, but the macro[1] is absolutely hideous.

[1] https://docs.rs/tokio/1.34.0/src/tokio/macros/select.rs.html

replies(4): >>45776605 #>>45776791 #>>45776868 #>>45777449 #
dap ◴[] No.45777449[source]
(author here)

Although the design of the `tokio::select!` macro creates ways to run into this behavior, I don't believe the problem is specific to `tokio`. Why wouldn't the example from the post using Streams happen with any other executor?

replies(1): >>45780559 #
1. dvt ◴[] No.45780559[source]
First of all, great write-up! Had a blast reading it :) I think there's a difference between a language giving you a footgun and a library giving you a footgun. Libraries, by definition, are supposed to be as user-friendly as possible.

For example, I can just do `loop { }` which the language is perfectly okay with letting me do anywhere in my code (and essentially hanging execution). But if I'm using a library and I'm calling `innocuous()` and there's a `loop { }` buried somewhere in there, that is (in my opinion) the library's responsibility.

N.B. I don't know enough about tokio's internals to suggest any changes and don't want to pretend like I'm an expert, but I do think this caveat should be clearly documented and a "safe" version of `select!` (which wouldn't work with references) should be provided.