←back to thread

Futurelock: A subtle risk in async Rust

(rfd.shared.oxide.computer)
427 points bcantrill | 10 comments | | HN request time: 1.374s | source | bottom

This RFD describes our distillation of a really gnarly issue that we hit in the Oxide control plane.[0] Not unlike our discovery of the async cancellation issue[1][2][3], this is larger than the issue itself -- and worse, the program that hits futurelock is correct from the programmer's point of view. Fortunately, the surface area here is smaller than that of async cancellation and the conditions required to hit it can be relatively easily mitigated. Still, this is a pretty deep issue -- and something that took some very seasoned Rust hands quite a while to find.

[0] https://github.com/oxidecomputer/omicron/issues/9259

[1] https://rfd.shared.oxide.computer/rfd/397

[2] https://rfd.shared.oxide.computer/rfd/400

[3] https://www.youtube.com/watch?v=zrv5Cy1R7r4

1. levodelellis ◴[] No.45777969[source]
In October alone I seen 5+ articles and comments about multi-threading and I don't know why

I always said if your code locks or use atomics, it's wrong. Everyone says I'm wrong but you get things like what's described in the article. I'd like to recommend a solution but there's pretty much no reasonable way to implement multi-threading when you're not an expert. I heard Erlang and Elixir are good but I haven't tried them so I can't really comment

replies(3): >>45777993 #>>45778558 #>>45779598 #
2. umvi ◴[] No.45777993[source]
> I always said if your code locks or use atomics, it's wrong. Everyone says I'm wrong but you get things like what's described in the article.

Ok so say you are simulating high energy photons (x-rays) flowing through a 3d patient volume. You need to simulate 2 billion particles propagating through the patient in order to get an accurate estimation of how the radiation is distributed. How do you accomplish this without locks or atomics without the simulation taking 100 hours to run? Obviously it would take forever to simulate 1 particle at a time, but without locks or atomics the particles will step on each others' toes when updating radiation distribution in the patient. I suppose you could have 2 billion copies of the patient's volume in memory and each particle gets its own private copy and then you merge them all at the end...

replies(1): >>45778046 #
3. levodelellis ◴[] No.45778046[source]
From my understanding this talk describes how he implemented a solution for a similar problem https://www.youtube.com/watch?v=Kvsvd67XUKw

I'm saying if you're not writing multi-threaded code everyday, use a library. It can use atomics/locks but you shouldn't use it directly. If the library is designed well it'd be impossible to deadlock.

replies(1): >>45789034 #
4. levodelellis ◴[] No.45778558[source]
To clarify by "your code" I mean your code excluding a library. A good library would make it impossible to deadlock. When I wrote mine I never called outside code during a lock so it was impossible for it to deadlock. My atomic code had auditing and test. I don't recommend people to write their own thread library unless they want to put a lot of work into it
5. 0x1ceb00da ◴[] No.45779598[source]
> I always said if your code locks or use atomics, it's wrong.

Why atomics?

replies(1): >>45779724 #
6. levodelellis ◴[] No.45779724[source]
People mess up the order all the time. When you mess up locks you get a deadlock, when you mess up an atomic you have items in the queue drop or processed twice, or some other weird behavior (waking up the wrong thread) you didn't expect. You just get hard to understand race conditions which are always a pain to debug

Just say no to atomics (unless they're hidden in a well written library)

replies(1): >>45781265 #
7. jstimpfle ◴[] No.45781265{3}[source]
People are messing up any number of things all the time. The more important question is, do you need to risk messing up in a particular situation? I.e. do you need multithreading? In many cases, for example HPC or GUI programming, the answer is yes, you need multithreading to avoid blocking and to get much higher performance.

With a little bit of experience and a bit of care, multithreading isn't _that_ hard. You just need to design for it. You can reduce the number of critical pieces.

replies(1): >>45786315 #
8. levodelellis ◴[] No.45786315{4}[source]
I completely disagree with you and I do write performance code (but not specifically HPC) and my current day job is highly async code with a GUI
replies(1): >>45788926 #
9. jstimpfle ◴[] No.45788926{5}[source]
Are you saying GUIs in general don't need multithreading or just that you think you haven't needed it so far? Or that you use some high-level async framework that hides just the synchronisation bits (at the cost of async type complexity)?
10. jstimpfle ◴[] No.45789034{3}[source]
If you take programming serious, learn it.

With a library that encapsulates a low number of patterns (like message passing) you'll be very limited. If you never start learning about lower level multi-threading issues you'll never learn it. And it's not _that_ hard.

I'm not writing multi threaded every day (by far), but often enough that I can write useful things (using shared memory, atomics, mutexes, condition variables, etc). And I'm looking forward to learn more, better understand various issues, learn new patterns.