Most active commenters

vlovich123(10)
sunshowers(6)
maplant(4)
Spivak(3)
rcxdude(3)

Popular/hot comments

>>45041833 #
>>45041860 #

←back to thread

Unexpected productivity boost of Rust

(lubeno.dev)

1. Spivak ◴[27 Aug 25 16:26 UTC] No.45041771[source]▶

>>45041286 (OP) #

How do you encode the locking issue in the type system, it seems magical? Can you just never hold any locks when calling await, is it smart enough to know that this scheduler might move work between threads?

replies(5): >>45041806 #>>45041833 #>>45041852 #>>45041891 #>>45041898 #

2. jvanderbot ◴[27 Aug 25 16:28 UTC] No.45041806[source]▶

>>45041771 (TP) #

Yes Rust has ways of verifying single access to locks through the borrow checker and lifetimes.

replies(1): >>45041860 #

3. vlovich123 ◴[27 Aug 25 16:31 UTC] No.45041833[source]▶

>>45041771 (TP) #

Presumably the author is using tokio which requires the future constructed (e.g the async function) to be Send (either because of the rules of Rust or annotated as Send) since tokio is a work-stealing runtime and any thread might end up executing a given future (or even start executing and then during a pause move it for completion on another thread). std::sync::MutexGuard intentionally isn't annotated with Send because there are platforms that require the acquiring thread be the one to unlock the mutex.

One caveat though - using a normal std Mutex within an async environment is an antipattern and should not be done - you can cause all sorts of issues & I believe even deadlock your entire code. You should be using tokio sync primitives (e.g. tokio Mutex) which can yield to the reactor when it needs to block. Otherwise the thread that's running the future blocks forever waiting for that mutex and that reactor never does anything else which isn't how tokio is designed).

So the compiler is warning about 1 problem, but you also have to know to be careful to know not to call blocking functions in an async function.

replies(6): >>45041892 #>>45041938 #>>45041964 #>>45042014 #>>45042145 #>>45042479 #

4. synalx ◴[27 Aug 25 16:31 UTC] No.45041852[source]▶

>>45041771 (TP) #

More the latter. The lock guard is not `Send`, so holding it across the await point makes the `impl Future` returned by the async function also not `Send`. Therefore it can't be passed to a scheduler which does work stealing, since that scheduler is typed to require futures to be `Send`.

5. Spivak ◴[27 Aug 25 16:33 UTC] No.45041860[source]▶

>>45041806 #

Well yes but this seems more complicated. If the code was executed on a single thread it would be correct. The compiler somehow knows that await might move the work to another thread in which the code would still be correct if it weren't for the pesky undefined behavior. At all points it seems like it would be single access and yet it catches this.

replies(3): >>45042126 #>>45042240 #>>45042247 #

6. maplant ◴[27 Aug 25 16:35 UTC] No.45041892[source]▶

>>45041833 #

> using a normal std Mutex within an async environment is an antipattern and should not be done

This is simply not true, and the tokio documentation says as much:

"Contrary to popular belief, it is ok and often preferred to use the ordinary Mutex from the standard library in asynchronous code."

https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html#wh...

replies(2): >>45042193 #>>45044049 #

7. bkolobara ◴[27 Aug 25 16:35 UTC] No.45041891[source]▶

>>45041771 (TP) #

Yes, if you use a scheduler that doesn't move work between threads, it will not require the task to be Send, and the example code would compile.

Rust can use that type information and lifetimes to figure out when it's safe and when not.

8. mrtracy ◴[27 Aug 25 16:36 UTC] No.45041898[source]▶

>>45041771 (TP) #

Rust uses the traits “Send” and “Sync” to encode this information, there is a lot of special tooling in the compiler around these.

A type is “Send” if it can be moved from one thread to another, it is “Sync” if it can be simultaneously accessed from multiple threads.

These traits are automatically applied whenever the compiler knows it is safe to do so. In cases where automatic application is not possible, the developer can explicitly declare a type to have these traits, but doing so is unsafe (requires the ‘unsafe’ keyword and everything that entails).

You can read more at rustinomicon, if you are interested: https://doc.rust-lang.org/nomicon/send-and-sync.html

9. dilawar ◴[27 Aug 25 16:40 UTC] No.45041938[source]▶

>>45041833 #

> One caveat though - using a normal std Mutex within an async environment is an antipattern and should not be done - you can cause all sorts of issues & I believe even deadlock your entire code.

True. I used std::mutex with tokio and after a few days my API would not respond unless I restarted the container. I was under the impression that if it compiles, it's gonna just work (fearless concurrency) which is usually the case.

replies(2): >>45042113 #>>45042735 #

10. Spivak ◴[27 Aug 25 16:41 UTC] No.45041964[source]▶

>>45041833 #

Thank you! It seems so simple in hindsight to have a type that means "can be moved to another thread safely." But the fact that using a Mutex in your function changes the "type" is really novel. It becoming Send once the lock is released before await is just fantastic.

replies(2): >>45042232 #>>45042345 #

11. rcxdude ◴[27 Aug 25 16:46 UTC] No.45042014[source]▶

>>45041833 #

It can be an issue, but only if you have long enough contention periods for the mutex to matter. A Mutex which is only held for short time periouds can work perfectly fine in an async context. And it's only going to cause deadlocks if it would cause deadlocks in a threaded context, since if Mutexes can only be held between yield points then it'll only ever be running tasks that are contending for them.

replies(1): >>45044110 #

12. ◴[27 Aug 25 16:55 UTC] No.45042113{3}[source]▶

>>45041938 #

13. 0x1ceb00da ◴[27 Aug 25 16:56 UTC] No.45042126{3}[source]▶

>>45041860 #

An async function call returns a future which is an object holding the state of the async computation. Multithreaded runtimes like tokio require this future to be `Send`, which means it can be moved from one thread to another. The future generated by the compiler is not Send if there is a local variable crossing an await boundary that is not `Send`.

14. sunshowers ◴[27 Aug 25 16:58 UTC] No.45042145[source]▶

>>45041833 #

Using a Tokio mutex is even more of an antipattern :) come to my RustConf talk about async cancellation next week to find out why!

replies(1): >>45044123 #

15. ◴[27 Aug 25 17:02 UTC] No.45042193{3}[source]▶

>>45041892 #

16. VWWHFSfQ ◴[27 Aug 25 17:05 UTC] No.45042232{3}[source]▶

>>45041964 #

Rust's marker traits (Send, Sync, etc.) + RAII is really what makes it so magical to me.

17. ViewTrick1002 ◴[27 Aug 25 17:05 UTC] No.45042240{3}[source]▶

>>45041860 #

That comes from the where the tokio task is spawned annotating the future with send.

https://docs.rs/tokio/latest/tokio/task/fn.spawn.html

If you want to run everything on the same thread then localset enables that. See how the spawn function does not include the send bound.

https://docs.rs/tokio/latest/tokio/task/struct.LocalSet.html

18. veber-alex ◴[27 Aug 25 17:06 UTC] No.45042247{3}[source]▶

>>45041860 #

The compiler doesn't know anything about threads.

The compiler knows the Future doesn't implement the Send trait because MutexGuard is not Send and it crosses await points.

Then, tokio the aysnc runtime requires that futures that it runs are Send because it can move them to another thread.

This is how Rust safety works. The internals of std, tokio and other low level libraries are unsafe but they expose interfaces that are impossible to misuse.

19. NobodyNada ◴[27 Aug 25 17:15 UTC] No.45042345{3}[source]▶

>>45041964 #

The way that this fully works together is:

- The return type of Mutex::lock() is a MutexGuard, which is a smart pointer type that 1) implements Deref so it can be dereferenced to access the underlying data, 2) implements Drop to unlock the mutex when the guard goes out of scope, and 3) implements !Send so the compiler knows it is unsafe to send between threads: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html

- Rust's implementation of async/await works by transforming an async function into a state machine object implementing the Future trait. The compiler generates an enum that stores the current state of the state machine and all the local variables that need to live across yield points, with a poll function that (synchronously) advances the coroutine to the next yield point: https://doc.rust-lang.org/std/future/trait.Future.html

- In Rust, a composite type like a struct or enum automatically implements Send if all of its members implement Send.

- An async runtime that can move tasks between threads requires task futures to implement Send.

So, in the example here: because the author held a lock across an await point, the compiler must store the MutexGuard smart pointer as a field of the Future state machine object. Since MutexGuard is !Send, the future also is !Send, which means it cannot be used with an async runtime that moves tasks between threads.

If the author releases the lock (i.e. drops the lock guard) before awaiting, then the guard does not live across yield points and thus does not need to be persisted as part of the state machine object -- it will be created and destroyed entirely within the span of one call to Future::poll(). Thus, the future object can be Send, meaning the task can be migrated between threads.

20. benmmurphy ◴[27 Aug 25 17:29 UTC] No.45042479[source]▶

>>45041833 #

original poster was 'lucky' that he was using a work stealing engine. if the engine was not moving tasks between threads then i think the compiler would have been happy and he could have had the fun of debugging what happens when the same thread tries to lock the same mutex twice. the rust compiler won't save you from this kind of bug.

> The exact behavior on locking a mutex in the thread which already holds the lock is left unspecified. However, this function will not return on the second call (it might panic or deadlock, for example).

21. maplant ◴[27 Aug 25 17:50 UTC] No.45042735{3}[source]▶

>>45041938 #

It sounds like you have a deadlock somewhere in your code and it's unlikely that the choice of Mutex really fixed that

replies(1): >>45043991 #

22. vlovich123 ◴[27 Aug 25 19:30 UTC] No.45043991{4}[source]▶

>>45042735 #

Or you do something super expensive while holding the mutex.

Or your server is heavily contended enough that all worker threads are blocked on this mutex and no reactor can make forward progress.

23. vlovich123 ◴[27 Aug 25 19:35 UTC] No.45044049{3}[source]▶

>>45041892 #

I think that’s a dangerous recommendation and ignores the consequence of doing this within a single threaded reactor (your app will hang permanently if the lock is contended) or assumes you won’t have N threads trying to acquire this lock where N is the size of your thread pool which is entirely possible on a heavily accessed server if the core state that needs to be accessed by every request is locked with said mutex.

replies(1): >>45044232 #

24. vlovich123 ◴[27 Aug 25 19:39 UTC] No.45044110{3}[source]▶

>>45042014 #

Imagine if every thread tries to acquire this mutex on every HTTP request. And then you also lock it in a spawn_blocking. On a heavily contended lock, all the reactor threads will end up stuck waiting for the spawn_blocking to finish and the service will appear hung since even requests that don’t even need that lock can’t get processed. And if spawn_blocking is doing synchronous network I/O or reading a file from a hung NFS mount, you’re completely hosed.

replies(1): >>45044763 #

25. vlovich123 ◴[27 Aug 25 19:41 UTC] No.45044123{3}[source]▶

>>45042145 #

Most people on this forum are not attending RustConf. It might be helpful to post at least the abstract of your idea.

replies(1): >>45044490 #

26. maplant ◴[27 Aug 25 19:48 UTC] No.45044232{4}[source]▶

>>45044049 #

If you're using a single threaded reactor you don't need a mutex at all; a Rc<RefCell<_>> will do just fine. And if you want other tasks to yield until you're done holding the borrow, the solution is simple: don't await until you're done. https://docs.rs/tokio/latest/tokio/task/struct.LocalSet.html...

there are absolutely situations where tokio's mutex and rwlock are useful, but the vast majority of the time you shouldn't need them

replies(1): >>45045733 #

27. sunshowers ◴[27 Aug 25 20:07 UTC] No.45044490{4}[source]▶

>>45044123 #

The big thing is that futures are passive, so any future can be cancelled at any await point by dropping it or not polling it any more. So if you have a situation like this, which in my experience is a very common way to use mutexes:

  let guard = mutex.lock().await;
  // guard.data is Option<T>, Some to begin with
  let data = guard.data.take(); // guard.data is now None

  let new_data = process_data(data).await;
  guard.data = Some(new_data); // guard.data is Some again

Then you could cancel the future at the await point in between while the lock is held, and as a result guard.data will not be restored to Some.

replies(2): >>45045706 #>>45045804 #

28. rcxdude ◴[27 Aug 25 20:28 UTC] No.45044763{4}[source]▶

>>45044110 #

You would be in a similarly bad situation if such a lock was async as well. You want to keep locks as short as possible, regardless, and a hung task with a lock is generally going to propagate issues throughout the system until it's dealt with.

replies(1): >>45045753 #

29. neandrake ◴[27 Aug 25 21:45 UTC] No.45045706{5}[source]▶

>>45044490 #

Same would be true for any resource that needs cleaned up, right? Referring to stop-polling-future as canceling is probably not good nomenclature. Typically canceling some work requires cleanup, if only to be graceful let alone properly releasing resources.

replies(1): >>45045790 #

30. vlovich123 ◴[27 Aug 25 21:48 UTC] No.45045733{5}[source]▶

>>45044232 #

As you know Rc<RefCell> won't work for a data structure that's being accessed by other threads (e.g. spawn_blocking or explicitly accessed threads). Just because you are using the single-threaded reactor does not mean you don't have other threads to access the data.

replies(1): >>45045859 #

31. vlovich123 ◴[27 Aug 25 21:50 UTC] No.45045753{5}[source]▶

>>45044763 #

If you have an async lock that can be held across await points, your motivation to spawn_block decreases. Additionally, even though the long running lock would be a problem, APIs that aren't taking that lock continue to work vs the entire HTTP service being hung which is a huge difference (e.g. health monitoring would time out making the service seem unresponsive vs 1 highly contended task just waiting on a lock).

replies(1): >>45047260 #

32. sunshowers ◴[27 Aug 25 21:53 UTC] No.45045790{6}[source]▶

>>45045706 #

Yes, this is true of any resource. But Tokio mutexes, being shared mutable state, are inherently likely to run into bugs in production.

In the Rust community, cancellation is pretty well-established nomenclature for this.

Hopefully the video of my talk will be up soon after RustConf, and I'll make a text version of it as well for people that prefer reading to watching.

replies(1): >>45045836 #

33. vlovich123 ◴[27 Aug 25 21:54 UTC] No.45045804{5}[source]▶

>>45044490 #

I'm not sure this introduces any new failure though:

    let data = mutex.lock().take();
    let new_data = process_data(data).await;
    *mutex.lock() = Some(new_data);

Here you are using a traditional lock and a cancellation at process_data results in the lock with the undesired state you're worried about. It's a general footgun of cancellation and asynchronous tasks that at every await boundary your data has to be in some kind of valid internally consistent state because the await may never return. To fix this more robustly you'd need the async drop language feature.

replies(1): >>45045842 #

34. neandrake ◴[27 Aug 25 21:57 UTC] No.45045836{7}[source]▶

>>45045790 #

Thank you, I look forward to watching your presentation.

35. sunshowers ◴[27 Aug 25 21:58 UTC] No.45045842{6}[source]▶

>>45045804 #

True! This is the case with std mutexes as well. But holding a std MutexGuard across an await point makes the future not Send, and therefore not typically spawnable on a Tokio runtime [1]. This isn't really an intended design decision, though, as far as I can tell -- just one that worked out to avoid this footgun.

Tokio MutexGuards are Send, unfortunately, so they are really prone to cancellation bugs.

(There's a related discussion about panic-based cancellations and mutex poisoning, which std's mutex has but Tokio's doesn't either.)

[1] spawn_local does exist, though I guess most people don't use it.

replies(1): >>45045943 #

36. maplant ◴[27 Aug 25 21:59 UTC] No.45045859{6}[source]▶

>>45045733 #

It seems like you've found a very specific situation where tokio's Mutex is helpful but it is simply not common to be structuring an async rust application with a single threaded reactor and copious uses of spawn blocking. The advice to go for the std Mutex first is generally good

37. vlovich123 ◴[27 Aug 25 22:08 UTC] No.45045943{7}[source]▶

>>45045842 #

You argument then is that the requirement to acquire the lock multiple times makes it more likely you'll think about cancellation & keeping it in a valid interim state? Otherwise I'm not sure how MutexGuards being send really makes it any more or less prone to cancellation bugs.

replies(2): >>45046029 #>>45049526 #

38. sunshowers ◴[27 Aug 25 22:17 UTC] No.45046029{8}[source]▶

>>45045943 #

Right, in general a lot of uses of mutexes are to temporarily violate invariants that are otherwise upheld while the mutex is released, something that can usually be reasoned out locally. Reasoning about cancellation at an await point is inherently non-local and so is much harder to do. (And Rust is all about scaling up local reasoning to global correctness, so async cancellation feels like a knife in the back to many practitioners.)

The generally recommended alternative is message passing/channels/"actor model" where there's a single owner of data which ensures cancellation doesn't occur -- or, at least that if cancellation happens the corresponding invalid state is torn down as well. But that has its own pitfalls, such as starvation.

This is all very unsatisfying, unfortunately.

39. rcxdude ◴[28 Aug 25 01:23 UTC] No.45047260{6}[source]▶

>>45045753 #

>e.g. health monitoring would time out making the service seem unresponsive vs 1 highly contended task just waiting on a lock

If anything that's a disadvantage. You want your health monitoring to be the canary, not something that keeps on trucking even if the system is no longer doing useful work. (See the classic safety critical software fail of 'I need a watchdog... I'll just feed it regularly in an isolated task')

replies(1): >>45053738 #

40. sunshowers ◴[28 Aug 25 07:42 UTC] No.45049526{8}[source]▶

>>45045943 #

I realize now that you were proposing unlocking and relocking the mutex several times. Generally, if someone unlocks a mutex in the middle, they're already thinking about races with other callers (even if the code isn't cancelled in between).

You can definitely argue that developers should think about await points the same way they think about letting go of the mutex entirely, in case cancellation happens. Are mutexes conducive to that kind of thinking? Practically, I've found this to be very easy to get wrong.

41. vlovich123 ◴[28 Aug 25 15:52 UTC] No.45053738{7}[source]▶

>>45047260 #

I disagree and hope I can convince you. You're assuming that any request has an equal probability of causing this problem so it's kind of OK if the server deadlocks under load so that the failing health check triggers a restart by whatever is watching it. However, let's say you have three API endpoints:

    /healthz
    /very_common_operation
    /may_deadlock_server

Normally, /may_deadlock_server doesn't get enough traffic to cause problems (let's say it's 10 RPS and 1000 RPS is /very_common_operation and the server operates fine). However, a sudden influx of requests to /may_deadlock_server may cause your service to deadlock (and not a lot, let's say on the order of a few hundred requests). Do you still want the server to lock up completely and forever and wait for a healthz timeout to reboot the service? What if healthz still remains fine but the entire service goes from 10ms response times for requests to 200ms, just enough to cause problems but not enough to make healthz actually unavailable? And all this just because /may_deadlock saw a spike in traffic. And also, the failing healthz check just restarts your service but it won't mitigate the traffic spike if it's sustained. Now consider also that /may_deadlock_server is a trivial gadget for an attacker to DOS your site.

Or do you want the web server responding healthily & rely on metrics and alerts to let you know that /may_deadlock_server is taking a long time to handle requests / impacting performance? Your health monitoring is an absolute last step for automatically mitigating an issue but it'll only help if the bug is some state stuck in a transient state - if it'll restart into the same conditions leading to the starvation then you're just going to be in an infinite reboot loop which is worse.

Healthz is not an alternative to metrics and alerting - it's a last stopgap measure to try to automatically get out of a bad situation. But it can also cause a worse problem if the situation is outside of the state of the service - so generally you want the service to remain available if a reboot wouldn't fix the problem.

↑