Concurrency in Haskell: Fast, Simple, Correct

1. FuckButtons ◴[17 Apr 25 03:24 UTC] No.43712793[source]▶

I thought it was a bit odd that the author claims there’s no mutexes in sight, the TVar is effectively a mutex guard unless I’m misunderstanding this? (I’ve written exactly 0 lines of Haskel). Or is the claim that the lack of ceremony and accidental complexity around threading is the real win for concurrency here?

replies(6): >>43713040 #>>43713043 #>>43713044 #>>43713127 #>>43714265 #>>43714335 #

2. chongli ◴[17 Apr 25 04:11 UTC] No.43713040[source]▶

>>43712793 (TP) #

No, a TVar is not a mutex guard. A TVar is a software transactional memory (STM) variable. STM works just like a database: you batch together a sequence of operations into a transaction and then execute them. During execution of a transaction, all changes made to the contents of the TVar are stored in a transaction log. If some other transaction occurs during the execution then the whole thing is aborted and re-run.

This can take any ordinary Haskell data structure and give you a lock-free concurrent data structure with easy-to-use transactional semantics. How it performs is another matter! That depends on the amount of contention and the cost of re-playing transactions.

replies(2): >>43713062 #>>43715091 #

3. pkjens04 ◴[17 Apr 25 04:11 UTC] No.43713043[source]▶

>>43712793 (TP) #

A mutex locks the thread, which doesn't happen with TVar. For mutexes TMVar would be used.

4. dsign ◴[17 Apr 25 04:11 UTC] No.43713044[source]▶

>>43712793 (TP) #

You are correct, Haskell has quite a few mutex-like types. MVar is one of them.

However, if memory serves me right, TVar is a building block for the transactional memory subsystem. The guard on TVar with, say, modifyTVar is not really stopping execution at entrance but simply indicating that the block modifies the variable. In my mental model, some magic happens in an STM block that checks if two concurrent STM blocks acted upon the same data at the same time, and if so, it reverts the computations of one of the blocks and repeats them with new data.

To my knowledge, Haskell is the only programming language (+runtime) that has a working transactional memory subsystem. It has been in the language for about 20 years, and in that time many have tried (and failed) to also implement STM.

replies(2): >>43713050 #>>43713083 #

5. dionian ◴[17 Apr 25 04:12 UTC] No.43713050[source]▶

>>43713044 #

https://zio.dev/reference/stm/

replies(1): >>43714227 #

6. whateveracct ◴[17 Apr 25 04:14 UTC] No.43713062[source]▶

>>43713040 #

https://hackage.haskell.org/package/stm-containers

This library is full of STM-oriented data structures. They perform better than a simple `TVar (Map k v)`.

It's kind of a fun trick actually. The stock Map is just a tree. The STM Map is also a tree [1] but with TVars at each node. So this helps a lot with contention - you only contend along a "spine" instead of across the whole tree, which is O(log n).

[1] Technically a HAMT a la unordered-containers - trie, tree, you get the idea :)

7. whateveracct ◴[17 Apr 25 04:18 UTC] No.43713083[source]▶

>>43713044 #

I think Clojure has some kind of STM too?

Haskell's STM is pretty world-class though. That's fair to say :)

replies(1): >>43713157 #

8. dwohnitmok ◴[17 Apr 25 04:27 UTC] No.43713127[source]▶

>>43712793 (TP) #

No a TVar isn't a mutex guard. As a sibling comment points out it gives you transactional semantics similar to most relational databases.

Here's an example in perhaps more familiar pseudocode.

  var x = "y is greater than 0"
  var y = 1
  
  forkAndRun {() =>
    y = y - 1
    if (y <= 0) {
      x = "y is less than or equal to 0"
    }
  }
  
  forkAndRun {() =>
    y = y + 1
    if (y > 0) {
      x = "y is greater than 0"
    }
  }

In the above example, it's perfectly possible, depending on how the forked code blocks interact with each other, to end up with

  x = "y is less than or equal to 0"
  y = 1

because we have no guarantee of atomicity/transactionality in what runs within the `forkAndRun` blocks.

The equivalent of what that Haskell code is doing is replacing `var` with a new keyword `transactional_var` and introducing another keyword `atomically` such that we can do

  transactional_var x = "y is greater than 0"
  transactional_var y = 1
  
  forkAndRun {
    atomically {() =>
      y = y - 1
      if (y <= 0) {
        x = "y is less than or equal to 0"
      }
    }
  }
  
  forkAndRun {
    atomically {() =>
      y = y + 1
      if (y > 0) {
        x = "y is greater than 0"
      }
    }
  }

and never end up with a scenario where `x` and `y` disagree with each other, because all their actions are done atomically together and `x` and `y` are specifically marked so that in an atomic block all changes to the variables either happen together or are all rolled back together (and tried again), just like in a database.

`transactional_var` is the equivalent of a `TVar` and `atomically` is just `atommically`.

9. dwohnitmok ◴[17 Apr 25 04:35 UTC] No.43713157{3}[source]▶

>>43713083 #

Clojure's STM never really took off because, for various reasons, it's not as easy to compose as Haskell's (where you can build up a big library of STM blocks and piece them together at the very edges of your program). As such Clojure's STM implementation doesn't actually have a great reputation within the Clojure ecosystem where it isn't usually used in most production codebases (whereas in Haskell STM is often one of the first tools used in any production codebase with concurrency).

Basically it's the difference between focusing only on transactional variables without having a good way of marking what is and isn't part of a larger transaction and having a higher-order abstraction of an `STM` action that clearly delineates what things are transactions and what aren't.

replies(2): >>43714468 #>>43718199 #

10. mrkeen ◴[17 Apr 25 08:06 UTC] No.43714227{3}[source]▶

>>43713050 #

> Implication of Using STM Running I/O Inside STM— There is a strict boundary between the STM world and the ZIO world. This boundary propagates even deeper because we are not allowed to execute arbitrary effects in the STM universe. Performing side effects and I/O operations inside a transaction is problematic. In the STM the only effect that exists is the STM itself. We cannot print something or launch a missile inside a transaction as it will nondeterministically get printed on every reties that transaction does that.

Does Zio actually offer any protection here, or is it just telling the reader that they're on their own and should be wary of footguns?

replies(2): >>43716861 #>>43718605 #

11. mrkeen ◴[17 Apr 25 08:15 UTC] No.43714265[source]▶

>>43712793 (TP) #

Mutexes lock code, TVars lock data.

If you lock a section of code (to protect data), there's no guarantee against mutations of that data from other sections of code.

If you lock the data itself, you can freely pass it around and anyone can operate on it concurrently (and reason about it as if it were single-threaded).

It's the same approach as a transactional database, where you share one gigantic bucket of mutable state with many callers, yet no-one has to put acquire/release/synchronise into their SQL statements.

12. ghusbands ◴[17 Apr 25 08:32 UTC] No.43714335[source]▶

>>43712793 (TP) #

As siblings note, TVar is a transactional variable. However, it's not just protective against concurrent writes but also against concurrent reads of altered variables, so it offers true atomicity across any accessed state in a transaction.

So if you have a thread altering `foo` and checking that `foo+bar` isn't greater than 5 and a thread altering `bar` and checking the same, then it's guaranteed that `foo+bar` does not exceed 5. Whereas if only write conflicts were detected (as is default with most databases) then `foo+bar` could end up greater than 5 through parallel changes.

13. pjmlp ◴[17 Apr 25 08:59 UTC] No.43714468{4}[source]▶

>>43713157 #

You missed a very important detail, the language runtime.

While Haskell's runtime is designed for Haskell needs, Clojure has to be happy with whatever JVM designers considered relevant for Java the language, the same on the other platforms targeted by Clojure.

This is yet another example of a platform being designed for a language, and being a guest language on a platform.

replies(1): >>43716662 #

14. quibono ◴[17 Apr 25 10:54 UTC] No.43715091[source]▶

>>43713040 #

> How it performs is another matter!

I know you say it depends on how much contention one sees but I'm interested in the performance hit. Also, is STM the "standard" (or accepted) way to do async in Haskell?

15. dwohnitmok ◴[17 Apr 25 13:39 UTC] No.43716662{5}[source]▶

>>43714468 #

I don't think this is a limitation of the JVM. When I've used Clojure's STM implementation it's been perfectly serviceable (barring the composability issues I mentioned). Likewise when I've used the various STM libraries in Scala. Eta (basically a Haskell implementation on the JVM that unfortunately stalled in development) also had a fine STM implementation.

It's more of a combination of API and language decisions rather than the underlying JVM.

16. dionian ◴[17 Apr 25 13:52 UTC] No.43716861{4}[source]▶

>>43714227 #

I am not super familiar with it, great question - my guess would be its the latter! Does Haskell's provide protections?

17. eduction ◴[17 Apr 25 15:21 UTC] No.43718199{4}[source]▶

>>43713157 #

My impression at least watching chatter over the last several years isn’t that it has a bad reputation but rather that people haven’t found a need for it, atoms are good enough for vast bulk of shared mutable state. Heck even Datomic, an actual bona fide database, doesn’t need STM it’s apparently all just an atom.

But I’ve never heard someone say it messed up in any way, that it was buggy or hard to use or failed to deliver on its promises.

replies(1): >>43719540 #

18. hackingonempty ◴[17 Apr 25 15:50 UTC] No.43718605{4}[source]▶

>>43714227 #

STM happens inside the STM monad while regular effects happen in the ZIO monad. If you try to do ZIO effects inside an STM transaction you'll get a type error.

Scala doesn't enforce purity like Haskell though so it wont stop you if you call some normal Scala or Java code with side effects. In practice its not a problem because you're wrapping any effectful outside APIs before introducing them into your code.

19. robto ◴[17 Apr 25 17:02 UTC] No.43719540{5}[source]▶

>>43718199 #

Clojure atoms use STM, though. I've been writing Clojure for almost a decade now, it's not that STM isn't great, it's just that immutable data will carry you a very long way - you just don't need coordinated mutation except in very narrow circumstances. In those circumstances STM is great! I have no complaints. But it just doesn't come up very often.

replies(2): >>43719854 #>>43724093 #

20. dwohnitmok ◴[17 Apr 25 17:30 UTC] No.43719854{6}[source]▶

>>43719540 #

I would say to the contrary it would come up all the time if the right idioms were in place.

For example, when it comes to concurrent access to a map the Clojure community generally forces a dichotomy, either stick a standard Clojure map in an atom and get fully atomic semantics at the expense of serial write performance or use a Java ConcurrentMap at the expense of inter-key atomicity (or do a more gnarly atom around a map itself containing atoms which gets quite messy quite fast).

Such a stark tradeoff doesn't need to exist! In theory STM gives you exactly the granularity you need where you can access the keys that you need atomicity for and only those keys together while allowing concurrent writes to anything else that doesn't touch those keys (this is exactly how e.g. the stm-containers library for Haskell works that's linked elsewhere).

21. eduction ◴[18 Apr 25 01:38 UTC] No.43724093{6}[source]▶

>>43719540 #

That’s incorrect. Only refs+dosync use stm. https://clojure.org/reference/refs

Not atoms.

From Hickey’s History of Clojure paper:

“ Taking on the design and implementation of an STM was a lot to add atop designing a programming language. In practice, the STM is rarely needed or used. It is quite common for Clojure programs to use only atoms for state, and even then only one or a handful of atoms in an entire program. But when a program needs coordinated state it really needs it, and without the STM I did not think Clojure would be fully practical.”

https://dl.acm.org/doi/pdf/10.1145/3386321

Atoms do an atomic compare and swap. It’s not the same thing.

replies(2): >>43724145 #>>43733941 #

22. eduction ◴[18 Apr 25 01:50 UTC] No.43724145{7}[source]▶

>>43724093 #

PS I agree atoms and stm are both solid though — and that you can go a very long way without touching either!

23. robto ◴[19 Apr 25 03:14 UTC] No.43733941{7}[source]▶

>>43724093 #

Haha, I read The Joy of Clojure way back in 2013 and conflated the different reference types with STM. So thanks for mentioning that, I always thought it weird that you'd need STM for vars and atoms too.

That said, I have never used a ref, nor seen one in use outside of a demo blogpost.

replies(1): >>43752241 #

24. eduction ◴[21 Apr 25 14:13 UTC] No.43752241{8}[source]▶

>>43733941 #

Totally! They are rare. Cheers