This can take any ordinary Haskell data structure and give you a lock-free concurrent data structure with easy-to-use transactional semantics. How it performs is another matter! That depends on the amount of contention and the cost of re-playing transactions.
However, if memory serves me right, TVar is a building block for the transactional memory subsystem. The guard on TVar with, say, modifyTVar is not really stopping execution at entrance but simply indicating that the block modifies the variable. In my mental model, some magic happens in an STM block that checks if two concurrent STM blocks acted upon the same data at the same time, and if so, it reverts the computations of one of the blocks and repeats them with new data.
To my knowledge, Haskell is the only programming language (+runtime) that has a working transactional memory subsystem. It has been in the language for about 20 years, and in that time many have tried (and failed) to also implement STM.
This library is full of STM-oriented data structures. They perform better than a simple `TVar (Map k v)`.
It's kind of a fun trick actually. The stock Map is just a tree. The STM Map is also a tree [1] but with TVars at each node. So this helps a lot with contention - you only contend along a "spine" instead of across the whole tree, which is O(log n).
[1] Technically a HAMT a la unordered-containers - trie, tree, you get the idea :)
Haskell's STM is pretty world-class though. That's fair to say :)
Here's an example in perhaps more familiar pseudocode.
var x = "y is greater than 0"
var y = 1
forkAndRun {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
forkAndRun {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
In the above example, it's perfectly possible, depending on how the forked code blocks interact with each other, to end up with x = "y is less than or equal to 0"
y = 1
because we have no guarantee of atomicity/transactionality in what runs within the `forkAndRun` blocks.The equivalent of what that Haskell code is doing is replacing `var` with a new keyword `transactional_var` and introducing another keyword `atomically` such that we can do
transactional_var x = "y is greater than 0"
transactional_var y = 1
forkAndRun {
atomically {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
}
forkAndRun {
atomically {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
}
and never end up with a scenario where `x` and `y` disagree with each other, because all their actions are done atomically together and `x` and `y` are specifically marked so that in an atomic block all changes to the variables either happen together or are all rolled back together (and tried again), just like in a database.`transactional_var` is the equivalent of a `TVar` and `atomically` is just `atommically`.
Basically it's the difference between focusing only on transactional variables without having a good way of marking what is and isn't part of a larger transaction and having a higher-order abstraction of an `STM` action that clearly delineates what things are transactions and what aren't.
Does Zio actually offer any protection here, or is it just telling the reader that they're on their own and should be wary of footguns?
If you lock a section of code (to protect data), there's no guarantee against mutations of that data from other sections of code.
If you lock the data itself, you can freely pass it around and anyone can operate on it concurrently (and reason about it as if it were single-threaded).
It's the same approach as a transactional database, where you share one gigantic bucket of mutable state with many callers, yet no-one has to put acquire/release/synchronise into their SQL statements.
So if you have a thread altering `foo` and checking that `foo+bar` isn't greater than 5 and a thread altering `bar` and checking the same, then it's guaranteed that `foo+bar` does not exceed 5. Whereas if only write conflicts were detected (as is default with most databases) then `foo+bar` could end up greater than 5 through parallel changes.
While Haskell's runtime is designed for Haskell needs, Clojure has to be happy with whatever JVM designers considered relevant for Java the language, the same on the other platforms targeted by Clojure.
This is yet another example of a platform being designed for a language, and being a guest language on a platform.
It's more of a combination of API and language decisions rather than the underlying JVM.
But I’ve never heard someone say it messed up in any way, that it was buggy or hard to use or failed to deliver on its promises.
Scala doesn't enforce purity like Haskell though so it wont stop you if you call some normal Scala or Java code with side effects. In practice its not a problem because you're wrapping any effectful outside APIs before introducing them into your code.
For example, when it comes to concurrent access to a map the Clojure community generally forces a dichotomy, either stick a standard Clojure map in an atom and get fully atomic semantics at the expense of serial write performance or use a Java ConcurrentMap at the expense of inter-key atomicity (or do a more gnarly atom around a map itself containing atoms which gets quite messy quite fast).
Such a stark tradeoff doesn't need to exist! In theory STM gives you exactly the granularity you need where you can access the keys that you need atomicity for and only those keys together while allowing concurrent writes to anything else that doesn't touch those keys (this is exactly how e.g. the stm-containers library for Haskell works that's linked elsewhere).
Not atoms.
From Hickey’s History of Clojure paper:
“ Taking on the design and implementation of an STM was a lot to add atop designing a programming language. In practice, the STM is rarely needed or used. It is quite common for Clojure programs to use only atoms for state, and even then only one or a handful of atoms in an entire program. But when a program needs coordinated state it really needs it, and without the STM I did not think Clojure would be fully practical.”
https://dl.acm.org/doi/pdf/10.1145/3386321
Atoms do an atomic compare and swap. It’s not the same thing.
That said, I have never used a ref, nor seen one in use outside of a demo blogpost.