This can take any ordinary Haskell data structure and give you a lock-free concurrent data structure with easy-to-use transactional semantics. How it performs is another matter! That depends on the amount of contention and the cost of re-playing transactions.
However, if memory serves me right, TVar is a building block for the transactional memory subsystem. The guard on TVar with, say, modifyTVar is not really stopping execution at entrance but simply indicating that the block modifies the variable. In my mental model, some magic happens in an STM block that checks if two concurrent STM blocks acted upon the same data at the same time, and if so, it reverts the computations of one of the blocks and repeats them with new data.
To my knowledge, Haskell is the only programming language (+runtime) that has a working transactional memory subsystem. It has been in the language for about 20 years, and in that time many have tried (and failed) to also implement STM.
This library is full of STM-oriented data structures. They perform better than a simple `TVar (Map k v)`.
It's kind of a fun trick actually. The stock Map is just a tree. The STM Map is also a tree [1] but with TVars at each node. So this helps a lot with contention - you only contend along a "spine" instead of across the whole tree, which is O(log n).
[1] Technically a HAMT a la unordered-containers - trie, tree, you get the idea :)
The concurrency stuff in the stdlib + the mainstays in the ecosystem are pretty stable and noncontroversial..there's stuff in Haskell that churn but this is not one of them.
Haskell's STM is pretty world-class though. That's fair to say :)
Here's an example in perhaps more familiar pseudocode.
var x = "y is greater than 0"
var y = 1
forkAndRun {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
forkAndRun {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
In the above example, it's perfectly possible, depending on how the forked code blocks interact with each other, to end up with x = "y is less than or equal to 0"
y = 1
because we have no guarantee of atomicity/transactionality in what runs within the `forkAndRun` blocks.The equivalent of what that Haskell code is doing is replacing `var` with a new keyword `transactional_var` and introducing another keyword `atomically` such that we can do
transactional_var x = "y is greater than 0"
transactional_var y = 1
forkAndRun {
atomically {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
}
forkAndRun {
atomically {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
}
and never end up with a scenario where `x` and `y` disagree with each other, because all their actions are done atomically together and `x` and `y` are specifically marked so that in an atomic block all changes to the variables either happen together or are all rolled back together (and tried again), just like in a database.`transactional_var` is the equivalent of a `TVar` and `atomically` is just `atommically`.
Basically it's the difference between focusing only on transactional variables without having a good way of marking what is and isn't part of a larger transaction and having a higher-order abstraction of an `STM` action that clearly delineates what things are transactions and what aren't.
In any other language I've used (barring maybe Ada) that is a refactoring that would take at least days, if not weeks, to manually track down all the places it interacts with the system directly or indirectly, because mixing e.g. int with long long is not a type error.
In Haskell, I change the type and the compiler spits out a list of locations that needs to change. This is repeated for a few iterations until all transitive interactions are worked out. Done!
Btw. too bad author talks about microsecond guarantees usage but does not provide a link, that would be interesting reading.
Why would there be large memory allocations because of immutable data structures? Btw, you can also use immutable data structure in eg Rust fairly easily. And Haskell also supports mutation and mutable data structures.
However, Haskell can use a lot of memory, but that's more to do with pervasive 'boxing' by default, and perhaps laziness.
Does Zio actually offer any protection here, or is it just telling the reader that they're on their own and should be wary of footguns?
If you lock a section of code (to protect data), there's no guarantee against mutations of that data from other sections of code.
If you lock the data itself, you can freely pass it around and anyone can operate on it concurrently (and reason about it as if it were single-threaded).
It's the same approach as a transactional database, where you share one gigantic bucket of mutable state with many callers, yet no-one has to put acquire/release/synchronise into their SQL statements.
So if you have a thread altering `foo` and checking that `foo+bar` isn't greater than 5 and a thread altering `bar` and checking the same, then it's guaranteed that `foo+bar` does not exceed 5. Whereas if only write conflicts were detected (as is default with most databases) then `foo+bar` could end up greater than 5 through parallel changes.
"Beware of bugs in the above code; I have only proved it correct, not tried it."
While Haskell's runtime is designed for Haskell needs, Clojure has to be happy with whatever JVM designers considered relevant for Java the language, the same on the other platforms targeted by Clojure.
This is yet another example of a platform being designed for a language, and being a guest language on a platform.
Is it because it is just a very hard thing, or is it because its a synchronous language, with async bolted on? (I'm talking about a purely language point of view, not from a python VM / GIL point of view)
When data is immutable, it can be freely shared. Changes to the data essentially uses copy-on-write. And it only writes the delta change, since you don't need a deep copy due to immutability. Add that the garbage collectors of Haskell and Erlang are designed to work with a high allocation rate and have 0 cost for dead data, and this is much faster than what people think.
The way you implement a webserver in either Haskell or Erlang is rather trivial. Whenever there's an incoming request, you make a thread to handle it. So you don't have 1 webserver serving 10k requests. You have 10k webservers serving 1 request each. And since they are started from the same core data, they'll share that due to immutability. See also old-style Apache or PHP and fork().
> It gets weirder: in Haskell, exceptions can be thrown to other threads!
What's really interesting is that because of purity, you have to have asynchronous exceptions otherwise you give up a lot of modularity. At least that's what Simons Marlow and Peyton Jones argue in Asynchronous Exceptions in Haskell (2006): https://www.microsoft.com/en-us/research/wp-content/uploads/...
> While the semi-asynchronous approach avoids breaking synchronization abstractions, it is non-modular in that the target code must be written to use the signalling mechanism. Worse still (for us), the semi-asynchronous approach is simply incompatible with a purely-functional language, such as Concurrent Haskell. The problem is that polling a global flag is not a functional operation, yet in a Concurrent Haskell program, most of the time is spent in purely-functional code. On the other hand, since there is absolutely no problem with abandoning a purely-functional computation at any point, asynchronous exceptions are safe in a functional setting. In short, in a functional setting, fully-asynchronous exceptions are both necessary and safe — whereas in an imperative context fully-asynchronous exceptions are not the only solution and are unsafe.
If you can read PLTese, it's really quite a nice paper.
Tries (like scala’s Vector) or trie maps (the core map types of Scala, Clojure and probably Haskell?) aren’t copied on updates.
In fact, whether a data structure is an immutable or persistent data structure or merely an unmodifiable data structure (like Kotlin uses) is based on whether it requires full copies on most updates or not. In FP languages, immutable data structures aren’t “specialized” at all.
Async is hard, no doubt—but some languages are designed to reduce the surface area of what can go wrong. I’ve heard great things about Erlang, Elixir, and BEAM-based languages in general. They treat async not as an add-on, but as a core architectural principle.
In practice, it is not. The canonical Haskell compiler, GHC, is excellent at transforming operations on immutable data, as Haskell programs are written, into efficient mutations, at the runtime level. Also, since web development is quite popular in the Haskell community, lots of people have spent many hours optimizing this precise use-case.
In my experience, the real downside is that compilation times are a bit long -- the compiler is doing a LOT of work after all.
The one construct that unlocks this lack of colored functions, STM, did require runtime support (as opposed to language support), which at least is transparent to downstream developers
[0]: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...
I should really learn Rust. Or Zig. I tried Nim (best of both worlds, Python-esque code that compiles to C!), but it wasn’t nearly as fast as my Python + C for my specific use case.
Yes, at the level of native machine code and memory cells, there's not that much of a difference between immutability + garbage collection, and higher level source code that mutates. Thanks to GC you are going to overwrite the same memory locations over and over again, too.
Either you have a specialised GC that works like this, or probably a good general generational GC can pick up on this pattern on its own.
It's more of a combination of API and language decisions rather than the underlying JVM.
This hurt my brain. It seems that in some places (e.g. Java land) unmodifiable refers to something that you can't modify but could just be a wrapper around a structure that can be modified. In that case they use immutable to mean something that is nowhere modifiable.
I may be misrepresenting this idea, but I think the terminology is so poor that it deserves to be misunderstood.
The step 0 is missing:
Compose the program into several lanes of execution, traditionally executed via SIMD.
This is a massive piece of performance left on the table on modern computer architectures, by assuming threading is the first manifestation of concurrency.
2. The way you call a function depends on its color.
`<-` or `>>=` vs `=` 3. You can only call a red function from within another red function.
This should sound pretty familiar! You can only call an IO function from within another IO function. STM in this case makes a third colour: IO can call IO functions.
IO can call STM functions. (*)
IO can call pure functions.
STM can call STM functions.
STM can call pure functions.
pure functions can call pure functions.
(*) calling into an STM block from IO is what makes it 'happen for real': it's the `atomically` which has type STM a -> IO a.Having these coloured functions is what made STM achievable back in the mid-late 2000s, since the mechanism to prevent STM or pure functions from calling IO was already in-place.
Other languages either tried to figure out how to contain the side-effects and gave up, or just released STM and put the onus on the user not to use side effects.
// Using mutability.
// `increment` is void, and makes 2 bigger for everyone.
increment(2);
// Typical Java "safety".
// It's still void, but now it throws a RuntimeException
// because the developers are saving you from making everyone's 2 bigger.
increment(2);
// Immutable
// Returns 3
increment(2);
I wish there was better author time feedback to the developer on where they're getting such a perf boost. As far as I'm aware there's no popular linting or blue squiggle to guide you in the right direction.
In games it seems like the popular pattern is to rewrite everything entirely in an entity component system framework.
But I’ve never heard someone say it messed up in any way, that it was buggy or hard to use or failed to deliver on its promises.
Haskell's IO type system doesn't model concurrency at all. `IO a` could be a fork and join, an infinite event loop, really anything, it's a black box in terms of "correctness". Better than using JavaScript maybe, but hardly "correct" in any sort of formal, tractable sense.
Scala doesn't enforce purity like Haskell though so it wont stop you if you call some normal Scala or Java code with side effects. In practice its not a problem because you're wrapping any effectful outside APIs before introducing them into your code.
ISPC comes close, but does come with a learning curve.
> In bare-metal embedded systems or inside the operating system, it’s not unusual to manually break computation into state machines, driven by interrupts.
Although not the topic of TFA, in fact, the footnotes that this is "a whole different ball game." Does anyone have any good source for this aspect of 'low-level'/OS development? I'm more than capable of chasing down sources from a more high level introduction or overview, so anything would be helpful. This concept seems like it may just be a more pragmatic description of embedded/OS development than what I've read previously.
For example, when it comes to concurrent access to a map the Clojure community generally forces a dichotomy, either stick a standard Clojure map in an atom and get fully atomic semantics at the expense of serial write performance or use a Java ConcurrentMap at the expense of inter-key atomicity (or do a more gnarly atom around a map itself containing atoms which gets quite messy quite fast).
Such a stark tradeoff doesn't need to exist! In theory STM gives you exactly the granularity you need where you can access the keys that you need atomicity for and only those keys together while allowing concurrent writes to anything else that doesn't touch those keys (this is exactly how e.g. the stm-containers library for Haskell works that's linked elsewhere).
If you want multiple writers, you can always use the Arc container and use the built-in lock.
I've written low thousands of lines of Haskell. Similar to mikojan, I love Haskell in theory, but ended up not enjoying it as much in practice.
1. The multitude of string-y types. I end up converting between String, Text, Lazy Text, ByteString, Lazy ByteString, and I forget what else. Each library wants me to pass in a specific string type, and each other library returns a different string type. LLMs are good at this, also for a while I had a ton of helper functions to convert between any two string types. Still, perhaps there's a better way?
2. The error messages. I come from Elm, so I'm spoiled. Yes, I understand a language with HKTs will never have as nice error messages. Yes, LLMs are pretty good at explaining GHC error messages.
3. The stdlib. Haskell gets a lot of credit for safety, but a `head` blows up instead of returning a `Maybe`. I know there are other - safer - preludes, but I don't know how to choose between them. I don't know how using a different prelude would impact my projects.
I feel like my next step is either towards Idris, with its polished standard library (and dependent types baked into the language!), or towards something simpler and more Elm-like (Gleam perhaps, or Roc). But if you can sell me on Haskell, I'm all ears!
However, if you decide to start learning, the path is hard, especially if you come from a non-computer-science background like me. I attempted to learn Haskell twice; I bounced off the first time, quite hard, and didn't try again for years.
What worked for me is a combination of two things:
* Having a goal in mind, that has nothing with the choice of language. For me, it was building a personal website
* The book Haskell Programming from First Principles [0]
and if you have more questions, reach out.
Haskell's GC is also fast when you are mostly generating garbage, which is inherently true for web server handlers.
containers and unordered-containers handle most of your needs and they only copy their trees' spines (O log n) on update.
Not atoms.
From Hickey’s History of Clojure paper:
“ Taking on the design and implementation of an STM was a lot to add atop designing a programming language. In practice, the STM is rarely needed or used. It is quite common for Clojure programs to use only atoms for state, and even then only one or a handful of atoms in an entire program. But when a program needs coordinated state it really needs it, and without the STM I did not think Clojure would be fully practical.”
https://dl.acm.org/doi/pdf/10.1145/3386321
Atoms do an atomic compare and swap. It’s not the same thing.
A composition of catamorphic and anamorphic functions can eliminate a lot of the in-between allocations (a hylomorphism)
Basically it looks like you’re building a ton of intermediate structure then consuming it - meaning much of the in-between stuff can be eliminated.
Interesting optimizations and a little mind blowing when you see it.
1. You get single-writer multiple reader for free from the type system, without any runtime overhead.
2. For the same reason, the type system does not allow multiple writers. If you want multiple writers, then you are forced to use locks. Once you use locks, the runtime guarantees safety for this case.
Either way, you get 100% safety.
Of course, even a moving GC has limits, itwon't turn a hashtable into something that has local accesses.
Yes, yes, that's exactly what my encounters with Haskell looked like. The last one is ~1k lines of code backend for a personal project. I feel that's about as much as I could manage at this point.
> The book Haskell Programming from First Principles
That book is getting recommended all the time! I'm concerned if it's perhaps a little too basic for me. (I understand monads, monad transformers, have some notion of final tagless and free monad. Yet I get perpetually confused by various relatively simple things.)
I guess what I'm missing is haskell-language-server to help me a little. Here I'm confused about the interplay between `haskell-stack` (which is in Debian repos and which I think I'd like to use), ghcup, cabal, and haskell-language-server.
> 1. You get single-writer multiple reader for free from the type system, without any runtime overhead.
Take the example from the article which is accepted by the Haskell type system:
writeTBCQueue :: TBCQueue a -> a -> STM ()
writeTBCQueue q v = do
stillOpen <- readTVar q.open
when stillOpen $ writeTBQueue q.queue v
Rust would reject this because of multiple writers.Also, thinking about it more, I'm now very skeptical of Rust even providing 'single-writer multiple-reader'. Is it in fact single-reader-writer xor multiple-reader? In other words, how does it handle a goblin constantly moving money between accounts while a gnome is constantly trying to count the total amount?
goblinBankerThread = forever $ do
seed <- randomSeed
atomically $ do
(acctA, acctB) <- chooseTwoRandomAccounts seed
if (amount acctA) > (amount acctB)
then moveAmount $5 acctA acctB
else moveAmount $5 acctB acctA
gnomeAccountantThread = forever $
atomically $ do
accounts <- readAllAccounts
assert someConstantAmount allAccounts
Yes, Rust is 100% safe because it would reject this code, so it would never run. Not running code also has guaranteed no-overhead!2. For the same reason, the type system does not allow multiple writers. If you want multiple writers, then you are forced to use locks
* Locks are problematic, which is why I chose STM over locks in the first place.
* Locks are in all the languages. Does your comment about 100% safety really apply to all languages ?
In your goblin example, I believe the gnomeAccountantThread would have to constantly retry, because the writer (if successful) would have produced a new version of two accounts, which would trip up the reader, forcing it to start again. In general, Haskell's STM is built for short-lived transactions; for longer running transactions or those that touch a lot of objects, you'd need something like multi-versioned objects seen in databases or epochs to get a consistent snapshot. Neither Rust nor Haskell is suited to this example out of the box.
For your second question, you assume axiomatically that locks are problematic. They aren't in Rust (except, see later about deadlocks). Unlike any other language with in-place mutation, Rust will force you to use a mutex in order to share something for read-write (in a multiple writer scenario), otherwise it won't compile. You have to use lock() in order to get access to the underlying object, and once you have that object, the type system makes sure only the owner can mutate it. In C/C++/Java/Go, you don't get this guarantee at all ... it is possible to mistakenly use the object without using a mutex. So, there is not guarantee of safety in the other languages. There is a 100% guarantee in Rust.
---
That said, the problematic part about locks (whether it is mutexes or MVars in Haskell) is deadlocks, which is solved by having a deterministic lock order. In your Haskell example, if acctA and acctB were MVars, you'd do
let (first, second) = if acctA < acctB then (acctA, acctB) else (acctB, acctA)
withMVar first $ \_ ->
withMVar second $ \_ -> do
...
That said, I have never used a ref, nor seen one in use outside of a demo blogpost.