Most active commenters
  • antonvs(3)

←back to thread

Why SSA Compilers?

(mcyoung.xyz)
158 points transpute | 15 comments | | HN request time: 0.386s | source | bottom
1. zachixer ◴[] No.45675783[source]
Every time I see a clean SSA explainer like this, I’m reminded that the “simplicity” of SSA only exists because we’ve decided mutation is evil. It’s not that SSA is simpler — it’s that we’ve engineered our entire optimization pipeline around pretending state doesn’t exist.

It’s a brilliant illusion that works… until you hit aliasing, memory models, or concurrency, and suddenly the beautiful DAG collapses into a pile of phi nodes and load/store hell.

replies(8): >>45675858 #>>45675871 #>>45676050 #>>45676212 #>>45676222 #>>45676288 #>>45678531 #>>45678840 #
2. vidarh ◴[] No.45675858[source]
SSA is appealing because you can defer the load/store hell until after a bunch of optimizations, though, and a lot of those optimizations becomes a lot easier to reason about when you get to pretend state doesn't exist.
3. achierius ◴[] No.45675871[source]
You have it backwards. Modern compilers don't use SSA because it's "simpler", we use it because it enables very fast data-flow optimizations (constant prop, CSE, register allocation, etc.) that would otherwise require a lot of state. It doesn't "pretend state doesn't exist", it's actually exactly what makes it possible/practical for the compiler to handle changes in state.

As some evidence to the second point: Haskell is a language that does enforce immutability, but it's compiler, GHC, does not use SSA for main IR -- it uses a "spineless tagless g-machine" graph representation that does, in fact, rely on that immutability. SSA only happens later once it's lowered to a mutating form. If your variables aren't mutated, then you don't even need to transform them to SSA!

Of course, you're welcome to try something else, people certainly have -- take a look at how V8's move to Sea-of-Nodes has gone for them.

replies(3): >>45676333 #>>45676374 #>>45679131 #
4. toast0 ◴[] No.45676050[source]
> pretending state doesn’t exist.

As a fan of a functional language, immutability doesn't mean state doesn't exist. You keep state with assignment --- in SSA, every piece of state has a new name.

If you want to keep state beyond the scope of a function, you have to return it, or call another function with it (and hope you have tail call elimination). Or, stash it in a mutable escape hatch.

5. rpearl ◴[] No.45676212[source]
SSA form is a state representation. SSA encodes data flow information explicitly which therefore simplifies all other analysis passes. Including alias analysis.
6. antonvs ◴[] No.45676222[source]
> the “simplicity” of SSA only exists because we’ve decided mutation is evil.

Mutation is the result of sloppy thinking about the role of time in computation. Sloppy thinking is a hindrance to efficient and tractable code transformations.

When you "mutate" a value, you're implicitly indexing it on a time offset - the variable had one value at time t_0 and another value at time t_1. SSA simply uses naming to make this explicit. (As do CPS and ANF, which is where that "equivalence" comes from.)

If you don't use SSA, CPS, or ANF for this purpose, you need to do something else to make the time dimension explicit, or you're going to be dealing with some very hairy problems.

"Evil" in this case is shorthand for saying that mutable variables are an unsuitable model for the purpose. That's not a subjective decision - try to achieve similar results without dealing with the time/mutation issue and you'll find out why.

7. jcranmer ◴[] No.45676288[source]
SSA isn't about saying mutation is evil. It's about trivializing chasing down def-use rules. In the Dragon Book, essentially the first two dataflow analyses introduced are "reaching definitions" and "live variables"; within an SSA-based IR, these algorithms are basically "traverse a few pointers". There's also some ancillary benefits--SSA also makes a flow-insensitive algorithm partially flow-sensitive just by the fact that it's renaming several variables.

Sure, you still need to keep those algorithms in place for being able to reason about memory loads and stores. But if you put effort into kicking memory operations into virtual register operations (where you get SSA for free), then you can also make the compiler faster since you're not constantly rerunning these analyses, but only on demand for the handful of passes that specifically care about eliminating or moving loads and stores.

8. antonvs ◴[] No.45676333[source]
> take a look at how V8's move to Sea-of-Nodes has gone for them.

Are you implying it hasn't gone well? I thought it bought some performance at least. What are the major issues? Any sources I can follow up on?

replies(1): >>45676673 #
9. mananaysiempre ◴[] No.45676374[source]
To appreciate the “fast” part, nothing beats reading though LuaJIT’s lj_opt_fold.c, none of which would work without SSA.

Of course, LuaJIT is cheating, because compared to most compilers it has redefined the problem to handling exactly two control-flow graphs (a line and a line followed by a loop), so most of the usual awkward parts of SSA simply do not apply. But isn’t creatively redefining the problem the software engineer’s main tool?..

10. kibwen ◴[] No.45676673{3}[source]
V8 blog post from March: "Land ahoy: leaving the Sea of Nodes" https://v8.dev/blog/leaving-the-sea-of-nodes
replies(2): >>45676772 #>>45679135 #
11. antonvs ◴[] No.45676772{4}[source]
Fascinating, thanks!
12. titzer ◴[] No.45678531[source]
> we’ve decided mutation is evil.

SSA isn't (primarily) concerned with memory, it's concerned with local variables. It completely virtualizes the storage of local variables--in fact, all intermediate computations. By connecting computations through dataflow edges and not storage, it removes ordering (except that induced by dependence edges) from consideration.

It is, after all, what CPUs do under the hood with register renaming! They are doing dynamic SSA, a trick they stole from us compiler people!

13. mrkeen ◴[] No.45678840[source]
> we’ve decided mutation is evil

The functional programmers have decided mutability is evil. The imperative programmers have not.

We functional programmers do crazy stuff and pretend we're not on real hardware - a new variable instead of mutating (how wasteful!) and infinite registers (what real-world machine supports that!?).

Anyway, there's plenty of room for alternatives to SSA/CPS/ANF. It's always possible to come up with something with more mutation.

14. pjmlp ◴[] No.45679131[source]
Meanwhile Java Hotspot, where Sea-of-Nodes was originally made mainstream, and GraalVM are doing just fine.

The author of Sea-of-Nodes approach is quite critic of V8's decision, as one would expect.

https://www.youtube.com/watch?v=Zo801M9E--M

15. pjmlp ◴[] No.45679135{4}[source]
Feedback from sea of nodes algorithm creator, https://www.youtube.com/watch?v=Zo801M9E--M