But its not a very robust one: its never true of fast programs on realistic hardware for example (not for a long time now). And all the rule bending (-fstrict-alias, bunch of stuff) exists in this tension between the grade school natural language paradigm and the reality of computers. I say grade school not to be pejorative, but rather because it is roughly the boundary where written natural languages begin to have interesting tensions around past and future and simultaneous, changing and not changing.
Functors and applicatives and monads and other type classes like these are the source of endless analogies because there isn't an accepted, broadly-understood terminology for this "well its roughly what would happen if you had a piece of paper and wrote things on it at every statement boundary and scratched off the old ones" (though Turing and von Neumann did formalize this in useful ways, they just don't generalize well to realistic computers anymore).
Monads are the mathematical object that is forced on you if you want a rigorous way to describe the semantics of program execution in the vicinity of this "common sense" notion. That's really what everyone is dancing around: your program is only well defined with either:
- a big rulebook full of exceptions and edge cases
- a compositional rule strict enough to give some useful predictability but lax enough to admit most useful programs.
It is this rigor/laxity tension as concerns text on a page and gates on a semiconductor that gives monads a privileged place in the towers of categories. When I worked on Sigma we were among the earlier adoptors of ApplicativeDo, for example, because we wanted a slightly different rigor/laxity tradeoff for performance reasons.
Monads are what happens when you do shift the giant pile of "back of the book" compiler details that describe program execution semantics into a much simpler set of rules, but at the cost of increasing the barrier to entry because you need to know the rules before you can print "hello world".