> and perhaps for good reasons
For the very good reason that the underlying math is insanely complicated and tiresome for mere practitioners (which, although I have a background in math, I openly aim to be).
For example, even if you assume sequential consistency (which is an expensive assumption) in a C or C++ language multi-threaded program, reasoning about the program isn't easy. And once you consider barriers, atomics, load-acqire/store-release explicitly, the "SMP" (shared memory) proposition falls apart, and you can't avoid programming for a message passing system, with independent actors -- be those separate networked servers, or separate CPUs on a board. I claim that struggling with async messaging between independent peers as a baseline is not why most people get interested in programming.
Our systems (= normal motherboards on one and, and networked peer to peer systems on the other end) have become so concurrent that doing nearly anything efficiently nowadays requires us to think about messaging between peers, and that's very-very foreign to our traditional, sequential, imperative programming languages. (It's also foreign to how most of us think.)
Thus, I certainly don't want a simple (but leaky) software / programming abstraction that hides the underlying hardware complexity; instead, I want the hardware to be simple (as little internally-distributed as possible), so that the simplicity of the (sequential, imperative) programming language then reflect and match the hardware well. I think this can only be found in embedded nowadays (if at all), which is why I think many are drawn to embedded recently.