Myths Programmers Believe about CPU Caches (2018)

1. daemontus ◴[01 Nov 25 12:26 UTC] No.45781114[source]▶

I may be completely out of line here, but isn't the story on ARM very very different? I vaguely recall the whole point of having stuff like weak atomics being that on x86, those don't do anything, but on ARM they are essential for cache coherency and memory ordering? But then again, I may just be conflating memory ordering and coherency.

replies(2): >>45783040 #>>45784336 #

2. jeffbee ◴[01 Nov 25 16:36 UTC] No.45783040[source]▶

>>45781114 (TP) #

Well, since this is a thread about how programmers use the wrong words to model how they think a CPU cache works, I think it bears mentioning that you've used "atomics" here to mean something irrelevant. It is not true that x86 atomics do nothing. Atomic instructions or, on x86, their prefix, make a naturally non-atomic operation such as a read-modify-write atomic. The ARM ISA actually lacked such a facility until ARMv8.1.

The instructions to which you refer are not atomics, but rather instructions that influence the ordering of loads and stores. x86 has total store ordering by design. On ARM, the program has to use LDAR/STLR to establish ordering.

3. phire ◴[01 Nov 25 19:01 UTC] No.45784336[source]▶

>>45781114 (TP) #

Everything it says about cache coherency is exactly the same on ARM.

Memory ordering has nothing to do with cache coherency, it's all about what happens within the CPU pipeline itself. On ARM reads and writes can become reordered within the CPU pipeline itself, before they hit the caches (which are still fully coherent).

ARM still has strict memory ordering for code within a single core (some older processors do not), but the writes from one core might become visible to other cores in the wrong order.