Effective Go has always said:
Do not communicate by sharing memory; instead, share memory by communicating.
This approach can be taken too far. Reference counts may be best done by putting a mutex around an integer variable, for instance.
https://golang.org/doc/effective_go.html#sharing
Reference counts are best done using interlocked increment/decrement primitives.
I wonder if there are any compilers which can replace
mutex.lock { x++ }
With a 'lock xaddl x 1' instruction.
It's conceivable, if you made mutexes compiler/language intrinsic, but as long as you're calling pthread_mutex_lock, the compiler has to assume that that pthread library, which is linked dynamically, is interchangeable, and can do anything it likes to memory. That includes mutating x
That hasn't inhibited optimizations for a long time. Disassemble a call to printf("Hello world") in optimized clang output and look at what it turns into.