There aren’t many reasons to write an inline asm block that the compiler will elide because of no apparent effects; more likely you screwed up the constraints. If it’s due to ensuring correct memory accesses relative to the compiler, it’s usually better to define appropriate “m” constraints to give the compiler appropriate visibility, or if it’s complex/loopy enough to make that impossible then that is what the “memory” clobber is for, not volatile.
So I strongly disagree with 2 and 3.