←back to thread

386 points ingve | 1 comments | | HN request time: 0.263s | source
Show context
kookamamie ◴[] No.35738041[source]
Trusting the C++ compiler to emit a specific instruction for guaranteed performance - cute, but not realistic.
replies(1): >>35738060 #
vkazanov ◴[] No.35738060[source]
I don't think it is the instruction that makes the difference. Cmov is a branch either way.

Branches can be unpredictable and slow, no matter what instructions, (or tables, or whatever) are used to introduce the branch at question.

It is predictable nature of branching in this algo that makes the difference. Which makes me wonder...

EDIT: the cmov vs explicit branching story is a bit more complicated than just branch vs no branch.

replies(1): >>35738115 #
2102922286 ◴[] No.35738115[source]
Cmov doesn't branch. A branch refers specifically to the program counter ending up in more than one possible place after an instruction has executed. It is this behavior that mucks with the CPU state and slows everything down.

It's true that the cmov instruction uses the CPU flags register (which I'm sure the CPU designers at Intel hate), but that doesn't mean that it branches.

You can achieve the same effect as cmov by using bitwise operations and a subtraction, though it'd just be a few cycles slower--but it would be even more clear that it doesn't branch.

replies(4): >>35738178 #>>35738206 #>>35738285 #>>35742166 #
vkazanov ◴[] No.35738178[source]
Sure, it doesn't change the PC. But it can introduce a branch indirectly, I.e. when there is a jump to an address MOVed by cmov.

Either way, it seems that the wisdom has change since the last time I wrote and read assembly. Cmov used to be slow. It seems that the current answer is "it depends".

replies(1): >>35738226 #
1. 2102922286 ◴[] No.35738226[source]
If what you're saying is (roughly)

        cmovne  rax, rdx
        jmp     rax
that is, a cmov followed by an indirect jump to the address contained in rax, "jmp rax" is _always_ an indirect jump. It doesn't matter whether rax was set via a conditional move instruction or not.