←back to thread

170 points judicious | 2 comments | | HN request time: 0.401s | source
Show context
Joker_vD ◴[] No.45407788[source]
Just so you know, the "return x < 0 ? -x : x;" compiles into

    abs_branch:
        mov     eax, edi
        neg     eax
        cmovs   eax, edi
        ret
on x64, and into

    abs_branch:
        srai    a5,a0,31
        xor     a0,a5,a0
        sub     a0,a0,a5
        ret
on RISC-V if you use a C compiler with a half-decent codegen. And "branchy" clamp() translates into

    clamp:
        cmp     edi, edx
        mov     eax, esi
        cmovle  edx, edi
        cmp     edi, esi
        cmovge  eax, edx
        ret
Seriously, the automatic transformation between ?: and if-then-else (in both directions) is quite well studied by now. And if you try to benchmark difference between branching and branchless implementations, please make sure that the branches you expect are actually there in the compiler's output.
replies(3): >>45408887 #>>45408971 #>>45411930 #
1. zeckalpha ◴[] No.45408971[source]
The inverse problem is here too:

> int partition_branchless(int* arr, int low, int high) {... for (int j = low; j < high; j++) { ... }... }

That for loop is just a sugared while loop which is just a sugared cmp and jmp

replies(1): >>45417846 #
2. xigoi ◴[] No.45417846[source]
The articleesays that it optimizes just the inside of the loop (the loop jump can be optimized via unrolling, which the compiler may do automatically).