←back to thread

173 points daviducolo | 1 comments | | HN request time: 0.203s | source
Show context
johnisgood ◴[] No.43335889[source]
Those CPU features (AVX2 and whatnot) need to be detected at runtime, too, however.

Those ifdefs only detect if the compiler supports them, i.e. at build-time only.

So... your program only compiles with AVX2 and others if the compiler supports them; so you should compile where the compiler has all those features (because you want everything to be compiled into one executable, of course), and then use runtime checks to make sure the CPU on which the program is run has actually support for AVX2, for example, as it can select the best implementation based on the available CPU features.

To make things a bit more complicated, let me quote a part from one of the projects he has: "The detection is performed at configure time through both CPUID flags and actual instruction execution tests on the host machine, verifying support in both the CPU and operating system.". Currently what you are doing is the "OS", or rather, compiler, since you are using only macro definitions.

Once you add this, then "Automatically leverages SSE4.2 and AVX2 instructions when available for maximum throughput." from the list of features on the website will be correct / accurate.

If interested, someone I know (or rather, follow) has a single header file for detecting CPU features at runtime (for C), and he also has a build-time detection one, but that has much more features.

replies(2): >>43339194 #>>43339808 #
doctorsher ◴[] No.43339808[source]
I am interested in the CPU intrinsics detection in a single header file, if you don’t mind dropping the link.
replies(2): >>43340702 #>>43341726 #
1. ashvardanian ◴[] No.43340702[source]
Here is the code I use to detect available CPU features at runtime and dispatch the correct kernels - https://github.com/ashvardanian/StringZilla/blob/main/c/lib....

There are also several AVX-512 kernels for different CPU generations, so it’s obviously longer than just checking for AVX.