(so, for example, this bug would have never been created by Rust unless it was deeply misused)
(so, for example, this bug would have never been created by Rust unless it was deeply misused)
Though, I really like the _mm_undefined_ps() intrinsics for SSE that make it clear that you're purposefully not initialising a variable. Something like that for ints and floats would be pretty sweet.
> The performance impact is negligible (less that 0.5% regression) to slightly positive (that is, some code gets faster by up to 1%). The code size impact is negligible (smaller than 0.5%). Compile-time regressions are negligible. Were overheads to matter for particular coding patterns, compilers would be able to obviate most of them.
> The only significant performance/code regressions are when code has very large automatic storage duration objects. We provide an attribute to opt-out of zero-initialization of objects of automatic storage duration. We then expect that programmer can audit their code for this attribute, and ensure that the unsafe subset of C++ is used in a safe manner.
> This change was not possible 30 years ago because optimizations simply were not as good as they are today, and the costs were too high. The costs are now negligible.
[1] https://github.com/cplusplus/papers/issues/1401
[2] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27...
A trick we were using with SSE was something like
__m128 zero = _mm_undefined_ps(); zero = _mm_xor_ps(zero, zero);
Now we were really careful with viewing our ops as data dependencies to reason about pipelining efficiency. But our profiling tools were not measuring this.
We did avoid _mm_set_ps(0.0f) which was actually showing up as cache misses.
I wonder if we were actually slower because cache misses are something we can measure?!