Zlib-rs is faster than C

(trifectatech.org)

341 points dochtman | 2 comments | 16 Mar 25 19:35 UTC | HN request time: 0.405s | source

Show context

YZF ◴[16 Mar 25 20:12 UTC] No.43381858[source]▶

I found out I already know Rust:

        unsafe {
            let x_tmp0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x10);
            xmm_crc0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x01);
            xmm_crc1 = _mm_xor_si128(xmm_crc1, x_tmp0);
            xmm_crc1 = _mm_xor_si128(xmm_crc1, xmm_crc0);

Kidding aside, I thought the purpose of Rust was for safety but the keyword unsafe is sprinkled liberally throughout this library. At what point does it really stop mattering if this is C or Rust?

Presumably with inline assembly both languages can emit what is effectively the same machine code. Is the Rust compiler a better optimizing compiler than C compilers?

replies(30): >>43381895 #>>43381907 #>>43381922 #>>43381925 #>>43381928 #>>43381931 #>>43381934 #>>43381952 #>>43381971 #>>43381985 #>>43382004 #>>43382028 #>>43382110 #>>43382166 #>>43382503 #>>43382805 #>>43382836 #>>43383033 #>>43383096 #>>43383480 #>>43384867 #>>43385039 #>>43385521 #>>43385577 #>>43386151 #>>43386256 #>>43386389 #>>43387043 #>>43388529 #>>43392530 #

torginus ◴[17 Mar 25 10:48 UTC] No.43387043[source]▶

>>43381858 #

I wonder why writing SIMD in high-level languages hasn't been figured out yet for CPUs (it has been the norm for GPUs for since forever). Auto-vectorization universally sucks, so do OpenMP directives.

There was Ispc, which was a separate C-like programming language just for SIMD, but I don't understand why can't regular compilers generated high-quality vectorized code.

replies(2): >>43388044 #>>43388570 #

1. YoshiRulz ◴[17 Mar 25 12:56 UTC] No.43388044[source]▶

>>43387043 #

.NET (C#) is getting there with Vector<T>.

replies(1): >>43392279 #

2. torginus ◴[17 Mar 25 20:09 UTC] No.43392279[source]▶

>>43388044 (TP) #

That's just syntactic sugar (and a bit of architecture independence) over intrinsics. You can get the same in C++ just with wrapping intrinsics in classes, and a few ifdefs.

↑