Zlib-rs is faster than C

(trifectatech.org)

Show context

YZF ◴[16 Mar 25 20:12 UTC] No.43381858[source]▶

I found out I already know Rust:

        unsafe {
            let x_tmp0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x10);
            xmm_crc0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x01);
            xmm_crc1 = _mm_xor_si128(xmm_crc1, x_tmp0);
            xmm_crc1 = _mm_xor_si128(xmm_crc1, xmm_crc0);

Kidding aside, I thought the purpose of Rust was for safety but the keyword unsafe is sprinkled liberally throughout this library. At what point does it really stop mattering if this is C or Rust?

Presumably with inline assembly both languages can emit what is effectively the same machine code. Is the Rust compiler a better optimizing compiler than C compilers?

replies(30): >>43381895 #>>43381907 #>>43381922 #>>43381925 #>>43381928 #>>43381931 #>>43381934 #>>43381952 #>>43381971 #>>43381985 #>>43382004 #>>43382028 #>>43382110 #>>43382166 #>>43382503 #>>43382805 #>>43382836 #>>43383033 #>>43383096 #>>43383480 #>>43384867 #>>43385039 #>>43385521 #>>43385577 #>>43386151 #>>43386256 #>>43386389 #>>43387043 #>>43388529 #>>43392530 #

Aurornis ◴[16 Mar 25 20:21 UTC] No.43381931[source]▶

>>43381858 #

Using unsafe blocks in Rust is confusing when you first see it. The idea is that you have to opt-out of compiler safety guarantees for specific sections of code, but they’re clearly marked by the unsafe block.

In good practice it’s used judiciously in a codebase where it makes sense. Those sections receive extra attention and analysis by the developers.

Of course you can find sloppy codebases where people reach for unsafe as a way to get around Rust instead of writing code the Rust way, but that’s not the intent.

You can also find die-hard Rust users who think unsafe should never be used and make a point to avoid libraries that use it, but that’s excessive.

replies(10): >>43381986 #>>43382095 #>>43382102 #>>43382323 #>>43385098 #>>43385651 #>>43386071 #>>43386189 #>>43386569 #>>43392018 #

timschmidt ◴[16 Mar 25 20:27 UTC] No.43381986[source]▶

>>43381931 #

Unsafe is a very distinct code smell. Like the hydrogen sulfide added to natural gas to allow folks to smell a gas leak.

If you smell it when you're not working on the gas lines, that's a signal.

replies(6): >>43382188 #>>43382239 #>>43384810 #>>43385163 #>>43385670 #>>43386705 #

cmrdporcupine ◴[16 Mar 25 20:51 UTC] No.43382188[source]▶

>>43381986 #

Look, no. Just go read the unsafe block in question. It's just SIMD intrinsics. No memory access. No pointers. It's unsafe in name only.

No need to get all moral about it.

replies(3): >>43382234 #>>43382266 #>>43382480 #

kccqzy ◴[16 Mar 25 20:56 UTC] No.43382234[source]▶

>>43382188 #

By your line of reasoning, SIMD intrinsics functions should not be marked as unsafe in the first place. Then why are they marked as unsafe?

replies(4): >>43382276 #>>43382451 #>>43384972 #>>43385883 #

thrance ◴[17 Mar 25 03:39 UTC] No.43384972[source]▶

>>43382234 #

For now the caller has to ensure proper alignment of SMID lines. But in the future a safe API will be made available, once the kinks are ironed out. You can already use it in fact, by enabling a specific compiler feature [1].

[1] https://doc.rust-lang.org/std/simd/index.html

replies(1): >>43385024 #

anonymoushn ◴[17 Mar 25 03:49 UTC] No.43385024[source]▶

>>43384972 #

there are no loads in the above unsafe block, in practice loadu is just as fast as load, and even if you manually use the aligned load or store, you get a crash. it's silly to say that crashes are unsafe.

replies(1): >>43385188 #

1. jchw ◴[17 Mar 25 04:21 UTC] No.43385188[source]▶

>>43385024 #

Well, there's a category difference between a crash as in a panic and a crash as in a CPU exception. Usually, "safe" programming limits crashes to language-level error handling, which allows you to easily reason about the nature of crashes: if the type system is sound and your program doesn't use unsafe, the only way it should crash is by panic, and panics are recoverable and leave your program in a well-defined state. By the time you get to a signal handler, you're too late. Admittedly, there are some cases where this is less important than others... misaligned load/store wouldn't lead to a potential RCE, but if it can bring down a program it still is a potential DoS vector.

Of course, in practice, even in Rust, it isn't strictly true that programs without unsafe can't crash with fatal runtime errors. There's always stack overflows, which will crash you with a SIGABRT or equivalent operating system error.

replies(2): >>43387323 #>>43387638 #

2. gpderetta ◴[17 Mar 25 11:27 UTC] No.43387323[source]▶

>>43385188 (TP) #

As you point out later, a SIGBRT or a SIGBUS would both be perfectly safe and really no different than a panic. With enough infra you could convert them to panic anyway (but probably not worth the effort).

replies(1): >>43388398 #

3. thrance ◴[17 Mar 25 12:11 UTC] No.43387638[source]▶

>>43385188 (TP) #

Also, AFAIK panics are not always recoverable in Rust. You can compile your project with `panic = "abort"`, in which case the program will quit immediately whenever a panic is encountered.

replies(1): >>43388463 #

4. jchw ◴[17 Mar 25 13:29 UTC] No.43388398[source]▶

>>43387323 #

Well, that's the thing though: in terms of Rust and Go and other safe programming languages, CPU exceptions are not "safe" even though they are not inherently dangerous. The point is that the subset of the language that is safe can't generate them, period. They are not accounted for in safe code.

There are uses for this, especially since some code will run in environments where you can not simply handle it, but it's also just cleaner this way; you don't have to worry about the different behaviors between operating systems and possibly CPU architectures with regards to error recovery if you simply don't generate any.

Since there are these edge cases where it wouldn't be possible to handle faults easily (e.g. some kernel code) it needs to be considered unsafe in general.

replies(1): >>43393003 #

5. jchw ◴[17 Mar 25 13:35 UTC] No.43388463[source]▶

>>43387638 #

Sure, but that is beside the point: if you compile code like that, you're intentionally making panics unrecoverable. The nature of panics from the language perspective is not any different; you're still in a well-defined state when it happens.

It's also possible to go a step further and practice "panic-free" Rust where you write code in such a way that it never links to the panic handler. Seems pretty hard to do, but seems like it might be worth it sometimes, especially if you're in an environment where you don't have anything sensible to do on a panic.

6. comex ◴[17 Mar 25 21:41 UTC] No.43393003{3}[source]▶

>>43388398 #

That’s largely true, but there are some exceptions (pun not intended).

In Rust, the CPU exception resulting from a stack overflow is considered safe. The compiler uses stack probing to ensure that as long as there is at least one page of unmapped memory below the stack (guard page), the program will reliably fault on it rather than continuing to access memory further below. In most environments it is possible to set up a guard page, including Linux kernel code if CONFIG_VMAP_STACK is enabled. But there are other environments where it’s not, such as WebAssembly and some microcontrollers. In those environments, the backend would have to add explicit checks to function prologs to ensure enough stack is available. I say “would have to”, not “does”: I’ve heard that on at least the microcontrollers, there are no such checks and Rust is just unsound at the moment. Not sure about WebAssembly.

Meanwhile, Go uses CPU exceptions to handle nil dereferences.

replies(1): >>43393106 #

7. jchw ◴[17 Mar 25 21:53 UTC] No.43393106{4}[source]▶

>>43393003 #

Yeah, I glossed over the Rust stack overflow case. I don't know why: Literally two parent comments up I did bother to mention it.

That said, I actually entirely forgot Go catches nil derefs in a segfault handler. I guess it's not a big deal since Go isn't really suitable for free-standing environments where avoiding CPU exceptions is sometimes more useful, so there's no particular reason why the runtime can't rely on it.

↑