←back to thread

278 points love2read | 10 comments | | HN request time: 0.842s | source | bottom
1. pizza234 ◴[] No.42481083[source]
I've ported some projects to Rust (including C, where I've used C2Rust as first step), and I've drawn some conclusions.

1. Converting a C program to Rust, even if it includes unsafe code, often uncovers bugs quickly thanks to Rust’s stringent constraints (bounds checking, strict signatures, etc.).

2. automated C to Rust conversion is IMO something that will never be solved entirely, because the design of C program is fundamentally different from Rust; such conversions require a significant redesign to be made safe (of course, not all C programs are the same).

3. in some cases, it’s plain impossible to port a program from C to Rust while preserving the exact semantics, because unsafety can be inherent in the design.

That said, tooling is essential to porting, and as tools continue to evolve, the process will become more streamlined.

replies(2): >>42481307 #>>42482340 #
2. LPisGood ◴[] No.42481307[source]
>because unsafety can be inherent in the design

I agree in principle, and I have strong feelings based on my experience that this is the case, but I think it would be illustrative to have some hard examples in mind. Does anyone know any simple cases to ground this discussion in?

replies(2): >>42481571 #>>42481575 #
3. nuancebydefault ◴[] No.42481571[source]
Suppose it is a dll that has exported functions returning or accepting unsafe strings. No way to make it safe without changing the API.
replies(1): >>42482125 #
4. colejohnson66 ◴[] No.42481575[source]
Maybe a JIT? Especially one that can poke back into the runtime (like dotnet).
replies(1): >>42482090 #
5. LPisGood ◴[] No.42482090{3}[source]
I know Unity game engine uses some transpiler called IL2CPP that manages to preserve some of the safety features of C# but still gets the speed of CPP, so one direction is definitely possible
replies(1): >>42483064 #
6. tatref ◴[] No.42482125{3}[source]
In Rust, there is no unsafe String, only blocks of code can be unsafe, no?
replies(1): >>42482331 #
7. whytevuhuni ◴[] No.42482331{4}[source]
They likely mean a char* pointer to a null-terminated string, or a char* pointer and a length, as is usual for C.

If Rust was forced to expose such an API (to be on par with C's old API), it would have to use `*const u8` in its signature. Converting that to something that can be used in Rust is unsafe.

Even once converted to &[u8], it now has to deal with non-UTF8 inputs throughout its whole codebase, which is a lot more inconvenient. A lot of methods, like .split_ascii_whitespace, are missing on &[u8]. A lot of libraries won't take anything but a &str.

Or they might be tempted to convert such an input to a String, in which case the semantics will differ (it will now panic on non-UTF8 inputs).

8. int_19h ◴[] No.42482340[source]
> automated C to Rust conversion is IMO something that will never be solved entirely

Automated conversion of C to safe fast Rust is hard. Automated conversion of C to safe Rust in general is much easier - you just need to represent memory as an array, and treat pointers as indices into said array. Now you can do everything C can do - unchecked pointer arithmetic, unions etc - without having to fight the borrow checker. Semantics fully preserved. Similar techniques have been used for C-to-Java for a long time now.

Of course, the value of such a conversion is kinda dubious. You basically end up with something like C compiled to wasm, but even slower, and while the resulting code is technically "safe", it is still susceptible to issues buffer overflows producing invalid state, dangling pointers allowing access to data in contexts where it shouldn't be allowed etc.

replies(1): >>42482663 #
9. zozbot234 ◴[] No.42482663[source]
You can do a lot better than that. You can treat memory ranges coming from separate allocations as distinct segments, and pointers as tuples of a segment ID and a linear offset within the segment. This is essentially what systems like CHERI are built on, and how C and C++ are implemented on segmented architectures like the 8086 and 80286. The C standard includes a somewhat limited notion of "objects" that's intended to support this exact case.
10. neonsunset ◴[] No.42483064{4}[source]
Oh, it's mainly for platform compatibility. IL2CPP performance is really problematic since it still carries many issues of Mono, even if transpiled to C++: https://meetemq.com/2023/09/18/is-net-8-performant-enough/ (don't look at just the starting graph - make sure to scroll down to notes where RyuJITs code competes with other fast entries or even outperforms them).

Perhaps what you were looking for is NativeAOT? Either way C ports really well to C# since it supports a large subset of it "as is" and then some with generics and other features originating from C# itself.