Most active commenters

panstromek(5)
einpoklum(4)

Wild – A fast linker for Linux

(github.com)

Show context

KerrAvon ◴[24 Jan 25 21:59 UTC] No.42817318[source]▶

I'm curious: what's the theory behind why this would be faster than mold in the non-incremental case? "Because Rust" is a fine explanation for a bunch of things, but doesn't explain expected performance benefits.

"Because there's low hanging concurrent fruit that Rust can help us get?" would be interesting but that's not explicitly stated or even implied.

replies(1): >>42818238 #

1. davidlattimore ◴[25 Jan 25 00:20 UTC] No.42818238[source]▶

>>42817318 #

I'm not actually sure, mostly because I'm not really familiar with the Mold codebase. One clue is that I've heard that Mold gets about a 10% speedup by using a faster allocator (mimalloc). I've tried using mimalloc with Wild and didn't get any measurable speedup. This suggests to me that Mold is probably making heavier use of the allocator than Wild is. With Wild, I've certainly tried to optimise the number of heap allocations.

But in general, I'd guess just different design decisions. As for how this might be related to Rust - I'm certain that were Wild ported from Rust to C or C++, that it would perform very similarly. However, code patterns that are fine in Rust due to the borrow checker, would be footguns in languages like C or C++, so maintaining that code could be tricky. Certainly when I've coded in C++ in the past, I've found myself coding more defensively, even at a small performance cost, whereas with Rust, I'm able to be a lot bolder because I know the compiler has got my back.

replies(2): >>42820415 #>>42820731 #

2. menaerus ◴[25 Jan 25 08:46 UTC] No.42820415[source]▶

>>42818238 (TP) #

> Mold gets about a 10% speedup by using a faster allocator (mimalloc). I've tried using mimalloc with Wild and didn't get any measurable speedup

Perhaps it is worth repeating the experiment with heavy MLoC codebases. jmalloc or mimalloc.

3. einpoklum ◴[25 Jan 25 10:13 UTC] No.42820731[source]▶

>>42818238 (TP) #

Rust is a perfectly fine language, and there's no reason you should not be able to implement fast incremental linking using Rust, so - I wish you success in doing that.

... however...

> code patterns that are fine in Rust due to the borrow checker, would be footguns in languages like C or C++,

That "dig" is probably not true. Or rather, your very conflation of C and C++ suggests that you are talking about the kind of code which would not be used in modern C++ of the past decade-or-more. While one _can_ write footguns in C++ easily, one can also very easily choose not to do so - especially when writing a new project.

replies(1): >>42820796 #

4. panstromek ◴[25 Jan 25 10:35 UTC] No.42820796[source]▶

>>42820731 #

Tell me you don't have rust experience without telling me you don't have rust experience.

replies(1): >>42820836 #

5. panstromek ◴[25 Jan 25 10:44 UTC] No.42820836{3}[source]▶

>>42820796 #

I mean, sorry for the snark but really, there's so many of these things that it's just ridiculous to even attempt to compare. e.g. I wouln't ever use something like string_view or span unless the code is absolutely performance critical. There's a lot of defensive copying in C(++), because all the risks of losing track of pointers are just not worth it. In Rust, you can go really wild with this, there's no comparison.

replies(2): >>42822546 #>>42840671 #

6. einpoklum ◴[25 Jan 25 16:24 UTC] No.42822546{4}[source]▶

>>42820836 #

> because all the risks of losing track of pointers are just not worth it.

These risks are mostly, and often entirely, gone when you write modern C++. You don't lose track of them, because you don't track them, and you only use them when you don't need to track them. (Except for inside the implementations of a few data structures, which one can think of as the equivalent of unsafe code in Rust). Of course I'm generalizing here, but again, you just don't write C-style code, and you don't have those problems.

(You may have some other problems of course, C++ has many warts.)

replies(1): >>42829548 #

7. panstromek ◴[26 Jan 25 12:01 UTC] No.42829548{5}[source]▶

>>42822546 #

I don't see how modern C++ solves any of those problems, and especially without performance implications.

Like, how do you make sure that you don't hold any dangling references to a vector that reallocated? How do you make sure that code that needs synchronization is synchronized? How do you make sure that non-thread safe code is never used from multiple threads? How do you make sure that you don't ever invalidate an iterator? How do you make sure that you don't hold a reference to a data owned by unique pointer that went out of scope? How do you make sure you don't hold a string view for a string that went out of scope?

As far as I know (and how I experienced it), the answer to all of those questions is to either use some special api that you have to know about, or do something non-optimal, like creating a defensive copy, use a shared pointer or adding "just in case" mutex, or "just remember you might cause problem a and be careful."

In Rust all of those problems are a compile error and you have to make an extra effort to trigger them at runtime with unsafe code. That's a very big difference and I don't understand how can modern C++ come even close to it.

replies(1): >>42831399 #

8. einpoklum ◴[26 Jan 25 16:46 UTC] No.42831399{6}[source]▶

>>42829548 #

> Like, how do you make sure that you don't hold any dangling references to a vector that reallocated?

So, I'll first nitpick and say that's not a problem with pointer tracking.

To answer the question, though:

When I'm writing a function which receives a reference to a vector, then - either it's a const reference, in which case I don't change it, or it's a non-const reference, in which case I can safely assume I'm allowed to change it - but I can't keep any references or pointers into it, or iterators from it etc. I also expect and rely on functions that I call with a non-const reference to that vector, to act the same.

And when I create a vector, I just rely on the above in functions I call.

This is not some gamble. It's how C++ code is written. Yes, you can write code which breaks that principle if you like, but - I don't, and library authors don't.

> How do you make sure that code that needs synchronization is synchronized?

You mean, synchronization between threads which work on the same data? There's no one answer for that. It depends. If you want to be super-safe, you don't let your threads know about another other than multithread-aware data structures, whose methods ensure synchronization. Like a concurrent queue or map or something. If it's something more performance-critical whether synchronization is too expensive, then you might work out when/where it's safe for the threads to work on the same data, and keep the synchronization to a minimum. Which is kind of like unsafe Rust, I imagine. But it's true that it's pretty easy to ignore synchronization and just "let it rip", and C++ will not warn you about doing that. Still, you won't enter that danger zone unless you've explicitly decided to do multithreaded work.

About the Rust side of things... isn't it Turing-complete to know whether, and when, threads need to synchronize? I'm guessing that safe Rust demands that you not share data which has unsynchronized access, between threads.

> the answer to all of those questions is to either use some special api that you have to know about

C++ language features and standard library facilities are a "special API" that you have to know about. But then, so are raw pointers. A novice C++ programming student might not even be taught about using them until late in their first programming course.

My main point was, that if you talk about "C/C++ progamming", then you will necessarily not use most of those language features and facilities - which are commonly used in modern code and can keep you safe. You would be writing C-like code, and will have to be very careful (or reinvent the wheel, creating such mechanisms yourself).

replies(1): >>42844642 #

9. account42 ◴[27 Jan 25 13:00 UTC] No.42840671{4}[source]▶

>>42820836 #

That you subject yourself to FUD is not an argument for anything.

replies(1): >>42845203 #

10. panstromek ◴[27 Jan 25 19:24 UTC] No.42844642{7}[source]▶

>>42831399 #

Most of what you describe, especially in the multithreading part, is already a defensive practice. That's kind of the whole point. I don't deny that some modern C++ constructs help, I've used them, but the level of confidence is just not there. Note that I lump C and C++ together intentionally. For this purpose, they are almost equivalent as Rust tackles the problems they have in common.

I think it'd be better if you first try understand what actually Rust does here, for which I usually recommend this talk for C ++ developers, which describes the most important ideas on snippets of C++ and Rust side by side: https://youtu.be/IPmRDS0OSxM

That's probably my favourite demonstration.

replies(1): >>42863869 #

11. panstromek ◴[27 Jan 25 20:22 UTC] No.42845203{5}[source]▶

>>42840671 #

No, it's just business. Memory corruption bugs are crazy expensive. One of those N cases goes wrong at some point and somebody will have to spend a week in gdb with corrupt stacktraces from production on some issue that's non determinstic and doesn't reproduce on dev machine.

12. einpoklum ◴[29 Jan 25 11:46 UTC] No.42863869{8}[source]▶

>>42844642 #

> I don't deny that some modern C++ constructs help

This thread started because you essentially denied these constructs have any significance, as you lumped the two languages together. You are still overstating your point.

Moreover - Rust has different design goals than any of these two languages. Indeed, neither of them guarantees memory safety at the language level; Rust makes different tradeoffs, paid a certain price, and does guarantee it. I will watch that video though.

↑