←back to thread

271 points mithcs | 3 comments | | HN request time: 0s | source
Show context
woodruffw ◴[] No.45953391[source]
Intentionally or not, this post demonstrates one of the things that makes safer abstractions in C less desirable: the shared pointer implementation uses a POSIX mutex, which means it’s (1) not cross platform, and (2) pays the mutex overhead even in provably single-threaded contexts. In other words, it’s not a zero-cost abstraction.

C++’s shared pointer has the same problem; Rust avoids it by having two types (Rc and Arc) that the developer can select from (and which the compiler will prevent you from using unsafely).

replies(13): >>45953466 #>>45953495 #>>45953667 #>>45954940 #>>45955297 #>>45955366 #>>45955631 #>>45955835 #>>45959088 #>>45959352 #>>45960616 #>>45962213 #>>45975677 #
kouteiheika ◴[] No.45953466[source]
> the shared pointer implementation uses a POSIX mutex [...] C++’s shared pointer has the same problem

It doesn't. C++'s shared pointers use atomics, just like Rust's Arc does. There's no good reason (unless you have some very exotic requirements, into which I won't get into here) to implement shared pointers with mutexes. The implementation in the blog post here is just suboptimal.

(But it's true that C++ doesn't have Rust's equivalent of Rc, which means that if you just need a reference counted pointer then using std::shared_ptr is not a zero cost abstraction.)

replies(2): >>45953492 #>>45953505 #
woodruffw ◴[] No.45953492[source]
To be clear, the “same problem” is that it’s not a zero-cost abstraction, not that it uses the same specific suboptimal approach as this blog post.
replies(1): >>45953527 #
kouteiheika ◴[] No.45953527[source]
I think that's an orthogonal issue. It's not that C++'s shared pointer is not a zero cost abstraction (it's as much a zero cost abstraction as in Rust), but that it only provides one type of a shared pointer.

But I suppose we're wasting time on useless nitpicking. So, fair enough.

replies(1): >>45953589 #
woodruffw ◴[] No.45953589[source]
I think they’re one and the same: C++ doesn’t have program-level thread safety by construction, so primitives like shared pointers need to be defensive by default instead of letting the user pick the right properties for their use case.

Edit: in other words C++ could provide an equivalent of Rc, but we’d see no end of people complaining when they shoot themselves in the foot with it.

(This is what “zero cost abstraction” means: it doesn’t mean no cost, just that the abstraction’s cost is no greater than the semantically equivalent version written by the user. So both Arc and shared_ptr are zero-cost in a MT setting, but only Rust has a zero-cost abstraction in a single-threaded setting.)

replies(3): >>45953732 #>>45957354 #>>45960578 #
SR2Z ◴[] No.45957354[source]
Isn't the point of using atomics that there is virtually no performance penalty in single threaded contexts?

IMO "zero cost abstraction" just means "I have a slightly less vague idea of what this will compile to."

replies(1): >>45957420 #
SkiFire13 ◴[] No.45957420[source]
No, atomics do have a performance penality compared to the equivalent single threaded code due to having to fetch/flush the impacted cache lines in the eventuality that another thread is trying to atomically read/write the same memory location at the same time.
replies(1): >>45958919 #
1. CyberDildonics ◴[] No.45958919[source]
Atomics have almost no impact when reading, which is what would happen in a shared pointer the vast majority of the time.
replies(2): >>45959048 #>>45959568 #
2. woodruffw ◴[] No.45959048[source]
> which is what would happen in a shared pointer the vast majority of the time.

This seems workload dependent; I would expect a lot of workloads to be write-heavy or at least mixed, since copies imply writes to the shared_ptr's control block.

3. oconnor663 ◴[] No.45959568[source]
I think it's pretty rare to do a straight up atomic load of a refcount. (That would be the `use_count` method in C++ or the `strong_count` method in Rust.) More of the time you're doing either a fetch-add to copy the pointer or a fetch-sub to destroy your copy, both of which involve stores. Last I heard the fetch-add can use the "relaxed" atomic ordering, which should make it very cheap, but the fetch-sub needs to use the "release" ordering, which is where the cost comes in.