Most active commenters

dmitrygr(3)
(3)
Animats(3)

Popular/hot comments

>>43979123 #

←back to thread

Flattening Rust’s learning curve

(corrode.dev)

1. dmitrygr ◴[13 May 25 23:24 UTC] No.43978986[source]▶

>>43978435 (OP) #

> Treat the borrow checker as a co-author, not an adversary

Why would I pair-program with someone who doesn’t understand doubly-linked lists?

replies(6): >>43979041 #>>43979123 #>>43979152 #>>43980150 #>>43980304 #>>43982624 #

2. dwattttt ◴[13 May 25 23:31 UTC] No.43979041[source]▶

>>43978986 (TP) #

I'd rather pair program with someone wary of double-linked lists, but is really hot on understanding ownership than the other way around.

replies(1): >>43979330 #

3. mre ◴[13 May 25 23:43 UTC] No.43979123[source]▶

>>43978986 (TP) #

For people who don't get the reference, this might be referring to the notoriously gnarly task of implementing a doubly-linked lists in Rust [1]

It is doable, just not as easy as in other languages because a production-grade linked-list is unsafe because Rust's ownership model fundamentally conflicts with the doubly-linked structure. Each node in a doubly-linked list needs to point to both its next and previous nodes, but Rust's ownership rules don't easily allow for multiple owners of the same data or circular references.

You can implement one in safe Rust using Rc<RefCell<Node>> (reference counting with interior mutability), but that adds runtime overhead and isn't as performant. Or you can use raw pointers with unsafe code, which is what most production implementations do, including the standard library's LinkedList.

https://rust-unofficial.github.io/too-many-lists/

replies(4): >>43979377 #>>43979467 #>>43980233 #>>43980462 #

4. pornel ◴[13 May 25 23:48 UTC] No.43979152[source]▶

>>43978986 (TP) #

So that you learn that loaning is for giving temporary shared^exclusive access within a statically-known scope, and not for storing data.

Trying to construct permanent data structures using non-owning references is a very common novice mistake in Rust. It's similar to how users coming from GC languages may expect pointers to local variables to stay valid forever, even after leaving the scope/function.

Just like in C you need to know when malloc is necessary, in Rust you need to know when self-contained/owning types are necessary.

replies(2): >>43979733 #>>43982547 #

5. ◴[14 May 25 00:12 UTC] No.43979330[source]▶

>>43979041 #

6. ◴[14 May 25 00:18 UTC] No.43979377[source]▶

>>43979123 #

7. Animats ◴[14 May 25 00:34 UTC] No.43979467[source]▶

>>43979123 #

Rust still needs a way out of that mess. It's conceptually possible to have compile time checking for this. Think of RefCell/Weak and .upgrade() and .borrow() being checked at compile time.

I've discussed this with some of the Rust devs. The trouble is traits. You'd need to know if a trait function could borrow one of its parameters, or something referenced by one of its parameters. This requires analysis that can't be done until after generics have been expanded. Or a lot more attributes on trait parameters. This is a lot of heavy machinery to solve a minor problem.

replies(2): >>43980106 #>>43981122 #

8. mplanchard ◴[14 May 25 01:22 UTC] No.43979733[source]▶

>>43979152 #

The biggest thing I’ve run into where I really want self-referential types is for work that I want to perform once and then cache, while still needing access to the original data.

An example: parsing a cookie header to get cookie names and values.

In that case, I settled on storing indexes indicating the ranges of each key and value instead of string slices, but it’s obviously a bit more error prone and hard to read. Benchmarking showed this to be almost twice as fast as cloning the values out into owned strings, so it was worth it, given it is in a hot path.

I do wish it were easier though. I know there are ways around this with Pin, but it’s very confusing IMO, and still you have to work with pointers rather than just having a &str.

9. umanwizard ◴[14 May 25 02:29 UTC] No.43980106{3}[source]▶

>>43979467 #

> Rust still needs a way out of that mess.

In practice, it really doesn't. The difficulty of implementing doubly linked lists has not stopped people from productively writing millions of lines of Rust in the real world. Most programmers spend less than 0.1% of their time reimplementing linked data structures; rust is pretty useful for the other 99.9%.

replies(1): >>43980848 #

10. lmm ◴[14 May 25 02:37 UTC] No.43980150[source]▶

>>43978986 (TP) #

Because you care about productivity and safety more than l33t h4x0r hazing rituals?

11. worik ◴[14 May 25 02:50 UTC] No.43980233[source]▶

>>43979123 #

I am working on a code base, that among its many glories and poo balls every list is a doubly linked list.

Stop!

If you are using a doubly linked list you (probably) do not have to, or want to.

There is almost no case where you need to traverse a list in both directions (do you want a tree?)

A doubly linked list wastes memory with the back links that you do not need.

A singly linked list is trivial to reason about: There is this node and the rest. A doubly linked list more than doubles that cognitive load.

Think! Spend time carefully reasoning about the data structures you are using. You will not need that complicated, wasteful, doubly linked list

replies(2): >>43980277 #>>43981044 #

12. dmitrygr ◴[14 May 25 02:59 UTC] No.43980277{3}[source]▶

>>43980233 #

> There is almost no case where you need to traverse a list in both directions

But you might need to remove a given element that you have a pointer to in O(1), which a singly linked list will not do

replies(2): >>43980367 #>>43980371 #

13. Ar-Curunir ◴[14 May 25 03:02 UTC] No.43980304[source]▶

>>43978986 (TP) #

Because most of my code is not doubly-linked lists!

14. dwattttt ◴[14 May 25 03:13 UTC] No.43980367{4}[source]▶

>>43980277 #

If that's a specific use case you need to handle, it's O(1) again if you have a pointer to both the node to be removed and the previous node.

Whether it's more efficient to carry a second pointer around when manipulating the list, or store a second pointer in every list node (aka double linked list) is up to your problem space.

Or whether an O(n) removal is acceptable.

15. MeetingsBrowser ◴[14 May 25 03:13 UTC] No.43980371{4}[source]▶

>>43980277 #

Getting the pointer to that element means randomly hopping around the heap to traverse the list though.

Linked lists are perfect for inserting/deleting nodes, as long as you never need to traverse the list or access any specific node.

replies(1): >>43992141 #

16. sbrother ◴[14 May 25 03:30 UTC] No.43980462[source]▶

>>43979123 #

Apologies since I have not taken the time to learn rust yet, but I've written a lot of modern C++. Is the ownership model kind of like std::unique_ptr and std::move, and `Rc<RefCell<Node>>` the same idea as `std::shared_ptr`? But less idiomatic? Or do I have the wrong idea?

replies(1): >>43980810 #

17. khuey ◴[14 May 25 04:28 UTC] No.43980810{3}[source]▶

>>43980462 #

Not really, because Rust enforces a "many readers or one writer" invariant on everything that has no C++ equivalent. That invariant is precisely what makes the doubly-linked list case hard (because every interior node in the list would be readable from two places, which means it can never be written to).

18. Animats ◴[14 May 25 04:33 UTC] No.43980848{4}[source]▶

>>43980106 #

Doubly linked lists are rare, but backlinks to the owner are often needed. It's the same problem, mostly.

replies(1): >>43983559 #

19. ◴[14 May 25 05:11 UTC] No.43981044{3}[source]▶

>>43980233 #

20. bigstrat2003 ◴[14 May 25 05:27 UTC] No.43981122{3}[source]▶

>>43979467 #

> Rust still needs a way out of that mess.

It has one: use raw pointers and unsafe. People are way too afraid of unsafe, it's there specifically to be used when needed.

21. pjmlp ◴[14 May 25 09:24 UTC] No.43982547[source]▶

>>43979152 #

Note that some users of GC languages that support stack allocation, are used that it is a compiler error trying to have such a pointer/reference.

D example, https://godbolt.org/z/bbfbeb19a

> Error: returning `& my_value` escapes a reference to local variable `my_value`

C# example, https://godbolt.org/z/Y8MfYMMrT

> error CS8168: Cannot return local 'num' by reference because it is not a ref local

22. scotty79 ◴[14 May 25 09:38 UTC] No.43982624[source]▶

>>43978986 (TP) #

It's not that it doesn't understand doubly linked list. It's just that you don't understand their consequences and have no objections against turning your program state briefly into inconsistent bullshit to facilitate them. The compiler minds. Unless you use Rc<>. That's what this language has for expressing inconsistency. Or unsafe {} if you are cocky. Borrows are not named pointers for a reason.

23. mplanchard ◴[14 May 25 12:10 UTC] No.43983559{5}[source]▶

>>43980848 #

Backlinks work fine with weak Arc references, don’t they?

replies(1): >>43987311 #

24. Animats ◴[14 May 25 17:47 UTC] No.43987311{6}[source]▶

>>43983559 #

Yes. But the Arc has to wrap a Mutex, which means you have to lock to get access. It's a dual of the Rc/RefCell/borrow mechanism.

The trouble with calling .lock() is that there is a potential for deadlock. There are some people working on static analysis for deadlock prevention, which is a dual of the static analysis for double borrow protection problem. We're maybe a PhD thesis or two from a solution. Here's some current research, out of Shanghai.[1] Outlines the theory, but code does not yet seem to be available.

[1] https://arxiv.org/pdf/2401.01114

25. dmitrygr ◴[15 May 25 05:32 UTC] No.43992141{5}[source]▶

>>43980371 #

You’re assuming no other data structure points to the element. It may. Example: implement a cache.

Each element is: key, value, linked list node for hash table bucket, linked list node for LRU. Hash table to look up element. Element is both a member of hash table and of linked list. Linked list is used as LRU for feeling memory when needed.

LRU never traversed but often needs removal and reinsertion.

↑