Most active commenters

yetihehe(3)
hacknat(3)
jerf(3)

Go channels are bad

(www.jtolds.com)

Show context

hacknat ◴[02 Mar 16 16:36 UTC] No.11211002[source]▶

I think I've just come to accept that sychronization is the pain point in any language. It's callbacks, promises, and the single event loop in nodejs. It's channels in golang.

No one can come up with a single abstraction for synchronization without it failing in some regard. I code in go quite a bit and I just try to avoid synchronization like the plague. Are there gripes I have with the language? Sure, CS theory states that a thread safe hash table can perform just about as well as a none-thread safe, so why don't we have one in go? However...

Coming up with a valid case where a language's synchronization primitive fails and then flaming it as an anti-pattern (for the clicks and the attention, I presume) is trolling and stupid.

replies(3): >>11211077 #>>11211292 #>>11211863 #

1. yetihehe ◴[02 Mar 16 16:45 UTC] No.11211077[source]▶

>>11211002 #

> No one can come up with a single abstraction for synchronization without it failing in some regard.

Erlang did. Or at least it's as close as possible.

replies(2): >>11211145 #>>11211260 #

2. hacknat ◴[02 Mar 16 16:54 UTC] No.11211145[source]▶

>>11211077 (TP) #

I'm not saying Erlang isn't great, but if you need to pass a large datastructure around between Erlang processes then copy message passing starts to be a lot and you need to share memory. You can do it in Erlang, but I'd hardly call it great, and you're avoiding the sync primitive that Erlang offers.

replies(2): >>11211256 #>>11211361 #

3. catnaroek ◴[02 Mar 16 17:10 UTC] No.11211256[source]▶

>>11211145 #

How about Rust's “share by transferring ownership”?

(0) In the general case, whatever object you give to a third party, you don't own anymore. And the type checker enforces this.

(1) Unless the object's type supports shallow copying, in which case, you get to keep a usable copy after the move.

(2) If the object's type doesn't support shallow copying, but supports deep cloning, you can also keep a copy [well, clone], but only if you explicitly request it.

This ensures that communication is always safe, and never more expensive than it needs to be.

---

Sorry, I can't post a proper reply because I'm “submitting too fast”, so I'll reply here...

The solution consists of multiple steps:

(0) Wrap the resource in a RWLock [read-write lock: http://doc.rust-lang.org/std/sync/struct.RwLock.html], which can be either locked by multiple readers or by a single writer.

(1) The RWLock itself can't be cloned, so wrap it in an Arc [atomically reference-counted pointer: http://doc.rust-lang.org/std/sync/struct.Arc.html], which can be cloned.

(2) Clone and send to as many parties as you wish.

---

I still can't post a proper reply, so...

Rust's ownership and borrowing system is precisely what makes RWLock and Arc work correctly.

replies(1): >>11211307 #

4. jerf ◴[02 Mar 16 17:10 UTC] No.11211260[source]▶

>>11211077 (TP) #

I've been bitten by the fact that Erlang lacks a channel-like primitive. You've got half-a-dozen "pool" abstractions on github because it's actually sorta hard to run a pool on pure asynchronous messages when there is absolutely no way to send a message out to "somebody", the way Go channels can have multiple listeners. I know that would only work on a local node but there's already a couple of functions that have already penetrated that abstraction anyhow.

You also have to deal with mailboxes filling up, still have problems with single processes becoming bottlenecks, and the whole system is pervasively dynamically typed which is fine until it isn't.

It is pretty good, but it's not the best possible. (Neither is Go. I still like Erlang's default of async messages better in a lot of ways. I wish there was a way to get synchronous messages to multiple possible listeners somehow in Erlang, but I still think async is the better default.)

replies(1): >>11211606 #

5. hacknat ◴[02 Mar 16 17:16 UTC] No.11211307{3}[source]▶

>>11211256 #

What if you want multiple readers at once, and a writer thrown in once in a while?

Edit:

Okay, my point was that the sync primitives of most languages alone can't save you and you're using RWLock in your example, so clearly ownership by itself doesn't solve everything, right? That's the point I'm trying to make.

Edit2:

Hmm, I'll have to check that out. I don't know that I would call Rust's ownership model super easy to reason about, but it is nice that the compiler prevents you from doing so much stupid $#^&.

replies(2): >>11211466 #>>11213916 #

6. felixgallo ◴[02 Mar 16 17:22 UTC] No.11211361[source]▶

>>11211145 #

Erlang lifts sufficiently large binaries into refs, which isn't perfect but pragmatically helps a lot with that problem.

7. pcwalton ◴[02 Mar 16 17:33 UTC] No.11211466{4}[source]▶

>>11211307 #

> Okay, my point was that the sync primitives of most languages alone can't save you and you're using RWLock in your example, so clearly ownership by itself doesn't solve everything, right?

The thing is that Rust ensures that you take the locks properly. It's an compile-time error to forget to take the lock or to forget to release the lock†. You can't access the guarded data without doing that.

† For lock release, it's technically possible to hold onto a lock forever by intentionally creating cycles and leaking, but you really have to go out of your way to do so and it never happens in practice.

8. yetihehe ◴[02 Mar 16 17:50 UTC] No.11211606[source]▶

>>11211260 #

> You've got half-a-dozen "pool" abstractions on github because it's actually sorta hard to run a pool on pure asynchronous messages when there is absolutely no way to send a message out to "somebody"

You can store receivers in ets table and implement any type of selection algorithm you want or have some process which selects workers. There is no default method, because one default method is not good for everyone and people will complain that it's not good for them. Implementing pools is easy in erlang, I've done tailored implementations for several projects.

> You also have to deal with mailboxes filling up

Yeah, unless you implement back-pressure mechanism like waiting for confirmation of receiving. In ALL systems you have to deal with filling queues.

> I wish there was a way to get synchronous messages to multiple possible listeners somehow in Erlang

You can implement receiver which waits for messages and exits when all are received or after timeout, it's trivial in erlang but I haven't needed it yet. Here is a simple example:

    receive_multi(Acc,0) ->
        Acc;
    receive_multi(Acc,Num) ->
        receive {special,Data} ->
            receive_multi([Data|Acc],Num-1)
        after 5000 ->
            Acc
        end.

replies(2): >>11211833 #>>11212236 #

9. jerf ◴[02 Mar 16 18:22 UTC] No.11211833{3}[source]▶

>>11211606 #

"You can store receivers in ets table and implement any type of selection algorithm you want or have some process which selects workers."

Your process that selects workers has no mechanism for telling which are already busy.

It is easy to implement a pool in Erlang where you may accidentally select a busy worker when there's a free one available. Unfortunately, due to the nature of the network and the way computations work at scale, that's actually worse than it sounds; if one of the pool members gets tied up, legitimately or otherwise, in a long request, it will keep getting requests that it ignores until done, unnecessarily upping the latency of those other requests, possibly past the tolerance of the rest of the system.

"You can implement receiver which waits for messages and exits when all are received or after timeout, it's trivial in erlang but I haven't needed it yet."

That's the opposite of the direction I was talking about. You can't turn that around trivially. You can fling N messages out to N listeners, you can fling a message out to what always boils down to a random selection of N listeners (any attempt to be more clever requires coordination which requires creating a one-process bottleneck), but there is no way to say "Here's a message, let the first one of these N processes that gets to it take it".

You wouldn't have so many pool implementations if they weren't trying to get around this problem. It would actually be relatively easy to solve in the runtime but you can't bodge it in at the Erlang level; you simply lack the necessary primitives.

replies(2): >>11212031 #>>11213639 #

10. yetihehe ◴[02 Mar 16 18:49 UTC] No.11212031{4}[source]▶

>>11211833 #

Then it's even easier, pool selector just hands out free workers and deletes them from queue. When worker is free, it just sends a message "I'm free" and it gets added to "free" pool. Yes, it will be "one master process is a choke point" but it's only a problem when your tasks are so short that sending messages is slower than doing the work. But then probably sending messages is the wrong way to do those tasks. There are so many pool implementations because there are many possible solutions depending on what exact problem you have.

replies(1): >>11212177 #

11. jerf ◴[02 Mar 16 19:10 UTC] No.11212177{5}[source]▶

>>11212031 #

"Yes, it will be "one master process is a choke point" but it's only a problem when your tasks are so short that sending messages is slower than doing the work."

You're simply reiterating my point now, while still sounding like you think you're disagreeing. Yes, if you drop some of the requirements, the problem gets a lot easier. Unfortunately these are not such bizarre requirements, and Erlang tends to be positioned in exactly the spaces where they are most likely to come up.

"But then probably sending messages is the wrong way to do those tasks."

That translates to "Erlang is the wrong solution if that's your problem". Since my entire point all along here has been that Erlang is not the magic silver bullet, that's not a big problem for me.

12. querulous ◴[02 Mar 16 19:18 UTC] No.11212236{3}[source]▶

>>11211606 #

message sending has backpressure built in. as a mailbox's size increases it gets more and more expensive (in reductions, the currency erlang uses for scheduling processes) for a process to send a message to it

13. lgas ◴[02 Mar 16 22:41 UTC] No.11213639{4}[source]▶

>>11211833 #

Is there any reason you couldn't just have the workers request work from the pool process when they are ready for work instead of trying to push it to them?

14. azth ◴[02 Mar 16 23:27 UTC] No.11213916{4}[source]▶

>>11211307 #

> Hmm, I'll have to check that out. I don't know that I would call Rust's ownership model super easy to reason about, but it is nice that the compiler prevents you from doing so much stupid $#^&.

It's much better get compile time errors than deal with very hard to reproduce data races.

replies(1): >>11213942 #

15. kazinator ◴[02 Mar 16 23:31 UTC] No.11213942{5}[source]▶

>>11213916 #

Only, as usual, in situations when all else is equal.

By the way, on a related note, data races themselves are easier to reproduce than the visible negative consequences of those races on the execution of that program. That's the basis of tools like the "Helgrind" tool in Valgrind. That is to say, we can determine that some data is being accessed without a consistently held lock even when that access is working fine by dumb luck. We don't need an accident to prove that racing was going on, in other words. :)

replies(1): >>11214691 #

16. catnaroek ◴[03 Mar 16 02:42 UTC] No.11214691{6}[source]▶

>>11213942 #

> By the way, on a related note, data races themselves are easier to reproduce than the visible negative consequences of those races on the execution of that program.

Perhaps, but a data race by itself isn't sufficiently loud to catch my attention (no idea about yours), unless it consistently has visible consequences during debugging - preferably not too long after the data race itself takes place.

> That is to say, we can [emphasis mine] determine that some data is being accessed without a consistently held lock even when that access is working fine by dumb luck.

By “we”, do you mean human beings or computers? And, by “can”, do you mean “in theory” or “in practice”? Also, “only when we're lucky” or “reliably”?

> We don't need an accident to prove that racing was going on, in other words.

What I want to prove is the opposite - that there are no races going on.

↑