←back to thread

267 points lawik | 2 comments | | HN request time: 0s | source
Show context
jhgg ◴[] No.42189283[source]
When I worked at Discord, we used BEAM hot code loading pretty extensively, built a bunch of tooling around it to apply and track hot-patches to nodes (which in turn could update the code on >100M processes in the system.) It allowed us to deploy hot-fixes in minutes (full tilt deploy could complete in a matter of seconds) to our stateful real-time system, rather than the usual ~hour long deploy cycle. We generally only used it for "emergency" updates though.

The tooling would let us patch multiple modules at a time, which basically wrapped `:rpc.call/4` and `Code.eval_string/1` to propagate the update across the cluster, which is to say, the hot-patch was entirely deployed over erlang's built-in distribution.

replies(2): >>42189462 #>>42191479 #
1. davisp ◴[] No.42189462[source]
This matches my experience. I spent a decade operating Erlang clusters and using hot code upgrades is a superpower for debugging a whole class of hard to track bugs. Although, without the tracking for cluster state it can be its own footgun when a hotpatch gets unpatched during a code deploy.

As for relups, I once tried starting a project to make them easier but eventually decided that the number of bazookas pointed at each and every toe made them basically a non-starter for anything that isn’t trivial. And if its trivial it was already covered by the nl (network load, send a local module to all nodes in the cluster and hot load it) style tooling.

replies(3): >>42189797 #>>42189898 #>>42192907 #
2. scotty79 ◴[] No.42192907[source]
> Although, without the tracking for cluster state it can be its own footgun when a hotpatch gets unpatched during a code deploy.

This and everything else said sounds so much like PHP+FTP workflow. It's so good.