Using Erlang hot code updates

(underjord.io)

268 points lawik | 2 comments | 19 Nov 24 20:29 UTC | HN request time: 0.434s | source

Show context

jhgg ◴[19 Nov 24 23:43 UTC] No.42189283[source]▶

When I worked at Discord, we used BEAM hot code loading pretty extensively, built a bunch of tooling around it to apply and track hot-patches to nodes (which in turn could update the code on >100M processes in the system.) It allowed us to deploy hot-fixes in minutes (full tilt deploy could complete in a matter of seconds) to our stateful real-time system, rather than the usual ~hour long deploy cycle. We generally only used it for "emergency" updates though.

The tooling would let us patch multiple modules at a time, which basically wrapped `:rpc.call/4` and `Code.eval_string/1` to propagate the update across the cluster, which is to say, the hot-patch was entirely deployed over erlang's built-in distribution.

replies(2): >>42189462 #>>42191479 #

1. davisp ◴[20 Nov 24 00:08 UTC] No.42189462[source]▶

>>42189283 #

This matches my experience. I spent a decade operating Erlang clusters and using hot code upgrades is a superpower for debugging a whole class of hard to track bugs. Although, without the tracking for cluster state it can be its own footgun when a hotpatch gets unpatched during a code deploy.

As for relups, I once tried starting a project to make them easier but eventually decided that the number of bazookas pointed at each and every toe made them basically a non-starter for anything that isn’t trivial. And if its trivial it was already covered by the nl (network load, send a local module to all nodes in the cluster and hot load it) style tooling.

replies(3): >>42189797 #>>42189898 #>>42192907 #

2. scotty79 ◴[20 Nov 24 11:35 UTC] No.42192907[source]▶

>>42189462 (TP) #

> Although, without the tracking for cluster state it can be its own footgun when a hotpatch gets unpatched during a code deploy.

This and everything else said sounds so much like PHP+FTP workflow. It's so good.

↑