Using Erlang hot code updates

1. rozap ◴[19 Nov 24 22:47 UTC] No.42188905[source]▶

I used to work on a pretty big elixir project that had many clients with long lived connections that ran jobs that weren't easily resumable. Our company had a language agnostic deployment strategy based on docker, etc which meant we couldn't do hot code updates even though they would have saved our customers some headache.

Honestly I wish we had had the ability to do both. Sometimes a change is so tricky that the argument that "hot code updates are complicated and it'll cause more issues than it will solve" is very true, and maybe a deploy that forces everyone to reconnect is best for that sort of change. But often times we'd deploy some mundane thing where you don't have to worry about upgrading state in a running gen server or whatever, and it'd be nice to have minimal impact.

Obviously that's even more complexity piled onto the system, but every time I pushed some minor change and caused a retry that (in a perfect world at least...) didn't need to retry, I winced a bit.

replies(1): >>42189011 #

2. ElevenLathe ◴[19 Nov 24 23:03 UTC] No.42189011[source]▶

>>42188905 (TP) #

I work in gaming and have experienced the opposite side of this: many of our services have more than one "kind" of update, each with its own caveats and gotchas, so that it takes an expert in the whole system (meaning really almost ALL of our systems) to determine which would be the least impactful possible one if nothing goes wrong. Not only is there a lot of complexity and lost productivity in managing this process ("Are we sure this change is zero downtime-able?" "Does it need a schema reload?" etc) but we often get it wrong. The result is that, in practice, anything even remotely questionable gets done during a full downtime where we kick players out.

It's sometimes helpful to have the option to just restart one little corner of the full system, to minimize impact, but it is helpful to customer experience (if we don't screw it up) and very much the opposite for developer experience (it's crippling to velocity to need to discuss each change with multiple experts and determine the appropriate type of release).

replies(1): >>42189179 #

3. rozap ◴[19 Nov 24 23:26 UTC] No.42189179[source]▶

>>42189011 #

No doubt that traditional deployments are much better for dev experience at (sometimes) the cost of customer experience.

replies(2): >>42189324 #>>42189443 #

4. ElevenLathe ◴[19 Nov 24 23:48 UTC] No.42189324{3}[source]▶

>>42189179 #

One thing that would help both is deployment automation that could examine the desired changes and work out the best way to deploy them without human input. For distributed systems, this would require rock-solid contracts between individual services for all relevant scenarios, and would also require each update to be specified completely in code (or at least something machine readable), ideally in one commit. This is a level of maturity that seems elusive in gaming.

5. toast0 ◴[20 Nov 24 00:05 UTC] No.42189443{3}[source]▶

>>42189179 #

I disagree. Hot loading means I can have a very short cycle on an issue, and move onto something else. Having to think about the implications of hot loading is worth it for the rapid cycle time and not having to hold as many changes in my mind at once.