Using Erlang hot code updates

1. arnon ◴[19 Nov 24 21:56 UTC] No.42188528[source]▶

A few years ago, the biggest problem with Erlang's hot code updates was getting the files updated on all of the nodes. Has this been solved or improved in any way?

replies(2): >>42188728 #>>42189232 #

2. comboy ◴[19 Nov 24 22:20 UTC] No.42188728[source]▶

>>42188528 (TP) #

I don't think updating files is the problem. The biggest issue with hot code updates seems to be that they can create states that cannot be replicated in either release on its own.

replies(2): >>42188800 #>>42189026 #

3. ketralnis ◴[19 Nov 24 22:32 UTC] No.42188800[source]▶

>>42188728 #

This is my experience. About 25% of the time I'd encounter a bug that's impossible to reproduce without both versions of the code in memory, and end up restarting the node anyway dropping requests in the process. Whereas if I'd have architected around not having hot code updates I could built it in a way that never has to drop requests

4. faizshah ◴[19 Nov 24 23:06 UTC] No.42189026[source]▶

>>42188728 #

In general, you can save your team a lot of ops trouble just by periodically restarting your long running services from scratch instead of trying to keep alive a process or container for a long time.

I’m still new to the erlang/elixir community and I haven’t run it in prod yet but this is my experience coming from Java, Node, and Python.

5. toast0 ◴[19 Nov 24 23:35 UTC] No.42189232[source]▶

>>42188528 (TP) #

There's about a thousand different ways to update files on servers?

You can build os packages, and push those however you like.

You can use rsync.

You could push the files over dist, if you want.

You could probably do something cool with bittorrent (maybe that trend is over?)

If you write Makefiles to push, you can use make -j X to get low effort parallelization, which works ok if your node count isn't too big, and you don't need as instant as possible updates.

Erlang source and beam files don't tend to get very large. And most people's dist clusters aren't very large either; I don't think I've seen anyone posting large cluster numbers lately, but I'd be surprised if anyone was pushing to 10,000 nodes at once. Assuming they're well connected, pushing to 10,000 nodes takes some prep, but not that much; if you're driving it from your laptop, you probably want an intermediate pusher node in your datacenter, so you can push once from home/office internet to the pusher node, and then fork a bunch of pushers in the datacenter to push to the other hosts. If you've got multiple locations and you're feeling fancy, have a pusher node at each location, push to the pusher node nearest you; that pushes to the node at each location and from there to individual nodes.

Other issues are more pressing; like making sure you write your code so it's hotload friendly, and maybe trying to test that to confirm you won't use the immense power of hotloading to very rapidly crash all your server processes.

replies(1): >>42189502 #

6. samgranieri ◴[20 Nov 24 00:15 UTC] No.42189502[source]▶

>>42189232 #

I think Twitter once cobbled together a BitTorrent based deployment strategy for Capistrano called murder, that was a cool read from their eng blog back in the day.

I wish I had used a pusher node to deploy things when a colleague was using almost all the upstream bandwidth in the office making a video call when my bosses were giving demo and the fix I coded for an issue discovered during the demo could not deploy via Capistrano