←back to thread

169 points hunvreus | 7 comments | | HN request time: 1.034s | source | bottom
Show context
londons_explore ◴[] No.43653973[source]
Unmentioned: there are serious security issues with memory cloning code not designed for it.

For example, an SSL library might have pre-calculated the random nonce for the next incoming SSL connection.

If you clone the VM containing a process using that library, now both child VM's will use the same nonce. Some crypto is 100% broken open if a nonce is reused.

replies(7): >>43654026 #>>43654396 #>>43654513 #>>43654702 #>>43654894 #>>43655157 #>>43657321 #
1. sunshinekitty ◴[] No.43654513[source]
GCP’s ‘live migrations’ have been doing this for close to a decade or more. Must not be that big of a problem.
replies(2): >>43654524 #>>43657289 #
2. londons_explore ◴[] No.43654524[source]
It isn't a problem if you guarantee only one child of the clone lives on - which GCP does.
replies(1): >>43654845 #
3. matt-p ◴[] No.43654845[source]
How do we know that isn't enforced here too?
replies(1): >>43655491 #
4. jsnell ◴[] No.43655491{3}[source]
Because their main selling point is to run the copies concurrently with the original.
5. oceanplexian ◴[] No.43657289[source]
Live Migration on VMWare has been a thing before Google even had a cloud service.
replies(1): >>43657602 #
6. tanelpoder ◴[] No.43657602[source]
VMware even has a vSphere Fault Tolerance product that creates a "live shadow instance" of a VM that mirrors the primary virtual machine (with up to 4 vCPUs). So you can do a quick failover in case of an "immediate planned" failover case, but apparently even when the primary DB goes down. I guess this might work when some external system (like a storage array) goes down in the primary, you can just switch to the other VM (with latest memory/CPU state) and replay that I/O there and keep going... But if there's a hard crash of the primary, if it actually does work, then they must be doing lots of reasoning about internal state change ordering & external device side-effect (somewhat like Antithesis, but for a different purpose). Back in the day, they supported only uniprocessor VMs (with something called vLockstep) and later up to 4 vCPUs with something called Fast Checkpointing.

I've always wanted to test this out for fun, by now 15 years have gone by and I've never got to it...

https://www.vmware.com/products/cloud-infrastructure/vsphere...

replies(1): >>43657915 #
7. umachin ◴[] No.43657915{3}[source]
VMware has also had a patent on live VM cloning (called it VMfork) for quite a few years now. I worked on the team that built related features. Feature itself was in the desktop product. https://blogs.vmware.com/euc/2016/02/horizon-7-view-instant-...

Live migration had some very cool demos. They would have an intensive workload such as a game playing and cause a crash and the VM would resume with 0 buffering.