Most active commenters
  • londons_explore(4)

←back to thread

169 points hunvreus | 22 comments | | HN request time: 0.527s | source | bottom
1. londons_explore ◴[] No.43653973[source]
Unmentioned: there are serious security issues with memory cloning code not designed for it.

For example, an SSL library might have pre-calculated the random nonce for the next incoming SSL connection.

If you clone the VM containing a process using that library, now both child VM's will use the same nonce. Some crypto is 100% broken open if a nonce is reused.

replies(7): >>43654026 #>>43654396 #>>43654513 #>>43654702 #>>43654894 #>>43655157 #>>43657321 #
2. generalizations ◴[] No.43654026[source]
Sounds like it would simply be inappropriate to clone & use a VM that's assuming it's data is unique. This would also be true of other conditions, e.g. if you needed to spoof a MAC or IPv6 address & picked one randomly.
replies(1): >>43654077 #
3. londons_explore ◴[] No.43654077[source]
The problem is modern software is so fiendishly complicated there almost certainly is stuff like that in the code. The question is where, and does it matter?
replies(1): >>43654228 #
4. generalizations ◴[] No.43654228{3}[source]
And the last question is, can the parts with stuff like that be extracted from the rest and run separately?
5. hypeatei ◴[] No.43654396[source]
> might have pre-calculated the random nonce

Isn't this still a concern even if you're not pre-calculating way ahead of time? If you generate it when needed, it could still catch you at the wrong time (e.g. right before encryption, but right after nonce generation)

replies(1): >>43654654 #
6. sunshinekitty ◴[] No.43654513[source]
GCP’s ‘live migrations’ have been doing this for close to a decade or more. Must not be that big of a problem.
replies(2): >>43654524 #>>43657289 #
7. londons_explore ◴[] No.43654524[source]
It isn't a problem if you guarantee only one child of the clone lives on - which GCP does.
replies(1): >>43654845 #
8. zamadatix ◴[] No.43654654[source]
Unless your encryption and transport protocols are 100% stateless only 1 connection will actually be able to form, even if you duplicate the machine during connection creation.

The problem with pre-computing a bunch and keeping them in memory is brand new connections made post cloning would use the same list of nonces.

9. hedora ◴[] No.43654702[source]
I was about to say you were being paranoid, then I read the article. It hadn’t occurred to me that anyone would be so reckless!

The proposed workflow involves cloning your dev environment and sharing it with the internet.

At most places, that’s equivalent to publishing your production keys, or at least github credentials.

Even for open source projects where confidentiality doesn’t matter, there are issues like using cargo/npm/etc keys to launch supply chain attacks.

Your nonce attack is harder to pull off, but more devastating if the attacker can man in the middle things like dependency downloads.

10. matt-p ◴[] No.43654845{3}[source]
How do we know that isn't enforced here too?
replies(1): >>43655491 #
11. perching_aix ◴[] No.43654894[source]
I don't really follow, what's the issue with that? The two nodes will encrypt using the same key, so they can snoop at each other's traffic that they send out? Doesn't sound that big of a deal per se.
replies(2): >>43655173 #>>43655673 #
12. CompuIves ◴[] No.43655157[source]
Yes, that's right. The Firecracker team has written a fantastic doc about this as well: https://github.com/firecracker-microvm/firecracker/blob/main....

It's important to refresh entropy immediately after clone. Still, there can be code that didn't assume it could be cloned (even though there's always been `fork`, of course). Because of this, we don't live clone across workspaces for unlisted/private sandboxes and limit the use case to dev envs where no secrets are stored.

13. Rygian ◴[] No.43655173[source]
A nonce is not a key, it's a piece of random that is meant to be used at most once.

If an attacker sees valid nonces on a VM, and knows of another VM sharing the same nonces, then your crypto on both* VMs becomes vulnerable to replay attacks.

*read: all

replies(2): >>43655417 #>>43656303 #
14. nodesocket ◴[] No.43655417{3}[source]
How would a reply attack work in production assuming multiple VMs share a nonce?
replies(1): >>43655794 #
15. jsnell ◴[] No.43655491{4}[source]
Because their main selling point is to run the copies concurrently with the original.
16. londons_explore ◴[] No.43655673[source]
Reusing a nonce often allows the entire world to decrypt or MITM the data.
17. saagarjha ◴[] No.43655794{4}[source]
You record the traffic going to one VM and send it to another, which will now accept it because the nonce is the same.
18. trollied ◴[] No.43656303{3}[source]
“Number ONCE”. NONCE. Indeed.
19. oceanplexian ◴[] No.43657289[source]
Live Migration on VMWare has been a thing before Google even had a cloud service.
replies(1): >>43657602 #
20. dietr1ch ◴[] No.43657321[source]
A neat use case for cloning is not truly duplicating a machine, but moving it from one machine that will go off to another one.

There's caveats in the network though, as packets targeting the old address need to be re-routed until all connections target the new machine.

21. tanelpoder ◴[] No.43657602{3}[source]
VMware even has a vSphere Fault Tolerance product that creates a "live shadow instance" of a VM that mirrors the primary virtual machine (with up to 4 vCPUs). So you can do a quick failover in case of an "immediate planned" failover case, but apparently even when the primary DB goes down. I guess this might work when some external system (like a storage array) goes down in the primary, you can just switch to the other VM (with latest memory/CPU state) and replay that I/O there and keep going... But if there's a hard crash of the primary, if it actually does work, then they must be doing lots of reasoning about internal state change ordering & external device side-effect (somewhat like Antithesis, but for a different purpose). Back in the day, they supported only uniprocessor VMs (with something called vLockstep) and later up to 4 vCPUs with something called Fast Checkpointing.

I've always wanted to test this out for fun, by now 15 years have gone by and I've never got to it...

https://www.vmware.com/products/cloud-infrastructure/vsphere...

replies(1): >>43657915 #
22. umachin ◴[] No.43657915{4}[source]
VMware has also had a patent on live VM cloning (called it VMfork) for quite a few years now. I worked on the team that built related features. Feature itself was in the desktop product. https://blogs.vmware.com/euc/2016/02/horizon-7-view-instant-...

Live migration had some very cool demos. They would have an intensive workload such as a game playing and cause a crash and the VM would resume with 0 buffering.