They mentioned "microVM" in the live stream. Notably there's no browser or internet access. It makes sense, running specialized Firecracker/Unikraft/etc microkernels is way faster and cheaper so you can scale it up. But there will be a big technical scalability difficulty jump from this to the "agents with their own computers". ChatGPT Operator already does have a browser, so they definitely can do this, but I imagine the demand is orders of magnitudes different.
There must be room for a Modal/Cloudflare/etc infrastructure company that focuses only on providing full-fledged computer environments specifically for AI with forking/snapshotting (pause/resume), screen access, human-in-the-loop support, and so forth, and it would be very lucrative. We have browser-use, etc, but they don't (yet) capture the whole flow.
replies(1):