A generative operating system that directly predicts screen images based on mouse and keyboard inputs, powered by an RNN for state modeling and a diffusion model for image generation.
See my tweet for more details: https://x.com/yuntiandeng/status/1944802154314916331
replies(1):