←back to thread

208 points yuntian | 2 comments | | HN request time: 0.421s | source
Show context
yuntian ◴[] No.44564532[source]
A generative operating system that directly predicts screen images based on mouse and keyboard inputs, powered by an RNN for state modeling and a diffusion model for image generation.

See my tweet for more details: https://x.com/yuntiandeng/status/1944802154314916331

replies(1): >>44565170 #
1. 5- ◴[] No.44565170[source]
i like how most of your demo video is clicking through various firefox and google popups.
replies(2): >>44565732 #>>44566398 #
2. arm32 ◴[] No.44566398[source]
Pretty realistic, actually.