←back to thread

208 points yuntian | 3 comments | | HN request time: 0.598s | source
Show context
yuntian ◴[] No.44565750[source]
Thanks everyone for trying out NeuralOS, and apologies for the frustrating user experience!

I coded up the demo myself and didn't anticipate how disruptive the intermittent warning messages about waiting users would become. The demo is quite resource-intensive: each session currently requires its own H100 GPU, and I'm already using a dispatcher-worker setup with 8 parallel workers. Unfortunately, demand exceeded my setup, causing significant lag and I had to limit sessions to 60 more seconds when others are waiting. Additionally, the underlying diffusion model itself is slow to run, resulting in a frame rate typically below 2 fps, further compounded by network bottlenecks.

As for model capabilities, NeuralOS is indeed quite limited at this point (as acknowledged in my paper abstract). That's why the demo interactions shown in my tweet were minimal (opening Firefox, typing a URL).

Overall, this is meant as a proof-of-concept demonstrating the potential of generative, neural-network-powered GUIs. It's fully open-source, and I hope others can help improve it going forward!

Thanks again for the honest feedback.

replies(5): >>44565771 #>>44566320 #>>44566525 #>>44567688 #>>44569273 #
1. cupantae ◴[] No.44566320[source]
Nǐ hăo, xìe xìe Yuntian! I read the readme and paper but haven’t played around much yet. I find this fascinating and I don’t care much about poor “experience” because intuitively I feel this idea couldn’t produce something as reliable and flexible as a real OS anyway. I see you talked about inability to install new software and my reaction was “well obviously”, because surely it will be at least as limited as the training data, while a real OS provides lots of software of great complexity which is seldom used.

Could you talk about your hopes for the future on this project? What are your thoughts on having a more simplified interface which could combine inputs in a more abstract way, or are you only interested in simulating a traditional OS?

Thanks again.

PS the waiting time while firefox “loads” made me laugh. I presume this is also simulated.

replies(2): >>44566332 #>>44566900 #
2. ◴[] No.44566332[source]
3. yuntian ◴[] No.44566900[source]
Thanks for your comment! I completely agree that currently NeuralOS is far from being as reliable as a real OS. The Firefox loading time is indeed a funny artifact of the neural model simulating delay in real OS.

However, my real dream behind this project is to blur the boundaries across applications, not just simulate traditional OS interactions. For example, imagine converting a movie we're watching directly into an interactive video game, or instantly changing the interface of an app (like Signal) to something we prefer (like Facebook Messenger) on the fly.

Of course, the current training data severely limits what's achievable today. But looking forward, I envision combining techniques from controllable text generation (such as Zhiting Hu's "Toward Controlled Generation of Text" paper) or synthesizing new interaction data to achieve greater and customization. I believe this is a promising path toward creating truly generative and personalized interfaces.

Thanks again for your interest!