←back to thread

1122 points felixrieseberg | 1 comments | | HN request time: 0.198s | source
Show context
jl6 ◴[] No.43907000[source]
IIRC correctly, Clippy’s most famous feature was interrupting you to offer advice. The advice was usually basic/useless/annoying, hence Clippy’s reputation, but a powerful LLM could actually make the original concept work. It would not be simply a chatbot that responds to text, but rather would observe your screen, understand it through a vision model, and give appropriate advice. Things like “did you know there’s an easier way to do what you’re doing”. I don’t think the necessary trust exists yet to do this using public LLM APIs, nor does the hardware to do it locally, but crack either of those and I could see ClipGPT being genuinely useful.
replies(10): >>43907133 #>>43907138 #>>43907168 #>>43907265 #>>43907418 #>>43907981 #>>43908398 #>>43908908 #>>43909895 #>>43913051 #
vunderba ◴[] No.43907133[source]
We are probably getting closer to that with the newer multimodal LLMs, but you'd almost need to take a screenshot on intervals fed directly to the LLM to provide a sort of chronological context to help it understand what the user is trying to do and gauge the users intentions.

As you say though, I don't know how many people would be comfortable having screenshots of their computer sent arbitrarily to a non-local LLM.

replies(5): >>43907196 #>>43907413 #>>43907760 #>>43908782 #>>43913893 #
1. nrmitchi ◴[] No.43907196[source]
> As you say though, I don't know how many people would be comfortable having screenshots of their computer sent arbitrarily to a non-local LLM.

Of the technical, hang-out-on-HN crowd? Ya, probably not many.

Of the other 99.99% of computer users? The majority of them wouldn't even think about it, let alone care. To quote a phrase, ”the user is going to pick dancing pigs over security every time”.

Even without the non-chalent attitude towards security, the majority of the population has been so conditioned that everything they do on a computer is already being sent to 1) Apple, 2) Google, 3) Microsoft, or 4) their employer, that they're burnt-out of caring.

All that is to say that if you can make a widely-available real-time LLM assistant that appeals to non-technical users, please invite me to your private-island-celebrity-filled-yacht-parties.