ChatGPT Atlas | slacker news

1. bilsbie ◴[21 Oct 25 18:45 UTC] No.45659896[source]▶

>>45658479 (OP) #

Super dumb question but why was this so hard for someone to build.

I’ve been wanting to simply ask AI about whatever is currently on my screen for years.

I don’t get why we can’t easily have this.

replies(3): >>45660171 #>>45660199 #>>45663243 #

2. Sean-Der ◴[21 Oct 25 19:05 UTC] No.45660171[source]▶

>>45659896 (TP) #

You can already do this! I saw this on X[0]. You can do WebRTC to Realtime API + getDisplayMedia.

[0] https://www.loom.com/share/22a165508ae5491dbd536fbbc5348fcc

3. AtNightWeCode ◴[21 Oct 25 19:07 UTC] No.45660199[source]▶

>>45659896 (TP) #

It is very basic. I have built my own version of this based on Chromium that integrates both Claude and ChatGPT in the browser. It can do a lot of tasks like translate or shorten the text I selected and so on. It took me like a couple of hours to build. The problem is the cost of using the LLMs, especially since they are still pretty stupid and requires huge prompts.

EDIT: I think I misunderstood your Q. Sorry. You can take a screenshot and post it to ChatGPT and get back what it is seeing, in theory. I mean, I use ChatGPT to post screenshots of my sites to get feedback on my layout and designs...

4. nsonha ◴[21 Oct 25 23:49 UTC] No.45663243[source]▶

>>45659896 (TP) #

We have this though, as a (controversial) built-in Windows's feature called "Recall". We have many apps like that (vercept.com) and MCP servers that do that. It's just, besides privacy concerns, it doesn't works well yet for agentic usecases.