←back to thread

422 points simedw | 3 comments | | HN request time: 0.729s | source
Show context
qsort ◴[] No.44433579[source]
This is actually very cool. Not really replacing a browser, but it could enable an alternative way of browsing the web with a combination of deterministic search and prompts. It would probably work even better as a command line tool.

A natural next step could be doing things with multiple "tabs" at once, e.g: tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references. I guess the problem at that point is whether the underlying model can support this type of workflow, which doesn't really seem to be the case even with SOTA models.

replies(4): >>44433628 #>>44435758 #>>44436819 #>>44440998 #
simedw ◴[] No.44433628[source]
Thank you.

I was thinking of showing multiple tabs/views at the same time, but only from the same source.

Maybe we could have one tab with the original content optimised for cli viewing, and another tab just doing fact checking (can ground it with google search or brave). Would be a fun experiment.

replies(5): >>44434149 #>>44434300 #>>44434460 #>>44435067 #>>44439084 #
1. nextaccountic ◴[] No.44434300[source]
In your cleanup step, after cleaning obvious junk, I think you should do whatever Firefox's reader mode does to further clean up, and if that fails bail out to the current output. That should reduce the number of tokens you send to the LLM even more

You should also have some way for the LLM to indicate there is no useful output because perhaps the page is supposed to be a SPA. This would force you to execute Javascript to render that particular page though

replies(1): >>44434421 #
2. simedw ◴[] No.44434421[source]
Just had a look and three is quite a lot going into Firefox's reader mode.

https://github.com/mozilla/readability

replies(1): >>44440596 #
3. dotancohen ◴[] No.44440596[source]
For the vast majority of pages you'd actually want to read, isProbablyReaderable() will quickly return a fair bool guess whether the page can be parsed or not.