(simedw.com)

422 points simedw | 3 comments | 01 Jul 25 12:49 UTC | HN request time: 0.594s | source

1. treyd ◴[01 Jul 25 14:14 UTC] No.44434108[source]▶

I wonder if you could use a less sophisticated model (maybe even something based on LSTMs) to walk over the DOM and extract just the chunks that should be emitted and collected into the browsable data structure, but doing it all locally. I feel like it'd be straightforward to generate training data for this, using an LLM-based toolchain like what the author wrote to be used directly.

replies(1): >>44435662 #

2. askonomm ◴[01 Jul 25 16:39 UTC] No.44435662[source]▶

>>44434108 (TP) #

Unfortunately in the modern web simply walking the DOM doesn't cut it if the website's content loads in with JS. You could only walk the DOM once the JS has loaded, and all the requests it makes have finished, and at that point you're already using a whole browser renderer anyway.

replies(1): >>44437827 #

3. kccqzy ◴[01 Jul 25 20:41 UTC] No.44437827[source]▶

>>44435662 #

Yeah but this project doesn't use JS anyway.

↑

Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages