←back to thread

302 points simonw | 2 comments | | HN request time: 0.404s | source
Show context
luke-stanley ◴[] No.41877416[source]
I'm glad this worked for Simon, but I would probably prefer using a User Script that scrapes DOM text changes and streams them to a small local web server to append to a JSONL file that has the URL, text change and timestamp. Probably since I already have something doing this, it allows me to do a backup of things I'm looking at in real time, like streaming LLM generations, and it just relies on normal browser technology. I should probably share my code since it's quite useful. I'm a bit uncomfortable relying on a LLM to transcribe something where there is a stream of text that could be used in a robust way, and with real data, Vs well trained but indirect token magic. A middle ground might be to have grounded extraction and evidence chains, with timestamps, screenshots, cropped regions it's sourcing from, spelled out reasoning. There's the extraction / retrieval step and there's a kind of data normalisation. Of course, it's nice that he's got something that just works with two or three steps, it's good the technology is getting quite reliable and cheap a lot of the time, but still, we could do better.
replies(3): >>41880801 #>>41890864 #>>41895929 #
1. ranger_danger ◴[] No.41890864[source]
The userscript idea is great, I could think of some uses for this, such as text-to-speech for live comments. Do you know of any examples of projects already doing this?
replies(1): >>41893644 #
2. ian_hn ◴[] No.41893644[source]
Things like this? https://greasyfork.org/en/scripts?q=speech