←back to thread

663 points nikisweeting | 2 comments | | HN request time: 0.458s | source

We've been pushing really hard over the last 6mo to develop this release. I'd love to hear feedback from people who've worked on big plugin systems in the past, or anyone who's tried our betas!
1. millvalleydev ◴[] No.41861779[source]
For devs like us, archivebox? or browsertrix-crawler? for scraping entire sites for our own uses, maybe to keep contents behind pay walls while we have subscriptions or maybe to feed them to local LLMs to ask?
replies(1): >>41863051 #
2. nikisweeting ◴[] No.41863051[source]
For scraping entire sites browserteix is currently more suited until we add full depth recursive crawling in v0.9. For feeding to LLMs ArchiveBox MIGHT BE better (imho) because we extract the raw content and you likely don't need the whole WARC.