←back to thread

454 points positiveblue | 1 comments | | HN request time: 0.205s | source
Show context
matt-p ◴[] No.45066473[source]
I have zero issue with Ai Agents, if there's a real user behind there somewhere. I DO have a major issue with my sites being crawled extremely aggressively by offenders including Meta, Perplexity and OpenAI - it's really annoying realising that we're tying up several cpu cores on AI crawling. Less than on real users and google et al.
replies(6): >>45066494 #>>45066689 #>>45066754 #>>45067321 #>>45067530 #>>45068488 #
1. chatmasta ◴[] No.45067321[source]
I wonder how many CPU cycles are spent because of AI companies scraping content. This factor isn't usually considered when estimating “environmental impact of AI.” What’s the overhead of this on top of inference and training?

To be fair, an accurate measurement would need to consider how many of those CPU cycles would be spent by the human user who is driving the bot. From that perspective, maybe the scrapers can “make up for it” by crawling efficiently, i.e. avoid loading tracker scripts, images, etc unless necessary to solve the query. This way they’ll still burn CPU cycles but at least it’ll be less cycles than a human user with a headful browser instance.