←back to thread

253 points akyuu | 1 comments | | HN request time: 0s | source
Show context
BinaryIgor ◴[] No.45945045[source]
I wonder why is it that we get an increase in these automated scrapers and attacks as of late (some few years); is there better (open-source?) technology that allows it? Is it because hosting infrastructure is cheaper also for the attackers? Both? Something else?

Maybe the long-term solution for such attacks is to hide most of the internet behind some kind of Proof of Work system/network, so that mostly humans get to access to our websites, not machines.

replies(6): >>45945393 #>>45945467 #>>45945584 #>>45945643 #>>45945917 #>>45945959 #
Vegenoid ◴[] No.45945917[source]
I'm pretty sure it is the commercial demand for data from AI companies. It is certainly the popular conception among sysadmins that it is AI companies who are responsible for the wave of scrapers over the past few years, and I see no compelling alternative.
replies(1): >>45946032 #
1. embedding-shape ◴[] No.45946032[source]
> and I see no compelling alternative.

Another potential cause: It's way easier for pretty much any person connected to the internet to "create" their own automation software by using LLMs. I could wager even the less smart LLMs could handle "Create a program that checks this website every second for any product updates on all pages" and give enough instructions for the average computer user to be able to run it without thinking or considering much.

Multiply this by every person with access to an LLM who wants to "do X with website Y" and you'll get an magnitude increase in traffic across the internet. This been possible since what, 2023 sometime? Not sure if the patterns would line up, but just another guess for the cause(s).