←back to thread

211 points CrankyBear | 1 comments | | HN request time: 0.204s | source
Show context
dizlexic ◴[] No.45107404[source]
I was sysadmining a virtual art gallery thousands of "exhibits" including sound, video, and images.

We had never had any issue before and suddenly we get taken down 3 times in as many days. When I investigated it was all claude.

They were just pounding every route regardless of timeouts with no throttle. It was nasty.

They give web scrapers a bad rep.

replies(1): >>45108720 #
1. dylan604 ◴[] No.45108720[source]
Web scrapers earned their bad rep all on their own thank you very much. This is nothing new. Scrapers have no concern about whether a site is mostly static with stale text vs constantly updated. Most sites are not FB/Twitt..er,X/etc. Even retail sites not Amazon don't have new products being listed every minute. But that would involve someone on the scraper's side to pay attention and instead just let the computer run even if it is reading the same data every time.

Even if sites offered their content in a single downloadable file for bots, the bot creators would not trust it is not stale and out of date so they'd still continue to scrape ignoring the easy method.