←back to thread

597 points classichasclass | 2 comments | | HN request time: 0.008s | source
Show context
bob1029 ◴[] No.45011628[source]
I think a lot of really smart people are letting themselves get taken for a ride by the web scraping thing. Unless the bot activity is legitimately hammering your site and causing issues (not saying this isn't happening in some cases), then this mostly amounts to an ideological game of capture the flag. The difference being that you'll never find their flag. The only thing you win by playing is lost time.

The best way to mitigate the load from diffuse, unidentifiable, grey area participants is to have a fast and well engineered web product. This is good news, because your actual human customers would really enjoy this too.

replies(7): >>45011652 #>>45011830 #>>45011850 #>>45012424 #>>45012462 #>>45015038 #>>45015451 #
phito ◴[] No.45011652[source]
My friend has a small public gitea instance, only use by him a a few friends. He's getting thousounds of requests an hour from bots. I'm sorry but even if it does not impact his service, at the very least it feels like harassment
replies(7): >>45011694 #>>45011816 #>>45011999 #>>45013533 #>>45013955 #>>45014807 #>>45025114 #
bob1029 ◴[] No.45011816[source]
Thousands of requests per hour? So, something like 1-3 per second?

If this is actually impacting perceived QoS then I think a gitea bug report would be justified. Clearly there's been some kind of a performance regression.

Just looking at the logs seems to be an infohazard for many people. I don't see why you'd want to inspect the septic tanks of the internet unless absolutely necessary.

replies(5): >>45014694 #>>45014705 #>>45015142 #>>45016540 #>>45019745 #
dkiebd ◴[] No.45015142{3}[source]
I love the snark here. I work at a hosting company and the only customers who have issues with crawlers are those who have stupidly slow webpages. It’s hard to have any sympathy for them.
replies(1): >>45018800 #
1. egypturnash ◴[] No.45018800{4}[source]
Isn't it part of your job to help them fix that?
replies(1): >>45019092 #
2. 0x457 ◴[] No.45019092[source]
How? They are hosting company, not a webshop.