←back to thread

597 points classichasclass | 1 comments | | HN request time: 0.001s | source
Show context
bob1029 ◴[] No.45011628[source]
I think a lot of really smart people are letting themselves get taken for a ride by the web scraping thing. Unless the bot activity is legitimately hammering your site and causing issues (not saying this isn't happening in some cases), then this mostly amounts to an ideological game of capture the flag. The difference being that you'll never find their flag. The only thing you win by playing is lost time.

The best way to mitigate the load from diffuse, unidentifiable, grey area participants is to have a fast and well engineered web product. This is good news, because your actual human customers would really enjoy this too.

replies(7): >>45011652 #>>45011830 #>>45011850 #>>45012424 #>>45012462 #>>45015038 #>>45015451 #
themafia ◴[] No.45011850[source]
The way I get a fast web product is to pay a premium for data. So, no, it's not "lost time" by banning these entities, it's actual saved costs on my bandwidth and compute bills.

The bonus is my actual customers get the same benefits and don't notice any material loss from my content _not_ being scraped. How you see this as me being secretly taken advantage of is completely beyond me.

replies(2): >>45013979 #>>45017742 #
1. zelphirkalt ◴[] No.45017742[source]
You are paying premium for data? Do you mean for traffic? Sounds like a bad deal to me. The tiniest Hetzner servers give you 20TB included per month. Either you really have lots of traffic, or you are paying for bad hosting deals.