←back to thread

597 points classichasclass | 6 comments | | HN request time: 0.314s | source | bottom
1. that_lurker ◴[] No.45011319[source]
Why not just block the User Agent?
replies(4): >>45011848 #>>45012056 #>>45012119 #>>45013130 #
2. N_Lens ◴[] No.45011848[source]
Bots often rotate the UA too, their entire goal is to get through and scrape as much content as possible, using any means possible.
3. aspenmayer ◴[] No.45012056[source]
I think the UA is easily spoofed, whereas the AS and IP are less easily spoofed. You have everything you need already to spoof UA, while you will need resources to spoof your IP, whether it’s wall clock time to set it up, CPU time to insert another network hop, and/or peers or other third parties to route your traffic, and so on. The User Agent are variables that you can easily change, no real effort or expense or third parties required.
4. lexicality ◴[] No.45012119[source]
because you have to parse the http request to do that, while blocking the IP can be done at the firewall
5. arewethereyeta ◴[] No.45013130[source]
Because it's the single most falsifiable piece of information you would find on ANY "how to scrape for dummies" article out there. They all start with changing your UA.
replies(1): >>45017778 #
6. ryantgtg ◴[] No.45017778[source]
Sure, but the article is about a bot that expressly identifies itself in the user agent and its user agent name contains a sentence suggesting you block its ip if you don’t like it. Since it uses at least 74 ips, blocking its user agent seems like a fine idea.