←back to thread

597 points classichasclass | 1 comments | | HN request time: 0s | source
Show context
sneak ◴[] No.45011199[source]
I feel like people seem to forget that an HTTP request is, after all, a request. When you serve a webpage to a client, you are consenting to that interaction with a voluntary response.

You can blunt instrument 403 geoblock entire countries if you want, or any user agent, or any netblock or ASN. It’s entirely up to you and it’s your own server and nobody will be legitimately mad at you.

You can rate limit IPs to x responses per day or per hour or per week, whatever you like.

This whole AI scraper panic is so incredibly overblown.

I’m currently working on a sniffer that tracks all inbound TCP connections and UDP/ICMP traffic and can trigger firewall rule addition/removal based on traffic attributes (such as firewalling or rate limiting all traffic from certain ASNs or countries) without actually having to be a reverse proxy in the HTTP flow. That way your in-kernel tables don’t need to be huge and they can just dynamically be adjusted from userspace in response to actual observed traffic.

replies(1): >>45011296 #
worthless-trash ◴[] No.45011296[source]
> This whole AI scraper panic is so incredibly overblown.

The problem is that its eating into peoples costs, and if you're not concerned with money, I'm just asking, can you send me $50.00 USD ?

replies(1): >>45015019 #
sneak ◴[] No.45015019[source]
If people don’t want to spend the money serving the requests, then their own servers are misconfigured because responding is optional.
replies(2): >>45038069 #>>45039054 #
1. worthless-trash ◴[] No.45039054{3}[source]
So, that is a no on the fifty?

When AI can now register and break captures on your site to login, how do I compete with this arms race of defeating my protection from AI ?