←back to thread

556 points campuscodi | 2 comments | | HN request time: 0.425s | source
Show context
amatecha ◴[] No.41867018[source]
I get blocked from websites with some regularity, running Firefox with strict privacy settings, "resist fingerprinting" etc. on OpenBSD. They just give a 403 Forbidden with no explanation, but it's only ever on sites fronted by CloudFlare. Good times. Seems legit.
replies(13): >>41867245 #>>41867420 #>>41867658 #>>41868030 #>>41868383 #>>41868594 #>>41869190 #>>41869439 #>>41869685 #>>41869823 #>>41871086 #>>41873407 #>>41873926 #
mzajc ◴[] No.41868594[source]
I randomize my User-Agent header and many websites outright block me, most often with no captcha and no useless error message.

The most egregious is Microsoft (just about every Microsoft service/page, really), where all you get is a "The request is blocked." and a few pointless identifiers listed at the bottom, purely because it thinks your browser is too old.

CF's captcha page isn't any better either, usually putting me in an endless loop if it doesn't like my User-Agent.

replies(3): >>41868763 #>>41868916 #>>41870975 #
charrondev ◴[] No.41868763[source]
Are you sending an actual random string as your UA or sending one of a set of actual user agents?

You’re best off just picking real ones. We’ve got hit by a botnet sending 10k+ requests from 40 different ASNs with 1000s of different IPs. The only way we’re able to identify/block the traffic was excluding user agents matching some regex (for whatever reason they weren’t spoofing real user agents but weren’t sending actual ones either).

replies(2): >>41868802 #>>41869639 #
RALaBarge ◴[] No.41868802[source]
I worked at an anti-spam email security company in the aughts, and we had a perl engine that would rip apart the MIME boundaries and measure everything - UA, SMTP client fingerprint headers, even the number of anchor or paragraph tags. A large combination of IF/OR evaluations with a regex engine did a pretty good job since the botnets usually don't bother to fully randomize or really opsec the payloads they are sending since it is a cannon instead of a flyswatter.
replies(1): >>41869593 #
1. kccqzy ◴[] No.41869593[source]
Similar techniques are known in the HTTP world too. There were things like detecting the order of HTTP request headers and matching them to known software, or even just comparing the actual content of the Accept header.
replies(1): >>41872048 #
2. miki123211 ◴[] No.41872048[source]
And then there's also TLS fingerprinting.

Different browsers use TLS in slightly different ways, send data in a slightly different order, have a different set of supported extensions / algorithms etc.

If your user agent says Safari 18, but your TLS fingerprint looks like Curl and not Safari, sophisticated services will immediately detect that something isn't right.