Using Cloudflare on your website could be blocking RSS users

(openrss.org)

559 points campuscodi | 2 comments | 16 Oct 24 22:46 UTC | HN request time: 0.416s | source

Show context

amatecha ◴[17 Oct 24 06:43 UTC] No.41867018[source]▶

I get blocked from websites with some regularity, running Firefox with strict privacy settings, "resist fingerprinting" etc. on OpenBSD. They just give a 403 Forbidden with no explanation, but it's only ever on sites fronted by CloudFlare. Good times. Seems legit.

replies(14): >>41867245 #>>41867420 #>>41867658 #>>41868030 #>>41868383 #>>41868594 #>>41869190 #>>41869439 #>>41869685 #>>41869823 #>>41871086 #>>41873407 #>>41873926 #>>42002463 #

mzajc ◴[17 Oct 24 11:32 UTC] No.41868594[source]▶

>>41867018 #

I randomize my User-Agent header and many websites outright block me, most often with no captcha and no useless error message.

The most egregious is Microsoft (just about every Microsoft service/page, really), where all you get is a "The request is blocked." and a few pointless identifiers listed at the bottom, purely because it thinks your browser is too old.

CF's captcha page isn't any better either, usually putting me in an endless loop if it doesn't like my User-Agent.

replies(3): >>41868763 #>>41868916 #>>41870975 #

charrondev ◴[17 Oct 24 11:53 UTC] No.41868763[source]▶

>>41868594 #

Are you sending an actual random string as your UA or sending one of a set of actual user agents?

You’re best off just picking real ones. We’ve got hit by a botnet sending 10k+ requests from 40 different ASNs with 1000s of different IPs. The only way we’re able to identify/block the traffic was excluding user agents matching some regex (for whatever reason they weren’t spoofing real user agents but weren’t sending actual ones either).

replies(2): >>41868802 #>>41869639 #

RALaBarge ◴[17 Oct 24 11:59 UTC] No.41868802[source]▶

>>41868763 #

I worked at an anti-spam email security company in the aughts, and we had a perl engine that would rip apart the MIME boundaries and measure everything - UA, SMTP client fingerprint headers, even the number of anchor or paragraph tags. A large combination of IF/OR evaluations with a regex engine did a pretty good job since the botnets usually don't bother to fully randomize or really opsec the payloads they are sending since it is a cannon instead of a flyswatter.

replies(1): >>41869593 #

1. kccqzy ◴[17 Oct 24 13:42 UTC] No.41869593[source]▶

>>41868802 #

Similar techniques are known in the HTTP world too. There were things like detecting the order of HTTP request headers and matching them to known software, or even just comparing the actual content of the Accept header.

replies(1): >>41872048 #

2. miki123211 ◴[17 Oct 24 18:04 UTC] No.41872048[source]▶

>>41869593 (TP) #

And then there's also TLS fingerprinting.

Different browsers use TLS in slightly different ways, send data in a slightly different order, have a different set of supported extensions / algorithms etc.

If your user agent says Safari 18, but your TLS fingerprint looks like Curl and not Safari, sophisticated services will immediately detect that something isn't right.

↑