Most active commenters

BiteCode_dev(10)
acdha(6)
mzajc(4)
lovethevoid(4)
HappMacDonald(4)
amatecha(3)
ForHackernews(3)
(3)
hombre_fatal(3)
capitainenemo(3)

Popular/hot comments

>>41868030 #
>>41868602 #
>>41869190 #
>>41868417 #
>>41867965 #
>>41869163 #
>>41869662 #
>>41868257 #
>>41868594 #
>>41869070 #
>>41867245 #
>>41870085 #
>>41870144 #

←back to thread

Using Cloudflare on your website could be blocking RSS users

(openrss.org)

1. amatecha ◴[17 Oct 24 06:43 UTC] No.41867018[source]▶

>>41864632 (OP) #

I get blocked from websites with some regularity, running Firefox with strict privacy settings, "resist fingerprinting" etc. on OpenBSD. They just give a 403 Forbidden with no explanation, but it's only ever on sites fronted by CloudFlare. Good times. Seems legit.

replies(14): >>41867245 #>>41867420 #>>41867658 #>>41868030 #>>41868383 #>>41868594 #>>41869190 #>>41869439 #>>41869685 #>>41869823 #>>41871086 #>>41873407 #>>41873926 #>>42002463 #

2. BiteCode_dev ◴[17 Oct 24 07:26 UTC] No.41867245[source]▶

>>41867018 (TP) #

Cloudflare is a fantastic service with an unmatched value proposition, but it's unfortunately slowly killing web privacy, with 1000s paper cuts.

Another problem is "resist fingerprinting" prevents some canvas processing, and many websites like bluesky, linked in or substack uses canvas to handle image upload, so your images appear to be stripes of pixel.

Then you have mobile apps that just don't run if you don't have a google account, like chatgpt's native app.

I understand why people give up, trying to fight for your privacy is an uphill battle with no end in sight.

replies(3): >>41867859 #>>41867883 #>>41869163 #

3. viraptor ◴[17 Oct 24 08:05 UTC] No.41867420[source]▶

>>41867018 (TP) #

I know it's not a solution for you specifically here, but if anyone has access to the CF enterprise plan, they can report specific traffic as non-bot and hopefully improve the situation. They need to have access to the "Bot Management" feature though. It's a shitty situation, but some of us here can push back a little bit - so do it if you can.

And yes, it's sad that the "make internet work again" is behind an expensive paywall..

replies(1): >>41868257 #

4. Jazgot ◴[17 Oct 24 08:51 UTC] No.41867658[source]▶

>>41867018 (TP) #

My rss reader was blocked on kvraudio.com by cloudflare. This issue wasn't solved for months. I simply stopped reading anything on kvraudio. Thank you cloudflare!

5. madeofpalk ◴[17 Oct 24 09:28 UTC] No.41867859[source]▶

>>41867245 #

> Then you have mobile apps that just don't run if you don't have a google account, like chatgpt's native app.

Is that true? At least on iOS you can log into the ChatGPT with same email/password as the website.

I never use Google login for stuff and ChatGPT works fine for me.

replies(1): >>41867966 #

6. KomoD ◴[17 Oct 24 09:32 UTC] No.41867883[source]▶

>>41867245 #

> Then you have mobile apps that just don't run if you don't have a google account, like chatgpt's native app.

That's not true, I use ChatGPT's app on my phone without logging into a Google account.

You don't even need any kind of account at all to use it.

replies(1): >>41867965 #

7. BiteCode_dev ◴[17 Oct 24 09:47 UTC] No.41867965{3}[source]▶

>>41867883 #

On Android at least, even if you don't need to log in to your google account when connecting to chatgpt, the app won't work if your phone isn't signed in into google play, which doesn't work if your phone isn't linked to a google account.

An android phone asks you to link a google account when you use it for the first time. It takes a very dedicated user to refuse that, then to avoid logging in into the gmail, youtube or app store apps which will all also link your phone to your google account when you sign in.

But I do actively avoid this, I use Aurora, F-droid, K9 and NewPipeX, so no link to google.

But then no ChatGPT app. When I start it, I get hit with a logging page to the app store and it's game over.

replies(4): >>41868612 #>>41868843 #>>41869090 #>>41869477 #

8. BiteCode_dev ◴[17 Oct 24 09:47 UTC] No.41867966{3}[source]▶

>>41867859 #

See other comment.

9. wakeupcall ◴[17 Oct 24 09:59 UTC] No.41868030[source]▶

>>41867018 (TP) #

Also running FF with strict privacy settings and several blockers. The annoyances are constantly increasing. Cloudflare, captchas, "we think you're a bot", constantly recurring cookie popups and absurd requirements are making me hate most of the websites and services I hit nowdays.

I tried for a long time to get around it, but now when I hit a website like this just close the tab and don't bother anymore.

replies(9): >>41868417 #>>41868617 #>>41869080 #>>41869225 #>>41870092 #>>41870195 #>>41871235 #>>41873515 #>>41884694 #

10. meeb ◴[17 Oct 24 10:39 UTC] No.41868257[source]▶

>>41867420 #

The issue here is that RSS readers are bots. Obviously perfectly sensible and useful bots, but they’re not “real people using a browser”. I doubt you could get RSS readers listed on Cloudflare’s “good bots” list either which would allow them the default bot protection feature given they’ll all run off random residential IPs.

replies(3): >>41868668 #>>41868842 #>>41872245 #

11. anal_reactor ◴[17 Oct 24 11:01 UTC] No.41868383[source]▶

>>41867018 (TP) #

On my phone Opera Mobile won't be allowed into some websites behind CloudFlare, most importantly 4chan

replies(1): >>41869306 #

12. afh1 ◴[17 Oct 24 11:07 UTC] No.41868417[source]▶

>>41868030 #

Same, but for VPN (either corporate or personal). Reddit blocks it completely, requires you to sign-in but even the sign-in page is "network restricted"; LinkedIn shows you a captcha but gives an error when submitting the result (several reports online); and overall a lot of 403's. All go magically away when turning off the VPN. Companies, specially adtechs like Reddit and LinkedIn, do NOT want you to browse privately, to the point they rather you don't use their website at all unless without a condom.

replies(4): >>41868602 #>>41868822 #>>41869694 #>>41870144 #

13. mzajc ◴[17 Oct 24 11:32 UTC] No.41868594[source]▶

>>41867018 (TP) #

I randomize my User-Agent header and many websites outright block me, most often with no captcha and no useless error message.

The most egregious is Microsoft (just about every Microsoft service/page, really), where all you get is a "The request is blocked." and a few pointless identifiers listed at the bottom, purely because it thinks your browser is too old.

CF's captcha page isn't any better either, usually putting me in an endless loop if it doesn't like my User-Agent.

replies(3): >>41868763 #>>41868916 #>>41870975 #

14. appendix-rock ◴[17 Oct 24 11:32 UTC] No.41868602{3}[source]▶

>>41868417 #

I don’t follow the logic here. There seems to be an implication of ulterior motive but I’m not seeing what it is. What aspect of ‘privacy’ offered by a VPN do you think that Reddit / LinkedIn are incentivised to bypass? From a privacy POV, your VPN is doing nothing to them, because your IP address means very little to them from a tracking POV. This is just FUD perpetuated by VPN advertising.

However, the undeniable reality is that accessing the website with a non-residential IP is a very, very strong indicator of sinister behaviour. Anyone that’s been in a position to operate one of these services will tell you that. For every…let’s call them ‘privacy-conscious’ user, there are 10 (or more) nefarious actors that present largely the same way. It’s easy to forget this as a user.

I’m all but certain that if Reddit or LinkedIn could differentiate, they would. But they can’t. That’s kinda the whole point.

replies(6): >>41869070 #>>41869084 #>>41869570 #>>41871928 #>>41873295 #>>41873620 #

15. lioeters ◴[17 Oct 24 11:34 UTC] No.41868617[source]▶

>>41868030 #

Same here. I occasionally encounter websites that won't work with ad blockers, sometimes with Cloudflare involved, and I don't even bother with those sites anymore. Same with sites that display a cookie "consent" form without an option to not accept. I reject the entire site.

Site owners probably don't even see these bounced visits, and it's such a tiny percentage of visitors who do this that it won't make a difference. Meh, it's just another annoyance to be able to use the web on our own terms.

replies(1): >>41871966 #

16. j16sdiz ◴[17 Oct 24 11:41 UTC] No.41868668{3}[source]▶

>>41868257 #

They can't whitelist useragent, otherwise bot will pass just using agent spoofing.

If you have enterprise plan, you can have custom rules including allowing by url

17. charrondev ◴[17 Oct 24 11:53 UTC] No.41868763[source]▶

>>41868594 #

Are you sending an actual random string as your UA or sending one of a set of actual user agents?

You’re best off just picking real ones. We’ve got hit by a botnet sending 10k+ requests from 40 different ASNs with 1000s of different IPs. The only way we’re able to identify/block the traffic was excluding user agents matching some regex (for whatever reason they weren’t spoofing real user agents but weren’t sending actual ones either).

replies(2): >>41868802 #>>41869639 #

18. RALaBarge ◴[17 Oct 24 11:59 UTC] No.41868802{3}[source]▶

>>41868763 #

I worked at an anti-spam email security company in the aughts, and we had a perl engine that would rip apart the MIME boundaries and measure everything - UA, SMTP client fingerprint headers, even the number of anchor or paragraph tags. A large combination of IF/OR evaluations with a regex engine did a pretty good job since the botnets usually don't bother to fully randomize or really opsec the payloads they are sending since it is a cannon instead of a flyswatter.

replies(1): >>41869593 #

19. acdha ◴[17 Oct 24 12:02 UTC] No.41868822{3}[source]▶

>>41868417 #

> Companies, specially adtechs like Reddit and LinkedIn, do NOT want you to browse privately, to the point they rather you don't use their website at all unless without a condom.

That’s true in some cases, I’m sure, but also remember that most site owners deal with lots of tedious abuse. For example, some people get really annoyed about Tor being blocked but for most sites Tor is a tiny fraction of total traffic but a fairly large percentage of the abuse probing for vulnerabilities, guessing passwords, spamming contact forms, etc. so while I sympathize for the legitimate users I also completely understand why a busy site operator is going to flip a switch making their log noise go down by a double-digit percentage.

replies(1): >>41871038 #

20. sam345 ◴[17 Oct 24 12:05 UTC] No.41868842{3}[source]▶

>>41868257 #

Not sure if I get this.It seems to me an RSS reader is as much of a bot as a browser is for HTML. It just reads RSS rather than HTML.

replies(1): >>41869616 #

21. acdha ◴[17 Oct 24 12:05 UTC] No.41868843{4}[source]▶

>>41867965 #

So the requirement is to pass the phone’s system validation process rather than having a Google account. I don’t love that but I can understand why they don’t want to pay the bill for the otherwise ubiquitous bots, and it’s why it’s an Android-specific issue.

replies(1): >>41869497 #

22. pushcx ◴[17 Oct 24 12:15 UTC] No.41868916[source]▶

>>41868594 #

Rails is going to make this much worse for you. All new apps include naive agent sniffing and block anything “old” https://github.com/rails/rails/pull/50505

replies(2): >>41869122 #>>41869662 #

23. bo1024 ◴[17 Oct 24 12:36 UTC] No.41869070{4}[source]▶

>>41868602 #

Not following what could be sinister about a GET request to a public website.

> From a privacy POV, your VPN is doing nothing to them, because your IP address means very little to them from a tracking POV.

I disagree. (1) Since I have javascript disabled, IP address is generally their next best thing to go on. (2) I don't want to give them IP address to correlate with the other data they have on me, because if they sell that data, now someone else who only has my IP address suddenly can get a bunch of other stuff with it too.

replies(3): >>41870266 #>>41871166 #>>41871182 #

24. orbisvicis ◴[17 Oct 24 12:38 UTC] No.41869080[source]▶

>>41868030 #

I have to solve captchas for Amazon while logged into my Amazon account.

replies(2): >>41873241 #>>41873646 #

25. afh1 ◴[17 Oct 24 12:38 UTC] No.41869084{4}[source]▶

>>41868602 #

IP address is a fingerprint to be shared with third parties, of course it's relevant. It's not ulterior motive, it's explicit, it's not caring about your traffic because you're not good product. They can and do differentiate by requiring a sign-in. They just don't care enough to make it actually work. Because they are adtechs and not interested in you as a user.

26. ForHackernews ◴[17 Oct 24 12:38 UTC] No.41869090{4}[source]▶

>>41867965 #

You might like: https://e.foundation/e-os/

replies(1): >>41869512 #

27. GoblinSlayer ◴[17 Oct 24 12:42 UTC] No.41869122{3}[source]▶

>>41868916 #

  def blocked?
    user_agent_version_reported? && unsupported_browser?
  end

well, you know what to do here :)

28. pjc50 ◴[17 Oct 24 12:48 UTC] No.41869163[source]▶

>>41867245 #

The privacy battle has to be at the legal layer. GDPR is far from perfect (bureaucratic and unclear with weak enforcement), but it's a step in the right direction.

In an adversarial environment, especially with both AI scrapers and AI posters, websites have to be able to identify and ban persistent abusers. Which unfortunately implies having some kind of identification of everybody.

replies(4): >>41869472 #>>41869801 #>>41870438 #>>41871900 #

29. neilv ◴[17 Oct 24 12:50 UTC] No.41869190[source]▶

>>41867018 (TP) #

Similar here. It's not unusual to be blocked from a site by CloudFlare when I'm running Firefox (either ESR or current release) on Linux.

I suspect that people operating Web sites have no idea how many legitimate users are blocked by CloudFlare.

And. based on the responses I got when I contacted two of the companies whose sites were chronically blocked by CloudFlare for months, it seemed like it wasn't worth any employee's time to try to diagnose.

Also, I'm frequently blocked by CloudFlare when running Tor Browser. Blocking by Tor exit node IP address (if that's what's happening) is much more understandable than blocking Firefox from a residential IP address, but still makes CloudFlare not a friend of people who want or need to use Tor.

replies(5): >>41869245 #>>41870049 #>>41870881 #>>41871039 #>>41872316 #

30. anilakar ◴[17 Oct 24 12:54 UTC] No.41869225[source]▶

>>41868030 #

Heck, I cannot even pass ReCAPTCHA nowadays. No amount of clicking buses, bicycles, motorcycles, traffic lights, stairs, crosswalks, bridges and fire hydrants will suffice. The audio transcript feature is the only way to get past a prompt.

replies(2): >>41869330 #>>41871389 #

31. pjc50 ◴[17 Oct 24 12:57 UTC] No.41869245[source]▶

>>41869190 #

> CloudFlare not a friend of people who want or need to use Tor

The adversarial aspect of all this is a problem: P(malicious|Tor) is much higher than P(malicious|!Tor)

32. dialup_sounds ◴[17 Oct 24 13:07 UTC] No.41869306[source]▶

>>41868383 #

4chan's CF config is so janky at this point it's the only site I have to use a VPN for.

33. josteink ◴[17 Oct 24 13:09 UTC] No.41869330{3}[source]▶

>>41869225 #

Just a heads up that this is how Google treat connections it suspects to originate from bots. Silently keeping you in an endless loop promising reward if you can complete it correctly.

I discovered this when I set up IPv6 using hurricane electric as a tunnel broker for IPv6 connectivity.

Seemingly Google has all HEnet IPv6tunnel subnets listed for such behaviour without it being documented anywhere. It was extremely annoying until I figured out what was going on.

replies(2): >>41869544 #>>41869837 #

34. pessimizer ◴[17 Oct 24 13:23 UTC] No.41869439[source]▶

>>41867018 (TP) #

Also, Cloudflare won't let you in if you forge your referer (it's nobody's business what site I'm coming from.) For years, you could just send the root of the site you were visiting, then last year somebody at Cloudflare flipped a switch and took a bite out of everyone's privacy. Now it's just endless reloading captchas.

replies(2): >>41869588 #>>41871887 #

35. BiteCode_dev ◴[17 Oct 24 13:27 UTC] No.41869472{3}[source]▶

>>41869163 #

That's another problem, we want cheap easy solutions like tracking people, instead of more targetteed or systemic ones.

36. __MatrixMan__ ◴[17 Oct 24 13:28 UTC] No.41869477{4}[source]▶

>>41867965 #

I have a similar experience with the pager duty app. It loads up and then exits with "security problem detected by app" because I've made it more secure by isolating it from Google (a competitor). Workaround is to just control it via slack instead.

replies(1): >>41869532 #

37. BiteCode_dev ◴[17 Oct 24 13:30 UTC] No.41869497{5}[source]▶

>>41868843 #

You can make a very rational case for each privacy invasive technical decision ever made.

In the end, the fact remain: no chatgpt app without giving up your privacy, to google none the less.

replies(1): >>41870085 #

38. BiteCode_dev ◴[17 Oct 24 13:32 UTC] No.41869512{5}[source]▶

>>41869090 #

That won't make chatgpt's app work thought.

replies(1): >>41869944 #

39. BiteCode_dev ◴[17 Oct 24 13:34 UTC] No.41869532{5}[source]▶

>>41869477 #

Well you can use the web base chagpt so there is a workaround. Except it's worse a worse experience.

40. n4r9 ◴[17 Oct 24 13:36 UTC] No.41869544{4}[source]▶

>>41869330 #

> Silently keeping you in an endless loop promising reward if you can complete it correctly.

Sounds suspiciously like how product managers talk to developers as well.

41. homebrewer ◴[17 Oct 24 13:39 UTC] No.41869570{4}[source]▶

>>41868602 #

It's equally easy to forget about users from countries with way less freedom of speech and information sharing than in Western rich societies. These anti-abuse measures have made it much more difficult to access information blocked by my internet provider during the last few years. I'm relatively competent and can find ways around it, but my friends and relatives who pursue other career choices simply don't bother anymore.

Telegram channels have been a good alternative, but even that is going downhill thanks to French authorities.

Cloudflare and Google also often treat us like bots (endless captchas, etc) which makes it even more difficult.

42. zamadatix ◴[17 Oct 24 13:41 UTC] No.41869588[source]▶

>>41869439 #

Why go through that hassle instead of just removing the referer?

replies(1): >>41871044 #

43. kccqzy ◴[17 Oct 24 13:42 UTC] No.41869593{4}[source]▶

>>41868802 #

Similar techniques are known in the HTTP world too. There were things like detecting the order of HTTP request headers and matching them to known software, or even just comparing the actual content of the Accept header.

replies(1): >>41872048 #

44. kccqzy ◴[17 Oct 24 13:44 UTC] No.41869616{4}[source]▶

>>41868842 #

The difference is that RSS readers usually do background fetches on their own rather than waiting for a human to navigate to a page. So in theory, you could just set up a crontab (or systemd timer) that simply xdg-open various pages on a schedule and not be treated as bots.

45. mzajc ◴[17 Oct 24 13:48 UTC] No.41869639{3}[source]▶

>>41868763 #

I use the Random User-Agent Switcher[1] extension on Firefox. It does pick real agents, but some of them might show a really outdated browser (eg. Firefox 5X), which I assume is the reason I'm getting blocked.

[1]: https://addons.mozilla.org/en-US/firefox/addon/random_user_a...

46. mzajc ◴[17 Oct 24 13:50 UTC] No.41869662{3}[source]▶

>>41868916 #

This is horrifying. What happened to simply displaying a "Your browser is outdated, consider upgrading" banner on the website?

replies(4): >>41870155 #>>41870213 #>>41870800 #>>41875058 #

47. anthk ◴[17 Oct 24 13:52 UTC] No.41869685[source]▶

>>41867018 (TP) #

Or any Dillo user, with a PSP User Agent which is legit for small displays.

48. anthk ◴[17 Oct 24 13:53 UTC] No.41869694{3}[source]▶

>>41868417 #

For Reddit I just use it r/o under gopher://gopherddit.com

A good client it's either Lagrange (multiplatform), the old Lynx or Dillo with the Gopher plugin.

replies(1): >>41870388 #

49. nonameiguess ◴[17 Oct 24 14:05 UTC] No.41869801{3}[source]▶

>>41869163 #

No, it's more than that. Cloudflare's bot protection has blocked me from sites where I have a paid account, paid for by my real checking account with my real name attached. Even when I am perfectly willing to give out my identity and be tracked, I still can't because I can't even get to the login page.

replies(1): >>41873954 #

50. jasonlotito ◴[17 Oct 24 14:07 UTC] No.41869823[source]▶

>>41867018 (TP) #

Cloudflare has always been a dumpster fire in usability. The number of times it would block me in that way was enough to make me seriously question anyones technical knowledge that used it. It's a dumpster fire. Friends don't let friend use Cloudflare. To me, it's like the Spirit airlines of CDNs.

Sure, tech wise it might work great, but from your users perspective: it's trash.

replies(1): >>41872085 #

51. anilakar ◴[17 Oct 24 14:09 UTC] No.41869837{4}[source]▶

>>41869330 #

Sadly my biggest crime is running Firefox with default privacy settings and uBlock Origin installed. No VPNs or IPv6 tunnels, no Tor traffic whatsoever, no Google search history poisoning plugins.

If only there was a law that allowed one to be excluded from automatic behavior profiling...

52. ForHackernews ◴[17 Oct 24 14:21 UTC] No.41869944{6}[source]▶

>>41869512 #

It might well do, depending on what ChatGPT's app is asking the OS for. /e/OS is an Android fork that removes Google services and replaces them with open source stubs/re-implementations from https://microg.org/

I haven't tried the ChatGPT app, but I know that, for example my bank and other financial services apps work with on-device fingerprint authentication and no Google account on /e/OS.

replies(1): >>41877120 #

53. jorams ◴[17 Oct 24 14:32 UTC] No.41870049[source]▶

>>41869190 #

> I suspect that people operating Web sites have no idea how many legitimate users are blocked by CloudFlare.

I sometimes wonder if all Cloudflare employees are on some kind of whitelist that makes them not realize the ridiculous false positive rate of their bot detection.

54. acdha ◴[17 Oct 24 14:36 UTC] No.41870085{6}[source]▶

>>41869497 #

“Giving up your privacy” is a pretty sweeping claim – it sounds like you’re saying that Android inherently leaks private data to Google, which is broader than even Apple fans tend to say.

replies(3): >>41870994 #>>41871023 #>>41874026 #

55. amanda99 ◴[17 Oct 24 14:37 UTC] No.41870092[source]▶

>>41868030 #

Yes and the most infuriating thing is the "we need to verify the security of your connection" text.

56. Adachi91 ◴[17 Oct 24 14:43 UTC] No.41870144{3}[source]▶

>>41868417 #

> Reddit blocks it completely, requires you to sign-in but even the sign-in page is "network restricted";

I've been creating accounts every time I need to visit Reddit now to read a thread about [insert subject]. They do not validate E-Mail, so I just use `example@example.com`, whatever random username it suggests, and `example` as a password. I've created at least a thousand accounts at this point.

Malicious Compliance, until they disable this last effort at accessing their content.

replies(3): >>41871155 #>>41871451 #>>41872024 #

57. shbooms ◴[17 Oct 24 14:44 UTC] No.41870155{4}[source]▶

>>41869662 #

idk, even that seems too much to me, but maybe I'm just being too senstive.

but like, why is it a website's job to tell me what browser version to use? unless my outdated browser is lacking legitmate functionality which is required by your website, just serve the page and be done with it.

replies(1): >>41871281 #

58. JohnFen ◴[17 Oct 24 14:49 UTC] No.41870195[source]▶

>>41868030 #

> when I hit a website like this just close the tab and don't bother anymore.

Yeah, that's my solution as well. I take those annoyances as the website telling me that they don't want me there, so I grant them their wish.

replies(1): >>41872040 #

59. freedomben ◴[17 Oct 24 14:51 UTC] No.41870213{4}[source]▶

>>41869662 #

Wow. And this is now happening right as I've blacklisted google-chrome due to manifest v3 removal :facepalm:

60. zahllos ◴[17 Oct 24 14:56 UTC] No.41870266{5}[source]▶

>>41869070 #

SQL injection?

Get parameters can be abused like any parameter. This could be sql, could be directory traversal attempts, brute force username attempts, you name it.

replies(1): >>41871401 #

61. ◴[17 Oct 24 15:08 UTC] No.41870388{4}[source]▶

>>41869694 #

62. wbl ◴[17 Oct 24 15:15 UTC] No.41870438{3}[source]▶

>>41869163 #

You notice that Analogue Devices puts their (incredibly useful) information up for free. That's because they make money other ways. Ad supported content farm Internet had a nice run but we will get on without it.

63. whoopdedo ◴[17 Oct 24 15:51 UTC] No.41870800{4}[source]▶

>>41869662 #

The irony being you can get around the block by pretending to be a bot.

https://github.com/rails/rails/pull/52531

64. amatecha ◴[17 Oct 24 16:00 UTC] No.41870881[source]▶

>>41869190 #

Yeah, I've contacted numerous owners of personal/small sites and they are usually surprised, and never have any idea why I was blocked (not sure if it's an aspect of CF not revealing the reason, or the owner not knowing how to find that information). One or two allowlisted my IP but that doesn't strike me as a solution.

I've contacted companies about this and they usually just tell me to use a different browser or computer, which is like "duh, really?" , but also doesn't solve the problem for me or anyone else.

65. lovethevoid ◴[17 Oct 24 16:10 UTC] No.41870975[source]▶

>>41868594 #

Not sure a random UA extension is giving you much privacy. Try your results on coveryourtracks eff, and see. A random UA would provide a lot of identifying information despite being randomized.

From experience, a lot of the things people do in hopes of protecting their privacy only makes them far easier to profile.

replies(1): >>41871173 #

66. michaelt ◴[17 Oct 24 16:11 UTC] No.41870994{7}[source]▶

>>41870085 #

A person who was maximally distrustful of Google would assume they link your phone and your IP through the connection used to receive push notifications, and the wifi-network-visibility-to-location API, and the software update checker, and the DNS over HTTPS, and suchlike. As a US company, they could even be forced to do this in secret against their will, and lie about it.

Of course as Google doesn't claim they do this, many people would consider it unreasonably fearful/cynical.

replies(1): >>41872182 #

67. BiteCode_dev ◴[17 Oct 24 16:14 UTC] No.41871023{7}[source]▶

>>41870085 #

Google and Apple were both part of the PRISM program, of course I'm making this claim.

That's the opposite stance that would be bonkers.

replies(1): >>41872251 #

68. rolph ◴[17 Oct 24 16:15 UTC] No.41871038{4}[source]▶

>>41868822 #

funny thing, when FF is blocked i can get through with TOR.

replies(1): >>41875575 #

69. lovethevoid ◴[17 Oct 24 16:15 UTC] No.41871039[source]▶

>>41869190 #

What are some examples? I've been running ff on linux for quite some time now and am rarely blocked. I just run it with ublock origin.

replies(1): >>41871996 #

70. bityard ◴[17 Oct 24 16:16 UTC] No.41871044{3}[source]▶

>>41869588 #

Lots of sites see an empty referrer and send you to their main page or marketing page. Which means you can't get anywhere else on their site without a valid referrer. They consider it a form of "hotlink" protection.

(I'm not saying I agree with it, just that it exists.)

replies(1): >>41872789 #

71. DrillShopper ◴[17 Oct 24 16:20 UTC] No.41871086[source]▶

>>41867018 (TP) #

Maybe after the courts break up Amazon the FTC can turn its eye to Cloudflare.

replies(1): >>41871107 #

72. gjsman-1000 ◴[17 Oct 24 16:23 UTC] No.41871107[source]▶

>>41871086 #

A. Do you think courts give a darn about the 0.1% of users that are still using RSS? We might as well care about the 0.1% of users who want the ability to set every website's background color to purple with neon green anchor tags. RSS never caught on as a standard to begin with, peaking at 6% adoption by 2005.

B. Cloudflare has healthy competition with AWS, Akamai, Fastly, Bunny.net, Mux, Google Cloud, Azure, you name it, there's a competitor. This isn't even an Apple vs Google situation.

replies(1): >>41874383 #

73. hombre_fatal ◴[17 Oct 24 16:29 UTC] No.41871155{4}[source]▶

>>41870144 #

Most subreddits worth posting on usually have a minimum account age + minimum account karma. I've found it annoying to register new accounts too often.

74. ◴[17 Oct 24 16:31 UTC] No.41871166{5}[source]▶

>>41869070 #

75. mzajc ◴[17 Oct 24 16:32 UTC] No.41871173{3}[source]▶

>>41870975 #

coveryourtracks.eff.org is a great service, but it has a few limitations that apply here:

- The website judges your fingerprint based on how unique it is, but assumes that it's otherwise persistent. Randomizing my User-Agent serves the exact opposite - a given User-Agent might be more unique than using the default, but I randomize it to throw trackers off.

- To my knowledge, its "One in x browsers" metric (and by extension the "Bits of identifying information" and the final result) are based off of visitor statistics, which would likely be skewed as most of its visitors are privacy-conscious. They only say they have a "database of many other Internet users' configurations," so I can't verify this.

- Most of the measurements it makes rely on javascript support. For what it's worth, it claims my fingerprint is not unique when javascript is disabled, which is how I browse the web by default.

The other extreme would be fixing my User-Agent to the most common value, but I don't think that'd offer me much privacy unless I also used a proxy/NAT shared by many users.

replies(2): >>41873984 #>>41874723 #

76. hombre_fatal ◴[17 Oct 24 16:32 UTC] No.41871182{5}[source]▶

>>41869070 #

At the very least, they're wasting bandwidth to a (likely) low quality connection.

But anyone making malicious POST requests, like spamming chatGPT comments, first makes GET requests to load the submission and find comments to reply to. If they think you're a low quality user, I don't see why they'd bother just locking down POSTs.

77. SoftTalker ◴[17 Oct 24 16:40 UTC] No.41871235[source]▶

>>41868030 #

Same. If a site doesn't want me there, fine. There's no website that's so crucial to my life that I will go through those kinds of contortions to access it.

78. michaelt ◴[17 Oct 24 16:43 UTC] No.41871281{5}[source]▶

>>41870155 #

Back when the sun was setting on IE6, sites deployed banners that basically meant "We don't test on this, there's a good chance it's broken, but we don't know the specifics because we don't test with it"

79. marssaxman ◴[17 Oct 24 16:55 UTC] No.41871389{3}[source]▶

>>41869225 #

There's a pho restaurant near where I work which wants you to scan a QR code at the table, then order and pay through their website instead of talking to a person. In three visits, I have not once managed to get past their captcha!

(The actual process at this restaurant is to sit down, fuss with your phone a bit, then get up like you're about to leave; someone will arrive promptly to take your order.)

replies(1): >>41873314 #

80. kam ◴[17 Oct 24 16:57 UTC] No.41871401{6}[source]▶

>>41870266 #

If your site is vulnerable to SQL injection, you need to fix that, not pretend Cloudflare will save you.

replies(1): >>41874774 #

81. zargon ◴[17 Oct 24 17:02 UTC] No.41871451{4}[source]▶

>>41870144 #

They verify signup emails now. At least for me.

82. philsnow ◴[17 Oct 24 17:49 UTC] No.41871887[source]▶

>>41869439 #

Ah, maybe this is what’s happening to me.. I use Firefox with uBlock origin, privacy badger, multi-account containers, and temporary containers.

Whenever I click a link to another site, i get a new tab in either a pre-assigned container or else in a “tmpNNNN” container, and i think either by default or I have it configured to omit Referer headers on those new tab navigations.

83. Gormo ◴[17 Oct 24 17:51 UTC] No.41871900{3}[source]▶

>>41869163 #

> The privacy battle has to be at the legal layer.

I couldn't disagree more. The way to protect privacy is to make privacy the standard at the implementation layer, and to make it costly and difficult to breach it.

Trying to rely on political institutions without the practical and technical incentives favoring privacy will inevitably result in the political institutions themselves becoming the main instrument that erodes privacy.

replies(1): >>41873938 #

84. miki123211 ◴[17 Oct 24 17:54 UTC] No.41871928{4}[source]▶

>>41868602 #

> For every…let’s call them ‘privacy-conscious’ user, there are 10 (or more) nefarious actors that present largely the same way.

And each one of these could potentially create thousands of accounts, and do 100x as many requests as a normal user would.

Even if only 1% of the people using your service are fraudsters, a normal user has at most a few accounts, while fraudsters may try to create thousands per day. This means that e.g. 90% of your signups are fraudulent, despite the population of fraudsters being extremely small.

85. capitainenemo ◴[17 Oct 24 17:57 UTC] No.41871966{3}[source]▶

>>41868617 #

It's a tiny percentage of visitors, but a tech savvy one, and depending on your website, they could be a higher than average percentage of useful users or product purchasers. The impact could be disproportionate. What's frustrating is many websites don't even realise it is happening because the reporting from the intermediate (Cloudflare say) is inaccurate or incorrectly represents how it works. Fingerprinting has become integral to bot "protection". It's also frustrating when people think this can be drop in, and put it in front of APIs that are completely incapable of handling the challenge with no special casing (encountered on FedEx, GoFundMe), much like the RSS reader problem.

86. capitainenemo ◴[17 Oct 24 17:59 UTC] No.41871996{3}[source]▶

>>41871039 #

Odds are they have Resist Fingerprinting turned on. When I use it in a Firefox profile I encounter this all over the place. Drupal, FedEx.. some sites handle it better than others. Some it's a hard block with a single terse error. Some it is a challenge which gets blocked due to using remote javascript. Some it's a local challenge you can get past. But it has definitely been getting worse. Fingerprinting is being normalised, and the excuse of "bot protection" (bots can make unique fingerprints too, though) means that it can now be used maliciously (or by ad networks like google, same diff) as a standard feature.

replies(1): >>41874764 #

87. immibis ◴[17 Oct 24 18:02 UTC] No.41872024{4}[source]▶

>>41870144 #

I've created a few thousand accounts through a VPN (random node per account). After doing that, I found out Reddit accounts created through VPNs are automatically shadow banned the second time they comment (I think the first is also shadow deleted in some way). But they allow you to browse from a shadow banned account just fine.

88. immibis ◴[17 Oct 24 18:03 UTC] No.41872040{3}[source]▶

>>41870195 #

That's fine. You were an obstacle to their revenue gathering anyway.

89. miki123211 ◴[17 Oct 24 18:04 UTC] No.41872048{5}[source]▶

>>41869593 #

And then there's also TLS fingerprinting.

Different browsers use TLS in slightly different ways, send data in a slightly different order, have a different set of supported extensions / algorithms etc.

If your user agent says Safari 18, but your TLS fingerprint looks like Curl and not Safari, sophisticated services will immediately detect that something isn't right.

90. immibis ◴[17 Oct 24 18:08 UTC] No.41872085[source]▶

>>41869823 #

It's got the best vendor lock-in enshittification story - it's free - and that's all that matters.

91. acdha ◴[17 Oct 24 18:17 UTC] No.41872182{8}[source]▶

>>41870994 #

Sure, but that says you shouldn’t have a phone, not that ChatGPT is forcing you to give up your privacy.

92. viraptor ◴[17 Oct 24 18:23 UTC] No.41872245{3}[source]▶

>>41868257 #

I was responding to a person with Firefox issues, not RSS.

I'm not sure either if RSS bots could be added to good bots, but if anyone has traffic from them, we can definitely try. (No high hopes though, given the responses I got from support so far)

93. acdha ◴[17 Oct 24 18:23 UTC] No.41872251{8}[source]▶

>>41871023 #

PRISM covered communications through U.S. company’s servers. It was not a magic back door giving them access to your device’s local data, and even if you did believe that it was the answer would be not using a phone. A major intelligence agency does not need you to have a Google account so they can spy on you.

replies(1): >>41877118 #

94. johnklos ◴[17 Oct 24 18:30 UTC] No.41872316[source]▶

>>41869190 #

I've had several discussions that were literally along the lines of, "we don't see what you're talking about in our logs". Yes, you don't - traffic is blocked before it gets to your servers!

95. zamadatix ◴[17 Oct 24 19:09 UTC] No.41872789{4}[source]▶

>>41871044 #

Fair and valid answer to my wording. Rewritten for what I meant to ask: "Why set referrer to the base of the destination origin instead of something like Referrer-Policy: strict-origin?". I.e. remove it completely for cross-origin instead of always making up that you came from the destination.

Though what you mention does beg the question "is there really much privacy gain in that over using Referrer-Policy: same-origin and having referrer based pages work right?" I suppose so if you're randomizing your identity in an untrackable way for each connection it could be attractive... though I think that'd trigger being suspected as a bot far before the lack of proper same origin info :p.

96. tenken ◴[17 Oct 24 20:07 UTC] No.41873241{3}[source]▶

>>41869080 #

Why?! ... I've had 404 pages on Amazon, but never a captcha...

97. ◴[17 Oct 24 20:13 UTC] No.41873295{4}[source]▶

>>41868602 #

98. eddythompson80 ◴[17 Oct 24 20:16 UTC] No.41873314{4}[source]▶

>>41871389 #

I’ve only seen that at Asian restaurants near a university in my city. When I asked I was told that this is a common way in China and they get a lot of international students who prefer/expect it that way.

99. KPGv2 ◴[17 Oct 24 20:26 UTC] No.41873407[source]▶

>>41867018 (TP) #

Reddit seems to do this to me (sometimes) when I use Zen browser. Switching over to Safari or Chrome and the site always works great.

100. doctor_radium ◴[17 Oct 24 20:37 UTC] No.41873515[source]▶

>>41868030 #

Hey, same here! For better or worse, I use Opera Mini for much of my mobile browsing, and it fares far worse than Firefox with uBlock Origin and ResistFingerprinting. I complained about this roughly a year ago on a similar HN thread, on which a Cloudflare rep also participated. Since then something changed, but both sides being black boxes, I can't tell if Cloudflare is wising up or Mini has stepped up. I still get the same challenge pages, but Mini gets through them automatically now, more often than not.

But not always. My most recent stumbling block is https://www.napaonline.com. Guess I'm buying oxygen sensors somewhere else.

101. ruszki ◴[17 Oct 24 20:48 UTC] No.41873620{4}[source]▶

>>41868602 #

Was anybody stopped to do nefarious actions by these annoyances?

It's like at my current and previous companies. They make a lot of security restrictions. The problem is, if somebody wants to get data out, they can get out anytime (or in). Security department says that it's against "accidental" leaks. I'm still waiting a single instance when they caught an "accidental" leak, and they are just not introducing extra steps, when at the end I achieve the exact same thing. Even when I caused a real potential leak, nobody stopped me to do it. The only reason why they have these security services/apps is to push responsibility to other companies.

102. m463 ◴[17 Oct 24 20:50 UTC] No.41873646{3}[source]▶

>>41869080 #

at one point I couldn't access amazon at night.

I would get different captcha, one convoluted that wouldn't even load the required images.

And I would get the oops sorry dog page for everything.

I finally contacted amazon, gave them my (static) ip address and it was good.

In other locations, I have to solve a 6-distorted-letter captcha to log in, but that's the extent of it.

103. kjkjadksj ◴[17 Oct 24 21:20 UTC] No.41873926[source]▶

>>41867018 (TP) #

Reddit has been bad about it as of late too

104. HappMacDonald ◴[17 Oct 24 21:21 UTC] No.41873938{4}[source]▶

>>41871900 #

Yet without regulation nothing stops large companies from simply changing the implementation layer for one that pads their bottom line better, or just rebuild it from scratch.

If people who valued privacy really controlled the implementation layer we wouldn't have gotten to this point in the first place.

replies(1): >>41874117 #

105. HappMacDonald ◴[17 Oct 24 21:23 UTC] No.41873954{4}[source]▶

>>41869801 #

They block such visits because their pragma suspects that your visit is the account of a real human that was hacked by a bot.

106. HappMacDonald ◴[17 Oct 24 21:27 UTC] No.41873984{4}[source]▶

>>41871173 #

I would just fingerprint you as "the only person on the internet who is scrambling their UA string" :)

107. ForHackernews ◴[17 Oct 24 21:31 UTC] No.41874026{7}[source]▶

>>41870085 #

> it sounds like you’re saying that Android inherently leaks private data to Google, which is broader than even Apple fans tend to say.

Yes? I mean, not "leaks" - it's designed to upload your private data to Google and others.

https://www.tcd.ie/news_events/articles/study-reveals-scale-...

> Even when minimally configured and the handset is idle, with the notable exception of e/OS, these vendor-customised Android variants transmit substantial amounts of information to the OS developer and to third parties such as Google, Microsoft, LinkedIn, and Facebook that have pre-installed system apps. There is no opt-out from this data collection.

108. Gormo ◴[17 Oct 24 21:42 UTC] No.41874117{5}[source]▶

>>41873938 #

The point we're at is one in which privacy is still attainable via implementation-layer measures, even if it requires investing some effort and making some trade-offs to sustain. The alternative -- placing trust in regulation, which never works in the long run -- will inevitably result in regulatory capture that eliminates those remaining practical measures and replaces them with, at best, a performative illusion.

109. HappMacDonald ◴[17 Oct 24 22:19 UTC] No.41874383{3}[source]▶

>>41871107 #

Cloudflare doesn't offer the same product suite as the other companies you mention, though. Cloudflare is primarily DDoS prevention while the others are primarily cloud hosting.

And it is the DDoS prevention measures at issue here.

replies(1): >>41875086 #

110. lovethevoid ◴[17 Oct 24 23:08 UTC] No.41874723{4}[source]▶

>>41871173 #

Randomizing to throw trackers off only works if you only ever visit sites once.

But yes, without javascript a lot of tracking functions fail to operate. That is good for privacy, and EFF notes that on the site.

You can fix your UA to a common value, it's about providing the least amount of identifying bits, and randomizing it just provides another bit to identify you by. Always remember: an absence of information is also valuable information!

111. lovethevoid ◴[17 Oct 24 23:13 UTC] No.41874764{4}[source]▶

>>41871996 #

I also use Mullvad Browser (a browser based on Firefox), and it supports resisting fingerprinting without any of those blocks. Tried it on Drupal and Fedex. Loads Cloudflare sites normally.

I'm guessing if it's really Resist Fingerprinting on Firefox (something Mullvad also has on by default), then there are other settings that aren't being enabled causing the issue. Mullvad actually lists the settings related to resisting fingerprinting here - https://mullvad.net/en/browser/hard-facts

replies(1): >>41875177 #

112. zahllos ◴[17 Oct 24 23:15 UTC] No.41874774{7}[source]▶

>>41871401 #

Obviously. But I was responding to "what is sinister about a GET request". To put it a slightly different way, it does not matter so much whether the request is a read or a write. For example DNS amplfication attacks work by asking a DNS server (read) for a much larger record than the request packet requires, and faking the request IP to match the victim. That's not even a connection the victim initiated, but that packet still travels along the network path. In fact, if it crashes a switch or something along the way, that's just as good from the point of view of the attacker, maybe even better as it will have more impact.

I am absolutely not a fan of all these "are you human?" checks at all, doubly so when ad-blockers trigger them. I think there are very legitimate reasons for wanting to access certain sites without being tracked - anything related to health is an example.

Maybe I should have made a more substantive comment, but I don't believe this is as simple a problem as reducing it to request types.

113. hombre_fatal ◴[18 Oct 24 00:00 UTC] No.41875058{4}[source]▶

>>41869662 #

It does do that, though.

https://github.com/rails/rails/pull/50505/files#diff-dce8d06...

114. gjsman-1000 ◴[18 Oct 24 00:04 UTC] No.41875086{4}[source]▶

>>41874383 #

Five years ago, you would’ve been right, but Cloudflare is very different now.

Nowadays, Cloudflare has image compression and CDN services, video storage and delivery services, serverless compute with Workers, domain registration, (soon) container support with optional GPUs, durable objects (basically serverless storage), serverless SQL databases (D1), even an AWS S3 competitor with B2. They even have bespoke services like CloudFlare Tunnels - what’s AWS got that’s anything like it?

Cloudflare is getting close to full-on AWS. At least, the parts most customers use. If they just added boring old VPSs, people would realize very quickly how full featured they are.

As for DDoS mitigation - you’ve still got AWS Shield, Akamai, Azure, Radware, F5, even Oracle (Dyn) competing in that market. Unless you could show Cloudflare did illegal tying as a monopolist specifically to sell DDoS prevention, there’s no case.

115. capitainenemo ◴[18 Oct 24 00:20 UTC] No.41875177{5}[source]▶

>>41874764 #

Or it could simply be that since it is on by default for Mullvad, that Cloudflare and others have an explicit exception built in for it. It might also be dependent on where traffic is coming from. I have had different behaviour with different ISPs. Perhaps your entire VPN network gets a pass due to, perhaps depending on how they manage abuse, or how much unique information they can get just based on the few bits of info the browser leaks combined with the uniqueness of the browser and VPN connection IPs.

116. mmooss ◴[18 Oct 24 01:32 UTC] No.41875575{5}[source]▶

>>41871038 #

With what browser? The same one that's blocked?

117. BiteCode_dev ◴[18 Oct 24 07:31 UTC] No.41877118{9}[source]▶

>>41872251 #

Forest for the tree.

Google and Apple are both heavily invested in ads (apple made 4.7 billion from ads in 2022), they have a track record of exfiltrating your data (remember contractors listening to your siri recordings?), of lying to the customers (remember the home button scandal on iPhone?), have control over a device that have your whole life yet runs partially on code you can't evaluate.

Trusting those people makes no sense at all. You have a business relationship with them, that's it.

replies(1): >>41878967 #

118. BiteCode_dev ◴[18 Oct 24 07:32 UTC] No.41877120{7}[source]▶

>>41869944 #

I already have microg installed.

119. acdha ◴[18 Oct 24 12:52 UTC] No.41878967{10}[source]▶

>>41877118 #

It’s interesting how each time you say something which isn’t accurate you try to distract by changing the topic.

120. Terr_ ◴[19 Oct 24 00:37 UTC] No.41884694[source]▶

>>41868030 #

The worst part is that a lot of it is mysteriously capricious with no recourse.

Like, you visit Site A too often while blocking some javascript, and now Site B doesn't work for no apparent reason, and there's no resolution path. Worse, the bad information may become permanent if an owner uses it to taint your account, again with no clear reason or appeal.

I suspect Reddit effectively killed my 10+ year account (appeal granted, but somehow still shadowbanned) because I once used the "wrong" public wifi to access it.

121. amatecha ◴[31 Oct 24 01:22 UTC] No.42002463[source]▶

>>41867018 (TP) #

Nice, today I found I'm blocked from subway.com, that's cool. Good bot detection, my brand new Debian Linux install with Firefox must be really suspicious.

↑