Another problem is "resist fingerprinting" prevents some canvas processing, and many websites like bluesky, linked in or substack uses canvas to handle image upload, so your images appear to be stripes of pixel.
Then you have mobile apps that just don't run if you don't have a google account, like chatgpt's native app.
I understand why people give up, trying to fight for your privacy is an uphill battle with no end in sight.
And yes, it's sad that the "make internet work again" is behind an expensive paywall..
Is that true? At least on iOS you can log into the ChatGPT with same email/password as the website.
I never use Google login for stuff and ChatGPT works fine for me.
That's not true, I use ChatGPT's app on my phone without logging into a Google account.
You don't even need any kind of account at all to use it.
An android phone asks you to link a google account when you use it for the first time. It takes a very dedicated user to refuse that, then to avoid logging in into the gmail, youtube or app store apps which will all also link your phone to your google account when you sign in.
But I do actively avoid this, I use Aurora, F-droid, K9 and NewPipeX, so no link to google.
But then no ChatGPT app. When I start it, I get hit with a logging page to the app store and it's game over.
I tried for a long time to get around it, but now when I hit a website like this just close the tab and don't bother anymore.
The most egregious is Microsoft (just about every Microsoft service/page, really), where all you get is a "The request is blocked." and a few pointless identifiers listed at the bottom, purely because it thinks your browser is too old.
CF's captcha page isn't any better either, usually putting me in an endless loop if it doesn't like my User-Agent.
However, the undeniable reality is that accessing the website with a non-residential IP is a very, very strong indicator of sinister behaviour. Anyone that’s been in a position to operate one of these services will tell you that. For every…let’s call them ‘privacy-conscious’ user, there are 10 (or more) nefarious actors that present largely the same way. It’s easy to forget this as a user.
I’m all but certain that if Reddit or LinkedIn could differentiate, they would. But they can’t. That’s kinda the whole point.
Site owners probably don't even see these bounced visits, and it's such a tiny percentage of visitors who do this that it won't make a difference. Meh, it's just another annoyance to be able to use the web on our own terms.
You’re best off just picking real ones. We’ve got hit by a botnet sending 10k+ requests from 40 different ASNs with 1000s of different IPs. The only way we’re able to identify/block the traffic was excluding user agents matching some regex (for whatever reason they weren’t spoofing real user agents but weren’t sending actual ones either).
That’s true in some cases, I’m sure, but also remember that most site owners deal with lots of tedious abuse. For example, some people get really annoyed about Tor being blocked but for most sites Tor is a tiny fraction of total traffic but a fairly large percentage of the abuse probing for vulnerabilities, guessing passwords, spamming contact forms, etc. so while I sympathize for the legitimate users I also completely understand why a busy site operator is going to flip a switch making their log noise go down by a double-digit percentage.
> From a privacy POV, your VPN is doing nothing to them, because your IP address means very little to them from a tracking POV.
I disagree. (1) Since I have javascript disabled, IP address is generally their next best thing to go on. (2) I don't want to give them IP address to correlate with the other data they have on me, because if they sell that data, now someone else who only has my IP address suddenly can get a bunch of other stuff with it too.
def blocked?
user_agent_version_reported? && unsupported_browser?
end
well, you know what to do here :)In an adversarial environment, especially with both AI scrapers and AI posters, websites have to be able to identify and ban persistent abusers. Which unfortunately implies having some kind of identification of everybody.
I suspect that people operating Web sites have no idea how many legitimate users are blocked by CloudFlare.
And. based on the responses I got when I contacted two of the companies whose sites were chronically blocked by CloudFlare for months, it seemed like it wasn't worth any employee's time to try to diagnose.
Also, I'm frequently blocked by CloudFlare when running Tor Browser. Blocking by Tor exit node IP address (if that's what's happening) is much more understandable than blocking Firefox from a residential IP address, but still makes CloudFlare not a friend of people who want or need to use Tor.
I discovered this when I set up IPv6 using hurricane electric as a tunnel broker for IPv6 connectivity.
Seemingly Google has all HEnet IPv6tunnel subnets listed for such behaviour without it being documented anywhere. It was extremely annoying until I figured out what was going on.
In the end, the fact remain: no chatgpt app without giving up your privacy, to google none the less.
Telegram channels have been a good alternative, but even that is going downhill thanks to French authorities.
Cloudflare and Google also often treat us like bots (endless captchas, etc) which makes it even more difficult.
[1]: https://addons.mozilla.org/en-US/firefox/addon/random_user_a...
Sure, tech wise it might work great, but from your users perspective: it's trash.
If only there was a law that allowed one to be excluded from automatic behavior profiling...
I haven't tried the ChatGPT app, but I know that, for example my bank and other financial services apps work with on-device fingerprint authentication and no Google account on /e/OS.
I sometimes wonder if all Cloudflare employees are on some kind of whitelist that makes them not realize the ridiculous false positive rate of their bot detection.
I've been creating accounts every time I need to visit Reddit now to read a thread about [insert subject]. They do not validate E-Mail, so I just use `example@example.com`, whatever random username it suggests, and `example` as a password. I've created at least a thousand accounts at this point.
Malicious Compliance, until they disable this last effort at accessing their content.
but like, why is it a website's job to tell me what browser version to use? unless my outdated browser is lacking legitmate functionality which is required by your website, just serve the page and be done with it.
I've contacted companies about this and they usually just tell me to use a different browser or computer, which is like "duh, really?" , but also doesn't solve the problem for me or anyone else.
From experience, a lot of the things people do in hopes of protecting their privacy only makes them far easier to profile.
Of course as Google doesn't claim they do this, many people would consider it unreasonably fearful/cynical.
That's the opposite stance that would be bonkers.
(I'm not saying I agree with it, just that it exists.)
B. Cloudflare has healthy competition with AWS, Akamai, Fastly, Bunny.net, Mux, Google Cloud, Azure, you name it, there's a competitor. This isn't even an Apple vs Google situation.
- The website judges your fingerprint based on how unique it is, but assumes that it's otherwise persistent. Randomizing my User-Agent serves the exact opposite - a given User-Agent might be more unique than using the default, but I randomize it to throw trackers off.
- To my knowledge, its "One in x browsers" metric (and by extension the "Bits of identifying information" and the final result) are based off of visitor statistics, which would likely be skewed as most of its visitors are privacy-conscious. They only say they have a "database of many other Internet users' configurations," so I can't verify this.
- Most of the measurements it makes rely on javascript support. For what it's worth, it claims my fingerprint is not unique when javascript is disabled, which is how I browse the web by default.
The other extreme would be fixing my User-Agent to the most common value, but I don't think that'd offer me much privacy unless I also used a proxy/NAT shared by many users.
But anyone making malicious POST requests, like spamming chatGPT comments, first makes GET requests to load the submission and find comments to reply to. If they think you're a low quality user, I don't see why they'd bother just locking down POSTs.
(The actual process at this restaurant is to sit down, fuss with your phone a bit, then get up like you're about to leave; someone will arrive promptly to take your order.)
Whenever I click a link to another site, i get a new tab in either a pre-assigned container or else in a “tmpNNNN” container, and i think either by default or I have it configured to omit Referer headers on those new tab navigations.
I couldn't disagree more. The way to protect privacy is to make privacy the standard at the implementation layer, and to make it costly and difficult to breach it.
Trying to rely on political institutions without the practical and technical incentives favoring privacy will inevitably result in the political institutions themselves becoming the main instrument that erodes privacy.
And each one of these could potentially create thousands of accounts, and do 100x as many requests as a normal user would.
Even if only 1% of the people using your service are fraudsters, a normal user has at most a few accounts, while fraudsters may try to create thousands per day. This means that e.g. 90% of your signups are fraudulent, despite the population of fraudsters being extremely small.
Different browsers use TLS in slightly different ways, send data in a slightly different order, have a different set of supported extensions / algorithms etc.
If your user agent says Safari 18, but your TLS fingerprint looks like Curl and not Safari, sophisticated services will immediately detect that something isn't right.
I'm not sure either if RSS bots could be added to good bots, but if anyone has traffic from them, we can definitely try. (No high hopes though, given the responses I got from support so far)
Though what you mention does beg the question "is there really much privacy gain in that over using Referrer-Policy: same-origin and having referrer based pages work right?" I suppose so if you're randomizing your identity in an untrackable way for each connection it could be attractive... though I think that'd trigger being suspected as a bot far before the lack of proper same origin info :p.
But not always. My most recent stumbling block is https://www.napaonline.com. Guess I'm buying oxygen sensors somewhere else.
It's like at my current and previous companies. They make a lot of security restrictions. The problem is, if somebody wants to get data out, they can get out anytime (or in). Security department says that it's against "accidental" leaks. I'm still waiting a single instance when they caught an "accidental" leak, and they are just not introducing extra steps, when at the end I achieve the exact same thing. Even when I caused a real potential leak, nobody stopped me to do it. The only reason why they have these security services/apps is to push responsibility to other companies.
I would get different captcha, one convoluted that wouldn't even load the required images.
And I would get the oops sorry dog page for everything.
I finally contacted amazon, gave them my (static) ip address and it was good.
In other locations, I have to solve a 6-distorted-letter captcha to log in, but that's the extent of it.
If people who valued privacy really controlled the implementation layer we wouldn't have gotten to this point in the first place.
Yes? I mean, not "leaks" - it's designed to upload your private data to Google and others.
https://www.tcd.ie/news_events/articles/study-reveals-scale-...
> Even when minimally configured and the handset is idle, with the notable exception of e/OS, these vendor-customised Android variants transmit substantial amounts of information to the OS developer and to third parties such as Google, Microsoft, LinkedIn, and Facebook that have pre-installed system apps. There is no opt-out from this data collection.
And it is the DDoS prevention measures at issue here.
But yes, without javascript a lot of tracking functions fail to operate. That is good for privacy, and EFF notes that on the site.
You can fix your UA to a common value, it's about providing the least amount of identifying bits, and randomizing it just provides another bit to identify you by. Always remember: an absence of information is also valuable information!
I'm guessing if it's really Resist Fingerprinting on Firefox (something Mullvad also has on by default), then there are other settings that aren't being enabled causing the issue. Mullvad actually lists the settings related to resisting fingerprinting here - https://mullvad.net/en/browser/hard-facts
I am absolutely not a fan of all these "are you human?" checks at all, doubly so when ad-blockers trigger them. I think there are very legitimate reasons for wanting to access certain sites without being tracked - anything related to health is an example.
Maybe I should have made a more substantive comment, but I don't believe this is as simple a problem as reducing it to request types.
https://github.com/rails/rails/pull/50505/files#diff-dce8d06...
Nowadays, Cloudflare has image compression and CDN services, video storage and delivery services, serverless compute with Workers, domain registration, (soon) container support with optional GPUs, durable objects (basically serverless storage), serverless SQL databases (D1), even an AWS S3 competitor with B2. They even have bespoke services like CloudFlare Tunnels - what’s AWS got that’s anything like it?
Cloudflare is getting close to full-on AWS. At least, the parts most customers use. If they just added boring old VPSs, people would realize very quickly how full featured they are.
As for DDoS mitigation - you’ve still got AWS Shield, Akamai, Azure, Radware, F5, even Oracle (Dyn) competing in that market. Unless you could show Cloudflare did illegal tying as a monopolist specifically to sell DDoS prevention, there’s no case.
Google and Apple are both heavily invested in ads (apple made 4.7 billion from ads in 2022), they have a track record of exfiltrating your data (remember contractors listening to your siri recordings?), of lying to the customers (remember the home button scandal on iPhone?), have control over a device that have your whole life yet runs partially on code you can't evaluate.
Trusting those people makes no sense at all. You have a business relationship with them, that's it.
Like, you visit Site A too often while blocking some javascript, and now Site B doesn't work for no apparent reason, and there's no resolution path. Worse, the bad information may become permanent if an owner uses it to taint your account, again with no clear reason or appeal.
I suspect Reddit effectively killed my 10+ year account (appeal granted, but somehow still shadowbanned) because I once used the "wrong" public wifi to access it.