Most active commenters
  • mzajc(4)

←back to thread

556 points campuscodi | 18 comments | | HN request time: 0.209s | source | bottom
Show context
amatecha ◴[] No.41867018[source]
I get blocked from websites with some regularity, running Firefox with strict privacy settings, "resist fingerprinting" etc. on OpenBSD. They just give a 403 Forbidden with no explanation, but it's only ever on sites fronted by CloudFlare. Good times. Seems legit.
replies(13): >>41867245 #>>41867420 #>>41867658 #>>41868030 #>>41868383 #>>41868594 #>>41869190 #>>41869439 #>>41869685 #>>41869823 #>>41871086 #>>41873407 #>>41873926 #
1. mzajc ◴[] No.41868594[source]
I randomize my User-Agent header and many websites outright block me, most often with no captcha and no useless error message.

The most egregious is Microsoft (just about every Microsoft service/page, really), where all you get is a "The request is blocked." and a few pointless identifiers listed at the bottom, purely because it thinks your browser is too old.

CF's captcha page isn't any better either, usually putting me in an endless loop if it doesn't like my User-Agent.

replies(3): >>41868763 #>>41868916 #>>41870975 #
2. charrondev ◴[] No.41868763[source]
Are you sending an actual random string as your UA or sending one of a set of actual user agents?

You’re best off just picking real ones. We’ve got hit by a botnet sending 10k+ requests from 40 different ASNs with 1000s of different IPs. The only way we’re able to identify/block the traffic was excluding user agents matching some regex (for whatever reason they weren’t spoofing real user agents but weren’t sending actual ones either).

replies(2): >>41868802 #>>41869639 #
3. RALaBarge ◴[] No.41868802[source]
I worked at an anti-spam email security company in the aughts, and we had a perl engine that would rip apart the MIME boundaries and measure everything - UA, SMTP client fingerprint headers, even the number of anchor or paragraph tags. A large combination of IF/OR evaluations with a regex engine did a pretty good job since the botnets usually don't bother to fully randomize or really opsec the payloads they are sending since it is a cannon instead of a flyswatter.
replies(1): >>41869593 #
4. pushcx ◴[] No.41868916[source]
Rails is going to make this much worse for you. All new apps include naive agent sniffing and block anything “old” https://github.com/rails/rails/pull/50505
replies(2): >>41869122 #>>41869662 #
5. GoblinSlayer ◴[] No.41869122[source]

  def blocked?
    user_agent_version_reported? && unsupported_browser?
  end
well, you know what to do here :)
6. kccqzy ◴[] No.41869593{3}[source]
Similar techniques are known in the HTTP world too. There were things like detecting the order of HTTP request headers and matching them to known software, or even just comparing the actual content of the Accept header.
replies(1): >>41872048 #
7. mzajc ◴[] No.41869639[source]
I use the Random User-Agent Switcher[1] extension on Firefox. It does pick real agents, but some of them might show a really outdated browser (eg. Firefox 5X), which I assume is the reason I'm getting blocked.

[1]: https://addons.mozilla.org/en-US/firefox/addon/random_user_a...

8. mzajc ◴[] No.41869662[source]
This is horrifying. What happened to simply displaying a "Your browser is outdated, consider upgrading" banner on the website?
replies(4): >>41870155 #>>41870213 #>>41870800 #>>41875058 #
9. shbooms ◴[] No.41870155{3}[source]
idk, even that seems too much to me, but maybe I'm just being too senstive.

but like, why is it a website's job to tell me what browser version to use? unless my outdated browser is lacking legitmate functionality which is required by your website, just serve the page and be done with it.

replies(1): >>41871281 #
10. freedomben ◴[] No.41870213{3}[source]
Wow. And this is now happening right as I've blacklisted google-chrome due to manifest v3 removal :facepalm:
11. whoopdedo ◴[] No.41870800{3}[source]
The irony being you can get around the block by pretending to be a bot.

https://github.com/rails/rails/pull/52531

12. lovethevoid ◴[] No.41870975[source]
Not sure a random UA extension is giving you much privacy. Try your results on coveryourtracks eff, and see. A random UA would provide a lot of identifying information despite being randomized.

From experience, a lot of the things people do in hopes of protecting their privacy only makes them far easier to profile.

replies(1): >>41871173 #
13. mzajc ◴[] No.41871173[source]
coveryourtracks.eff.org is a great service, but it has a few limitations that apply here:

- The website judges your fingerprint based on how unique it is, but assumes that it's otherwise persistent. Randomizing my User-Agent serves the exact opposite - a given User-Agent might be more unique than using the default, but I randomize it to throw trackers off.

- To my knowledge, its "One in x browsers" metric (and by extension the "Bits of identifying information" and the final result) are based off of visitor statistics, which would likely be skewed as most of its visitors are privacy-conscious. They only say they have a "database of many other Internet users' configurations," so I can't verify this.

- Most of the measurements it makes rely on javascript support. For what it's worth, it claims my fingerprint is not unique when javascript is disabled, which is how I browse the web by default.

The other extreme would be fixing my User-Agent to the most common value, but I don't think that'd offer me much privacy unless I also used a proxy/NAT shared by many users.

replies(2): >>41873984 #>>41874723 #
14. michaelt ◴[] No.41871281{4}[source]
Back when the sun was setting on IE6, sites deployed banners that basically meant "We don't test on this, there's a good chance it's broken, but we don't know the specifics because we don't test with it"
15. miki123211 ◴[] No.41872048{4}[source]
And then there's also TLS fingerprinting.

Different browsers use TLS in slightly different ways, send data in a slightly different order, have a different set of supported extensions / algorithms etc.

If your user agent says Safari 18, but your TLS fingerprint looks like Curl and not Safari, sophisticated services will immediately detect that something isn't right.

16. HappMacDonald ◴[] No.41873984{3}[source]
I would just fingerprint you as "the only person on the internet who is scrambling their UA string" :)
17. lovethevoid ◴[] No.41874723{3}[source]
Randomizing to throw trackers off only works if you only ever visit sites once.

But yes, without javascript a lot of tracking functions fail to operate. That is good for privacy, and EFF notes that on the site.

You can fix your UA to a common value, it's about providing the least amount of identifying bits, and randomizing it just provides another bit to identify you by. Always remember: an absence of information is also valuable information!

18. hombre_fatal ◴[] No.41875058{3}[source]
It does do that, though.

https://github.com/rails/rails/pull/50505/files#diff-dce8d06...