Most active commenters
  • (5)
  • throwaway290(4)
  • lazide(4)
  • benhurmarcel(3)

←back to thread

770 points ta988 | 55 comments | | HN request time: 0.004s | source | bottom
Show context
markerz ◴[] No.42551173[source]
One of my websites was absolutely destroyed by Meta's AI bot: Meta-ExternalAgent https://developers.facebook.com/docs/sharing/webmasters/web-...

It seems a bit naive for some reason and doesn't do performance back-off the way I would expect from Google Bot. It just kept repeatedly requesting more and more until my server crashed, then it would back off for a minute and then request more again.

My solution was to add a Cloudflare rule to block requests from their User-Agent. I also added more nofollow rules to links and a robots.txt but those are just suggestions and some bots seem to ignore them.

Cloudflare also has a feature to block known AI bots and even suspected AI bots: https://blog.cloudflare.com/declaring-your-aindependence-blo... As much as I dislike Cloudflare centralization, this was a super convenient feature.

replies(14): >>42551260 #>>42551410 #>>42551412 #>>42551513 #>>42551649 #>>42551742 #>>42552017 #>>42552046 #>>42552437 #>>42552763 #>>42555123 #>>42562686 #>>42565119 #>>42572754 #
1. MetaWhirledPeas ◴[] No.42551742[source]
> Cloudflare also has a feature to block known AI bots and even suspected AI bots

In addition to other crushing internet risks, add wrongly blacklisted as a bot to the list.

replies(4): >>42551773 #>>42552921 #>>42562510 #>>42564887 #
2. throwaway290 ◴[] No.42551773[source]
What do you mean crushing risk? Just solve these 12 puzzles by moving tiny icons on tiny canvas while on the phone and you are in the clear for a couple more hours!
replies(3): >>42552006 #>>42552586 #>>42552825 #
3. gs17 ◴[] No.42552006[source]
If it clears you at all. I accidentally set a user agent switcher on for every site instead of the one I needed it for, and Cloudflare would give me an infinite loop of challenges. At least turning it off let me use the Internet again.
4. homebrewer ◴[] No.42552586[source]
If you live in a region which it is economically acceptable to ignore the existence of (I do), you sometimes get blocked by website r̶a̶c̶k̶e̶t̶ protection for no reason at all, simply because some "AI" model saw a request coming from an unusual place.
5. benhurmarcel ◴[] No.42552825[source]
Sometimes it doesn’t even give you a Captcha.

I have come across some websites that block me using Cloudflare with no way of solving it. I’m not sure why, I’m in a large first-world country, I tried a stock iPhone and a stock Windows PC, no VPN or anything.

That’s just no way to know.

replies(2): >>42555004 #>>42570541 #
6. JohnMakin ◴[] No.42552921[source]
These features are opt-in and often paid features. I struggle to see how this is a "crushing risk," although I don't doubt that sufficiently unskilled shops would be completely crushed by an IP/userAgent block. Since Cloudflare has a much more informed and broader view of internet traffic than maybe any other company in the world, I'll probably use that feature without any qualms at some point in the future. Right now their normal WAF rules do a pretty good job of not blocking legitimate traffic, at least on enterprise.
replies(1): >>42553817 #
7. MetaWhirledPeas ◴[] No.42553817[source]
The risk is not to the company using Cloudflare; the risk is to any legitimate individual who Cloudflare decides is a bot. Hopefully their detection is accurate because a false positive would cause great difficulties for the individual.
replies(1): >>42562004 #
8. dannyw ◴[] No.42555004{3}[source]
That’s probably a page/site rule set by the website owner. Some sites block EU IPs as the costs of complying with GDPR outweigh the gain.
replies(2): >>42556915 #>>42557953 #
9. throwaway290 ◴[] No.42556915{4}[source]
I saw GDPR related blockage like literally twice in a few years and I connect from EU IP almost all the time

Overload of captcha is not about GDPR...

but the issue is strange. @benhurmarcel I would check if there is somebody or some company nearby abusing stuff and you got under the hammer. Maybe unscrupulous VPN company. Using a good VPN can in fact make things better (but will cost money) or if you have a place to put your own all the better. otherwise check if you can change your IP with provider or change providers or move I guess...

not to excuse CF racket but as this thread shows the data hungry artificial stupidity leaves no choice to some sites

replies(2): >>42557971 #>>42565139 #
10. benhurmarcel ◴[] No.42557953{4}[source]
One of the affected websites is a local cafe in the EU. It doesn’t make any sense to block EU IPs.
11. benhurmarcel ◴[] No.42557971{5}[source]
Does it work only based on the IP?

I also tried from a mobile 4G connection, it’s the same.

replies(1): >>42564209 #
12. neilv ◴[] No.42562004{3}[source]
For months, my Firefox was locked out of gitlab.com and some other sites I wanted to use, because CloudFlare didn't like my browser.

Lesson learned: even when you contact the sales dept. of multiple companies, they just don't/can't care about random individuals.

Even if they did care, a company successfully doing an extended three-way back-and-forth troubleshooting with CloudFlare, over one random individual, seems unlikely.

13. kmeisthax ◴[] No.42562510[source]
This is already a thing for basically all of the second[0] and third worlds. A non-trivial amount of Cloudflare's security value is plausible algorithmic discrimination and collective punishment as a service.

[0] Previously Soviet-aligned countries; i.e. Russia and eastern Europe.

replies(5): >>42562599 #>>42563762 #>>42564357 #>>42566973 #>>42567500 #
14. ls612 ◴[] No.42562599[source]
People hate collective punishment because it works so well.
replies(5): >>42562792 #>>42563310 #>>42563642 #>>42563761 #>>42563805 #
15. eckesicle ◴[] No.42562792{3}[source]
Anecdatally, by default, we now block all Chinese and Russian IPs across our servers.

After doing so, all of our logs, like ssh auth etc, are almost completely free and empty of malicious traffic. It’s actually shocking how well a blanket ban worked for us.

replies(5): >>42562837 #>>42563023 #>>42567554 #>>42569757 #>>42574189 #
16. macintux ◴[] No.42562837{4}[source]
~20 years ago I worked for a small IT/hosting firm, and the vast majority of our hostile traffic came from APNIC addresses. I seriously considered blocking all of it, but I don’t think I ever pulled the trigger.
17. panic ◴[] No.42563310{3}[source]
Works how? Are these blocks leading to progress toward solving any of the underlying issues?
replies(2): >>42563743 #>>42573501 #
18. ◴[] No.42563642{3}[source]
19. forgetfreeman ◴[] No.42563743{4}[source]
It's unclear that there are actors below the regional-conglomerate-of-nation-states level that could credibly resolve the underlying issues, and given legislation and enforcement regimes sterling track record of resolving technological problems realistically it seems questionable that solutions could exist in practice. Anyway this kind of stuff is well outside the bounds of what a single org hosting an online forum could credibly address. Pragmatism uber alles.
20. anonym29 ◴[] No.42563761{3}[source]
Innocent people hate being punished for the behavior of other people, whom the innocent people have no control over.*

FTFY.

replies(1): >>42563952 #
21. shark_laser ◴[] No.42563762[source]
Yep. Same for most of Asia too.

Cloudflare's filters are basically straight up racist.

I have stopped using so many sites due to their use of Cloudflare.

replies(2): >>42570553 #>>42571441 #
22. saagarjha ◴[] No.42563805{3}[source]
Putting everyone in jail also works well to prevent crime.
replies(1): >>42575772 #
23. zdragnar ◴[] No.42563952{4}[source]
The phrase "this is why we can't have nice things" springs to mind. Other people are the number one cause of most people's problems.
replies(1): >>42564188 #
24. thwarted ◴[] No.42564188{5}[source]
Tragedy of the Commons Ruins Everything Around Me.
25. throwaway290 ◴[] No.42564209{6}[source]
This may be too paranoid, but if your mobile IP is persistent and phone was compromised and is serving as a proxy for bots then it could explain why your IP fell out of favor
replies(1): >>42565170 #
26. grishka ◴[] No.42564357[source]
I have a growing Mastodon thread of this shit: https://mastodon.social/@grishka/111934602844613193

It's of course trivially bypassable with a VPN, but getting a 403 for an innocent get request of a public resource makes me angry every time nonetheless.

replies(1): >>42596582 #
27. CalRobert ◴[] No.42564887[source]
We’re rapidly approaching a login-only internet. If you’re not logged in with google on chrome then no website for you!

Attestation/wei enable this

replies(1): >>42596595 #
28. EVa5I7bHFq9mnYK ◴[] No.42565139{5}[source]
I found it's best to use VPSes from young and little known hosting companies, as their IP is not yet on the blacklists.
29. EVa5I7bHFq9mnYK ◴[] No.42565170{7}[source]
You don't get your own external IP with the phone, it's shared, like NAT.
replies(2): >>42565485 #>>42566337 #
30. throwaway290 ◴[] No.42565485{8}[source]
Depends on provider/plan
31. scarface_74 ◴[] No.42566337{8}[source]
I get a different IPv4 and IPv6 address every time I toggle airplane mode on and off.
replies(1): >>42571480 #
32. QuadmasterXLII ◴[] No.42566973[source]
The difference between politics and diplomacy is that you can survive in politics without resorting to collective punishment.
33. d0mine ◴[] No.42567500[source]
unrelated: USSR might have been 2nd world. Russia is 3rd world (since 1991) -- banana republic
replies(1): >>42571123 #
34. TacticalCoder ◴[] No.42567554{4}[source]
> Anecdatally, by default, we now block all Chinese and Russian IPs across our servers.

This. Just get several countries' entire IP address space and block these. I've posted I was doing just that only to be told that this wasn't in the "spirit" of the Internet or whatever similar nonsense.

In addition to that only allow SSH in from the few countries / ISPs legit trafic shall legitimately be coming from. This quiets the logs, saves bandwidth, saves resources, saves the planet.

replies(1): >>42570683 #
35. citrin_ru ◴[] No.42569757{4}[source]
Being slightly annoyed by noise in SSH logs I’ve blocked APNIC IPs and now see a comparable number of brute force attempts from ARIN IPs (mostly US ones). Geo blocks are totally ineffective against TAs which use a global network of proxies.
36. ◴[] No.42570541{3}[source]
37. brianwawok ◴[] No.42570553{3}[source]
If 90% of your problem users come from 1-2 countries, seems pretty sensible to block that country. I know I have 0 paying users in those countries, so why deal with it? Let them go fight it out doing bot wars in local sites
replies(1): >>42573043 #
38. brianwawok ◴[] No.42570562{5}[source]
That is not at all the reason for the great firewall.
39. xp84 ◴[] No.42570683{5}[source]
I agree with your approach. It’s easy to empathize with innocent people in say, Russia, blocked from a site which has useful information to them. However the thing these “spirit/openness” people miss is that many sites have a narrow purpose which makes no sense to open it up to people across the world. For instance, local government. Nobody in India or Russia needs to see the minutes from some US city council meeting, or get building permit information. Likewise with e-commerce. If I sell chocolate bars and ship to US and Canada, why wouldn’t I turn off all access from overseas? You might say “oh, but what if some friend in $COUNTRY wants to order a treat for someone here?” And the response to that is always “the hypothetical loss from that is minuscule compared to the cost of serving tons of bot traffic as well as possible exploits those bots might do.

(Yes, yes, VPNs and proxies exist and can be used by both good and bad actors to evade this strategy, and those are another set of IPs widely banned for the same reason. It’s a cat and mouse game but you can’t argue with the results)

40. crote ◴[] No.42571123{3}[source]
No, Russia is by definition the 2nd world. It's about spheres of influence, not any kind of economic status. The First World is the Western Bloc centered around the US, the Second World is the Eastern Bloc centered around then-USSR and now-Russia (although these days more centered on China), the Third World is everyone else.
replies(2): >>42573124 #>>42577011 #
41. lazide ◴[] No.42571441{3}[source]
Well, not racist per-se - if you visit the countries (regardless of race) you’re screwed too.

Geo-location-ist?

42. lazide ◴[] No.42571480{9}[source]
Externally routable IPv4, or just a different between-a-cgnat address?
replies(1): >>42571679 #
43. scarface_74 ◴[] No.42571679{10}[source]
Externally routable IPv4 as seen by whatismyip.com.
44. lazide ◴[] No.42573043{4}[source]
Keep in mind, this is literally why stereotypes and racism exists. It’s the exact same process/reasoning.
replies(2): >>42575893 #>>42576995 #
45. d0mine ◴[] No.42573124{4}[source]
By which definition? Here’s the first result in google: “The term "second world" was initially used to refer to the Soviet Union and countries of the communist bloc. It has subsequently been revised to refer to nations that fall between first and third world countries in terms of their development status and economic indicators.” https://www.investopedia.com/terms/s/second-world.asp#:~:tex....

Notice the word economic in it.

46. victorbjorklund ◴[] No.42573501{4}[source]
The underlying issue is that countries like russia support abuse like this. So by blocking them perhaps the people there will demand that their govt stops supporting crimes and absuse so that they can be allowed back into the internet.

(In the case of russians though i guess they will never change)

replies(1): >>42574222 #
47. ◴[] No.42574189{4}[source]
48. petre ◴[] No.42574222{5}[source]
> people there will demand that their govt stops supporting crimes and absuse so that they can be allowed back into the internet

Sure. It doesn't work that way, not in Russia or China. First they have to revert back to 1999 when Putin took over. Then they have to extradite criminals and crack down on cybercrime. Then maybe they could be allowed back onto the open Internet.

In my country one would be exradited to the US in no time. In fact the USSS came over for a guy who had been laundering money through BTC from a nearby office. Not a month passed and he got extradited to the US, never to be heard from again.

49. singleshot_ ◴[] No.42575772{4}[source]
Having a door with a lock on it prevents other people from committing crime in my house. This metaphor has the added benefit of making some amount of sense in context.
50. qeternity ◴[] No.42575893{5}[source]
No, racism would be “I won’t deal with customers of Chinese ethnicity irrespective of their country of operation”.

Blocking Chinese (or whatever) IPs because they are responsible for a huge amount of malicious behavior is not racist.

Frankly I don’t care what the race of the Chinese IP threat actor is.

replies(1): >>42576507 #
51. lazide ◴[] No.42576507{6}[source]
You really might want to re-read my comment.
52. ◴[] No.42576995{5}[source]
53. ◴[] No.42577011{4}[source]
54. neop1x ◴[] No.42596582{3}[source]
Exactly. I have to use a VPN just for this kind of bu**it. :/
55. neop1x ◴[] No.42596595[source]
And not just a login but soon probably also the real verified identity tied to it. The internet is becoming a worse place than the real world.