←back to thread

454 points positiveblue | 1 comments | | HN request time: 0s | source
Show context
TIPSIO ◴[] No.45066555[source]
Everyone loves the dream of a free for all and open web.

But the reality is how can someone small protect their blog or content from AI training bots? E.g.: They just blindly trust someone is sending Agent vs Training bots and super duper respecting robots.txt? Get real...

Or, fine what if they do respect robots.txt, but they buy the data that may or may not have been shielded through liability layers via "licensed data"?

Unless you're reddit, X, Google, or Meta with scary unlimited budget legal teams, you have no power.

Great video: https://www.youtube.com/shorts/M0QyOp7zqcY

replies(37): >>45066600 #>>45066626 #>>45066827 #>>45066906 #>>45066945 #>>45066976 #>>45066979 #>>45067024 #>>45067058 #>>45067180 #>>45067399 #>>45067434 #>>45067570 #>>45067621 #>>45067750 #>>45067890 #>>45067955 #>>45068022 #>>45068044 #>>45068075 #>>45068077 #>>45068166 #>>45068329 #>>45068436 #>>45068551 #>>45068588 #>>45069623 #>>45070279 #>>45070690 #>>45071600 #>>45071816 #>>45075075 #>>45075398 #>>45077464 #>>45077583 #>>45080415 #>>45101938 #
Gud ◴[] No.45066626[source]
By developing Free Software combating these hostile softwares.

Corporations develop hostile AI agents,

Capable hackers develop anti-AI-agents.

This defeatist atittude "we have no power".

replies(7): >>45066667 #>>45066678 #>>45066770 #>>45066789 #>>45066830 #>>45067106 #>>45067374 #
victorbjorklund ◴[] No.45067374[source]
So basically cloudflare but self-hosted (with all the pain that comes from that)?
replies(1): >>45067532 #
Gud ◴[] No.45067532[source]
What’s so painful about self hosting? I’ve been self hosting since before I hit puberty. If 12 year old me can run a httpd, anyone can.

And if you don’t want to self host, at least try to use services from organisations that aren’t hostile to the open web

replies(2): >>45067558 #>>45067559 #
victorbjorklund ◴[] No.45067558[source]
I self-host lots of stuff. But yes it is more pain to host a WAF that can handle billions of request per minute. Even harder to do it for free like Cloudflare. And in the end the end result for the user is exactly the same if you use a self-hosted WAF or let someone else host it for you.
replies(2): >>45068536 #>>45069544 #
lucb1e ◴[] No.45068536[source]
If you're handling billions of requests per second, you're not a self hoster. That's a commercial service with a dedicated team to handle traffic around the clock. Most ISPs probably don't even operate lines that big

To put that in perspective, even if they're sending empty TCP packets, "several billion" pps is 200 to 1800 gigabits of traffic, depending on what you mean by that. Add a cookieless HTTP payload and you're at many terabits per second. The average self hoster is more likely to get struck by lightning than encounter and need protection from this (even without considering the, probably modest, consequences of being offline a few hours if it does happen)

Edit: off by a factor of 60, whoops. Thanks to u/Gud for pointing that out. I stand by the conclusion though: less likely to occur than getting struck by lightning (or maybe it's around equally likely now? But somewhere in that ballpark) and the consequences of being down for a few hours are generally not catastrophic anyway. You can always still put big brother in front if this event does happen to you and your ISP can't quickly drop the abusive traffic

replies(2): >>45068644 #>>45069180 #
PaulHoule ◴[] No.45069180[source]
If somebody decides they hate you, your site that could handle, say, 100,000 legitimate requests per day could suddenly get billions of illegitimate requests.
replies(2): >>45069486 #>>45070817 #
Gud ◴[] No.45069486[source]
Not everybody wants to manage some commercial grade packet filter that can handle some DDoSing script kiddie, it’s a strong argument.

But another argument against using the easiest choice, the near monopoly, is that we need a diverse, thriving ecosystem.

We don’t want to end up in a situation where suddenly Cloudflare gets to dictate what is allowed on the web.

We have already lost email to the tech giants, try running your own mail sometime. The technical aspect is easy, the problem is you will end up in so many spam folders it’s disgusting.

What we need are better decentralized protocols.

replies(1): >>45069554 #
immibis ◴[] No.45069554{3}[source]
Please do try running your own mail some time. It's not nearly as hard as doomers would have you think. And if you only receive, you don't have any problems at all.

At first, you can use it for less serious stuff until you see how much it works.

replies(1): >>45070305 #
Gud ◴[] No.45070305{4}[source]
I do, I host my own mail server.

Technically it's not very challenging. The problem is the total dominance of a few actors and a lot of spammers.

replies(1): >>45070866 #
1. lucb1e ◴[] No.45070866{5}[source]
I haven't had spam issues since using a catch-all and giving everyone a unique address, blocking ones that receive spam

Won't work if you need a fixed address on a business card or something, but in case you don't...

Waiting for the day they catch on. Then it's time for a challenge-response protocol I guess