←back to thread

454 points positiveblue | 2 comments | | HN request time: 0s | source
Show context
TIPSIO ◴[] No.45066555[source]
Everyone loves the dream of a free for all and open web.

But the reality is how can someone small protect their blog or content from AI training bots? E.g.: They just blindly trust someone is sending Agent vs Training bots and super duper respecting robots.txt? Get real...

Or, fine what if they do respect robots.txt, but they buy the data that may or may not have been shielded through liability layers via "licensed data"?

Unless you're reddit, X, Google, or Meta with scary unlimited budget legal teams, you have no power.

Great video: https://www.youtube.com/shorts/M0QyOp7zqcY

replies(37): >>45066600 #>>45066626 #>>45066827 #>>45066906 #>>45066945 #>>45066976 #>>45066979 #>>45067024 #>>45067058 #>>45067180 #>>45067399 #>>45067434 #>>45067570 #>>45067621 #>>45067750 #>>45067890 #>>45067955 #>>45068022 #>>45068044 #>>45068075 #>>45068077 #>>45068166 #>>45068329 #>>45068436 #>>45068551 #>>45068588 #>>45069623 #>>45070279 #>>45070690 #>>45071600 #>>45071816 #>>45075075 #>>45075398 #>>45077464 #>>45077583 #>>45080415 #>>45101938 #
Gud ◴[] No.45066626[source]
By developing Free Software combating these hostile softwares.

Corporations develop hostile AI agents,

Capable hackers develop anti-AI-agents.

This defeatist atittude "we have no power".

replies(7): >>45066667 #>>45066678 #>>45066770 #>>45066789 #>>45066830 #>>45067106 #>>45067374 #
victorbjorklund ◴[] No.45067374[source]
So basically cloudflare but self-hosted (with all the pain that comes from that)?
replies(1): >>45067532 #
Gud ◴[] No.45067532[source]
What’s so painful about self hosting? I’ve been self hosting since before I hit puberty. If 12 year old me can run a httpd, anyone can.

And if you don’t want to self host, at least try to use services from organisations that aren’t hostile to the open web

replies(2): >>45067558 #>>45067559 #
victorbjorklund ◴[] No.45067558[source]
I self-host lots of stuff. But yes it is more pain to host a WAF that can handle billions of request per minute. Even harder to do it for free like Cloudflare. And in the end the end result for the user is exactly the same if you use a self-hosted WAF or let someone else host it for you.
replies(2): >>45068536 #>>45069544 #
1. immibis ◴[] No.45069544{4}[source]
But you don't get billions of requests per minute. You get maybe five requests per second (300 per minute) on a bad day. The sites that seem to be getting badly attacked, they get 200 per second, which is still within reach of a self hosted firewall. Think about how many CPU cycles per packet that allows for. Hardly a real DDoS.

The only reason you even want to firewall 200 requests per second is that the code downstream of the firewall takes more than 5ms to service a request, so you could also consider improving that. And if you're only getting <5 and your server isn't overloaded then why block anything at all?

replies(1): >>45070071 #
2. Symbiote ◴[] No.45070071[source]
Such entitlement.

How much additional tax money should I spend at work so the AI scum can make 200 searches per second?

Human and 'nice' bots make about 5 per second.