←back to thread

454 points positiveblue | 3 comments | | HN request time: 0s | source
Show context
TIPSIO ◴[] No.45066555[source]
Everyone loves the dream of a free for all and open web.

But the reality is how can someone small protect their blog or content from AI training bots? E.g.: They just blindly trust someone is sending Agent vs Training bots and super duper respecting robots.txt? Get real...

Or, fine what if they do respect robots.txt, but they buy the data that may or may not have been shielded through liability layers via "licensed data"?

Unless you're reddit, X, Google, or Meta with scary unlimited budget legal teams, you have no power.

Great video: https://www.youtube.com/shorts/M0QyOp7zqcY

replies(37): >>45066600 #>>45066626 #>>45066827 #>>45066906 #>>45066945 #>>45066976 #>>45066979 #>>45067024 #>>45067058 #>>45067180 #>>45067399 #>>45067434 #>>45067570 #>>45067621 #>>45067750 #>>45067890 #>>45067955 #>>45068022 #>>45068044 #>>45068075 #>>45068077 #>>45068166 #>>45068329 #>>45068436 #>>45068551 #>>45068588 #>>45069623 #>>45070279 #>>45070690 #>>45071600 #>>45071816 #>>45075075 #>>45075398 #>>45077464 #>>45077583 #>>45080415 #>>45101938 #
Gud ◴[] No.45066626[source]
By developing Free Software combating these hostile softwares.

Corporations develop hostile AI agents,

Capable hackers develop anti-AI-agents.

This defeatist atittude "we have no power".

replies(7): >>45066667 #>>45066678 #>>45066770 #>>45066789 #>>45066830 #>>45067106 #>>45067374 #
TIPSIO ◴[] No.45066667[source]
Yes, I obviously agree with you. My comment's point is missed a little I think by you. CF is making these tools and giving access to it to millions of people.
replies(1): >>45066773 #
supriyo-biswas ◴[] No.45066773[source]
Well there's open source stuff like https://github.com/TecharoHQ/anubis; one doesn't need a top-down mandated solution coming from a corporation.

In general Cloudflare has been pushing DRMization of the web for quite some time, and while I understand why they want to do it, I wish they didn't always show off as taking the moral high ground.

replies(1): >>45067410 #
Klonoar ◴[] No.45067410[source]
Anubis doesn’t necessarily stop the most well funded actors.

If anything we’ve seen the rise in complaints about it just annoying average users.

replies(1): >>45067521 #
1. supriyo-biswas ◴[] No.45067521[source]
The actual response to which Anubis was created is seemingly a strange kind of DDOS attack that has been misattributed to LLMs, but is some kind of attacker that makes partial GET requests that are aborted soon after sending the request headers, mostly coming from residential proxies. (Yes, it doesn’t help that the author of Anubis also isn’t fully aware of the mechanics of the attack. In fact, there is no proper write up of the mechanism of the attack which I hope to write about someday).

Having said that, the solution is effective enough, having a lightweight proxy component that issues proof of work tokens to such bogus requests works well enough, as various users on HN seem to point out.

replies(1): >>45068203 #
2. theamk ◴[] No.45068203[source]
> a strange kind of DDOS attack that has been misattributed to LLMs, , but is some kind of attacker that makes partial GET requests that are aborted soon after sending the request headers, mostly coming from residential proxies.

um, no? Where did you get this strange bit of info.

The original reports say nothing of that sort: https://news.ycombinator.com/item?id=42790252 ; and even original motivation for Anubis was Amazon AI crawler https://news.ycombinator.com/item?id=42750420

(I've seen more posts with the analysis, including one which showed an AI crawler which would identify properly, but once it hits the ratelimit, would switch to fake user agent from proxies.. but I cannot find it now)

replies(1): >>45072360 #
3. ◴[] No.45072360[source]