←back to thread

454 points positiveblue | 1 comments | | HN request time: 0s | source
Show context
impure ◴[] No.45066528[source]
Well, if you have a better way to solve this that’s open I’m all ears. But what Cloudflare is doing is solving the real problem of AI bots. We’ve tried to solve this problem with IP blocking and user agents, but they do not work. And this is actually how other similar problems have been solved. Certificate authorities aren’t open and yet they work just fine. Attestation providers are also not open and they work just fine.
replies(6): >>45066914 #>>45067091 #>>45067829 #>>45072492 #>>45072740 #>>45072778 #
viktorcode ◴[] No.45066914[source]
AI poisoning is a better protection. Cloudflare is capable of serving stashes of bad data to AI bots as protective barrier to their clients.
replies(2): >>45066992 #>>45067679 #
verdverm ◴[] No.45067679[source]
You don't think that the AI companies will take efforts to detect and filter bad data for training? Do you suppose they are already doing this, knowing that data quality has an impact on model capabilities?
replies(2): >>45067756 #>>45071793 #
culi ◴[] No.45071793[source]
The current state of the art in AI poisoning is Nightshade from the University of Chicago. It's meant to eventually be an addon to their WebGlaze[1] which is an invite-only tool meant for artists to protect their art from AI mimicry

If these companies are adding extra code to bypass artists trying to protect their intellectual property from mimicry then that is an obvious and egregious copyright violation

More likely it will push these companies to actually pay content creators for the content they work on to be included in their models.

[0] https://nightshade.cs.uchicago.edu/whatis.html

[1] https://glaze.cs.uchicago.edu/webglaze.html

replies(1): >>45079362 #
1. verdverm ◴[] No.45079362[source]
Seems like their poisoning is something that shouldn't be hard to detect and filter on. There is enough perturbation to create visual artifacts people can see. Steganography research is much further along in being undetectable. I would imaging in order to disrupt training sufficiently, you would not be able to have so few perturbations that it would go undetected