←back to thread

454 points positiveblue | 1 comments | | HN request time: 0.203s | source
Show context
TIPSIO ◴[] No.45066555[source]
Everyone loves the dream of a free for all and open web.

But the reality is how can someone small protect their blog or content from AI training bots? E.g.: They just blindly trust someone is sending Agent vs Training bots and super duper respecting robots.txt? Get real...

Or, fine what if they do respect robots.txt, but they buy the data that may or may not have been shielded through liability layers via "licensed data"?

Unless you're reddit, X, Google, or Meta with scary unlimited budget legal teams, you have no power.

Great video: https://www.youtube.com/shorts/M0QyOp7zqcY

replies(37): >>45066600 #>>45066626 #>>45066827 #>>45066906 #>>45066945 #>>45066976 #>>45066979 #>>45067024 #>>45067058 #>>45067180 #>>45067399 #>>45067434 #>>45067570 #>>45067621 #>>45067750 #>>45067890 #>>45067955 #>>45068022 #>>45068044 #>>45068075 #>>45068077 #>>45068166 #>>45068329 #>>45068436 #>>45068551 #>>45068588 #>>45069623 #>>45070279 #>>45070690 #>>45071600 #>>45071816 #>>45075075 #>>45075398 #>>45077464 #>>45077583 #>>45080415 #>>45101938 #
1. raxxorraxor ◴[] No.45101938[source]
This isn't even an issue, you have made that problem up. I host a blog and there are some AI bots coming around. Big deal. Most of them do respect a robots.txt. Some don't. Not a big deal as well.

In contrast trying to change the infrastructure of the net, which previously was quite resistant to censorship is quite a big deal.

This sounds exactly like a crazy preacher warning about the dangers of rock music. A completely made up threat. And we need the protection of god against these evil AI bots.

Wow, a bot that disrespected a robots.txt. How can the internet survive...

Also, OpenAI already has the data. You want to ensure they will never get competitors by putting up barriers now. It makes no sense...