←back to thread

454 points positiveblue | 1 comments | | HN request time: 0s | source
Show context
TIPSIO ◴[] No.45066555[source]
Everyone loves the dream of a free for all and open web.

But the reality is how can someone small protect their blog or content from AI training bots? E.g.: They just blindly trust someone is sending Agent vs Training bots and super duper respecting robots.txt? Get real...

Or, fine what if they do respect robots.txt, but they buy the data that may or may not have been shielded through liability layers via "licensed data"?

Unless you're reddit, X, Google, or Meta with scary unlimited budget legal teams, you have no power.

Great video: https://www.youtube.com/shorts/M0QyOp7zqcY

replies(37): >>45066600 #>>45066626 #>>45066827 #>>45066906 #>>45066945 #>>45066976 #>>45066979 #>>45067024 #>>45067058 #>>45067180 #>>45067399 #>>45067434 #>>45067570 #>>45067621 #>>45067750 #>>45067890 #>>45067955 #>>45068022 #>>45068044 #>>45068075 #>>45068077 #>>45068166 #>>45068329 #>>45068436 #>>45068551 #>>45068588 #>>45069623 #>>45070279 #>>45070690 #>>45071600 #>>45071816 #>>45075075 #>>45075398 #>>45077464 #>>45077583 #>>45080415 #>>45101938 #
wvenable ◴[] No.45067955[source]
> Everyone loves the dream of a free for all and open web... But the reality is how can someone small protect their blog or content from AI training bots?

Aren't these statements entirely in conflict? You either have a free for all open web or you don't. Blocking AI training bots is not free and open for all.

replies(8): >>45067998 #>>45068139 #>>45068376 #>>45068589 #>>45068929 #>>45069170 #>>45073712 #>>45074969 #
BrenBarn ◴[] No.45068929[source]
No, that is not true. It is only true if you just equate "AI training bots" with "people" on some kind of nominal basis without considering how they operate in practice.

It is like saying "If your grocery store is open to the public, why is it not open to this herd of rhinoceroses?" Well, the reason is because rhinoceroses are simply not going to stroll up and down the aisles and head to the checkout line quietly with a box of cereal and a few bananas. They're going to knock over displays and maybe even shelves and they're going to damage goods and generally make the grocery store unusable for everyone else. You can say "Well, then your problem isn't rhinoceroses, it's entities that damage the store and impede others from using it" and I will say "Yes, and rhinoceroses are in that group, so they are banned".

It's certainly possible to imagine a world where AI bots use websites in more acceptable ways --- in fact, it's more or less the world we had prior to about 2022, where scrapers did exist but were generally manageable with widely available techniques. But that isn't the world that we live in today. It's also certainly true that many humans are using websites in evil ways (notably including the humans who are controlling many of these bots), and it's also very true that those humans should be held accountable for their actions. But that doesn't mean that blocking bots makes the internet somehow unfree.

This type of thinking that freedom means no restrictions makes sense only in a sort of logical dreamworld disconnected from practical reality. It's similar to the idea that "freedom" in the socioeconomic sphere means the unrestricted right to do whatever you please with resources you control. Well, no, that is just your freedom. But freedom globally construed requires everyone to have autonomy and be able to do things, not just those people with lots of resources.

replies(4): >>45068997 #>>45072168 #>>45073489 #>>45090949 #
akoboldfrying ◴[] No.45073489[source]
> "If your grocery store is open to the public, why is it not open to this herd of rhinoceroses?"

What this scenario actually reveals is that the words "open to the public" are not intended to mean "access is completely unrestricted".

It's fine to not want to give completely unrestricted access to something. What's not fine, or at least what complicates things unnecessarily, is using words like "open and free" to describe this desired actually-we-do-want-to-impose-certain-unstated-restrictions contract.

I think people use words like "open and free" to describe the actually-restricted contracts they want to have because they're often among like-minded people for whom these unstated additional restrictions are tacitly understood -- or, simply because it sounds good. But for precise communication with a diverse audience, using this kind of language is at best confusing, at worst disingenuous.

replies(2): >>45074012 #>>45077082 #
BrenBarn ◴[] No.45077082[source]
Using "open and free" to mean "I actually want no restrictions at all" is also confusing and disingenuous, because, as you yourself point out, a lot of people don't mean that by those words.

The other thing, though, is that there's a difference between "I personally want to release my personal work under open, free, and unrestricted terms" and "I want to release my work into a system that allows people to access information in general under open, free, and unrestricted terms". You can't just look at the individual and say "Oh, well, the conditions you want to put on your content mean it's not open and free so you must not actually want openness and freedom". You have to look at the reality of the entire system. When bots are overloading sites, when information is gated behind paywalls, when junk is firehosed out to everyone on behalf of paid advertisers while actual websites are down on page 20 of the search results, the overall situation is not one of open and free information exchange, and it's naive to think that individuals simply dumping their content "openly and freely" into this environment is going to result in an open and free situation.

Asking people to just unilaterally disarm by imposing no restrictions, while other less noble actors continue to impose all sorts of restrictions, will not produce a result that is free of restrictions. In fact quite the opposite. In order to actually get a free and open world in the large, it's not sufficient for good actors to behave in a free and open manner. Bad actors also must be actively prevented from behaving in an unfree and closed manner. Until they are, one-sided "gifts" of free and open content by the good actors will just feed the misdeeds of the bad actors.

replies(1): >>45079043 #
akoboldfrying ◴[] No.45079043[source]
> Asking people to just unilaterally disarm by imposing no restrictions

I'm not asking for this. I'm asking for people who want such restrictions (most of which I consider entirely reasonable) to say so explicitly. It would be enough to replace words like "free" or "open" with "fair use", which immediately signals that some restrictions are intended, without getting bogged down in details.

replies(1): >>45079196 #
BrenBarn ◴[] No.45079196[source]
Why? It seems you already know what people mean by "open and free", and it does have a connection to the ideals of openness and freedom, namely in the systemic context that I described above. So why bother about the terminology?
replies(1): >>45080835 #
1. akoboldfrying ◴[] No.45080835[source]
What people mean by words like "open" and "free" varies. It varies a lot, and a lot turns on what they actually mean.

The only sensible way forward is to be explicit.

Why fight this obvious truth? Why does it hurt so much to say what you mean?