←back to thread

439 points diggan | 1 comments | | HN request time: 0.214s | source
Show context
TheRoque ◴[] No.45065446[source]
To be honest, these companies already stole terabytes of data and don't even disclose their dataset, so you have to assume they'll steal and train at anything you throw at them
replies(4): >>45066376 #>>45066970 #>>45068970 #>>45077378 #
marssaxman ◴[] No.45066376[source]
"Reading stuff freely posted on the internet" constitutes stealing now?

Seems like an excessively draconian interpretation of property rights.

replies(10): >>45066424 #>>45066467 #>>45066537 #>>45068095 #>>45068974 #>>45069163 #>>45069363 #>>45069550 #>>45074841 #>>45076689 #
michaelmior ◴[] No.45066424[source]
"Reading stuff freely posted on the internet" is also very different from a business having machines consume large volumes of data posted on the Internet for the purpose of generating value for them without compensating the creators. I'm not making a value judgement one way or the other, but "reading stuff freely posted on the Internet" is an oversimplification.
replies(5): >>45066511 #>>45066562 #>>45068503 #>>45070930 #>>45071058 #
marssaxman ◴[] No.45066511[source]
Okay, but "stealing" is also an oversimplification, to the point of absurdity.

It makes no sense to put stuff up on the internet where it can freely be downloaded by anyone at any time, by people who are then free to do whatever they like with it on their own hardware, then complain that people have downloaded that stuff and done what they liked with it on their own hardware.

"Having machines consume large volumes of data posted on the Internet for the purpose of generating value for them without compensating the creators" is equally a description of Google.

replies(9): >>45066575 #>>45067827 #>>45068034 #>>45068085 #>>45068365 #>>45069767 #>>45070721 #>>45072004 #>>45073608 #
thrwaway55 ◴[] No.45073608[source]
Ok so if I publish under a license saying I don't allow for it to be used for AI do you believe they respect it? What word would you use to describe this violation? Go ahead throw up a robots.txt, throw up a license. You will be able to coax the "fair use" stochastic parrots to render it verbatim.

Sam Altman and his ilk are exploiting the incredibly slow moving legal system to enrich themselves.

replies(1): >>45100640 #
1. heavyset_go ◴[] No.45100640[source]
> Ok so if I publish under a license saying I don't allow for it to be used for AI do you believe they respect it?

It's even worse than that, they don't even legally have to respect it if courts find it to be fair use, and so far they have. If it's fair use to train models on it, your license means nothing.

The only way to "win" is to not publish your code at all, anywhere.