←back to thread

439 points diggan | 2 comments | | HN request time: 0.593s | source
Show context
TheRoque ◴[] No.45065446[source]
To be honest, these companies already stole terabytes of data and don't even disclose their dataset, so you have to assume they'll steal and train at anything you throw at them
replies(4): >>45066376 #>>45066970 #>>45068970 #>>45077378 #
marssaxman ◴[] No.45066376[source]
"Reading stuff freely posted on the internet" constitutes stealing now?

Seems like an excessively draconian interpretation of property rights.

replies(10): >>45066424 #>>45066467 #>>45066537 #>>45068095 #>>45068974 #>>45069163 #>>45069363 #>>45069550 #>>45074841 #>>45076689 #
michaelmior ◴[] No.45066424[source]
"Reading stuff freely posted on the internet" is also very different from a business having machines consume large volumes of data posted on the Internet for the purpose of generating value for them without compensating the creators. I'm not making a value judgement one way or the other, but "reading stuff freely posted on the Internet" is an oversimplification.
replies(5): >>45066511 #>>45066562 #>>45068503 #>>45070930 #>>45071058 #
marssaxman ◴[] No.45066511[source]
Okay, but "stealing" is also an oversimplification, to the point of absurdity.

It makes no sense to put stuff up on the internet where it can freely be downloaded by anyone at any time, by people who are then free to do whatever they like with it on their own hardware, then complain that people have downloaded that stuff and done what they liked with it on their own hardware.

"Having machines consume large volumes of data posted on the Internet for the purpose of generating value for them without compensating the creators" is equally a description of Google.

replies(9): >>45066575 #>>45067827 #>>45068034 #>>45068085 #>>45068365 #>>45069767 #>>45070721 #>>45072004 #>>45073608 #
1. pigeons ◴[] No.45067827[source]
But they didn't only train on information the creators made freely available. They trained on copyrighted materials obtained illicitly.
replies(1): >>45071073 #
2. pigeons ◴[] No.45071073[source]
I know we're not supposed to comment about downvotes, but the original comment was talking about "these companies", and none of the information indicating that they, or at the very least Meta, trained on terabytes of books downloaded from zlib and libgen and other torrent sites, is in dispute. So even if you believe that copyright should not exist, I don't see why this is not a valid dispute of the parents argument that they only trained on information creators made freely available.