To be honest, these companies already stole terabytes of data and don't even disclose their dataset, so you have to assume they'll steal and train at anything you throw at them
replies(4):
Seems like an excessively draconian interpretation of property rights.
It makes no sense to put stuff up on the internet where it can freely be downloaded by anyone at any time, by people who are then free to do whatever they like with it on their own hardware, then complain that people have downloaded that stuff and done what they liked with it on their own hardware.
"Having machines consume large volumes of data posted on the Internet for the purpose of generating value for them without compensating the creators" is equally a description of Google.