To be honest, these companies already stole terabytes of data and don't even disclose their dataset, so you have to assume they'll steal and train at anything you throw at them
"Reading stuff freely posted on the internet" constitutes stealing now?
Seems like an excessively draconian interpretation of property rights.
This is a quintessential bad faith comment.
The reference to terabytes of stolen data refers to copyrighted material. I think you know this but chose to frame it as "stuff freely posted on the internet" in order to mislead and strawman the other comment.
I meant it exactly as I said it. I do not agree that any theft occurred, either in law or in spirit, and I believe that reinterpretation of intellectual-property law in order to make it a crime would cause significant harm, greatly outweighing the benefits, as has been the case with every other expansion of intellectual property law I have seen.
Anthropic downloaded books from Library Genesis and The Pirate Library mirror. This is factual and reported on from court documents.
What’s the angle that describes this as fair use?
[0] https://www.businessinsider.com/anthropic-cut-pirated-millio...
The simple fact that they are not republishing any of that data. Fair use does not apply, because copyright does not apply, because nothing is being copied.
So you don't think downloading something from The Pirate Bay constitutes copyright infringement provided you don't republish it?
Precisely. The person sharing is the one breaking the law.