Anthropic cut up millions of used books, and downloaded 7M pirated ones – judge

(www.businessinsider.com)

Show context

codedokode ◴[07 Jul 25 16:49 UTC] No.44492195[source]▶

If AI companies are allowed to use pirated material to create their products, does it mean that everyone can use pirated software to create products? Where is the line?

Also please don't use word "learning", use "creating software using copyrighted materials".

Also let's think together how can we prevent AI companies from using our work using technical measures if the law doesn't work?

replies(5): >>44492257 #>>44492400 #>>44492975 #>>44493804 #>>44493829 #

redcobra762 ◴[07 Jul 25 17:06 UTC] No.44492400[source]▶

>>44492195 #

It's abusive and wrong to try and prevent AI companies from using your works at all.

The whole point of copyright is to ensure you're paid for your work. AI companies shouldn't pirate, but if they pay for your work, they should be able to use it however they please, including training an LLM on it.

If that LLM reproduces your work, then the AI company is violating copyright, but if the LLM doesn't reproduce your work, then you have not been harmed. Trying to claim harm when you haven't been due to some philosophical difference in opinion with the AI company is an abuse of the courts.

replies(4): >>44492530 #>>44492615 #>>44492908 #>>44492935 #

1. codedokode ◴[07 Jul 25 17:19 UTC] No.44492530[source]▶

>>44492400 #

It is not wrong at all. The author decides what to do with their work. AI companies are rich and can simply buy the rights or hire people to create works.

I could agree with exceptions for non-commercial activity like scientific research, but AI companies are made for extracting profits and not for doing research.

> AI companies shouldn't pirate, but if they pay for your work, they should be able to use it however they please, including training an LLM on it.

It doesn't work this way. If you buy a movie it doesn't mean you can sell goods with movie characters.

> then you have not been harmed.

I am harmed because less people will buy the book if they can simply get an answer from LLM. Less people will hire me to write code if an LLM trained on my code can do it. Maybe instead of books we should start making applications that protect the content and do not allow copying text or making screenshots. ANd instead of open-source code we should provide binary WASM modules.

replies(2): >>44492572 #>>44493443 #

2. redcobra762 ◴[07 Jul 25 17:23 UTC] No.44492572[source]▶

>>44492530 (TP) #

If you reproduce the material from a work you've purchased then of course you're in violation of copyright, but that's not what an LLM does (and when it does I already conceded it's in violation and should be stopped). An LLM that doesn't "sell goods with movie characters" is not in violation.

And the harm you describe is not a recognized harm. You don't own information, you own creative works in their entirety. If your work is simply a reference, then the fact being referenced isn't something you own, thus you are not harmed if that fact is shared elsewhere.

It is an abuse of the courts to attempt to prevent people who have purchased your works from using those works to train an LLM. It's morally wrong.

replies(2): >>44492664 #>>44493431 #

3. codedokode ◴[07 Jul 25 17:29 UTC] No.44492664[source]▶

>>44492572 #

To load a printed book into a computer one has to reproduce it in digital form without authorization. That's making a copy.

replies(1): >>44492706 #

4. redcobra762 ◴[07 Jul 25 17:34 UTC] No.44492706{3}[source]▶

>>44492664 #

Making a digital copy of a physical book is fair use under every legal structure I am aware of.

When you do it for a transformative purpose (turning it into an LLM model) it's certainly fair use.

But more importantly, it's ethical to do so, as the agreement you've made with the person you've purchased the book from included permission to do exactly that.

replies(1): >>44493030 #

5. seadan83 ◴[07 Jul 25 18:01 UTC] No.44493030{4}[source]▶

>>44492706 #

Per the ruling, the problem is the books were not purchased, they were downloaded from black market websites. It's akin to shoplifting, what you do later with the goods is a different matter.

Reasonable minds could debate the ethics of how the material was used, this ruling judged the usage was legal and fair use. The only problem is the material was in effect stolen.

6. CaptainFever ◴[07 Jul 25 18:43 UTC] No.44493431[source]▶

>>44492572 #

> It is worse than ineffective; it is wrong too, because software developers should not exercise such power over what users do. Imagine selling pens with conditions about what you can write with them; that would be noisome, and we should not stand for it. Likewise for general software. If you make something that is generally useful, like a pen, people will use it to write all sorts of things, even horrible things such as orders to torture a dissident; but you must not have the power to control people's activities through their pens. It is the same for a text editor, compiler or kernel.

Sorry for the long quote, but basically this, yeah. A major point of free software is that creators should not have the power to impose arbitrary limits on the users of their works. It is unethical.

It's why the GPL allows the user to disregard any additional conditions, why it's viral, and why the FSF spends so much effort on fighting "open source but..." licenses.

7. CaptainFever ◴[07 Jul 25 18:44 UTC] No.44493443[source]▶

>>44492530 (TP) #

> Maybe instead of books we should start making applications that protect the content and do not allow copying text or making screenshots.

https://en.wikipedia.org/wiki/Analog_hole

replies(1): >>44496376 #

8. codedokode ◴[08 Jul 25 02:15 UTC] No.44496376[source]▶

>>44493443 #

That would be "circumvention of DRM".

↑