←back to thread

394 points pyman | 3 comments | | HN request time: 0.418s | source
Show context
ramon156 ◴[] No.44488798[source]
Pirate and pay the fine is probably hell of a lot cheaper than individually buying all these books. I'm not saying this is justified, but what would you have done in their situation?

Sayi "they have the money" is not an argument. It's about the amount of effort that is needed to individually buy, scan, process millions of pages. If that's done for you, why re-do it all?

replies(11): >>44488878 #>>44488900 #>>44488933 #>>44489076 #>>44489255 #>>44489312 #>>44489833 #>>44490433 #>>44491603 #>>44491921 #>>44493173 #
suyjuris ◴[] No.44489312[source]
Just downloading them is of course cheaper, but it is worth pointing out that, as the article states, they did also buy legitimate copies of millions of books. (This includes all the books involved in the lawsuit.) Based on the judgement itself, Anthropic appears to train only on the books legitimately acquired. Used books are quite cheap, after all, and can be bought in bulk.
replies(1): >>44491385 #
1. asadotzler ◴[] No.44491385[source]
Buying a book is not license to re-sell that content for your own profit. I can't buy a copy of your book, make a million Xeroxes of it and sell those. The license you get when you buy a book is for a single use, not a license to do what ever you want with the contents of that book.
replies(2): >>44492012 #>>44492144 #
2. thedevilslawyer ◴[] No.44492012[source]
What are you on about - the judge has literally said this was not resell, and is transformative and fair use.
3. suyjuris ◴[] No.44492144[source]
Yes, of course! In this case, the judge identified three separate instances of copying: (1) downloading books without authorisation to add to their internal library, (2) scanning legitimately purchased books to add to their internal library, and (3) taking data from their internal library for the purposes of training LLMs. The purchasing part is only relevant for (2) — there the judge ruled that this is fair use. This makes a lot of sense to me, since no additional copies were created (they destroyed the physical books after scanning), so this is just a single use, as you say. The judge also ruled that (3) is fair use, but for a different reason. (They declined to decide whether (1) is fair use at this point, deferring to a later trial.)