Most active commenters
  • pier25(5)
  • conradev(3)

←back to thread

397 points pyman | 29 comments | | HN request time: 1.458s | source | bottom
Show context
dehrmann ◴[] No.44491718[source]
The important parts:

> Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use

> "All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies"

It was always somewhat obvious that pirating a library would be copyright infringement. The interesting findings here are that scanning and digitizing a library for internal use is OK, and using it to train models is fair use.

replies(6): >>44491820 #>>44491944 #>>44492844 #>>44494100 #>>44494132 #>>44494944 #
6gvONxR4sf7o ◴[] No.44491944[source]
You skipped quotes about the other important side:

> But Alsup drew a firm line when it came to piracy.

> "Anthropic had no entitlement to use pirated copies for its central library," Alsup wrote. "Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."

That is, he ruled that

- buying, physically cutting up, physically digitizing books, and using them for training is fair use

- pirating the books for their digital library is not fair use.

replies(6): >>44492103 #>>44492512 #>>44492665 #>>44493580 #>>44493641 #>>44495079 #
1. pier25 ◴[] No.44493580[source]
> buying, physically cutting up, physically digitizing books, and using them for training is fair use

So Suno would only really need to buy the physical albums and rip them to be able to generate music at an industrial scale?

replies(7): >>44493615 #>>44493850 #>>44494405 #>>44494753 #>>44494779 #>>44495203 #>>44496071 #
2. theteapot ◴[] No.44493615[source]
Yes.
replies(1): >>44494683 #
3. ohdeargodno ◴[] No.44493850[source]
Only if the physical albums don't have copy protection, otherwise you're circumenventing it and that's illegal. Or is it, against the right to private copy? If anything, AI at least shows that all of the existing copyright laws are utter bullshit made to make Disney happy.

Do keep in mind though: this is only for the wealthy. They're still going to send the Pinkertons at your house if you dare copy a Blu-ray.

replies(3): >>44493923 #>>44494068 #>>44494290 #
4. zerocrates ◴[] No.44493923[source]
With some minor exceptions, CDs don't have copy protection.
replies(1): >>44495085 #
5. nilamo ◴[] No.44494068[source]
> They're still going to send the Pinkertons at your house if you dare copy a Blu-ray.

Hey woah now, that's a Hasbro play, not a Disney one.

6. kbelder ◴[] No.44494290[source]
No, because they can just play the album for the AI to learn. AI training can be set up to exploit the analog hole. Same with images/movies
7. itronitron ◴[] No.44494405[source]
If it's fair use to train a model, that doesn't necessarily imply that the model can be legally used to generate anything.
replies(3): >>44494718 #>>44494724 #>>44495286 #
8. pier25 ◴[] No.44494683[source]
Actually it remains to be seen.

If you read the ruling, training was considered fair use in part because Claude is not a book generation tool. Hence it was deemed transformative. Definitely not what Suno and Udio are doing.

9. pier25 ◴[] No.44494718[source]
I've been reading a bit more about this. The training might not be considered fair use if it's not considered transformative.

Claude has been considered transformative given it's not really meant to generate books but Suno or Midjourney are absolutely in another category.

replies(1): >>44495956 #
10. make3 ◴[] No.44494724[source]
this is funny and potentially accurate
11. jbverschoor ◴[] No.44494753[source]
Same how it works in the Netherlands.
12. conradev ◴[] No.44494779[source]
Yes! Training and generation are fair use. You are free to train and generate whatever you want in your basement for whatever purpose you see fit. Build a music collection, go ham.

If the output from said model uses the voice of another person, for example, we already have a legal framework in place for determining if it is infringing on their rights, independent of AI.

Courts have heard cases of individual artists copying melodies, because melodies themselves are copyrightable: https://www.hypebot.com/hypebot/2020/02/every-possible-melod...

Copyright law is a lot more nuanced than anyone seems to have the attention span for.

replies(1): >>44494822 #
13. pier25 ◴[] No.44494822[source]
> Yes!

But Suno is definitely not training models in their basement for fun.

They are a private company selling music, using music made by humans to train their models, to replace human musicians and artists.

We'll see what the courts say but that doesn't sound like fair use.

replies(1): >>44495390 #
14. FateOfNations ◴[] No.44495085{3}[source]
Minor exception: https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootk...
15. burnt-resistor ◴[] No.44495203[source]
So not only did they pirate works but they destroyed possibly collectible physical copies too. Kafkaesque.
replies(1): >>44495335 #
16. protocolture ◴[] No.44495286[source]
Well there was that legal company who trained an LLM on their oppositions legal documents and then generated their own. I dont think inputs or outputs were ruled legal in that regard.

But as long as the model isnt outputting infringing works theres not really any issue there either.

17. bigyabai ◴[] No.44495335[source]
Google set the precedent for this with an even less transformative use case: https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....
18. conradev ◴[] No.44495390{3}[source]
My understanding is that Suno does not sell music, but instead makes a tool for musicians to generate music and sells access to this tool.

The law doesn't distinguish between basement and cloud – it's a service. You can sell access to the service without selling songs to consumers.

replies(3): >>44495595 #>>44495608 #>>44495707 #
19. pier25 ◴[] No.44495595{4}[source]
That's like arguing that a restaurant doesn't sell food because it sells the service of cooking it.
replies(1): >>44495997 #
20. pyman ◴[] No.44495608{4}[source]
What does "fair use" even mean in a world where models can memorise and remix every book and song ever written? Are we erasing ownership?

The problem is, copyright law wasn't written for machines. It was written for humans who create things.

In the case of songs (or books, paintings, etc), only humans and companies can legally own copyright, a machine can't. If an AI-powered tool generates a song, there’s no author in the legal sense, unless the person using the tool claims authorship by saying they operated the tool.

So we're stuck in a grey zone: the input is human, the output is AI generated, and the law doesn't know what to do with that.

For me the real debate is: Do we need new rules for non-human creation?

replies(1): >>44495950 #
21. johnnyanmac ◴[] No.44495707{4}[source]
That doesn't seem to track in my mind. So you can't sell music but you can sell 10 second snippets of music you pirated? It doesn't math out.

But i guess I'm not surprised that 2025 has little respect for artists.

22. markhahn ◴[] No.44495950{5}[source]
why are you saying "memorize"? are people training AIs to regurgitate exact copies? if so, that's just copying. if they return something that is not a literal copy of the whole work, then there is established caselaw about how much is permitted. some clearly is, but not entire works.

when you buy a book, you are not acceding to a license to only ever read it with human eyes, forbearing to memorize it, never to quote it, never to be inspired by it.

replies(2): >>44496065 #>>44496175 #
23. markhahn ◴[] No.44495956{3}[source]
really? so Suno or Midjourney can produce literal copies of works they were trained on?
replies(1): >>44496705 #
24. conradev ◴[] No.44495997{5}[source]
The restaurant is not responsible for E. coli if it’s found, are they? Just cooking it out of the food

Suno can’t prevent humans from copying other humans, it can only make sure that the direct output of its system isn’t infringing.

25. mwarkentin ◴[] No.44496065{6}[source]
> Specifically, the paper estimates that Llama 3.1 70B has memorized 42 percent of the first Harry Potter book well enough to reproduce 50-token excerpts at least half the time. (I’ll unpack how this was measured in the next section.)

> Interestingly, Llama 1 65B, a similar-sized model released in February 2023, had memorized only 4.4 percent of Harry Potter and the Sorcerer's Stone. This suggests that despite the potential legal liability, Meta did not do much to prevent memorization as it trained Llama 3. At least for this book, the problem got much worse between Llama 1 and Llama 3.

> Harry Potter and the Sorcerer's Stone was one of dozens of books tested by the researchers. They found that Llama 3.1 70B was far more likely to reproduce popular books—such as The Hobbit and George Orwell’s 1984—than obscure ones. And for most books, Llama 3.1 70B memorized more than any of the other models.

26. kelnos ◴[] No.44496071[source]
Not sure we can infer that (or anything) about Suno from this ruling. The judge here said that Anthropic's usage was extremely transformative. Would Suno's also be considered that way?

Anthropic doesn't take books and use them to train a model that is intended to generate new books. (Perhaps it could do that, to some extent, but that's no its [sole] purpose.)

But Suno would be taking music to train a model in order to generate new music. Is that transformative enough? We don't know what a judge thinks, at least not yet.

27. pyman ◴[] No.44496175{6}[source]
You are comparing AI to humans, but they're not the same. Humans don't memorise millions of copyrighted work and spit out similar content. AI does that.

Memorising isn't wrong but when machines memorise at scale and the people behind the original work get nothing, it raises big ethical questions.

The law hasn't caught up.

replies(1): >>44496684 #
28. bongodongobob ◴[] No.44496684{7}[source]
As a former musician, yes, we do. Any above average musician can play "Riders on the Storm" in the style of Johnny Cash, or Green Day, or Nirvana, etc. Successful above average musicians usually have almost encyclopedic knowledge of artists and albums at least in their favorite genre. This is how all art is made. Some artists will be more honest about this than others.
29. bongodongobob ◴[] No.44496705{4}[source]
Well I've been able to get Suno to do Beatles covers. It only works maybe 1/20 times, but you can do it. It's not an exact replica either, but you can get the same chords and melodies as the original.