Anthropic wins fair use victory for AI – but still in trouble for stealing books

(simonwillison.net)

48 points taubek | 3 comments | 25 Jun 25 20:44 UTC | HN request time: 0.819s | source

Show context

pseufaux ◴[26 Jun 25 09:11 UTC] No.44385580[source]▶

> But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.

This feels like an unwarranted anthropomorphization of what LLMs are doing.

replies(4): >>44387177 #>>44387291 #>>44387683 #>>44390645 #

reverendsteveii ◴[26 Jun 25 13:23 UTC] No.44387177[source]▶

>>44385580 #

Like corporations, the machines will be human for purposes of rights and abstract, ephemeral entities for purposes of responsibility.

replies(1): >>44388784 #

1. pseufaux ◴[26 Jun 25 16:13 UTC] No.44388784[source]▶

>>44387177 #

I'm unsure if this is true. I'm far from an expert in the current legal framework, but so far the court cases regarding liability in autonomous vehicle crashes have held humans responsible. That may change as driverless vehicles reach higher levels of automation, but in my understanding the ruling is still out.

I don't see why it would be different for LLMs.

replies(1): >>44392702 #

2. Charon77 ◴[27 Jun 25 00:12 UTC] No.44392702[source]▶

>>44388784 (TP) #

Not a lawyer, but how would you think the law react when I sell computer for authors with pdf of pirated books come pre-installed as part of the 'reference' for aspiring authors to look at, without permission from publishers?

The issue is the recall LLMs have over copyrighted contents.

replies(1): >>44392883 #

3. pseufaux ◴[27 Jun 25 00:47 UTC] No.44392883[source]▶

>>44392702 #

That's not a bad analogy. I like that it makes clear that the storage mechanism isn't relevant.

Personally, my read is that the issue with most of these cases is that we are treating and talking about LLMs as if they do things that humans do. They don't. They don't reason. They don't think. They don't know. They just map input to probabilistic output. LLMs are a tool like any other for more easily achieving some outcome.

It's precisely because we insist on treating LLMs as if they are more than an inefficient storage device (with a neat/useful trick) that we run into questions like this. I personally think the illegal status of current models should be pretty clear simply based of the pirated nature of their input material. To my understanding, fair use has never before applied to works that were obtained illegally.

↑