←back to thread

290 points nobody9999 | 2 comments | | HN request time: 0.017s | source
Show context
jawns ◴[] No.45187038[source]
I'm an author, and I've confirmed that 3 of my books are in the 500K dataset.

Thus, I stand to receive about $9,000 as a result of this settlement.

I think that's fair, considering that two of those books received advances under $20K and never earned out. Also, while I'm sure that Anthropic has benefited from training its models on this dataset, that doesn't necessarily mean that those models are a lasting asset.

replies(22): >>45187319 #>>45187366 #>>45187519 #>>45187839 #>>45188602 #>>45189683 #>>45189684 #>>45190184 #>>45190223 #>>45190237 #>>45190555 #>>45190731 #>>45191633 #>>45192016 #>>45192191 #>>45192348 #>>45192404 #>>45192630 #>>45193043 #>>45195516 #>>45201246 #>>45218895 #
visarga ◴[] No.45187519[source]
How is it fair? Do you expect 9,000 from Google, Meta, OpenAI, and everyone else? Were your books imitated by AI?

Infringement was supposed to imply substantial similarity. Now it is supposed to mean statistical similarity?

replies(4): >>45187577 #>>45187677 #>>45187811 #>>45187853 #
gruez ◴[] No.45187577[source]
>Were your books imitated by AI?

Given that books can be imitated by humans with no compensation, this isn't as strong as an argument as you think. Moreover AFAIK the training itself has been ruled legal, so Anthropic could have theoretically bought the book for $20 (or whatever) and be in the clear, which would obviously bring less revenue than the $9k settlement.

replies(2): >>45187621 #>>45188044 #
visarga ◴[] No.45187621[source]
Copyright should be about copying rights, not statistical similarities. Similarity vs causal link - a different standard all together.
replies(3): >>45187751 #>>45187806 #>>45187851 #
Retric ◴[] No.45187751[source]
The entire purpose of training materials is to copy aspects of them. That’s the causal link.
replies(2): >>45187830 #>>45193880 #
1. visarga ◴[] No.45193880[source]
> That’s the causal link.

But copyright was based on substantial similarity, not causal links. That is the subtle change. Copyright is expanding more and more.

In my view, unless there is substantially similarity to the infringed work, copyright should not be invoked.

Even the substantial similarity concept is already an expanded concept from original "protected expression".

It makes no sense to attack gen-AI for infringement, if we wanted the originals we would get the originals, you can copy anything you like on the web. Generating bootleg Harry Potter is slow, expensive and unfaithful to the original. We use gen-AI for creating things different from the training data.

replies(1): >>45211202 #
2. Retric ◴[] No.45211202[source]
Substantial similarly is less stringent than causal links. With substantial similarity the worlds’s a landline of unpopular media.

Copyright isn’t supposed to apply if you happen to write a story that bares an uncanny similarity to a story you never read written in 1952 in a language you don’t know that sold 54 copies.