←back to thread

290 points nobody9999 | 9 comments | | HN request time: 0.001s | source | bottom
Show context
jawns ◴[] No.45187038[source]
I'm an author, and I've confirmed that 3 of my books are in the 500K dataset.

Thus, I stand to receive about $9,000 as a result of this settlement.

I think that's fair, considering that two of those books received advances under $20K and never earned out. Also, while I'm sure that Anthropic has benefited from training its models on this dataset, that doesn't necessarily mean that those models are a lasting asset.

replies(22): >>45187319 #>>45187366 #>>45187519 #>>45187839 #>>45188602 #>>45189683 #>>45189684 #>>45190184 #>>45190223 #>>45190237 #>>45190555 #>>45190731 #>>45191633 #>>45192016 #>>45192191 #>>45192348 #>>45192404 #>>45192630 #>>45193043 #>>45195516 #>>45201246 #>>45218895 #
tartoran ◴[] No.45187839[source]
> I think that's fair, considering that two of those books received advances under $20K and never earned out.

It may be fair to you but how about other authors? Maybe it's not fair at all to them.

replies(2): >>45187873 #>>45189724 #
terminalshort ◴[] No.45189724[source]
Do they sell their books for more than $3000 per copy? In that case it isn't fair. Otherwise they are getting a windfall because of Anthropic's stupidity in not buying the books.
replies(5): >>45189898 #>>45190191 #>>45190448 #>>45192764 #>>45196449 #
1. godelski ◴[] No.45190191[source]

  | Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.[0]
Please don't be disingenuous. You know that none of the authors were selling their books for $3k a piece, so obviously this is about something more

  > because of Anthropic's stupidity in not buying the books.
And what about OpenAI, who did the same thing?

What about Meta, who did the same thing?

What about Google, who did the same thing?

What about Nvidia, who did the same thing?

Clearly something should be done because it's not like these companies can't afford the cost of the books. I mean Meta recently hired people giving out >$100m packages and bought a data company for $15bn. Do you think they can't afford to buy the books, videos, or even the porn? We're talking about trillion dollar companies.

It's been what, a year since Eric Schmidt said to steal everything and let the lawyers figure it out if you become successful?[1] Personal I'm not a big fan of "the ends justify the means" arguments. It's led to a lot of unrest, theft, wars, and death.

Do you really not think it's possible to make useful products ethically?

[0] https://news.ycombinator.com/newsguidelines.html

[1] https://www.theverge.com/2024/8/14/24220658/google-eric-schm...

replies(3): >>45190454 #>>45190829 #>>45191515 #
2. janalsncm ◴[] No.45190454[source]
This isn’t a deal to sell their books. The authors are getting $3k per book while maintaining the rights to their IP. The settlement is to avoid statutory damages which are between $750 and $30k or more per infringement.

One of the consequences of retaining their rights is that they can also sue Meta and Google and OpenAI etc for the same thing.

replies(1): >>45190500 #
3. godelski ◴[] No.45190500[source]
I think we are in agreement[0]. I was just focusing on a different part

[0] https://news.ycombinator.com/item?id=45190232

4. terminalshort ◴[] No.45190829[source]
Where is your evidence that Meta, Google, and OpenAI did the same thing? (As for NVIDIA, do they even train models?) Because if they did, why haven't they been sued? This is a garden variety copyright infringement case and would be a slam dunk win for the plaintiffs. The only novel part of the case is the claim that the plaintiffs lost on, which establishes president that training an LLM is fair use.

> Clearly something should be done because it's not like these companies can't afford the cost of the books

Yes indeed it should, and it has. They have been forced to pay $3000 per book they pirated, which is more than 100x what they would have gained if they had gotten away with it.

IMO a fine of 100x the value of a copy of the pirated work is more than sufficient as a punishment for piracy. If you want to argue that the penalty should be more, you can do that, but it is completely missing my point. You are talking about what is fair punishment to the companies, and my comment was talking about what is fair compensation to the authors. Those are two completely different things.

replies(3): >>45193777 #>>45195142 #>>45195829 #
5. kelnos ◴[] No.45191515[source]
> And what about $OTHER_AI_COMPANY, who did the same thing?

If there's evidence of this that will stand up in court, they should be sued as well, and they'll presumably lose. If this hasn't happened, or isn't in the works, then I guess they covered their tracks well enough. That's unfortunate, but that's life.

replies(1): >>45193795 #
6. godelski ◴[] No.45193777[source]
I mean you can Google these... They also have been popping up on HN for the last year, it is even referenced in the article, and there's even another post in the sidebar titled "Anthropic Record AI Copyright Pact Sets Bar for OpenAI, Meta"[0], so I really didn't feel it was necessary to provide links. But sure, if you're feeling lazy, I got your back. I'll even limit it to HN posts so you don't have to even leave the site

  Torrenting:
  Meta Pirating Books[1,2,3]
    - [1] Fun fact, [1] is the most popular post of all time on HN for the search word "torrent" and the 5th ranking for "Meta". [2] is the 16th for "illegal"
  Nvidia [4,5]
  Apple, Nvidia, Anthropic[6]
  GitHub [7,8]
  OpenAI [9,10]
  Google [11]
    - I mean this one was even mentioned in the articled from the Anthropic post from a few days ago[12]
I hope that's sufficient. You can find plenty more if you do a good old fashion search instead of just using the HN search. But most of these were pretty high profile stories so was pretty quick to look.

  > which establishes president that training an LLM is fair use.
                      ~~~~~~~~~
                      precedent
I think you misunderstand. The precedent is over the issue of piracy. This has not made precedence over the issue of fair use. There is ongoing litigation, but there was precedence set in another lawsuit with Meta[13], which is currently going through appeals. I'll give you a head start on that one [14,15]. But the issue of fair use is still being debated. These things take years and I don't think anyone will be surprised when this stuff lands in some of the highest courts and gets revisited in a different administration.

  > IMO a fine of 100x the value of a copy of the pirated work is more than sufficient as a punishment for piracy.
Sure. You can have whatever opinion you want. I wasn't arguing about your opinion. I even agreed with it[16]!

But that is a different topic all together. I still think you've vastly over simplified the conversation and thus unintentionally making some naive assumptions. It's the whole reason I said "probably" in [16]. The big difference being just that you're smart enough to figure out how law works and I'm smart enough to know that neither of us are lawyers.

And please don't ask me for more citations unless they are difficult to Google... I think I already set some kinda record here...

  [0] https://archive.is/3oCg8
  [1] https://news.ycombinator.com/item?id=42971446
  [2] https://news.ycombinator.com/item?id=43125840
  [3] https://news.ycombinator.com/item?id=42772771
  [4] https://news.ycombinator.com/item?id=40505480
  [5] https://news.ycombinator.com/item?id=41163032
  [6] https://news.ycombinator.com/item?id=40987971
  [7] https://news.ycombinator.com/item?id=33457063
  [8] https://news.ycombinator.com/item?id=27724042
  [9] https://news.ycombinator.com/item?id=42273817
  [10] https://news.ycombinator.com/item?id=38781941
  [11] https://news.ycombinator.com/item?id=11520633
  [12] https://news.ycombinator.com/item?id=45142885
  [13] https://perkinscoie.com/insights/update/court-sides-meta-fair-use-and-dmca-questions-leaves-door-open-future-challenges
  [14] https://arstechnica.com/tech-policy/2025/07/meta-pirated-and-seeded-porn-for-years-to-train-ai-lawsuit-says/
  [15] https://torrentfreak.com/copyright-lawsuit-accuses-meta-of-pirating-adult-films-for-ai-training/
  [16] https://news.ycombinator.com/item?id=45190232
7. godelski ◴[] No.45193795[source]
I mean they are being sued? I provided a long list of HN links in the sibling comment. But you know... you can also check Google[0]

[0] https://gprivate.com/6ib6y

8. vidarh ◴[] No.45195142[source]
> As for NVIDIA, do they even train models?

Yes. Nemotron:

https://www.nvidia.com/en-gb/ai-data-science/foundation-mode...

9. jimmydorry ◴[] No.45195829[source]
> IMO a fine of 100x the value of a copy of the pirated work is more than sufficient as a punishment for piracy.

Anti-piracy groups use scare letters on pirates where they threaten to sue for tens of thousands of dollars per instance of piracy. Why should it be lower for a company?