←back to thread

989 points acomjean | 2 comments | | HN request time: 0.46s | source
Show context
aeon_ai ◴[] No.45143392[source]
To be very clear on this point - this is not related to model training.

It’s important in the fair use assessment to understand that the training itself is fair use, but the pirating of the books is the issue at hand here, and is what Anthropic “whoopsied” into in acquiring the training data.

Buying used copies of books, scanning them, and training on it is fine.

Rainbows End was prescient in many ways.

replies(36): >>45143460 #>>45143461 #>>45143507 #>>45143513 #>>45143567 #>>45143731 #>>45143840 #>>45143861 #>>45144037 #>>45144244 #>>45144321 #>>45144837 #>>45144843 #>>45144845 #>>45144903 #>>45144951 #>>45145884 #>>45145907 #>>45146038 #>>45146135 #>>45146167 #>>45146218 #>>45146268 #>>45146425 #>>45146773 #>>45146935 #>>45147139 #>>45147257 #>>45147558 #>>45147682 #>>45148227 #>>45150324 #>>45150567 #>>45151562 #>>45151934 #>>45153210 #
amradio1989 ◴[] No.45145884[source]
I think the jury is still out on how fair use applies to AI. Fair use was not designed for what we have now.

I could read a book, but its highly unlikely I could regurgitate it, much less months or years later. An LLM, however, can. While we can say "training is like reading", its also not like reading at all due to permanent perfect recall.

Not only does an LLM have perfect recall, it also has the ability to distribute plagiarized ideas at a scale no human can. There's a lot of questions to be answered about where fair use starts/ends for these LLM products.

replies(6): >>45145935 #>>45146799 #>>45147413 #>>45147551 #>>45151973 #>>45153940 #
dns_snek ◴[] No.45147413[source]
Fair use wasn't designed for AI, but AI doesn't change the motivations and goals behind copyright. We should be returning back to the roots - why do we have copyright in the first place, what were the goals and the intent behind it, and how does AI affect them?

The way this technology is being used clearly violates the intent behind copyright law, it undermines its goals and results in harm that it was designed to prevent. I believe that doing this without extensive public discussion and consensus is anti-democratic.

We always end up discussing concrete implementation details of how copyright is currently enforced, never the concept itself. Is there a good word for this? Reification?

replies(2): >>45147500 #>>45151861 #
1. godelski ◴[] No.45147500[source]

  >  but AI doesn't change the motivations and goals behind copyright
That's the point they're making
replies(1): >>45147557 #
2. dns_snek ◴[] No.45147557[source]
The person I responded to? Yes I'm agreeing with them, just adding my own thoughts. Maybe I could've worded that better :)