←back to thread

989 points acomjean | 1 comments | | HN request time: 0.206s | source
Show context
markasoftware ◴[] No.45143475[source]
They also agreed to destroy the pirated books. I wonder how large of a portion of their training data comes from these shadow libraries, and if AI labs in countries that have made it clear they won't enforce anti-piracy laws against AI companies will get a substantial advantage by continuing to use shadow libraries.
replies(2): >>45143610 #>>45144912 #
1. gpm ◴[] No.45144912[source]
They already, prior to this lawsuit, prior to serving public models, replaced this data set with one they made by scanning purchased books. Destroying the data set they aren't even using should have approximately zero effect.