←back to thread

989 points acomjean | 1 comments | | HN request time: 0.465s | source
Show context
aeon_ai ◴[] No.45143392[source]
To be very clear on this point - this is not related to model training.

It’s important in the fair use assessment to understand that the training itself is fair use, but the pirating of the books is the issue at hand here, and is what Anthropic “whoopsied” into in acquiring the training data.

Buying used copies of books, scanning them, and training on it is fine.

Rainbows End was prescient in many ways.

replies(36): >>45143460 #>>45143461 #>>45143507 #>>45143513 #>>45143567 #>>45143731 #>>45143840 #>>45143861 #>>45144037 #>>45144244 #>>45144321 #>>45144837 #>>45144843 #>>45144845 #>>45144903 #>>45144951 #>>45145884 #>>45145907 #>>45146038 #>>45146135 #>>45146167 #>>45146218 #>>45146268 #>>45146425 #>>45146773 #>>45146935 #>>45147139 #>>45147257 #>>45147558 #>>45147682 #>>45148227 #>>45150324 #>>45150567 #>>45151562 #>>45151934 #>>45153210 #
ants_everywhere ◴[] No.45143731[source]
I wonder what Aaron Swartz would think if he lived to see the era of libgen.
replies(2): >>45143762 #>>45144481 #
klntsky ◴[] No.45143762[source]
He died (2013) after libgen was created (2008).
replies(2): >>45143787 #>>45143998 #
arcanemachiner ◴[] No.45143787[source]
Yeah but did he die before anybody actually knew about it?
replies(2): >>45143923 #>>45145157 #
edgineer ◴[] No.45145157[source]
I knew about library genesis by 2012. It was at least 10 TiB large by then, IIRC. With the amount of Russian language content I got the impression it was more popular in that sphere, but an impressive collection for anyone and not especially secret.
replies(1): >>45146186 #
1. h2zizzle ◴[] No.45146186[source]
To be fair, he might have been rather preoccupied at that time.