←back to thread

398 points pyman | 10 comments | | HN request time: 0.629s | source | bottom
1. platunit10 ◴[] No.44492696[source]
Every time an article like this surfaces, it always seems like the majority of tech folks believe that training AI on copyrighted material is NOT fair use, but the legal industry disagrees.

Which of the following are true?

(a) the legal industry is susceptible to influence and corruption

(b) engineers don't understand how to legally interpret legal text

(c) AI tech is new, and judges aren't technically qualified to decide these scenarios

Most likely option is C, as we've seen this pattern many times before.

replies(9): >>44492721 #>>44492755 #>>44492782 #>>44492783 #>>44492932 #>>44493290 #>>44493664 #>>44494318 #>>44494973 #
2. rockemsockem ◴[] No.44492721[source]
Idk, I think most people in tech I talk to IRL think it is fair use?

I think the overly liberal, non-tech crowd has become really vocal on HN as of late and your sample is likely biased by these people.

3. 827a ◴[] No.44492755[source]
Armchair commentators, including myself, tend to be imprecise when speaking about whether something is illegal, versus something should be illegal. Sometimes due to a misunderstanding of the law, or an over-estimation of the court's authority, or an over-estimation of our legislature's productivity, or just because we're making conversation and like talking.
4. CaptainFever ◴[] No.44492782[source]
> Every time an article like this surfaces, it always seems like the majority of tech folks believe that training AI on copyrighted material is NOT fair use

Where are you getting your data from? My conclusions are the exact opposite.

(Also, aren't judges by definition the only ones qualified to declare if it is actually fair use? You could make a case that it shouldn't be fair use, but that's different from it being not fair use.)

5. redcobra762 ◴[] No.44492783[source]
It's not likely you've actually gotten the opinion of the "majority of tech folks", just the most outspoken ones, and only in specific bubbles you belong to.
6. OkayPhysicist ◴[] No.44492932[source]
There's a lot of conflation of "should/shouldn't" and "is/isn't". The comments by tech folk you're alluding to mostly think that it "shouldn't" be fair use, out of concern about the societal consequences, whereas judges are looking at it and saying that it "is" fair use, based on the existing law.

Any reasonable reading of the current state of fair use doctrine makes it obvious that the process between Harry Potter and the Sorcerer's Stone and "A computer program that outputs responses to user prompts about a variety of topics" is wildly transformative, and thus the usage of the copyrighted material is probably covered by fair use.

7. kube-system ◴[] No.44493290[source]
I know for sure (b) is true. Way too many people on technical forums read legal texts as if the process to interpret laws is akin to a compiler generating a binary.
8. standardUser ◴[] No.44493664[source]
I don't understand at all the resistance to training LLMs on any and all materials available. Then again, I've always viewed piracy as a compatible with markets and a democratizing force upon them. I thought (wrongly?) that this was the widespread progressive/leftist perspective, to err on the side of access to information.
9. freshtake ◴[] No.44494318[source]
If I allegedly train off of your training, which was trained off of copyrighted content under fair use, we're good right?

Just asking for a friend who's into this sort of thing.

10. mrguyorama ◴[] No.44494973[source]
Seeing as (a) is true in the US Supreme Court, it's probably at least as true in the lower courts.