←back to thread

728 points freetonik | 2 comments | | HN request time: 0.017s | source
Show context
neilv ◴[] No.44976959[source]
There is also IP taint when using "AI". We're just pretending that there's not.

If someone came to you and said "good news: I memorized the code of all the open source projects in this space, and can regurgitate it on command", you would be smart to ban them from working on code at your company.

But with "AI", we make up a bunch of rationalizations. ("I'm doing AI agentic generative AI workflow boilerplate 10x gettin it done AI did I say AI yet!")

And we pretend the person never said that they're just loosely laundering GPL and other code in a way that rightly would be existentially toxic to an IP-based company.

replies(6): >>44976975 #>>44977217 #>>44977317 #>>44980292 #>>44980599 #>>44980775 #
ineedasername ◴[] No.44977317[source]
Courts (at least in the US) have already ruled that use of ingested data for training is transformative. There’s lots of details to figure, but the genie is out of the bottle.

Sure it’s a big hill to climb in rethinking IP laws to align with a societal desire that generating IP continue to be a viable economic work product, but that is what’s necessary.

replies(9): >>44977525 #>>44978041 #>>44978412 #>>44978589 #>>44979766 #>>44979930 #>>44979934 #>>44980167 #>>44980236 #
jhanschoo ◴[] No.44979934[source]
An AI model's output can be transformative, but you can be unlucky enough that the LLM memorized the data that it gave you.
replies(1): >>44983110 #
1. martin-t ◴[] No.44983110[source]
I don't see why verbatim or not should matter at all.

How complex does a mechanical transformation have to be to not be considered plagiarism, copyright infringement or parasitism?

If somebody writes a GPL-licensed program, is it enough to change all variable and function names to get rid of those pesky users' rights? Do you have to change the order of functions? Do you have to convert it to a different language? Surely nobody would claim c2rust is transformative even though the resulting code can be wildly different if you apply enough mechanical transformations.

All LLMs do is make the mechanical transformations 1) probabilistic 2) opaque 3) all at once 4) using multiple projects as a source.

replies(1): >>44986639 #
2. jhanschoo ◴[] No.44986639[source]
> How complex does a mechanical transformation have to be to not be considered plagiarism, copyright infringement or parasitism?

Legally speaking, this depends from domain to domain. But consider for example extracting facts from several biology textbooks, and then delivering those facts to the user in the characteristic ChatGPT tone that is distinguishable from the style of each source textbook. You can then be quite assured that courts will not find that you have infringed on copyright.