AI tooling must be disclosed for contributions

(github.com)

728 points freetonik | 2 comments | 21 Aug 25 18:49 UTC | HN request time: 0s | source

Show context

neilv ◴[21 Aug 25 19:26 UTC] No.44976959[source]▶

There is also IP taint when using "AI". We're just pretending that there's not.

If someone came to you and said "good news: I memorized the code of all the open source projects in this space, and can regurgitate it on command", you would be smart to ban them from working on code at your company.

But with "AI", we make up a bunch of rationalizations. ("I'm doing AI agentic generative AI workflow boilerplate 10x gettin it done AI did I say AI yet!")

And we pretend the person never said that they're just loosely laundering GPL and other code in a way that rightly would be existentially toxic to an IP-based company.

replies(6): >>44976975 #>>44977217 #>>44977317 #>>44980292 #>>44980599 #>>44980775 #

ineedasername ◴[21 Aug 25 19:57 UTC] No.44977317[source]▶

>>44976959 #

Courts (at least in the US) have already ruled that use of ingested data for training is transformative. There’s lots of details to figure, but the genie is out of the bottle.

Sure it’s a big hill to climb in rethinking IP laws to align with a societal desire that generating IP continue to be a viable economic work product, but that is what’s necessary.

replies(9): >>44977525 #>>44978041 #>>44978412 #>>44978589 #>>44979766 #>>44979930 #>>44979934 #>>44980167 #>>44980236 #

BobbyTables2 ◴[22 Aug 25 01:44 UTC] No.44980236[source]▶

>>44977317 #

I’m curious … So “transformative” is not necessarily “derivative”?

Seems to me the training of AI is not radically different than compression algorithms building up a dictionary and compressing data.

Yet nobody calls JPEG compression “transformative”.

Could one do lossy compression over billions of copyrighted images to “train” a dictionary?

replies(2): >>44980566 #>>44981498 #

1. ipaddr ◴[22 Aug 25 06:11 UTC] No.44981498[source]▶

>>44980236 #

A compression algorithm doesn't transform the data it stores it in a different format. Storing a story in a txt file vs word file doesn't transform the data.

An llm is looking at the shape of words and ideas over scale and using that to provide answers.

replies(1): >>44984885 #

2. const_cast ◴[22 Aug 25 14:07 UTC] No.44984885[source]▶

>>44981498 (TP) #

No a compression algorithm does transform the data, particularly lossy ones. The pixels stored in the output are not in the input, they're new pixels. That's why you can't uncompress a jpeg. Its a new image that just happens to look like the original. But it even might not - some jpegs are so deep fried they become their own form of art. This is very popular in meme culture.

The only difference, really, is we know how a JPEG algorithm works. If I wanted to, I could painstakingly make a jpeg by hand. We don't know how LLMs work.

↑