AI tooling must be disclosed for contributions

(github.com)

728 points freetonik | 2 comments | 21 Aug 25 18:49 UTC | HN request time: 0.468s | source

Show context

neilv ◴[21 Aug 25 19:26 UTC] No.44976959[source]▶

There is also IP taint when using "AI". We're just pretending that there's not.

If someone came to you and said "good news: I memorized the code of all the open source projects in this space, and can regurgitate it on command", you would be smart to ban them from working on code at your company.

But with "AI", we make up a bunch of rationalizations. ("I'm doing AI agentic generative AI workflow boilerplate 10x gettin it done AI did I say AI yet!")

And we pretend the person never said that they're just loosely laundering GPL and other code in a way that rightly would be existentially toxic to an IP-based company.

replies(6): >>44976975 #>>44977217 #>>44977317 #>>44980292 #>>44980599 #>>44980775 #

ineedasername ◴[21 Aug 25 19:57 UTC] No.44977317[source]▶

>>44976959 #

Courts (at least in the US) have already ruled that use of ingested data for training is transformative. There’s lots of details to figure, but the genie is out of the bottle.

Sure it’s a big hill to climb in rethinking IP laws to align with a societal desire that generating IP continue to be a viable economic work product, but that is what’s necessary.

replies(9): >>44977525 #>>44978041 #>>44978412 #>>44978589 #>>44979766 #>>44979930 #>>44979934 #>>44980167 #>>44980236 #

bsder ◴[21 Aug 25 21:39 UTC] No.44978412[source]▶

>>44977317 #

> Courts (at least in the US) have already ruled that use of ingested data for training is transformative.

If you have code that happens to be identical to some else's code or implements someone's proprietary algorithm, you're going to lose in court even if you claim an "AI" gave it to you.

AI is training on private Github repos and coughing them up. I've had it regurgitate a very well written piece of code to do a particular computational geometry algorithm. It presented perfect, idiomatic Python with perfect tests that caught all the degenerate cases. That was obviously proprietary code--no amount of searching came up with anything even remotely close (it's why I asked the AI, after all).

replies(5): >>44979018 #>>44979022 #>>44979146 #>>44979821 #>>44979900 #

ineedasername ◴[21 Aug 25 22:43 UTC] No.44979018[source]▶

>>44978412 #

>If you have code that happens to be identical to some else's code or implements someone's proprietary algorithm, you're going to lose in court even if you claim an "AI" gave it to you.

Not for a dozen lines here or there, even if it could be found and identified in a massive code base. That’s like quoting a paragraph of a book in another book, non infringing.

For the second half of your comment it sounds like you’re saying you got results that were too good to be AI- that’s a bit “no true Scotsman”, at least without more detail. But implementing an algorithm, even a complex one, is very much something an LLM can do. Algorithms are much better defined and scoped natural language, and LLMs do a reasonable job of translating to languages. An algorithm is a narrow subset of that task type with better defined context and syntax.

replies(2): >>44979801 #>>44980120 #

ozfive ◴[22 Aug 25 00:24 UTC] No.44979801[source]▶

>>44979018 #

What will happen when company A implements algorithm X based on AI output, company B does the same and company A claims that it is proprietary code and takes company B to court?

replies(2): >>44979970 #>>44980026 #

1. andreasmetsala ◴[22 Aug 25 00:52 UTC] No.44979970[source]▶

>>44979801 #

What has happened when the same thing happens without AI involved?

replies(1): >>44981006 #

2. ozfive ◴[22 Aug 25 04:32 UTC] No.44981006[source]▶

>>44979970 (TP) #

Yep, it’s not a brand-new problem. I just wonder if AI is going to turbocharge the odds of these disputes popping up.

↑