Define policy forbidding use of AI code generators

(github.com)

489 points todsacerdoti | 4 comments | 25 Jun 25 23:26 UTC | HN request time: 0.61s | source

Show context

benlivengood ◴[26 Jun 25 00:21 UTC] No.44383064[source]▶

Open source and libre/free software are particularly vulnerable to a future where AI-generated code is ruled to be either infringing or public domain.

In the former case, disentangling AI-edits from human edits could tie a project up in legal proceedings for years and projects don't have any funding to fight a copyright suit. Specifically, code that is AI-generated and subsequently modified or incorporated in the rest of the code would raise the question of whether subsequent human edits were non-fair-use derivative works.

In the latter case the license restrictions no longer apply to portions of the codebase raising similar issues from derived code; a project that is only 98% OSS/FS licensed suddenly has much less leverage in takedowns to companies abusing the license terms; having to prove that infringers are definitely using the human-generated and licensed code.

Proprietary software is only mildly harmed in either case; it would require speculative copyright owners to disassemble their binaries and try to make the case that AI-generated code infringed without being able to see the codebase itself. And plenty of proprietary software has public domain code in it already.

replies(8): >>44383156 #>>44383218 #>>44383229 #>>44384184 #>>44385081 #>>44385229 #>>44386155 #>>44387156 #

1. graemep ◴[26 Jun 25 08:02 UTC] No.44385229[source]▶

>>44383064 #

Proprietary source code would not usually end up training LLMs. Unless its leaked, how would an LLM have access to it?

> it would require speculative copyright owners to disassemble their binaries

I wonder whether AI might be a useful tool for making that easier.

If you have evidence then you can get courts to order disclosure or examination of code.

> And plenty of proprietary software has public domain code in it already.

I am pretty sure there is a significant amount of proprietary code that has FOSS code in it, against license terms (especially GPL and similar).

A lot of proprietary code is now been written using AIs trained on FOSS code, and companies are open about this. It might open an interesting can of worms.

replies(2): >>44385241 #>>44386080 #

2. physicsguy ◴[26 Jun 25 08:04 UTC] No.44385241[source]▶

>>44385229 (TP) #

> Unless its leaked

Given the number of people on HN that say they're using for e.g. Cursor, OpenAI, etc. through work, and my experience with workplaces saying 'absolutely you can't use it', I suspect a large amount is being leaked.

replies(1): >>44386188 #

3. pmlnr ◴[26 Jun 25 10:47 UTC] No.44386080[source]▶

>>44385229 (TP) #

Licence incompatibility is enough.

4. graemep ◴[26 Jun 25 11:08 UTC] No.44386188[source]▶

>>44385241 #

I thought most of these did not use users context and input for training?

↑