Define policy forbidding use of AI code generators

(github.com)

493 points todsacerdoti | 4 comments | 25 Jun 25 23:26 UTC | HN request time: 0.835s | source

Show context

benlivengood ◴[26 Jun 25 00:21 UTC] No.44383064[source]▶

Open source and libre/free software are particularly vulnerable to a future where AI-generated code is ruled to be either infringing or public domain.

In the former case, disentangling AI-edits from human edits could tie a project up in legal proceedings for years and projects don't have any funding to fight a copyright suit. Specifically, code that is AI-generated and subsequently modified or incorporated in the rest of the code would raise the question of whether subsequent human edits were non-fair-use derivative works.

In the latter case the license restrictions no longer apply to portions of the codebase raising similar issues from derived code; a project that is only 98% OSS/FS licensed suddenly has much less leverage in takedowns to companies abusing the license terms; having to prove that infringers are definitely using the human-generated and licensed code.

Proprietary software is only mildly harmed in either case; it would require speculative copyright owners to disassemble their binaries and try to make the case that AI-generated code infringed without being able to see the codebase itself. And plenty of proprietary software has public domain code in it already.

replies(8): >>44383156 #>>44383218 #>>44383229 #>>44384184 #>>44385081 #>>44385229 #>>44386155 #>>44387156 #

olalonde ◴[26 Jun 25 13:20 UTC] No.44387156[source]▶

>>44383064 #

Seems like a fake problem. Who would sue QEMU for using AI-generated code? OpenAI? Anthropic?

replies(1): >>44387220 #

ethbr1 ◴[26 Jun 25 13:28 UTC] No.44387220[source]▶

>>44387156 #

Anyone whose code is in a used model's training set.*

This is about future existential tail risk, not current risk.

* Depending on future court decisions in different jurisdictions

replies(1): >>44387393 #

1. olalonde ◴[26 Jun 25 13:48 UTC] No.44387393[source]▶

>>44387220 #

Again, seems so implausible that it's not worth worrying about.

replies(2): >>44387785 #>>44388305 #

2. ethbr1 ◴[26 Jun 25 14:27 UTC] No.44387785[source]▶

>>44387393 (TP) #

Were you around for SCO? https://en.m.wikipedia.org/wiki/Timeline_of_SCO%E2%80%93Linu...

IP disputes aren't trivial, especially for shoestring-funded OSS.

3. consp ◴[26 Jun 25 15:23 UTC] No.44388305[source]▶

>>44387393 (TP) #

It is implausible until it isn't and qemu is taking a very cheap and easy step to outright ban it and covering their ass just in case. The threat is low plausibility but high risk and thus a valid one to consider.

replies(1): >>44389882 #

4. olalonde ◴[26 Jun 25 18:17 UTC] No.44389882[source]▶

>>44388305 #

I disagree. Open source projects routinely deal with far greater risk, like employees contributing open source code on company time without explicit authorization. Yet they generally allow code from anyone without much verification (some have a contributor agreement but it's based on trust, there's no actual verification). I stand by my 2022 prediction[0]: no one will get sued for using LLM-generated code.

[0] https://news.ycombinator.com/item?id=31849027

↑