Define policy forbidding use of AI code generators

1. sysmax ◴[26 Jun 25 01:58 UTC] No.44383569[source]▶

I wish people would make distinction regarding the size/scope of the AI-generated parts. Like with video copyright laws, where a 5-second clip from a copyrighted movie is usually considered fair use and not frowned upon.

Because for projects like QEMU, current AI models can actually do mind-boggling stuff. You can give it a PDF describing an instruction set, and it will generate you wrapper classes for emulating particular instructions. Then you can give it one class like this and a few paragraphs from the datasheet, and it will spit out unit tests checking that your class works as the CPU vendor describes.

Like, you can get from 0% to 100% test coverage several orders of magnitude faster than doing it by hand. Or refactoring, where you want to add support for a particular memory virtualization trick, and you need to update 100 instruction classes based on straight-forward, but not 100% formal rule. A human developer would be pulling their hairs out, while an LLM will do it faster than you can get a coffee.

replies(3): >>44383596 #>>44383598 #>>44384005 #

2. echelon ◴[26 Jun 25 02:03 UTC] No.44383596[source]▶

>>44383569 (TP) #

Qemu can make the choice to stay in the "stone age" if they want. Contributors who prefer AI assistance can spend their time elsewhere.

It might actually be prudent for some (perhaps many foundational) OSS projects to reject AI until the full legal case law precedent has been established. If they begin taking contributions and we find out later that courts find this is in violation of some third party's copyright (as shocking as that outcome may seem), that puts these projects in jeopardy. And they certainly do not have the funding or bandwidth to avoid litigation. Or to handle a complete rollback to pre-AI background states.

3. 762236 ◴[26 Jun 25 02:03 UTC] No.44383598[source]▶

>>44383569 (TP) #

It sounds like you're saying someone could rewrite Qemu on their own, with the help of AI. That would be pretty funny.

replies(1): >>44384101 #

4. halostatue ◴[26 Jun 25 03:44 UTC] No.44384005[source]▶

>>44383569 (TP) #

Not all jurisdictions are the US, and not all jurisdictions allow fair use, but instead have specific fair dealing laws. Not all jurisdictions have fair dealing laws, meaning that every use has to be cleared.

There are simple algorithms that everyone will implement the same way down to the variable names, but aside from those fairly rare exceptions, there's no "maximum number of lines" metric to describe how much code is "fair use" regardless of the licence of the code "fair use"d in your scenario.

Depending on the context, even in the US that 5-second clip would not pass fair use doctrine muster. If I made a new film cut entirely from five second clips of different movies and tried a fair use doctrine defence, I would likely never see the outside of a courtroom for the rest of my life. If I tried to do so with licensing, I would probably pay more than it cost to make all those movies.

Look up the decisions over the last two decades over sampling (there are albums from the late 80s and 90s — when sampling was relatively new — which will never see another pressing or release because of these decisions). The musicians and producers who chose the samples thought they would be covered by fair use.

5. mrheosuper ◴[26 Jun 25 04:06 UTC] No.44384101[source]▶

>>44383598 #

Given enough time, a monkey randomly types on typewriter can rewrite QEMU.