Copyright in its current form is ridiculous, but I support some (much-pared-back) version of copyright that limits rights further, expands fair use, repeals the DMCA, and reduces the copyright term to something on the order of 15-20 years (perhaps with a renewal option as with patents).
I've released a lot of software under the GPL, and the GPL in its current form couldn't exist without copyright.
What copyright should do is protect individual creators, not corporations. And it should protect them even if their work is mixed through complex statistical algorithms such as LLMs.
LLMs wouldn't be possible without _trillions_ of hours of work by people writing books, code, music, etc. they are trained on. The _millions_ of hours of work spent on the training algorithm itself, the chat interface, the scraping scripts, etc. is barely a drop in the bucket.
There is 0 reason the people who spent mere millions of hours of work should get all the reward without giving anything to the rest of the world who put in trillions of hours.
Your point remains, but the problem of the division of responsibility and financial credit doesn't go away with that alone. Do you know if the openAI lawsuits have laid this out?
It can be as simple as "you cannot train on someone's work for commercial uses without a license", It can be as complex as setting up some sort of model like Spotify based on the numbers of time the LLM references those works for what it's generating. The devil's in the details, but the problem itself isn't new.
>Dividing equal share based on inputs would require the company to potentially expose proprietary information.
I find this defense ironic, given the fact that a lot of this debate revolves around defining copyright infringement. The works being trained on are infringed upon, but we might give too many details about the tech used to siphon all these IP's? Tragic.
>Do you know if the openAI lawsuits have laid this out?
IANAL, but my understanding of high profile cases is going more towards the "you can't train on this" litigation over the "how do we setup a payment model" sorts. If that's correct, we're pretty far out from considering that.