I realize there's a whole legal quagmire here involved with intellectual "property" and what counts as "derivative work", but that's a whole separate (and dubiously useful) part of the law.
AI companies will get bailed out like the auto industry was - they won't be hurt at all.
It’s quite clear. It’s easy to opt out. They’re making everyone go through it.
It doesn’t reach your threshold of having everyone sign a contract or something, but then again no other online service makes people sign contracts.
> should be considered a serious criminal offense.
On what grounds? They’re showing people the terms. It’s clear enough. People have to accept the terms. We’ve all been accepting terms for software and signing up for things online for decades.
If you can use all of the content of stack overflow to create a “derivative work” that replaces stack overflow, and causes it to lose tons of revenue, is it really a derivative work?
I’m pretty sure solution sites like chegg don’t include the actual questions for that reason. The solutions to the questions are derivative, but the questions aren’t.
(and as diggan said, the web isn't the only source they use anyway. who knows what they're buying from data brokers.)
And when talking specifically about AI, one could argue that learning from interactions is a common aspect of intelligence, so a casual user who do not understand details about LLMs would expect so anyways. Also, the fact that LLMs (and other neural networks) have distinct training and inference phases seems more like an implementation detail.
Privacy makes sense, treating data like property does not.
The users did provide the data, which is a good point. But there’s a reason SO was so useful to developers and quora was not. It also made it a perfect feeding ground for hungry LLMs.
Then again I’m just guessing that big models are trained on SO. Maybe that’s not true