I realize there's a whole legal quagmire here involved with intellectual "property" and what counts as "derivative work", but that's a whole separate (and dubiously useful) part of the law.
If you can use all of the content of stack overflow to create a “derivative work” that replaces stack overflow, and causes it to lose tons of revenue, is it really a derivative work?
I’m pretty sure solution sites like chegg don’t include the actual questions for that reason. The solutions to the questions are derivative, but the questions aren’t.
Privacy makes sense, treating data like property does not.
The users did provide the data, which is a good point. But there’s a reason SO was so useful to developers and quora was not. It also made it a perfect feeding ground for hungry LLMs.
Then again I’m just guessing that big models are trained on SO. Maybe that’s not true