But unlike the 100s of data brokers that also want your data, they have an existing operational funnel of your data already that you voluntary give them every day. All they need is dark pattern ToS changes and manage the minor PR issue. People will forget about this in a week.
If they had done this in a more measured way they might have been able to separate human from AI content such as doing legal deals with publishers.
However they couldn't wait to just take it all to be first and now the well is poisoned for everyone.
I've seen zero evidence anything of the such is occurring, and that if it was, it's due to what you claim. I'd be highly interested in research suggesting both or either is occurring however.
To AI companies, data is even more of a gold mine than to adtech companies. It is existentially important.
The truly evil behavior will emerge at the intersection of these two industries. I'm sure Google and Facebook are already using data from one to power the other, even if it's currently behind closed doors. I can hardly wait for the use cases these geniuses will think of once this is publicly acceptable and in widespread use by all companies.
The claim was made that the models are "suffering", at this exact moment, because they have been recursively feeding themselves, RIGHT now.
I want evidence the current models are "suffering" right now, and I want further evidence that suggests this suffering is due to recursive data ingestion.
Some year old article with no relevance to today talking about hypotheticals of indiscriminate gorging of recursive data is not evidence of either of the things I asked for.
> what's the latest year of data you're trained on
> ChatGPT said: My training goes up to April 2023.
There's a reason they're not willing to update the training corpus even with GPT-5.
> Some year old article with no relevance to today
The current models are based on training even older so I guess you should disregard those too if you're choosing to judge things purely based on their age.