I check when I start using any new service. The cynical assumption that everything's being shared leads to shrugging it off and making no attempt to look for settings.
It only takes a moment to go into settings -> privacy and look.
Because they already used data without permission on a much larger scale, so it's a perfectly logical assumption that they would continue doing so with their users?
Training on everything you can publicly scrape from the internet is a very different thing from training on data that your users submit directly to your service.
What if you ask it for medical advice, or legal things? What if you turn on Gmail integration? Should I now be able to generate your conversations with the right prompt?
xAI trains Grok on both public data (Tweets) and non-public data (Conversations with Grok) by default. [0]
> Grok.com Data Controls for Training Grok: For the Grok.com website, you can go to Settings, Data, and then “Improve the Model” to select whether your content is used for model training.
Meta trains its AI on things posted to Meta's products, which are not as "public" as Tweets on X, because users expect these to be shared only with their networks. They do not use DMs, but they do use posts to Instagram/Facebook/etc. [1]
> We use information that is publicly available online and licensed information. We also use information shared on Meta Products. This information could be things like posts or photos and their captions. We do not use the content of your private messages with friends and family to train our AIs unless you or someone in the chat chooses to share those messages with our AIs.
OpenAI uses conversations for training data by default [2]
> When you use our services for individuals such as ChatGPT, Codex, and Sora, we may use your content to train our models.
> You can opt out of training through our privacy portal by clicking on “do not train on my content.” To turn off training for your ChatGPT conversations and Codex tasks, follow the instructions in our Data Controls FAQ. Once you opt out, new conversations will not be used to train our models.
[1] https://www.facebook.com/privacy/genai/
[2] https://help.openai.com/en/articles/5722486-how-your-data-is...