Most active commenters
  • benterix(4)

←back to thread

747 points porridgeraisin | 12 comments | | HN request time: 0.706s | source | bottom
Show context
psychoslave ◴[] No.45062941[source]
What a surprise, a big corp collected large amount of personal data under some promises, and now reveals actually they will exploit it in completely unrelated manner.
replies(7): >>45062982 #>>45063078 #>>45063239 #>>45064031 #>>45064041 #>>45064193 #>>45064287 #
raldi ◴[] No.45063239[source]
“These updates will apply only to new or resumed chats and coding sessions.”

https://www.anthropic.com/news/updates-to-our-consumer-terms

replies(1): >>45063343 #
1. benterix ◴[] No.45063343[source]
What kind of guarantee do we have this is true?

Meta downloaded copyrighted content and trained their models on it, OpenAI did the same.

Uber developed Greyball to cheat the officials and break the law.

Tesla deletes accident data and reports to the authorities they don't have it.

So forgive me I have zero trust in whatever these companies say.

replies(5): >>45063418 #>>45063536 #>>45063639 #>>45063846 #>>45063974 #
2. komali2 ◴[] No.45063418[source]
> What kind of guarantee do we have this is true?

None. And even if it's the nicest goody two shoes company in the history of capitalism, the NSA will have your data and then there'll be a breach and then Russian cyber criminals will have it too.

At this point I'm with you on the zero trust: we should be shouting loud and clear to everyone, if you put data into a web browser or app, that data will at some point be sold for profit without any say so from you.

replies(1): >>45063734 #
3. scrollaway ◴[] No.45063536[source]
You have no more guarantees that this is true than you had before that they didn’t do it in the first place.

If you don’t take companies at their word, you need to be consistent about it.

4. Aurornis ◴[] No.45063639[source]
> Meta downloaded copyrighted content and trained their models on it, OpenAI did the same

Where did these companies claim they didn’t do this?

Even websites can be covered by copyright. It has always been known that they trained on copyrighted content. The output is considered derivative and therefore it’s not illegal.

replies(1): >>45073587 #
5. pixl97 ◴[] No.45063734[source]
I mean you really sell short where your data is going to be taken from. Browsers and apps are just the start, your TV is selling your data. Your car is selling your data. The places you shop are selling your data.
replies(1): >>45063808 #
6. komali2 ◴[] No.45063808{3}[source]
Reading this comment gave me a flash of vertigo as I realized how deep down the rabbit hole of "crazy dude that only pays in cash" I'd fallen.

I don't own a car and only take public transit or bike. I fill my transit card with cash. I buy food in cash from the farmer's morning market. My tv isn't connected to the Internet, it's connected to a raspberry pi which is connected to my home lab running jellyfin and a YouTube archiving software. I de Googled and use an old used phone and foss apps.

It's all happened so gradually I didn't even realize how far I'd gone!

7. Thorrez ◴[] No.45063846[source]
We're having this discussion on an article about Anthropic changing their privacy policy. If you don't believe Anthropic will follow their privacy policy, then a change to the privacy policy should mean nothing to you.
replies(1): >>45073537 #
8. jsnell ◴[] No.45063974[source]
If it were a lie, why take the PR hit of telling the truth about starting to train on user data but lying about the specifics? It'd be much simpler to just lie about not training on user data at all.

If your threat model is to unconditionally not trust the companies, what they're saying is irrelevant. Which is fair enough, you probably should not be using a service you don't trust at all. But there's not much of a discussion to be had when you can just assert that everything they say is a lie.

> Meta downloaded copyrighted content and trained their models on it, OpenAI did the same.

> Uber developed Greyball to cheat the officials and break the law.

These seem like randomly chosen generic grievances, not examples of companies making promises in their privacy policy (or similar) and breaking them. Am I missing some connection?

replies(2): >>45064157 #>>45073554 #
9. ravishi ◴[] No.45064157[source]
It's all PR. Some people won't read the details and just assume it will train on all data. Some people might complain and they tell it was a bug or a minor slip. And moving forward, after a few months, nobody will remember it was ever different. And some might vaguely remember them saying something about it at some point or something like that.
10. benterix ◴[] No.45073537[source]
Well, yes and no - it gives them more plausible deniability ("oh, this particular piece just ended up in the training set by accident") if they get caught when compared to the previous ToS.
11. benterix ◴[] No.45073554[source]
> These seem like randomly chosen generic grievances, not examples of companies making promises in their privacy policy (or similar) and breaking them. Am I missing some connection?

My point is that whenever we send our data to a third party, we can assume it could be abused, either unintentionally (by a hack, mistake etc.) or intentionally, because these companies are corrupted to the core and have a very relaxed attitude to obeying the law in general as these random examples show.

12. benterix ◴[] No.45073587[source]
> The output is considered derivative and therefore it’s not illegal.

Well, this is what they claim. In practice, this is untrue on several levels. First, earlier OpenAI models were able to quote verbatim, and they were maimed later not to do that. Second, there were several lawsuits against OpenAI and not all of them ended. And finally, assuming that courts decide what they did was legal would mean everyone can legally download and use a copy of Libgen (part of "Books3") whereas the courts around the world are doing the opposite and are blocking access to Libgen country by country. So unless you set double standards, something is not right here. Even Meta employees torrenting Lingen knew that so let's not pretend we buy this rhetoric.