←back to thread

747 points porridgeraisin | 2 comments | | HN request time: 0s | source
Show context
superposeur ◴[] No.45064455[source]
Everyone seems to be unsurprised by this move, but I’m genuinely shocked. What a shoot your own foot business decision. Google, evil though it be, doesn’t post the text of your gmails in its search results because who would consider using Gmail after that? This is the llm equivalent. Am I missing something?
replies(7): >>45064592 #>>45064626 #>>45064638 #>>45064681 #>>45064737 #>>45064752 #>>45065348 #
KoolKat23 ◴[] No.45064592[source]
This data is useful for reinforcement learning. All the others do it.

And most importantly, you can just opt-out.

replies(3): >>45064613 #>>45064705 #>>45064753 #
1. superposeur ◴[] No.45064705[source]
Ok, to be clear, let’s say I’m dumb and accidentally go with the default (I get the color of the opt out button wrong or something). As if there’s a “publish my private emails to the internet” default-on button in email. Then, I use it to edit a rec letter for student X, with my signature Y. (Yes I know this is dumb and I try changing names when editing but am sure some actual names may slip through.) A few months later the next model is released trained on the data. Student X asks Claude what Y would write in a rec letter about X. Such a button is a “wings stay on / wings fall off” button on a plane.
replies(1): >>45064868 #
2. franga2000 ◴[] No.45064868[source]
You're severely overestimating the ability of the model to recall a single mostly uninteresting item from it's billions of input documents.