Most active commenters
  • ljosifov(9)
  • 12ian34(5)

←back to thread

747 points porridgeraisin | 19 comments | | HN request time: 0.543s | source | bottom
Show context
ljosifov ◴[] No.45064773[source]
Excellent. What were they waiting for up to now?? I thought they already trained on my data. I assume they train, even hope that they train, even when they say they don't. People that want to be data privacy maximalists - fine, don't use their data. But there are people out there (myself) that are on the opposite end of the spectrum, and we are mostly ignored by the companies. Companies just assume people only ever want to deny them their data.

It annoys me greatly, that I have no tick box on Google to tell them "go and adapt models I use on my Gmail, Photos, Maps etc." I don't want Google to ever be mistaken where I live - I have told them 100 times already.

This idea that "no one wants to share their data" is just assumed, and permeates everything. Like soft-ball interviews that a popular science communicator did with DeepMind folks working in medicine: every question was prefixed by litany of caveats that were all about 1) assumed aversion of people to sharing their data 2) horrors and disasters that are to befall us should we share the data. I have not suffered any horrors. I'm not aware of any major disasters. I'm aware of major advances in medicine in my lifetime. Ultimately the process does involve controlled data collection and experimentation. Looks a good deal to me tbh. I go out of my way to tick all the NHS boxes too, to "use my data as you see fit". It's an uphill struggle. The defaults are always "deny everything". Tick boxes never go away, there is no master checkbox "use any and all of my data and never ask me again" to tick.

replies(16): >>45064814 #>>45064872 #>>45064877 #>>45064889 #>>45064911 #>>45064921 #>>45064967 #>>45064974 #>>45064988 #>>45065001 #>>45065005 #>>45065065 #>>45065128 #>>45065333 #>>45065457 #>>45065554 #
1. 12ian34 ◴[] No.45064814[source]
not remotely worried about leaks, hacks, or sinister usage of your data?
replies(3): >>45064920 #>>45065057 #>>45072864 #
2. londons_explore ◴[] No.45064920[source]
I would far prefer the service use my data to work better and take a few privacy risks.

People die all the time from cancer or car accidents. People very rarely die from data leaks.

Some countries like Sweden make people's private financial data public information - and yet their people seem happier than ever. Perhaps privacy isn't as important as we think for a good society.

replies(6): >>45065000 #>>45065055 #>>45065141 #>>45065772 #>>45065823 #>>45066321 #
3. soiltype ◴[] No.45065000[source]
public/private isn't a binary, it's a spectrum. we Americans mostly sit in the shithole middle ground where our data is widely disseminated among private, for-profit actors, for the explicit purpose of being used to manipulate us, but it's mostly not available to us, creating an assymmetric power balance.
replies(1): >>45066547 #
4. 12ian34 ◴[] No.45065055[source]
Sweden is a very poor example, all that is public is personal taxable income. That's it. You're comparing apples to oranges. And how is your home address, and AI chatbot history going to cure cancer?
5. ljosifov ◴[] No.45065057[source]
If they leaked bank accounts numbers, or private keys - I would be worried. That has not happened in the past.

About myself personally - my Name Surname is googleable, I'm on the open electoral register, so my address is not a secret, my company information is also open in the companies register, I have a a personal website I have put up willingly and share information about myself there. Training models on my data doesn't seem riskier than that.

Yeah, I know I'd be safer if I was completely dark, opaque to the world. I like the openness though. I also think my life has been enriched in infinitely many ways by people sharing parts of their lives via their data with me. So it would be mildly sociopathic of me, if I didn't do similar back to the world, to some extent.

replies(2): >>45065103 #>>45068227 #
6. 12ian34 ◴[] No.45065103[source]
So you are projecting sociopathy on those that choose to keep their lives more private than you? Like you said, basic personal details are essentially public knowledge anyway. Where do you draw the line personally on what should be private?
replies(1): >>45065529 #
7. Gud ◴[] No.45065141[source]
That financial data is very limited. Would it be just as acceptable if these companies knew where and what you purchased?
8. ljosifov ◴[] No.45065529{3}[source]
Not at all, on the contrary, I chose my words carefully ("mildly sociopathic OF ME") as to avoid casting shade on others. Saying "this is how I feel", so to preclude judging others. Everyone makes their own choices, and that's fine.

Boundaries - yes sure they exist. I don't have my photo albums open to the world. I don't share info about family and friends - I know people by default don't want to share information about them, and I try to respect that. Don't share anything on Facebook, where plenty share, for example.

At the same time, I find obstacles to data sharing codified in the UK law frustrating. With the UK NHS. 1) Can't email my GP to pass information back-and-forth - GP withholds their email contact; 2) MRI scan private hospital makes me jump 10 hops before sharing my data with me; 3) Blood tests scheduling can't tell me back that schedule for a date failed, apparently it's too much for them to have my email address on record; 4) Can't volunteer my data to benefit R&D in NHS. ("here are - my lab works reports, 100 GB of my DNA paid for by myself, my medical histories - take them all in, use them as you please...") In all cases vague mutterings of "data protection... GDPR..." have been relayed back as "reasons". I take it's mostly B/S. They could work around if they wanted to. But there is a kernel of truth - it's easier for them to not try share, so it's used as a cover leaf. (in the worst case - an alibi for laziness.)

I'm for having power to share, or not share, what I want. With Google - I do want them to know about myself and use that for my (and theirs) benefit. With the UK gov (trying to break encryption) - I don't want them to be able to read my WhatsApp-s. I detest UK gov for effectively forcing me (by forcing the online pharmacy) to take a photos of myself (face, figure) in order to buy online Wegovy earlier today.

replies(1): >>45066579 #
9. ◴[] No.45065772[source]
10. nojs ◴[] No.45065823[source]
Would you be comfortable posting all of this information here, right now? Your name, address, email address, search history, ChatGPT history, emails, …

If not, why?

11. ljosifov ◴[] No.45066321[source]
In the past I have found obstacles to data sharing codified in the UK law frustrating. I'm reasonably sure some people will have died because of this, that would not have died otherwise. If they could communicate with the NHS, similarly (email, whatsapp) to how they communicate in their private and professional lives.

Within the UK NHS and UK private hospital care, these are my personal experiences.

1) Can't email my GP to pass information back-and-forth. GP withholds their email contact, I can't email them e.g. pictures of scans, or lab work reports. In theory they should have those already on their side. In practice they rarely do. The exchange of information goes sms->web link->web form->submit - for one single turn. There will be multiple turns. Most people just give up.

2) MRI scan private hospital made me jump 10 hops before sending me link, so I can download my MRI scans videos and pictures. Most people would have given up. There were several forks in the process where in retrospect could have delayed data DL even more.

3) Blood tests scheduling can't tell me back that scheduled blood test for a date failed. Apparently it's between too much to impossible for them to have my email address on record, and email me back that the test was scheduled, or the scheduling failed. And that I should re-run the process.

4) I would like to volunteer my data to benefit R&D in the NHS. I'm a user of medicinal services. I'm cognisant that all those are helping, but the process of establishing them relied on people unknown to me sharing very sensitive personal information. If it wasn't for those unknown to me people, I would be way worse off. I'd like to do the same, and be able to tell UK NHS "here are, my lab works reports, 100 GB of my DNA paid for by myself, my medical histories - take them all in, use them as you please."

In all cases vague mutterings of "data protection... GDPR..." have been relayed back as "reasons". I take it's mostly B/S. Yes there are obstacles, but the staff could work around if they wanted to. However there is a kernel of truth - it's easier for them to not try to share, it's less work and less risk, so the laws are used as a cover leaf. (in the worst case - an alibi for laziness.)

12. ljosifov ◴[] No.45066547{3}[source]
I agree with your stance there. Further - the conventional opinion is that the power imbalance coming from the information imbalance (state/business know a lot about me; I know little about them) is that us citizens and consumers should reduce our "information surface" towards them. And address the imbalance that way. But.

There exists another, often unmentioned option. And that option is for state/business to open up, to increase their "information surface" towards us, their citizens/consumers. That will also achieve information (and one hopes power) rebalance. Every time it's actually measured, how much value we put on our privacy, when we have to weight privacy against convenience and other gains from more data sharing, the revealed preference is close to zero. The revealed preference is that we put the value of our privacy close to zero, despite us forever saying otherwise. (that we value privacy very very much; seems - "it ain't so")

So the option of state/business revealing more data to us citizens/consumers, is actually more realistic. Yes there is extra work on part of state/business to open their data to us. But it's worth it. The more advanced the society, the more coordination it needs to achieve the right cooperation-competition balance in the interactions between ever greater numbers of people.

There is an old book "Data For the People" by an early AI pioneer and Amazon CTO Andreas Weigend. Afaics it well describes the world we live in, and also are likely to live even more in the future.

13. 12ian34 ◴[] No.45066579{4}[source]
Thanks for this considered response. I find it difficult to disagree with anything you said in this particular comment :) however I do think each instance you mention in this message is quite different to the topic at hand, regarding the big tech data machine. Additionally, I think I would rather our UK level of privacy regarding healthcare data than the commercialised free for all in the US. One counterpoint could be that Palantir got a significant amount of UK NHS data.
replies(1): >>45067008 #
14. ljosifov ◴[] No.45067008{5}[source]
Thanks for the consideration. Yeah US and UK are different in that respect. I got the impression that US ends with the worst deal on both ends: organisations that could help you are denied your data, while organisation most unscrupulous most bent on doing their worst with your data, get almost free access to it.

For UK - I'm reasonably sure some people will have died because of the difficulties sharing their data, that would not have died otherwise. "Otherwise" being - they could communicate with the NHS, share their data, similarly via email, WhatsApp etc, to how they communicate and share data in their private and professional lives.

People at personal level have a fairly reasonable stance, in how they behave, when it comes to sharing their data. They are surprisingly subtle in their cost-benefit analysis. It's only when they answer surveys, or talk in public, that they are less-than-entirely-truthful. We know this, b/c their revealed preferences are at odds with what they say they value, and how much they value.

15. int_19h ◴[] No.45068227[source]
LLMs can and do sometimes regurgitate parts of training data verbatim - this has been demonstrated many times on things ranging from Wikipedia articles to code snippets. Yes, it is not particularly likely for that damning private email of yours to be memorized, but if you throw a dataset with millions of private emails onto a model, it will almost certainly memorize some of them, and nobody knows what exact sequence of input tokens might trigger it to recite.
replies(1): >>45072902 #
16. ljosifov ◴[] No.45072864[source]
I'm worried, it's not like I don't care. For example, I'm worried that Google is such a huge ginormous target, that at some point their Gmail will be broken. At the same time, there are benefits to sharing data. There are benefits to me, in Google using the information it has on my, to make my life easier. In this case, I judge that Gemini using my data to train, is a low extra risk for me. Compared to all other risks I take, for doing things in public. Including writing this on public forums, as you do too.

In general, I find the ongoing public scare about sharing data, to be anti-thesis to the original spirit of the Net, that was all about sharing data. Originally, we were delighted to connect to perfect strangers on the other side of the world. That we would never have gotten to communicate with otherwise. I accept there might have been an element of self-selection there, that aided that view: people one'd communicate with, although maybe from a different culture, would be from similar niche sub-culture of people messing with computers and looking forward to communication, having a favourable view of that.

replies(1): >>45073180 #
17. ljosifov ◴[] No.45072902{3}[source]
That's a consideration, for sure. But given the LLM-s have not got the ground truth, everything is controlled hallucination, then - if the LLM tells you an imperfect version of my email or chat, you can never be sure if what the LLM told you is true, or not. So maybe you don't gain that much extra knowledge about me. For example, you can reasonably guess I'm typing this on the computer, and having coffee too. So if you ask the LLM "tell me a trivial story", and LLM comes back with "one morning, LJ was typing HN replies on the computer while having his morning coffee" - did you learn that much new about me, that you didn't know or could guess before?
18. 12ian34 ◴[] No.45073180[source]
> the ongoing public scare about sharing data

I think this might be a bit of a social bubble thing - I think it isn't a forefront concern for the vast majority of people.

replies(1): >>45073297 #
19. ljosifov ◴[] No.45073297{3}[source]
I think you are correct there - the majority of the public don't care. They just try to get about doing their daily business and act the best they can under circumstances. So we just click "Accept" to any popup banner make it go away, accept "All cookies" 100 times every day, use Google mail/map/photos/drive and that all involves giving away data, even if in words we say we don't want to give data. So yes obviously the public by necessity act in a rational way, doing cost-benefit analysis. While a cadre of privacy obsessives have made my life worse by lobbying and having their bad ideas codified in the UL laws. Wrote about my experience in the UK medical systems here https://news.ycombinator.com/item?id=45066321