I find it odd that they keep insisting on this to the point that it's the very first example. I'm willing to bet 90% of users don't use genmoji and the 10% who have used it on occasion mostly do it for the lulz at how bizarre the whole thing is.
It seems to me that they don't really have a vision for Apple Intelligence, or at least not a compelling one.
I especially dont want it nativly on my phone or macbook unless it's opt-in. the opt-out stuff is soooo frustrating.
The way I read this, there's no discovery mechanism here, so Apple has to guess a priori which prompts will be popular. How do they know what queries to send?
tldr: Privacy protections seems personal, but not collective:
- For short genmoji prompts, respond with false positives so large numbers are required
- For longer writing, generate texts and match their embedding signatures with opted-in samples
i.e., personal privacy is preserved, but one could likely still distinguish populations if not industries and use-cases: social media users vs. students vs. marketers, conservatives vs. progressives, etc. These categories themselves have meaning because they carry useful associations: marketers more likely to do x, conservatives y, etc. And that information is very valuable, unless it's widely known.
No one likes being personally targeted: it's weird to get ads for something you just searched for. But it might also be problematic for society to have groups be characterized, particularly to the extent that the facts are non-obvious (e.g., if marketers decide within a minute v. developers taking days). To the extent the information is valuable, it's more so if it private and limited (i.e., preserves the information asymmetry), which means the collectors of that information have an incentive to keep it private.
So even if Apple broadly has the best of intentions, even this data collection creates a moral hazard, a valuable resource that enterprising people can tap. It adds nothing to Apple's bottom line, but could be someone's life's work and salary.
Could it be mitigated by a commitment to publish all their conclusions? (hmm: but the analyses are often borderline insignificant) Not clear.
Bottom line for me: I'm now less worried about losing personal privacy than about technologies for characterizing and manipulating groups of consumers or voters. But it's impossible for Apple to characterize users at scale for their own quality assessment -- and thus to maintain their product excellence -- without doing exactly that.
Oy!
If they hadn’t saddled themselves with the privacy promises, or if OpenAI were willing to uphold those same promises, then I bet Siri would’ve been wholly replaced by ChatGPT by now.
And they have a dedicated app for participating in clinical studies: https://www.apple.com/ios/research-app/
It is opt-in but you just need to click a single checkbox:
https://user-images.githubusercontent.com/3705482/142927547-...
https://research.google/blog/improving-gboard-language-model...
Later in the article, for a different (but similar) feature:
> To curate a representative set of synthetic emails, we start by creating a large set of synthetic messages on a variety of topics... We then derive a representation, called an embedding, of each synthetic message that captures some of the key dimensions of the message like language, topic, and length. These embeddings are then sent to a small number of user devices that have opted in to Device Analytics.
It's crazy to think Apple is constantly asking my iPhone if I ever write emails similar to emails about tennis lessons (their example). This feels like the least efficient way to understand users in this context. Especially considering they host an email server!
No need to be so dismissive. Anyway i do agree those 3 examples you provided are good ones and they have made a big difference in healthcare.
I'm still unclear on how you create that initial set of class labels used to generate the random seed texts, and how sensitive the method is to that initial corpus.
This is pretty important, because these systems aren't so robust that you can just assume everything is working without review. (See, for example, this paper [3].) Apple should at least document what kinds of data are being collected, and precisely how the collection process works.
[1] https://static.googleusercontent.com/media/research.google.c... [2] https://www.apple.com/privacy/docs/Differential_Privacy_Over... [3] https://arxiv.org/pdf/1709.02753
E: i guess I'm wrong, apologies