There would be so much to write about what I've seen. I've thought of making a blog post. I use mitmproxy to check on sketchy apps and to learn in general.
The information sent out is fascinating. I knew extensive telemetry is pretty norm these days, but it's another thing to see it with your own eyes. My exercise has also made the typical "yes, we collect data/telemetry, but it's deanonymized/secured/etc. and deleted after X days so no worries" sound very hollow; even if a company goes in good faith by their own rules, how am I supposed to trust the other 1000 companies who also do data collection. If someone hacked my mitmproxy itself and downloaded all the payloads it collected, they would probably know me better than I do.
Random examples on top of my head from mitmproxy (when I say "chatty" I mean they talk a lot to server somewhere):
I had GitHub CoPilot neovim plugin. I didn't realize how chatty it was until I did this (although I wasn't surprised either, obviously completions are sent out to a server, but it also has your usual telemetry+AB test experiment stuff). I had wanted to ditch that service for a long time so I finally did it after seeing with a local setup since open stuff has mostly caught up. Also it's not actually open source I think? I had no idea (I thought it would just be a simple wrapper to call into some APIs, but: no PRs, no issues, code has blobs of .wasm and .node: https://github.com/github/copilot.vim)
Firefox telemetry, if it's turned on, is a bit concerningly detailed to me. I think I might be completely identifiable on some of the payloads if someone decided to really take a go at analyzing the payloads I send. Also I find it funny that one of the JSON fields says "telemetry is off". Telemetry is actually on on the menu (I leave it on purpose to see stuff like this); just in the JSON for some reason it says off. I'm not sure if that telemetry is meant to be non-identifiable though in the first place.
Unity-made software (also mentioned in the article) send out a Unity piece at start-up that looks similar to the article, although I didn't take a deeper look myself.
Author mentioned the battery: I also noticed that a lot of mobile apps are interested in the battery level. I didn't connect the pieces why but the article mentions Uber 4% battery surcharge, and now it makes a bit more sense.
One app that has at least once been on HN at high scores starts sending out analytics before you've consented to any terms and conditions. One of the fields is your computer hostname (one of my computers has my real name in my hostname...it does not anymore). Usually web pages have "by downloading you accept terms and conditions" but this one only presented that text after you launch app before you get to the main portion. I never clicked it (still haven't), but I allowed the app mellow on background to snoop on its behavior.
Video games: The ones I've tried seen mostly don't do anything too interesting. But I haven't tried any crappy mobile games for example. One unity game on the laptop, Bloons TD 6 sends out analytics at every menu click and a finished game sends a summary and is the "chattiest" game so far, although seems limited to what the game actually needs to do (it has an.online aspect). The payloads had more detailed info on my game stats though, they should add those to the game UI ;)
Apple updates don't work through mitmproxy (won't trust the certificates). Neither do many mobile apps (none of the banking ones did, now I know what a mitm attack would look like to my bank app).
Some requests have a boatload of HTTP headers. I've thought of writing a mitmproxy module to make a top 10 list. I think some Google services might be at the top that I've seen. (I think Google also has developed new HTTP tech, is it so that they can more efficiently set even more cookies? ;)
I think anything Microsoft-tied may be chattiest programs overall on my laptop. But I haven't done stats or anything like that.
Aside from mitmproxy, I'm learning security/cryptography (managed to find real world vulnerabilities although frankly very boring ones so far...), Ghidra, started learning some low-level seccomp() stuff, qemu user emulation, things in that nature to get some skills in this space. Still need to learn: legal side of things (ToSes like to say 'no reverse engineering'), how to not get into trouble if you reverse engineer something someone didn't like. I've not dared to report some things, and to not poke some APIs or even mention them because I don't know enough yet how to cover my ass.
Modern computing privacy and security is a mess.
I've worked a good part of my career at a DSP company (it would be in the box that says "Criteo" on it on the author's article). So I have some idea what companies in that space have as data.