I used o3 to profile myself from my saved Pocket links

1. saeedesmaili ◴[07 Jul 25 14:09 UTC] No.44490550[source]▶

After reading this I realized I also have an archive of my pocket account (4200 items), so tried the same prompt with o3, gemini 2.5 pro, and opus 4:

- chatgpt UI didn't allow me to submit the input, saying it's too large. Although it was around 80k tokens, less than o3's 200k context size.

- gemini 2.5 pro: worked fine for personality and interest related parts of the profile, but it failed the age range, job role, location, parental status with incorrect perdictions.

- opus 4: nailed it and did a more impressive job, accurately predicted my base city (amsterdam), age range, relationship status, but didn't include anything about if I'm a parent or not.

Both gemini and opus failed in predicting my role, probably understandably. Although I'm a data scientist, I read a lot about software engineering practices because I like writing software and since I don't have the opportunity at work to do this kind of work, I code for personal projects, so I need to learn a lot about system design, etc. Both models thought I'm a software engineer.

Overall it was a nice experiment. Something I noticed is both models mentioned photography as my main hobby, but if they had access to my youtube watch history, they'd confidently say it's tennis. For topics and interests that we usually watch videos rather than reading articles about, would be interesting to combine the youtube watch history with this pocket archive data (although it would be challenging to get that data).

replies(9): >>44490818 #>>44490825 #>>44491013 #>>44491019 #>>44492764 #>>44493027 #>>44495207 #>>44499820 #>>44501925 #

2. greenavocado ◴[07 Jul 25 14:36 UTC] No.44490818[source]▶

>>44490550 (TP) #

You need to use an iterative refinement pyramid of prompts. Use a cheap model to condense the majority of the raw data in chunks, then increasingly stronger and more expensive models over increasingly larger sets of those chunks until you are able to reach the level of summarization you desire.

3. tgtweak ◴[07 Jul 25 14:37 UTC] No.44490825[source]▶

>>44490550 (TP) #

I think a reasoning/thinking-heavy model would do better at piecing together the various data points than an agentic model. Would be interested to see how o3 does with the context summarized.

replies(1): >>44493222 #

4. tehlike ◴[07 Jul 25 14:52 UTC] No.44491013[source]▶

>>44490550 (TP) #

You should take this as a sign, and shoot for SWE jobs - given your interest.

What you do at work today doesn't mean you can't switch to a related ladder.

replies(2): >>44491635 #>>44493295 #

5. juliendorra ◴[07 Jul 25 14:53 UTC] No.44491019[source]▶

>>44490550 (TP) #

You should be able to use Google Takeout to get all of your YouTube data, including your watch history.

This article is a nice example of someone using it:

> When I downloaded all my YouTube data, I’ve noticed an interesting file included. That file was named watch-history and it contained a list of all the videos I’ve ever watched.

https://blog.viktomas.com/posts/youtube-usage/

Of course as an European it's a legal obligation for companies to give you access, but I think Google Takeout works worldwide?

replies(3): >>44491293 #>>44498307 #>>44499423 #

6. jazzyjackson ◴[07 Jul 25 15:19 UTC] No.44491293[source]▶

>>44491019 #

Yes I've done this in USA. pretty neat. I have it on my todo list to parse over it and find all the music videos I've watched 3 or more times to archive them.

replies(1): >>44491975 #

7. justusthane ◴[07 Jul 25 15:53 UTC] No.44491635[source]▶

>>44491013 #

Sometimes it’s nice for hobbies to remain hobbies

replies(4): >>44493877 #>>44495336 #>>44495561 #>>44497804 #

8. toomuchtodo ◴[07 Jul 25 16:27 UTC] No.44491975{3}[source]▶

>>44491293 #

https://archive.zhimingwang.org/blog/2014-11-05-list-youtube... might be of use along with https://github.com/yt-dlp/yt-dlp, might just grab it all and prune later due to rot and availability issues over time within YT.

9. LoganDark ◴[07 Jul 25 17:38 UTC] No.44492764[source]▶

>>44490550 (TP) #

> Both models thought I'm a software engineer.

You probably still are, even if that's not your career path :)

10. larve ◴[07 Jul 25 18:01 UTC] No.44493027[source]▶

>>44490550 (TP) #

re o3: you can zip the file, upload it, and it will use python and grep and the shell to inspect it. I have yet to try using it with a sqlite db, but that's how i do things locally with agents.

replies(1): >>44493330 #

11. saeedesmaili ◴[07 Jul 25 18:21 UTC] No.44493222[source]▶

>>44490825 #

Agreed, that's why I used reasoning models (gemini 2.5 pro and opus 4 with extended thinking enabled).

12. smt88 ◴[07 Jul 25 18:29 UTC] No.44493295[source]▶

>>44491013 #

I love reading about cooking but I'd hate to become a cook

13. saeedesmaili ◴[07 Jul 25 18:33 UTC] No.44493330[source]▶

>>44493027 #

Author mentions that by doing that they didn't get a high quality response. Adding the texts into model's context make all the information available for it to use.

14. formerphotoj ◴[07 Jul 25 19:36 UTC] No.44493877{3}[source]▶

>>44491635 #

Exactly this. The need to make money from a thing may well eliminate the value one derives from the thing, and even add negatives such as stress, etc.

replies(1): >>44512461 #

15. datpuz ◴[07 Jul 25 22:20 UTC] No.44495207[source]▶

>>44490550 (TP) #

Reading 80k tokens requires more than 80k tokens due to overhead

16. cortesoft ◴[07 Jul 25 22:43 UTC] No.44495336{3}[source]▶

>>44491635 #

I believed this, which is what made me avoid computer science in college; I wanted to avoid ruining my favorite hobby.

After a few years post graduation, where I wasn't sure what I wanted to do and I floundered to find a career, I decided to give software development a try, and risk ruining my favorite hobby.

Definitely the best decision I could have made. Now people pay me a lot of money to do the thing I love to do the most... what's not to love? 20 years later, it I still my favorite hobby, and they keep paying me to do it.

replies(4): >>44495559 #>>44495618 #>>44495730 #>>44512595 #

17. p1necone ◴[07 Jul 25 23:21 UTC] No.44495559{4}[source]▶

>>44495336 #

I think it heavily depends on who you're working for.

If they get out of the way and let you do the thing you love how you want to do it you'll get good results for you and them.

If they treat you like a cog in a machine and assume they need to carrot and stick you into doing things because you might not really want to be there, you'll be miserable.

replies(1): >>44503509 #

18. sea-gold ◴[07 Jul 25 23:22 UTC] No.44495561{3}[source]▶

>>44491635 #

https://english.stackexchange.com/questions/25225/ways-to-ru...

19. justusthane ◴[07 Jul 25 23:31 UTC] No.44495618{4}[source]▶

>>44495336 #

Sure, of course. Sometimes it works out to follow your passion into a career. I was objecting to the apparent premise that that’s _always_ what you should do.

20. 8n4vidtmkvmk ◴[07 Jul 25 23:53 UTC] No.44495730{4}[source]▶

>>44495336 #

My first software job I enjoyed. My 2nd/current job I enjoy everything except the actual work. Too much beuracracy, but it hasn't ruined my love for the craft yet. Oh well, I'm building some other skills I didn't know I had in me.

21. abrookewood ◴[08 Jul 25 07:12 UTC] No.44497804{3}[source]▶

>>44491635 #

100%. I am absolutely certain that I do not have a viable career as a professional surfer ... no matter how much I wish it wasn't true.

replies(1): >>44512579 #

22. viraptor ◴[08 Jul 25 08:59 UTC] No.44498307[source]▶

>>44491019 #

It is available and it can be surprisingly large. I've somehow accumulated multiple GB of data from YT alone. Which feels a bit absurd - there's bound to be lots of waste there.

23. yubblegum ◴[08 Jul 25 12:43 UTC] No.44499423[source]▶

>>44491019 #

This can give a false sense of what Google (Alphabet) actually knows about you. That above is Google playing the game of 'ok, here is what we know of your activities on youtube when logged in!'

But Google and the rest of the "advertising" (euphemism for surveillance) industry track and create "profiles" based on a basket of data points, from ip/MAC address to the rest of their bag of tricks.

replies(1): >>44500827 #

24. alexnorton ◴[08 Jul 25 13:33 UTC] No.44499820[source]▶

>>44490550 (TP) #

I was able to give this a try on every YouTube video I've ever watched by exporting the history from Google Takeout:

https://takeout.google.com/settings/takeout/custom/youtube?p...

And then a combination of pup and jq to parse the video titles from the HTML file:

  cat watch-history.html \
    | pup '.outer-cell .mdl-grid .content-cell:nth-child(2) json{}' \
    | jq -r '.[] .children[0] | select(.tag != "br") | select(.text | startswith("https://www.youtube.com/watch?v=") | not) | .text' \
    > videos.txt

25. dietr1ch ◴[08 Jul 25 15:17 UTC] No.44500827{3}[source]▶

>>44499423 #

Internally at Google a toy tool to peek into your own personal advertisement profile was released and taken down within a week or two because it was creepy knowledgeable about you.

replies(1): >>44503885 #

26. UrineSqueegee ◴[08 Jul 25 17:07 UTC] No.44501925[source]▶

>>44490550 (TP) #

o3 on the webui has a tiny context as do all the models

27. cortesoft ◴[08 Jul 25 20:00 UTC] No.44503509{5}[source]▶

>>44495559 #

I have worked a few places at many different positions over an 18 year career so far.

I have enjoyed the programming part of all the jobs. I don’t really care the problem, I just like using computers to solve problems.

28. ariwilson ◴[08 Jul 25 20:40 UTC] No.44503885{4}[source]▶

>>44500827 #

when?

replies(1): >>44510840 #

29. dietr1ch ◴[09 Jul 25 15:00 UTC] No.44510840{5}[source]▶

>>44503885 #

Probably sometime around 2018 or 2019, I don't recall, but it was before the covid lockdown

30. tehlike ◴[09 Jul 25 17:08 UTC] No.44512461{4}[source]▶

>>44493877 #

Not really. I do software both as a hobby, and as a career.

31. tehlike ◴[09 Jul 25 17:17 UTC] No.44512579{4}[source]▶

>>44497804 #

Eh. Software engineers are in demand, and surfers decidedly are not.

32. tehlike ◴[09 Jul 25 17:18 UTC] No.44512595{4}[source]▶

>>44495336 #

It was my hobby. Then I did computer science, and now I'm at a faang, make more money in a year than my parents in their lifetime probably.