I've spent a weekend making something similar for my gmail account (which google keeps nagging me about being 90% full). It's fascinating to be able to classify 65k+ of emails (surprise: more than half are garbage), as well as summarize and trace the nature of communication between specific senders/recipients. It took about 50 hours on a dual RTX 3090 running Qwen 3.
My original goal was to prune the account deleting all the useless things and keeping just the unique, personal, valuable communications -- but the other day, an insight has me convinced that the safer / smarter thing to do in the current landscape is the opposite: remove any personal, valuable, memorable items, and leave google (and whomever else is scraping these repositories) with useless flotsam of newsletters, updates, subscription receipts, etc.
replies(2):