←back to thread

429 points pabs3 | 1 comments | | HN request time: 0s | source
Show context
lisper ◴[] No.43472957[source]
I've been running my own spam filter for many years now based on this super-simple heuristic: My filter looks at my outgoing mail, and any mail received from an address I've sent mail to, or with a subject that has appeared in my outgoing mail (possibly with a "re:" prefix) is marked as non-spam. Everything else goes in spam, and any spam message from an address I've never received mail from before is marked as unread. I get hundreds of spams per day, but only about a dozen from new addresses. It takes me about ten seconds to scan them for non-spam cold calls, which are extremely rare. The other source of false positives is things like subscription confirmations, but because I know to expect those, they are always at the top of the spam folder.

I put this initial system in place expecting to have to augment it later with a more traditional content-based filter, but this simple heuristic works so well I've never felt the need to implement that additional step.

replies(3): >>43473299 #>>43473466 #>>43473483 #
EGreg ◴[] No.43473299[source]
Someone posted on X advice that really helped me clean up my inbox

Add a filter looking for the word "Unsubscribe" and automatically put them in "Promotional" category or something similar. Also apply the filter to existing emails, and let it run for a minute.

Try it now! And comment if it reduced your inbox to like 2% of what it was :)

replies(2): >>43473401 #>>43473518 #
lisper ◴[] No.43473518[source]
I tried that a long time ago and the problem with it was that it produced a lot of false positives for me because I subscribe to a lot of Google Groups.
replies(1): >>43474782 #
EGreg ◴[] No.43474782[source]
Can you make a negative condition also, X but not Y?
replies(1): >>43475420 #
1. lisper ◴[] No.43475420[source]
Of course. But the problem is that the more complicated you make your filtering logic, the harder it becomes to maintain. I was constantly discovering new exceptions to my ever-more-complicated rules, which is why I eventually gave up on that whole approach.