←back to thread

Sampling with SQL

(blog.moertel.com)
175 points thunderbong | 1 comments | | HN request time: 0s | source
Show context
tmoertel ◴[] No.41899091[source]
Hey! I'm tickled to see this on HN. I'm the author. If you have any questions, just ask. I'll do my best to answer them here.
replies(2): >>41904092 #>>41904525 #
indoordin0saur ◴[] No.41904092[source]
Thank you! The thing I find tricky here is choosing the weight. I think maybe one obvious way you would want to weight samples would be for recency. E.g. if I have a table of user login events then I care about seeing more of the ones that happened recently but still want to see some of the older ones. Would the algorithm still work if I converted a `created_at` timestamp to epoch time and used that? Or would I want to normalize it in some way?
replies(2): >>41904836 #>>41905441 #
1. swasheck ◴[] No.41904836[source]
depending on what you’re analyzing, EMA could give you a good weighting method