←back to thread

Sampling with SQL

(blog.moertel.com)
175 points thunderbong | 1 comments | | HN request time: 0.253s | source
1. apwheele ◴[] No.41902802[source]
Very nice, another pro-tip for folks is that you can set the weights to get approximate stratified sampling. So say group A had 100,000 rows, and group B had 10,000 rows, and you wanted each in the resulting to have approximately the same proportion. You would set the weight for each A row to be 1/100,000 and for B to be 1/10,000.

If you want exact counts I think you would need to do RANK and PARTITION BY.