←back to thread

Bloom Filters by Example

(llimllib.github.io)
243 points ibobev | 1 comments | | HN request time: 0.617s | source
Show context
verytrivial ◴[] No.44413866[source]
The overlap in concepts required to understand Bloom filters, sets and hash tables is about 95% IMHO. A set is a hash table used for membership tests where you only care about the key, not the value. And a Bloom filter is just a set that exploits the fact that many-to-one hashing 'compresses' the key-space with collisions. It deliberately uses a very collide-y hash function. If a specific key was ever hashed, you WILL get a hit, but there might be other keys that produced the same hash. It's a feature, not a bug.
replies(3): >>44414062 #>>44414176 #>>44416360 #
1. cherrycherry98 ◴[] No.44414176[source]
Glad to know I'm not alone in my mental modeling of Bloom filters as just hash tables that only track the buckets which have data but not the actual data itself.