←back to thread

1764 points fatihky | 1 comments | | HN request time: 0.341s | source
Show context
buttershakes ◴[] No.12701447[source]
I had almost exactly this same engineering test when google interviewed me in 2006. It was terrible, and left a bad taste in my mouth. Given the complexity of the work I was doing at the time, the entire thing seemed ridiculous.
replies(2): >>12701579 #>>12701983 #
TheCapn ◴[] No.12701579[source]
The inode question gave me flashbacks to my interview with Amazon. They wanted me to explain what a hash function is. I kept giving answers for about 3 minutes explaining hashing, common algorithms, reasons to use it and places it applies.

Recruiter: "I was looking for you to say it's a fingerprint"

So I guess I was wrong, because despite explaining them in decent detail, I didn't use the one keyword.

replies(6): >>12701662 #>>12701711 #>>12701878 #>>12701993 #>>12702618 #>>12702696 #
dekhn ◴[] No.12701662[source]
only some hash functions are fingerprints. Other hash functions are the exact opposite of a fingerprint.
replies(1): >>12702099 #
lucb1e ◴[] No.12702099[source]
I know some hash functions are not meant to create unique fingerprints (they're to pick a bucket to put/look in), but what do you mean by the opposite of a fingerprint?
replies(1): >>12702209 #
dekhn ◴[] No.12702209[source]
The goal of a fingerprint hash is to convert an input space of "large" values to an output space where the output space values are much shorter- typically fixed size and two similar input values have effectively random outputs (without spending the CPU cycles to implement a cryptographic hash). This permits a wide range of optimizations (a document can be fingerprinted, and looked up by its fingerprint, to see if it's a cache, has associated data, etc).

A non-fingerprinting hash function - well, one example of one- is something that for similar inputs, produces the same output (similarity being defined by some distance metric). See, for example: https://en.wikipedia.org/wiki/Locality-sensitive_hashing and https://en.wikipedia.org/wiki/MinHash

Confusingingly, many functions that are used for similarity detection are called Fingerprints, (https://en.wikipedia.org/wiki/Acoustic_fingerprint), but I consider that a distinct use of the term.

replies(2): >>12703128 #>>12703627 #
lucb1e ◴[] No.12703128[source]
Thanks for the elaborate answer! I knew about things like acoustic fingerprints but not that people use the term "hash function" to describe something that indicates similarity.

Could something like Hamming distance be called a hash function too? It's not mentioned on its Wikipedia page.

replies(2): >>12703301 #>>12703504 #
1. dekhn ◴[] No.12703504[source]
it's worth reading and understanding the ideas behind LSH and hamming distance. there are some... fundamental mathematical relationshps there that are still being ... hashed out.