←back to thread

104 points thunderbong | 9 comments | | HN request time: 0.416s | source | bottom
1. palsecam ◴[] No.42171316[source]
See also: “Meow hash” by the legendary Casey Muratori, a fast non-cryptographic hash function: https://github.com/cmuratori/meow_hash

> It’s the fastest hash function we know of, and we have benchmarked all the ones we could find. On modern Intel x64 CPUs, it hashes 16 bytes per cycle single-threaded. This means in cache it can hash at a rate of 64 gigabytes per second on a 4.2gHz machine. Out of cache, it hashes at whatever speed your main memory bus can provide to a single core, since that is usually the limiting factor on modern x64 CPUs.

> It has also now been tuned to be the fastest hash on small inputs, too. Despite the fact that it is a full 128-bit hash, it still outperforms “fast” 64-bit hashes across all input sizes.

https://archive.is/CQOVm (originally https://mollyrocket.com/meowhash)

Discussion on HN: https://news.ycombinator.com/item?id=29038813 & https://news.ycombinator.com/item?id=18262627

replies(3): >>42171727 #>>42172847 #>>42173408 #
2. Sesse__ ◴[] No.42171727[source]
Not sure why people keep claiming it's the “fastest” when it's far down the list on the SMHasher benchmark list:

https://rurban.github.io/smhasher/doc/table.html

In particular, rapidhash is three times as fast for small inputs and is portable (unlike meow_hash). Plus meow fails one of the SMHasher quality tests.

replies(2): >>42171802 #>>42183422 #
3. palsecam ◴[] No.42171802[source]
Right, Meow is not the best when it comes to small inputs.

It shines for large inputs (it’s far up the SMHasher list you linked in this case).

It offers a good compromise; it’s not bad (good, even) for both large and small inputs.

Granted, rapidhash is quite good on both fronts, too.

replies(1): >>42172274 #
4. Sesse__ ◴[] No.42172274{3}[source]
Well, it's not even the best AES-NI hash on the list (for large nor small inputs), and it's not passing the quality tests, so why bother?
5. zamalek ◴[] No.42172847[source]
From the archive link (blog post published 2018):

> To our surprise, we found a lack of published, well-optimized, large-data hash functions.

Murmur was released in 2008. Murmur3 in 2011. Release dates on higher quality functions are not easy to find, but I am sure that there were more around.

This type of thing is why I take Casey's claims with a huge dose of salt.

replies(1): >>42183415 #
6. ur-whale ◴[] No.42173408[source]
> See also: “Meow hash”

Meow hash may very well be fast on large data, but it completely fails the "small code" criteria, which is one of the clearly stated design goal of Chibi.

Generally speaking, the "code as small as possible" is IMO a design constraint that is very often under-estimated or even ignored for these type of algorithms, including the crypto hard ones.

I personally find the "code fits on half a page" metric very important: beyond satisfying some very personal sense of aesthetics, it buys you quite a lot of nice properties:

   - fits in head
   - inlines like a dream
   - easy to integrate
   - easy to do header only
   - easy to security audit
   - pretty hard to introduce bugs
   - nothing up my sleeve numbers pretty much obvious
For these reasons, I used to love the TEA (tiny encryption algorithm) [1] before cryptanalysis unfortunately showed it to be weak. Its successors (XXTEA, etc...) are already almost too big for my taste.

The speck cipher [2] is quite nice in that regard, in spite of its rather untrustworthy provenance.

[1] https://en.wikipedia.org/wiki/Tiny_Encryption_Algorithm

[2] https://en.wikipedia.org/wiki/Speck_(cipher)

7. Sesse__ ◴[] No.42183415[source]
> This type of thing is why I take Casey's claims with a huge dose of salt.

My impression in general is that Casey is a fairly decent engineer who has somehow been elevated into godhood by a subset of users.

8. nwellnhof ◴[] No.42183422[source]
I wouldn't consider a hash function that relies on 64-bit multiplication truly portable. Its performance will suffer considerably on a 32-bit system. Note that this applies to both Rapidhash and ChibiHash.
replies(1): >>42184161 #
9. Sesse__ ◴[] No.42184161{3}[source]
Well, even without taking into account that 32-bit systems are increasingly becoming extinct, it's less catastrophic than one might think. You get four muls instead of one and then a little carry propagation; I guess less if you have 32x32 -> 64 muls available? (I haven't thought deeply about it.) Certainly anything that relies on AES-NI would be much worse. But I guess your best shot in Murmur3, then.