←back to thread

688 points samwho | 1 comments | | HN request time: 0.674s | source
Show context
ryeats ◴[] No.45019141[source]
O(1) in many cases involves a hashing function which is a non-trivial but constant cost. For smaller values of N it can be outperformed in terms of wall clock time by n^2 worst case algorithms.
replies(2): >>45019209 #>>45025666 #
1. namibj ◴[] No.45025666[source]
Arguably this is just a fairly poor way of thinking for quite many practical applications. Notably, memory access latency(/inverse throughput) is roughly between O(log(n)) [ https://news.ycombinator.com/item?id=12385472 ] and O(n^(1/2)) [ https://news.ycombinator.com/item?id=12383275 ](the latter aka O(sqrt(n))).

For example, in applications where the sorted form can be readily maintained, a decent B+-tree tends to massively outperform a hashmap as soon as you get the streamed/non-indexed side (of what's in a way a lookup join) to opportunistically batch items:

as when you sort your lookups you can use exponential forward search (compare at exponentially increasing distances from the cursor; once you're past the target, run binary search between this now upper bounds and the previous probe point as lower bound) in your index for each next key to reduce the per-lookup cost to be logarithmic in distance from the last-looked-up key (asymptotically always better than single isolated lookups; in practice tends to cap out at 2x the cost in pathological cases if you respect page locality of B+-tree structures). If you aren't ignorant of your LRU cache set during such, you'll get by with overall significantly fewer memory accesses to e.g. fresh DRAM pages (let alone NAND pages) than with a hashmap setup.

I've severely sped up a C++ program by replacing an `std::unordered_map` with a `std::vector` (iirc technically a pair of; for SoA purposes) by realizing that I could collect the items unordered; sort the structure; then use binary search instead of hashmap lookups. The function that I modified ran something like 3x faster as a result, and that's without anything like vectorization-friendly structures or such.