←back to thread

460 points pieterr | 1 comments | | HN request time: 0.225s | source
Show context
__turbobrew__ ◴[] No.42159121[source]
It’s interesting, SICP and other many other “classic” texts talk about designing programs, but these days I think the much more important skill is designing systems.

I don’t know if distributed systems is consider part of “Computer Science” but it is a much more common problem that I see needs to be solved.

I try to write systems in the simplest way possible and then use observability tools to figure out where the design is deficient and then maybe I will pull out a data structure or some other “computer sciency” thing to solve that problem. It turns out that big O notation and runtime complexity doesn’t matter the majority of the time and you can solve most problems with arrays and fast CPUs. And even when you have runtime problems you should profile the program to find the hot spots.

What computer science doesn’t teach you is how memory caching works in CPUs. Your fancy graph algorithm may have good runtime complexity but it completely hoses the CPU cache and you may have been able to go faster with an array with good cache usage.

The much more common problems I have is how to deal with fault tolerance, correctness in distributed locks and queues, and system scalability.

Maybe I am just biased because I have a computer/electrical engineering background.

replies(23): >>42159154 #>>42159177 #>>42159448 #>>42159469 #>>42159581 #>>42159600 #>>42160025 #>>42160394 #>>42160515 #>>42160581 #>>42160656 #>>42161150 #>>42161980 #>>42162797 #>>42163285 #>>42163324 #>>42163910 #>>42164082 #>>42164391 #>>42164509 #>>42164766 #>>42165157 #>>42169617 #
1. 0xDEAFBEAD ◴[] No.42162797[source]
>What computer science doesn’t teach you is how memory caching works in CPUs. Your fancy graph algorithm may have good runtime complexity but it completely hoses the CPU cache and you may have been able to go faster with an array with good cache usage.

Traditionally, the field of databases is largely about solving algorithm problems in the scenario where you have much more data that can fit in memory. Data exists on disk as "pages", you have a fixed number of "page slots" in RAM. Moving pages from disk to RAM or RAM to disk is slow, so you want to do as little of that as you can. This makes trivial problems interesting -- e.g. there's no notion of a 'join' in classic computer science because it's too trivial to bother naming.

We're used to thinking of the study of algorithms as a sort of pure essence, but one could argue that algorithmic efficiency is only meaningful in a particular data and hardware context. That's part of what keeps our jobs interesting, I guess -- otherwise algorithm expertise wouldn't be as useful, since you could just apply libraries/cookbook solutions everywhere.