Why HN was slow and how Rtm fixed it

(ycombinator.com)

257 points pg | 1 comments | 19 Jan 11 18:41 UTC | HN request time: 0.303s | source

Show context

blinkingled ◴[19 Jan 11 19:13 UTC] No.2120857[source]▶

>>2120756 (OP) #

Sounds like there is a scalability issue within MzScheme in that it iterates over the number of threads, asking each thread about the sockets it has. As one can tell, once # of threads and # of sockets grow - finding which thread to run in user space becomes awfully expensive. As any clever admin will do, a least invasive fix involving limiting the number of connections and threads was done - with what sounds like immediate results!

I have no idea what MzScheme is but I am curious about why is HN running threads in user space in 2011? The OS kernel knows best what thread to pick to run and that is a very well tuned, O(1) operation for Linux and Solaris.

replies(3): >>2120924 #>>2120962 #>>2120976 #

svlla ◴[19 Jan 11 19:33 UTC] No.2120924[source]▶

>>2120857 #

not to mention that one thread per connection is, well, extremely outdated.

replies(2): >>2120943 #>>2120991 #

jrockway ◴[19 Jan 11 19:48 UTC] No.2120991[source]▶

>>2120924 #

I don't know much about MzScheme, but it's quite possible that "thread" means "stack", not "OS thread". One context stack per TCP connection is quite sustainable; with Haskell's threads and Perl's coros, I run out of fds long before I'm using any significant amount of memory. (This is somewhere around 30,000 open connections on my un-tweaked Linux desktop. I know I can do a lot more if I tried.)

The issue, in the case of HN, is with O(n) IO watchers. Most sockets are idle most of the time, so you really want an algorithm that is O(n) over active sockets, not O(n) over active and inactive sockets. You typically have so few active fds at any time that the n is really tiny, making massively scalable network servers trivial to write. But you also have a lot of connections at any one time, so if you are O(n) over active and inactive fds, then you are going to have performance issues. Basically, you don't want to pay for connections that aren't doing anything.

Fortunately, we have the technology; epoll on Linux, kqueue on BSDs, /dev/poll on Solaris. You just need to use an event loop, so it does all the hard stuff for you (and so you don't have to worry about the OS differences). Hacking a proper event loop into MzScheme may be hard, but it's absolutely necessary for writing scalable network servers. Handling 10k+ open connections is trivial with today's technology. And, all the cool kids are doing it (node.js, GHC, etc.).

replies(1): >>2121074 #

swannodette ◴[19 Jan 11 20:15 UTC] No.2121074[source]▶

>>2120991 #

My understanding is that MzScheme / Racket has a proper event loop.

replies(1): >>2121973 #

1. jrockway ◴[20 Jan 11 00:35 UTC] No.2121973[source]▶

>>2121074 #

Yeah, I have no idea. All I know is that proper threads do not bloat anything, and that proper IO watchers are not O(n) over inactive connections.

↑