Why HN was slow and how Rtm fixed it

(ycombinator.com)

257 points pg | 3 comments | 19 Jan 11 18:41 UTC | HN request time: 0.805s | source

Show context

mmaunder ◴[19 Jan 11 22:16 UTC] No.2121495[source]▶

>>2120756 (OP) #

"In 7 seconds, a hundred or more connections accumulate. So the server ends up with hundreds of threads, most of them probably waiting for input (waiting for the HTTP request). MzScheme can be inefficient when there are 100s of threads waiting for input -- when it wants to find a thread to run, it asks the O/S kernel about each socket in turn to see if any input is ready, and that's a lot of asking per thread switch if there are lots of threads. So the server is able to complete fewer requests per second when there is a big backlog, which lets more backlog accumulate, and perhaps it takes a long time for the server to recover."

I may have misunderstood but it sounds like you have MzScheme facing the open internet? Try putting nginx (or another epoll/kqueue based server) in front of MzScheme. It will handle the thousands of connections you have that are waiting for IO with very little incremental CPU load and with a single thread. Then when nginx reverse proxies to MzScheme each request happens very fast because it's local which means you need much fewer threads for your app server. That means less memory and less of the other overhead that you get with a high thread count.

An additional advantage is that you can enable keepalive again (right now you have it disabled it looks like) which makes things a faster for first-time visitors. It also makes it slightly faster for us regulars because the conditional gets we do for the gif's and css won't have to reestablish connections. Less connections established means you give your OS a break too with fewer syn/syn-ack/ack TCP handshakes.

Someone mentioned below that reverse proxies won't work for HN. They mean that caching won't work - but a reverse proxy like nginx that doesn't cache but handles high concurrency efficiently should give you a huge perf improvement.

PS: I'd love to help implement this free. I run a 600 req/sec site using nginx reverse proxying to apache.

replies(6): >>2121641 #>>2121644 #>>2122343 #>>2122679 #>>2125682 #>>2126225 #

sedachv ◴[19 Jan 11 22:57 UTC] No.2121644[source]▶

>>2121495 #

Or I don't know, use continuations in a place that's actually appropriate? John Fremlin showed that even with horrible CPS rewriting and epoll you can get way better throughput in SBCL (TPD2) than nginx. MzScheme comes with native continuations. It's not hard to call out to epoll.

Instead everyone in the Lisp community (pg included) is still enamored with using continuations to produce ugly URLs and unmaintainable web applications.

replies(2): >>2121816 #>>2122083 #

pg ◴[19 Jan 11 23:46 UTC] No.2121816[source]▶

>>2121644 #

Instead everyone in the Lisp community (pg included) is still enamored with using continuations to produce ugly URLs and unmaintainable web applications.

If you read the source of HN, you'll see that it doesn't actually use continuations.

I find the source of HN very clear. Have you read it? Is there a specific part you found so complicated as to be unmaintainable?

replies(2): >>2121859 #>>2121983 #

axod ◴[19 Jan 11 23:56 UTC] No.2121859[source]▶

>>2121816 #

> If you read the source of HN, you'll see that it doesn't actually use continuations.

> It had to be some dialect of Lisp with continuations, which meant Scheme, and MzScheme seemed the best.

(From further down the page).

I'm confused. What needs continuations?

replies(2): >>2121863 #>>2121867 #

dauphin ◴[19 Jan 11 23:58 UTC] No.2121863[source]▶

>>2121859 #

Errors/exceptions, for one, are implemented using continuations.

replies(1): >>2121874 #

1. axod ◴[20 Jan 11 00:02 UTC] No.2121874[source]▶

>>2121863 #

Sounds terribly inefficient to me, but what do I know -shrug-

replies(2): >>2121906 #>>2121970 #

2. jrockway ◴[20 Jan 11 00:34 UTC] No.2121970[source]▶

>>2121874 (TP) #

All control flow is a subset of continuations. The stack is a continuation (calling a function is call-with-current-continuation, return is just calling the "current continuation"), loops are continuations (with the non-local control flow, like break/last/redo/etc.), exceptions are continuations (like functions, but returning to the frame with the error handler), etc. Continuations are the general solution to things that are normally treated as different. So continuations are just as efficient (or inefficient) as calling functions or throwing exceptions.

In a web app context, though, it's kind of silly to keep a stack around to handler something like clicking a link that returns the contents of database row foo. People do this, call it continuations, and then run into problems. The problem is not continuations, the problem is that you are treating HTTP as a session, not as a series of request/responses. (The opposite of this style is REST.)

replies(2): >>2122010 #>>2122390 #

3. sedachv ◴[20 Jan 11 02:48 UTC] No.2122390[source]▶

>>2121970 #

In theory yes, in practice you need to reify the stack (even for one-shot continuations). Clinger, Hartheimer and Ost have a really good survey paper of the different ways to do that:

http://www.scribd.com/doc/47221367/Clinger-Implementation-St...

↑