Most active commenters

    ←back to thread

    257 points pg | 11 comments | | HN request time: 1.242s | source | bottom
    1. blinkingled ◴[] No.2120857[source]
    Sounds like there is a scalability issue within MzScheme in that it iterates over the number of threads, asking each thread about the sockets it has. As one can tell, once # of threads and # of sockets grow - finding which thread to run in user space becomes awfully expensive. As any clever admin will do, a least invasive fix involving limiting the number of connections and threads was done - with what sounds like immediate results!

    I have no idea what MzScheme is but I am curious about why is HN running threads in user space in 2011? The OS kernel knows best what thread to pick to run and that is a very well tuned, O(1) operation for Linux and Solaris.

    replies(3): >>2120924 #>>2120962 #>>2120976 #
    2. svlla ◴[] No.2120924[source]
    not to mention that one thread per connection is, well, extremely outdated.
    replies(2): >>2120943 #>>2120991 #
    3. klochner ◴[] No.2120962[source]
    I think this whitepaper covers the plt web server bundled with MzScheme (now 'Racket'):

    http://www.cs.brown.edu/~sk/Publications/Papers/Published/kh...

    replies(1): >>2121154 #
    4. metageek ◴[] No.2120976[source]
    MzScheme is an implementation of Scheme (dialect of Lisp); it implements its own threading. This is not uncommon for languages which support (or used to support) many OSes, with many different versions of threading: it's easier to write it yourself, once, than maintain N+1 OS-specific versions.

    Of course, these days, N+1 is probably 2, since everything except Windows supports pthreads.

    replies(1): >>2121330 #
    5. jrockway ◴[] No.2120991[source]
    I don't know much about MzScheme, but it's quite possible that "thread" means "stack", not "OS thread". One context stack per TCP connection is quite sustainable; with Haskell's threads and Perl's coros, I run out of fds long before I'm using any significant amount of memory. (This is somewhere around 30,000 open connections on my un-tweaked Linux desktop. I know I can do a lot more if I tried.)

    The issue, in the case of HN, is with O(n) IO watchers. Most sockets are idle most of the time, so you really want an algorithm that is O(n) over active sockets, not O(n) over active and inactive sockets. You typically have so few active fds at any time that the n is really tiny, making massively scalable network servers trivial to write. But you also have a lot of connections at any one time, so if you are O(n) over active and inactive fds, then you are going to have performance issues. Basically, you don't want to pay for connections that aren't doing anything.

    Fortunately, we have the technology; epoll on Linux, kqueue on BSDs, /dev/poll on Solaris. You just need to use an event loop, so it does all the hard stuff for you (and so you don't have to worry about the OS differences). Hacking a proper event loop into MzScheme may be hard, but it's absolutely necessary for writing scalable network servers. Handling 10k+ open connections is trivial with today's technology. And, all the cool kids are doing it (node.js, GHC, etc.).

    replies(1): >>2121074 #
    6. swannodette ◴[] No.2121074{3}[source]
    My understanding is that MzScheme / Racket has a proper event loop.
    replies(1): >>2121973 #
    7. idoh ◴[] No.2121154[source]
    Hacker News uses a web server written in arc.
    8. blinkingled ◴[] No.2121330[source]
    If it's just a porting issue, there is Pthreads-win32 which worked well enough the last time I used it few years ago.
    replies(1): >>2121350 #
    9. metageek ◴[] No.2121350{3}[source]
    I don't know the details. I suspect the answer is that the threading support was written when pthread support was less common, and the MzScheme developers haven't been sufficiently interested in rewriting it.
    replies(1): >>2134146 #
    10. jrockway ◴[] No.2121973{4}[source]
    Yeah, I have no idea. All I know is that proper threads do not bloat anything, and that proper IO watchers are not O(n) over inactive connections.
    11. elibarzilay ◴[] No.2134146{4}[source]
    Racket now includes a new facility -- "features" -- which are essentially a lightweight OS-level thread. There's also another -- "places" -- which is a more separated heavy threads (closer to a new process), but that one is not enabled by default.