Most active commenters

    ←back to thread

    67 points anon6362 | 14 comments | | HN request time: 1.209s | source | bottom
    Show context
    alexdns ◴[] No.45074520[source]
    It was considered innovative when it was first shared here eight years ago.
    replies(1): >>45074700 #
    nurumaik ◴[] No.45074700[source]
    Anything more innovative happened since (honestly curious)?
    replies(4): >>45075146 #>>45075479 #>>45075495 #>>45077234 #
    1. js4ever ◴[] No.45075146[source]
    I don't think so, but my guess is raw performance rarely matters in the real world.

    I once explored this, hitting around 125K RPS per core on Node.js. Then I realized it was pointless, the moment you add any real work (database calls, file I/O, etc.), throughput drops below 10K RPS.

    replies(3): >>45075358 #>>45075454 #>>45075994 #
    2. antoinealb ◴[] No.45075358[source]
    The goal of this kind of system is not to replace the application server. This is intended to work on the data plane where you do simple operations but do them many time per second. Think things like load balancers, cache server, routers, security appliances, etc. In this space Kernel Bypass is still very much the norm if you want to get an efficient system.
    replies(2): >>45075829 #>>45076472 #
    3. jandrewrogers ◴[] No.45075454[source]
    Storage and database doesn’t have to be that slow, that’s just architecture. I have database servers doing 10M RPS each, which absolutely will stress the network.

    We just do the networking bits a bit differently now. DPDK was a product of its time.

    replies(1): >>45086643 #
    4. eqvinox ◴[] No.45075829[source]
    > In this space Kernel Bypass is still very much the norm if you want to get an efficient system.

    Unless you can get an ASIC to do it, then the ASIC is massively preferrable; just the power savings generally¹ end the discussion. (= remove most routers from the list; also some security appliances and load balancers.)

    ¹ exceptions confirm the rule, i.e. small/boutique setups

    replies(1): >>45077150 #
    5. rivetfasten ◴[] No.45075994[source]
    It's always a matter of chasing the bottleneck. It's fair to say that network isn't the bottleneck for most applications. Heuristically, if you're willing to take on the performance impacts of a GC'd language you're probably already not the target audience.

    Zero copy is the important part for applications that need to saturate the NIC. For example Netflix integrated encryption into the FreeBSD kernel so they could use sendfile for zero-copy transfers from SSD (in the case of very popular titles) to a TLS stream. Otherwise they would have had two extra copies of every block of video just to encrypt it.

    Note however that their actual streaming stack is very different from the application stack. The constraint isn't strictly technical: ISP colocation space is expensive, so they need to have the most juiced machines they can possibly fit in the rack to control costs.

    There's an obvious appeal to accomplishing zero-copy by pushing network functionality into user space instead of application functionality into kernel space, so the DPDK evolution is natural.

    replies(1): >>45077821 #
    6. baruch ◴[] No.45076472[source]
    We do storage systems and use DPDK in the application, when the network IS the bottleneck it is worth it. Saturating two or three 400gbps NICs is possible with DPDK and the right architecture that makes the network be the bottleneck.
    7. gonzopancho ◴[] No.45077150{3}[source]
    ASICs require years to develop and aren’t flexible once deployed
    replies(2): >>45077634 #>>45078128 #
    8. nsteel ◴[] No.45077634{4}[source]
    Even the ones supporting things like P4?
    9. pclmulqdq ◴[] No.45077821[source]
    TCP is generally zero-copy now. Zero-copy with io_uring is also possible.

    AF_XDP is also another way to do high-performance networking in the kernel, and it's not bad.

    DPDK still has a ~30% advantage over an optimized kernel-space application with a huge maintenance burden. A lot of people reach for it, though, without optimizing kernel interfaces first.

    10. eqvinox ◴[] No.45078128{4}[source]
    You don't develop an ASIC to run a router with, you buy one off the shelf. And the function of a router doesn't exactly change day by day (or even year by year).
    replies(2): >>45086222 #>>45091677 #
    11. ZephyrP ◴[] No.45086222{5}[source]
    Change keeps coming, even when the wire format of a protocol has ossified. I've spent years in security and router performance at Cisco, wrote a respectable fraction of the flagship's L3 and L2-L3 (tun) firewall. I merged a patch on this tried-and-true firewall just this year; it's now deployed.

    As vendors are eager to remind us, custom silicon to accelerate everything between L1 to L7 exists. That said, it is still the case in 2025 that the "fast path" data-plane will end up passing either nothing or everything in a flow to the "slow path" control-plane, where the most significant silicon is less 'ASIC' and more 'aarch64'.

    This is all to say that the GP's comments are broadly correct.

    replies(1): >>45091662 #
    12. js4ever ◴[] No.45086643[source]
    What DB engine is it? What hardware?
    13. ◴[] No.45091662{6}[source]
    14. nsteel ◴[] No.45091677{5}[source]
    My colleagues are always writing new features for our edge and core router ASICs released more than 10 years ago. They ship new software versions multiple times a year. It is highly specialised work and the customer requesting the feature has to be big enough to make it worth-while, but our silicon is flexible enough to avoid off-loading to slow CPUs in many cases. You get what you pay for.