←back to thread

306 points carlos-menezes | 1 comments | | HN request time: 0.32s | source
Show context
cletus ◴[] No.41891721[source]
At Google, I worked on a pure JS Speedtest. At the time, Ookla was still Flash-based so wouldn't work on Chromebooks. That was a problem for installers to verify an installation. I learned a lot about how TCP (I realize QUIC is UDP) responds to various factors.

I look at this article and consider the result pretty much as expected. Why? Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace. TCP has flow-control and sequencing. QUICK makes you manage that yourself (sort of).

Now there can be good reasons to do that. TCP congestion control is famously out-of-date with modern connection speeds, leading to newer algorithms like BRR [1] but it comes at a cost.

But here's my biggest takeaway from all that and it's something so rarely accounted for in network testing, testing Web applications and so on: latency.

Anyone who lives in Asia or Australia should relate to this. 100ms RTT latency can be devastating. It can take something that is completely responsive to utterly unusable. It slows down the bandwidth a connection can support (because of the windows) and make it less responsive to errors and congestion control efforts (both up and down).

I would strongly urge anyone testing a network or Web application to run tests where they randomly add 100ms to the latency [2].

My point in bringing this up is that the overhead of QUIC may not practically matter because your effective bandwidth over a single TCP connection (or QUICK stream) may be MUCH lower than your actual raw bandwidth. Put another way, 45% extra data may still be a win because managing your own congestion control might give you higher effective speed over between two parties.

[1]: https://atoonk.medium.com/tcp-bbr-exploring-tcp-congestion-c...

[2]: https://bencane.com/simulating-network-latency-for-testing-i...

replies(11): >>41891766 #>>41891768 #>>41891919 #>>41892102 #>>41892118 #>>41892276 #>>41892709 #>>41893658 #>>41893802 #>>41894376 #>>41894468 #
ec109685 ◴[] No.41891766[source]
For reasonably long downloads (so it has a chance to calibrate), why don't congestion algorithms increase the number of inflight packets to a high enough number that bandwidth is fully utilized even over high latency connections?

It seems like it should never be the case that two parallel downloads will preform better than a single one to the same host.

replies(4): >>41891861 #>>41891874 #>>41891957 #>>41892726 #
toast0 ◴[] No.41892726[source]
I've run a big webserver that served a decent size apk/other app downloads (and a bunch of small files and what nots). I had to set the maximum outgoing window to keep the overall memory within limits.

IIRC, servers were 64GB of ram and sendbufs were capped at 2MB. I was also dealing with a kernel deficiency that would leave the sendbuf allocated if the client disappeared in LAST_ACK. (This stems from a deficiency in the state description from the 1981 rfc written before my birth)

replies(1): >>41894769 #
1. dan-robertson ◴[] No.41894769[source]
I wonder if there’s some way to reduce this server-side memory requirement. I thought that was part of the point of sendfile but I might be mistaken. Unfortunately sendfile isn’t so suitable nowadays because of tls. But maybe if you could do tls offload and do sendfile then an OS could be capable of needing less memory for sendbufs.