What happens to latency if service time is cut in half (2022)

(pveentjer.github.io)

130 points luu | 1 comments | 18 Jun 24 06:10 UTC | HN request time: 0.214s | source

Show context

fwlr ◴[18 Jun 24 09:31 UTC] No.40715677[source]▶

    In mathematical queueing theory, Little's law (also result, theorem, lemma, or formula[1][2]) is a theorem by John Little which states that the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system. Expressed algebraically the law is

    L=λW

    The relationship is not influenced by the arrival process distribution, the service distribution, the service order, or practically anything else. In most queuing systems, service time is the bottleneck that creates the queue.

https://en.m.wikipedia.org/wiki/Little%27s_law

An extremely useful law to remember. You’d be surprised how much bullshit it can detect!

replies(1): >>40716969 #

jethkl ◴[18 Jun 24 12:23 UTC] No.40716969[source]▶

>>40715677 #

> An extremely useful law to remember....

Would you be willing to provide an example where you applied Little's Law?

replies(3): >>40717899 #>>40718014 #>>40718328 #

fwlr ◴[18 Jun 24 14:35 UTC] No.40718328[source]▶

>>40716969 #

Sure! In a professional capacity - our customers were dissatisfied with the length of time and flakiness of a particular workload job[0], and I was party to the discussions on how to solve this problem. Knowing Little's Law allowed me to dissent against the prevailing opinion that we should customise our job processing queue to prioritise these jobs[1], arguing instead to provision more resources (i.e. beefier servers).

The decision was still made to alter the priority. The change went into production and was found to unacceptably degrade the performance of other jobs. Thankfully, one of the engineers who I had convinced that "processing time is the only factor that matters" had spent all their time optimizing the heavy task, to the point where it was no longer a heavy task, thus saving the day.

0. The job was some kind of "export CSV" form, and it somehow involved both 'traversing dozens of highly normalised tables' and 'digging into json blobs stored as text'.

1. I specifically remember one of the arguments was that if you have 3 heavy tasks A B and C, best case is "in parallel" which takes max(A, B, C) time whereas worst case is "sequential" which takes (A) + (B + A) + (C + B + A) time, our current priority approximated the "sequential" scenario, and the priority change would instead approximate the "parallel" scenario. I use scare quotes because I felt it was a resource issue (the "sequential" pattern was a byproduct of the most common way a heavy task got enough resources... which was an earlier heavy task finished and freed up a lot of resources).

replies(3): >>40718518 #>>40718630 #>>40719799 #

1. jethkl ◴[18 Jun 24 14:54 UTC] No.40718518[source]▶

>>40718328 #

Thank you! -- and thank you to the others who shared!

↑