The waiting time paradox: why is my bus always late? (2018)

(jakevdp.github.io)

Show context

xnorswap ◴[20 Aug 24 15:50 UTC] No.41301135[source]▶

My favourite corollary of this is that even if you win the lottery jackpot, then you win less than the average lottery winner.

Average Jackpot prize is JackpotPool/Average winners.

Average Jackpot prize given you win is JackpotPool/(1+Average winners).

The number of expected other winners on the date you win is the same as the average number of winners. Your winning ticket doesn't affect the average number of winners.

This is similar to the classroom paradox where there are more winners when the prize is poorly split, so the average observed jackpot prize is less than the average jackpot prize averaged over events.

replies(6): >>41301213 #>>41302432 #>>41302595 #>>41303568 #>>41304369 #>>41305006 #

pif ◴[20 Aug 24 15:58 UTC] No.41301213[source]▶

>>41301135 #

> The number of expected other winners on the date you win is the same as the average number of winners.

Sorry, but no! The total number of expected winners (including you) is the same as the average number of winners.

replies(3): >>41301245 #>>41301876 #>>41302414 #

cortesoft ◴[20 Aug 24 17:12 UTC] No.41301876[source]▶

>>41301213 #

To understand why your totally understandable conclusion is wrong, it helps me to think about what it means to determine the average number of other winners when I win.

The reason is similar to the Monty Hall Problem (https://en.wikipedia.org/wiki/Monty_Hall_problem)

To understand, lets think about the simplest representation of this problem... 2 people playing the lottery, and a 50/50 chance to win.

So, we can map out all the possible combinations:

A wins (50%) and B wins (50%) - 25% of the time

A wins (50%) and B loses (50%) - 25% of the time

A loses (50%) and B wins (50%) - 25% of the time

A loses (50%) and B loses (50%) - 25% of the time

So we have 4 even outcomes, so to figure out the average number of winners, we just add up the total number of winners in all the situations and divide by 4... so two winners in the first scenario, plus one winner in scenario 2, plus one winner in scenario 3, and zero winners in scenario 4, for 4 total winners in all situations... divide that by 4, and we see we have an average of 1 winner per scenario.

This makes sense... with 50/50 chance of winning with 2 people leads to an average of 1 winner per draw.

Now lets see what happens if we check for situations where player A wins; in our example, that is the first two scenarios. We throw out scenario 3 and 4, since player A loses in those two scenarios.

So scenario one has 2 winners (A + B) while scenario two has 1 winner (just A)... so in two (even probability) outcomes where A is a winner, we have a total of 3 winners... divide that 3 by the two scenarios, and we get an average of 1.5 winners per scenario where A is a winner.

Why does this happen? In this simple example it is easy to see why... we removed the 1/4 chance where we have ZERO winners, which was bringing down the average.

This same thing happens no matter how many players and what the odds are... by selecting only the scenarios where a specific player wins, we are removing all the possible outcomes where zero people win.

replies(3): >>41302605 #>>41302670 #>>41303389 #

kgwgk ◴[20 Aug 24 19:51 UTC] No.41303389[source]▶

>>41301876 #

Maybe pif's comment

"The total number of expected winners (including you) is the same as the average number of winners"

means

"The total number of expected winners (including you) is the same as the average number of winners when there is at least one winner"

All the possible outcomes where zero people win are irrelevant when it comes to the calculation of how much "the average lottery winner" wins.

replies(1): >>41304155 #

cortesoft ◴[20 Aug 24 21:15 UTC] No.41304155[source]▶

>>41303389 #

I mispoke a bit when saying it is ALL because of the case where zero people win.

It still holds for non-zero cases, too.

Since whether any individual wins is independent of other people winning, selecting only the situations where you win doesn't change the odds of other people winning, it simply adds a 100% chance of you winning. So it has all the same combination of winners, plus you.

I don't have time right now to type out a more full explanation, but I hope this somewhat makes sense given my previous comment.

replies(1): >>41304259 #

1. kgwgk ◴[20 Aug 24 21:27 UTC] No.41304259[source]▶

>>41304155 #

> So it has all the same combination of winners, plus you.

And the same is true when you condition on having at least one winner. One winner doesn't change the odds of other people winning.

[edit: this may not be correct, never mind "In your example it doesn't matter whether you condition on A winning, on B winning or on at least one of A and B winning."]

replies(1): >>41304302 #

2. cortesoft ◴[20 Aug 24 21:32 UTC] No.41304302[source]▶

>>41304259 (TP) #

Right, one winner doesn't change the odds... but we are choosing to throw out all the scenarios where that winner doesn't win, which DOES change the overall odds distribution. We are changing our selection criteria.

replies(1): >>41304333 #

3. kgwgk ◴[20 Aug 24 21:35 UTC] No.41304333[source]▶

>>41304302 #

I think my previous comment was wrong. Anyway, the point is that the original claim

"if you win the lottery jackpot, then you win less than the average lottery winner"

seems wrong unless the winnings of "the average lottery winner" are defined in a quite unnatural way.

In your example the average lottery winner wins 3/4 of the jackpot. Half the winners take it all, the other half have to split it with someone else.

replies(1): >>41306717 #

4. cortesoft ◴[21 Aug 24 04:12 UTC] No.41306717{3}[source]▶

>>41304333 #

No, it it still the case that “if you win the lottery jackpot, then you will win less than the average lottery winner”. Let me see if I can explain in another way that might make this more clear… the example of only 2 people actually confuses the issue.

So in our example with 50% chance of winning, we know the average number of winners will be n/2, where n is the number of players. This means that the average lottery winner will win prize_pool / (n/2).

Now, let’s say we know I won. That means the average number of other winners is going to be (n-1) / 2. If you add in the known winner (me), we would have an average of 1 + (n-1)/2 winners… meaning the prize per person when I win is going to be prize_pool / (1 + (n-1)/2).

You can clearly see that the prize pool will be smaller when you know I am a winner. If it isn’t clear, just sub in 10 for N and solve it… the average winner will get prize_pool / (10/2) or prize_pool / 5. When I win, the average winner will get prize_pool / (1 + (10-1)/2), or prize_pool / 5.5. You can see that when I win, the average is lower.

This of course works whenever you start with the assumption that a particular person wins… you are turning the 1/2 chance for that person into a 100% chance, which increases the overall average number of winners.

replies(2): >>41307224 #>>41307528 #

5. kgwgk ◴[21 Aug 24 06:04 UTC] No.41307224{4}[source]▶

>>41306717 #

> This means that the average lottery winner will win prize_pool / (n/2).

Does it?

Take the case n=2. Run the lottery a few times and take all the winners.

Half the winners win prize_pool.

Half the winners win prize_pool/2.

How do you define “average lottery winner” so the average lottery winner will win prize_pool / (n/2) = prize_pool ?

6. kgwgk ◴[21 Aug 24 06:57 UTC] No.41307528{4}[source]▶

>>41306717 #

Taking the n=10 case, because you think n=2 is confusing.

> the average winner will get prize_pool / (10/2) or prize_pool / 5.

No, the average winner will get expected_prize_pool / expected_number_of_winners.

If 5 is the number of winners averaged over all draws - including those without winners - the (average) pot they share has also to take into account draws without winners.

The average prize shared in this case is not prize_pool, it’s 1023/1024 times prize_pool.

↑