Most active commenters
  • kgwgk(11)
  • xnorswap(10)
  • cortesoft(4)
  • FabHK(3)

←back to thread

219 points skadamat | 45 comments | | HN request time: 1.458s | source | bottom
1. xnorswap ◴[] No.41301135[source]
My favourite corollary of this is that even if you win the lottery jackpot, then you win less than the average lottery winner.

Average Jackpot prize is JackpotPool/Average winners.

Average Jackpot prize given you win is JackpotPool/(1+Average winners).

The number of expected other winners on the date you win is the same as the average number of winners. Your winning ticket doesn't affect the average number of winners.

This is similar to the classroom paradox where there are more winners when the prize is poorly split, so the average observed jackpot prize is less than the average jackpot prize averaged over events.

replies(6): >>41301213 #>>41302432 #>>41302595 #>>41303568 #>>41304369 #>>41305006 #
2. pif ◴[] No.41301213[source]
> The number of expected other winners on the date you win is the same as the average number of winners.

Sorry, but no! The total number of expected winners (including you) is the same as the average number of winners.

replies(3): >>41301245 #>>41301876 #>>41302414 #
3. xnorswap ◴[] No.41301245[source]
No, it's 1+average number of winners.

If the odds of winning are 1 in 14 million, and 28 million tickets are sold, then you expect there to be 2 winners.

If look at your ticket and see you've won the lottery, then the odds of winners are still 1 in 14 million, and out of the 27,999,999 other tickets sold, you expect 2 other winners, and now expect 3 winners total, given you have won.

replies(5): >>41301374 #>>41301753 #>>41302155 #>>41302202 #>>41302412 #
4. xnorswap ◴[] No.41301374{3}[source]
For anyone unconvinced, let's simulate this.

Instead of 1 in 14 million, we'll just do 1 in 2, and 8 players.

So we'll check how many bits are set in the average random byte:

    void Main()
     {
      byte[] buffer = new byte[256*1024];
      Random.Shared.NextBytes(buffer);
      var avg = buffer.Average(b =>     System.Runtime.Intrinsics.X86.Popcnt.PopCount(b));
      Console.WriteLine(avg);
     }
Okay, bounces around 3.998 to 4.001, seems normal.

Now let's check how many bits are set in the average random byte given that the low bit is 1 (i.e. player 1 has won!)

    void Main()
     {
      byte[] buffer = new byte[256*1024];
      Random.Shared.NextBytes(buffer);
      var avg = buffer
       .Where(b => (b & ((byte)0x01)) == 0x01)
       .Average(b =>     System.Runtime.Intrinsics.X86.Popcnt.PopCount(b));
      Console.WriteLine(avg); 
     }
Now ~=4.500

Which is 1+3.5

In this case, we're 1+ average from the 7 other players, so being an average of 7 others not 8 others is significant.

If we simulate with millions of players, you'll see that removing 1 person from the pool makes essentially no difference.

5. melenaboija ◴[] No.41301753{3}[source]
You are making an assumption you did not explain before and makes it confusing, you don’t know the number of winning tickets or the number of winning tickets affects the prize quantity, and not all lotteries work like that.
replies(1): >>41302765 #
6. cortesoft ◴[] No.41301876[source]
To understand why your totally understandable conclusion is wrong, it helps me to think about what it means to determine the average number of other winners when I win.

The reason is similar to the Monty Hall Problem (https://en.wikipedia.org/wiki/Monty_Hall_problem)

To understand, lets think about the simplest representation of this problem... 2 people playing the lottery, and a 50/50 chance to win.

So, we can map out all the possible combinations:

A wins (50%) and B wins (50%) - 25% of the time

A wins (50%) and B loses (50%) - 25% of the time

A loses (50%) and B wins (50%) - 25% of the time

A loses (50%) and B loses (50%) - 25% of the time

So we have 4 even outcomes, so to figure out the average number of winners, we just add up the total number of winners in all the situations and divide by 4... so two winners in the first scenario, plus one winner in scenario 2, plus one winner in scenario 3, and zero winners in scenario 4, for 4 total winners in all situations... divide that by 4, and we see we have an average of 1 winner per scenario.

This makes sense... with 50/50 chance of winning with 2 people leads to an average of 1 winner per draw.

Now lets see what happens if we check for situations where player A wins; in our example, that is the first two scenarios. We throw out scenario 3 and 4, since player A loses in those two scenarios.

So scenario one has 2 winners (A + B) while scenario two has 1 winner (just A)... so in two (even probability) outcomes where A is a winner, we have a total of 3 winners... divide that 3 by the two scenarios, and we get an average of 1.5 winners per scenario where A is a winner.

Why does this happen? In this simple example it is easy to see why... we removed the 1/4 chance where we have ZERO winners, which was bringing down the average.

This same thing happens no matter how many players and what the odds are... by selecting only the scenarios where a specific player wins, we are removing all the possible outcomes where zero people win.

replies(3): >>41302605 #>>41302670 #>>41303389 #
7. adastra22 ◴[] No.41302155{3}[source]
This is a variant of the Monty Hall problem, and the trick that makes it unintuitive is that you’ve snuck in the conditional of assuming you have already won.

If you have a ticket and haven’t checked that it is winning, you should expect two winners, regardless of whether you end up being one of them.

If you play the lottery trillions of times (always with the same odds, for simplicity) and build up a frequentist sample of winning events, you will on average be part of a pool of two winners in those instances where you win.

You snuck in the assumption that the first ticket checked (yours) is a winner, which screws up the statistics.

replies(1): >>41302539 #
8. burnished ◴[] No.41302202{3}[source]
I think you forgot to mention the condition 'given that you already have a ticket', and whatever justifications are required to assume that two more winning tickets will be present (if each ticket has independent odds of being a winner then you end up with a distribution of other tickets yeah?). Otherwise your premise doesn't quite lead to your conclusion.
replies(1): >>41302628 #
9. jncfhnb ◴[] No.41302412{3}[source]
That doesn’t work very well when the average number of winners is much less than 1. The math might work out that the “expected value” is more than one winner but in a realistic lottery you should expect to be the only winner.
replies(2): >>41302487 #>>41302508 #
10. mitthrowaway2 ◴[] No.41302414[source]
You're both right! This is where the subjective Bayesian framework helps clarify things. The passive-voice term "expected winners" leaves ambiguous a key idea: Expected by whom?

The number of winners you expect depends on what information you have, namely, whether or not you know that you are holding a winning lottery ticket or not!

11. mecsred ◴[] No.41302432[source]
> Your winning ticket doesn't affect the average number of winners.

I think this is a good hint that the conclusion isn't true. Just think about what it would mean if this were true for a sample of lotto winners. For a winner, if they win, their average number of winners is higher than the global average. Repeat this logic for each individual winner... And every winner wins with a higher number of winners than the average. Which is clearly impossible.

It would be true if you were guaranteed to win, since that's the assumption you have conditioned the probability on, but that's not a lottery then. If you want to get the actual expected value across all samples you need to take a weighted sum including the expected value when you don't win.

replies(1): >>41302584 #
12. TylerE ◴[] No.41302487{4}[source]
I'm not sure if this is true, as large jackpots see a higher than average number of tickets sold.
replies(1): >>41302830 #
13. bluGill ◴[] No.41302508{4}[source]
Every lottery I know of has many winners. One big winner but many who match only one numbe, and so win a tiny amount.
replies(1): >>41302614 #
14. xnorswap ◴[] No.41302539{4}[source]
I haven't "snuck in" anything, I very explicitly stated that it's the conditional expectation I'm talking about.

Expected(Number of winners | Pop size N and You win) = 1 + Expected(Number winners | Pop size N-1)

For small p, large N, that's ~1 + Expected(number of winners).

replies(1): >>41303124 #
15. xnorswap ◴[] No.41302584[source]
> Which is clearly impossible.

It's just like the average pupil being in a larger than average class size, it's not impossible!

Take a situation where you have 499 lotteries with zero winners, and 1 lottery with 1000 winners.

There are on average 2 winners per lottery.

From the perspective of all the winners, there was an average of 1000 winners.

That's the very basis of the paradox in the article.

Now, in that case, the lottery would be investigated for fraud. But the paradox plays out in a much gentler sense.

replies(1): >>41303746 #
16. kgwgk ◴[] No.41302595[source]
> Average Jackpot prize is JackpotPool/Average winners.

> Average Jackpot prize given you win is JackpotPool/(1+Average winners).

That doesn't make a lot of sense.

Maybe you mean that most winners get less than the average prize.

Let's say that there is $1m jackpot and there could be one, two, three or four winners (with equal probability).

To simplify the calculation, let's say that each outcome happens once.

The average prize is $400k (4 x $1m / (1+2+3+4)).

A winner has 40% probability of getting just $250k and 30% probability of getting $333k.

----

Edit: Or maybe you tried to say something like the following but didn't get it right because "average winners" means different things when you win and when you don't.

> Average Jackpot prize is JackpotPool/Average winners when there are one or more winners

> Average Jackpot prize given you win is JackpotPool/(1+Average winners when there are zero or more winners).

replies(1): >>41302733 #
17. xnorswap ◴[] No.41302605{3}[source]
You said my conclusion was wrong (edit: Apologies, I confused the nesting level here), then proved it correct by calculating the expected number of winners given you win as 3/2.
replies(1): >>41302690 #
18. TylerE ◴[] No.41302614{5}[source]
Winners in this case means jackpot winners. This is especially relevant as unlike partial winners, the jackpot is shared, not duplicated.
19. xnorswap ◴[] No.41302628{4}[source]
Right, I didn't spell out the format of the lottery. I'm assuming a "pick X numbers from Y" format, rather than a raffle style lottery.

This allows for multiple independent winners.

20. FabHK ◴[] No.41302670{3}[source]
Nice. And, say the lottery jackpot is a constant 6$, then the average winning per player is 3$ (case 1) or 6$ (case 2) or 6$ (case 3), each equally likely (case 4 is not applicable), so $5.

However, if A wins, A wins either $3 (case 1) or $6 (case 2), so A's expected winnings are $4.5, which is indeed < $5, as GGP asserted.

replies(1): >>41303435 #
21. FabHK ◴[] No.41302690{4}[source]
(I think cortesoft was responding to pif, whom you also responded to, thus agreeing with you.)
replies(1): >>41302783 #
22. xnorswap ◴[] No.41302733[source]
The key here is that you don't care what happens when you don't win, you don't care how much other people win.

What you care about, is the expected amount you win, given that you have a winning ticket.

Let's say there are N players, and let's say anyone has a 1 in X independent chance to win.

If you don't buy a ticket, there are N/X expected winners.

If you do buy a ticket, it doesn't affect whether other people win or not.

There are still N/X expected other winners.

Your participation doesn't reduce the expected number of people, who are not yourself, that will win.

This isn't a Monty hall problem, because Monty Hall introduced new information.

Buying a ticket doesn't introduce new information.

With Prob of (X-1)/X, you lose, and go home unhappy.

With Prob of 1/X, you win. And now there are 1 + N winners.

Your buying a ticket therefore increased the overall expected number of winners by 1/X. That is correct.

Conditioned on you winning, there are 1+N expected winners.

Conditioned on you losing, there are N expected winners.

replies(1): >>41302817 #
23. xnorswap ◴[] No.41302765{4}[source]
I'm essentially assuming the UK lottery (from 96-200x) rules, where players:

Pick 6 numbers from 49, so odds are independent, and roughly 1 in 14m.

The prize jackpot is determined from a set percentage from the ticket sales, and is shared between jackpot winners.

How much the jackpot prize is therefore determined by total sales and how many winners there are.

I'm also assuming your individual ticket contribution doesn't materially affect either the prize pool or the number of people playing. For a large N, small p, this holds true.

24. xnorswap ◴[] No.41302783{5}[source]
Ah, thank you, navigating the nesting on here is difficult sometimes and this has proven a very contentious topic!
25. kgwgk ◴[] No.41302817{3}[source]
Conditioned on you winning, there are 1+N expected winners. The number of winners is always larger than zero [edit: the following is not correct "and the average prize is calculated diving the jackpot by the 1+N expected winners"].

Conditioned on you losing, there are N expected winners. The number of winners can be zero and the average prize cannot be calculated dividing the jackpot by the N expected winners [edit: not that you could before...]. [edit: the following is not correct "You have to divide the jackpot by a higher number: the expected winners conditional on having at least one."] No winner, no prize.

---

Let's say there are 2 people playing and the probability of winning is 50%. The number of expected winners is 1. If the jackpot is $1000 the "average lottery winner" doesn't get $1000.

Three outcomes are possible, with the following probabilities:

  1/4 zero winners
  1/2 one winner gets $1000
  1/4 two winners get $500 each
The "average lottery winner" gets less than $1000. The "average lottery winner" gets $750. (Imagine that a lot of draws have happened: for each split jackpot there were two jackpots going to a single winnner. All in all, half the winners got $1000 and the other half got $500.)

Consider now that you are one of the two players and you win. The other person will either win (you get $500) or not (you get $1000) with the same probability. Your expected prize? $750

What a coincidence!

replies(1): >>41306172 #
26. jncfhnb ◴[] No.41302830{5}[source]
It is true. The average power ball does not have a winner.
27. kgwgk ◴[] No.41303124{5}[source]
The main problem with your argument is that the "average lottery winner" doesn't win JackpotPool/Average winners.
28. kgwgk ◴[] No.41303389{3}[source]
Maybe pif's comment

"The total number of expected winners (including you) is the same as the average number of winners"

means

"The total number of expected winners (including you) is the same as the average number of winners when there is at least one winner"

All the possible outcomes where zero people win are irrelevant when it comes to the calculation of how much "the average lottery winner" wins.

replies(1): >>41304155 #
29. kgwgk ◴[] No.41303435{4}[source]
The "average lottery winner" also wins $4.5 though. (The original claim was that "if you win the lottery jackpot, then you win less than the average lottery winner".)

If there are 100 draws with a $6 jackpot 25 will have no winners, 50 will have one ($6) winner and 25 will have two ($3 each) winners.

100 winners in total - half won $6 and half won $3.

replies(1): >>41310193 #
30. GuB-42 ◴[] No.41303568[source]
That's true if you are cheating, for example by knowing the numbers in advance, guaranteeing a win. The cheater is the "+1" in your argument, an extra player with a 100% win rate.

But if you are not, and pick a random time where you win, on average, you will win as much as the average lottery winner.

For the classroom paradox to work, you have to take the average prize per draw after splitting, not the average prize per winner.

For example, if there are 9 winners in the first draw and 1 in the second, then there are 5 winners on average, so the average prize is 1/5. If you are one of the winners, there is 9/10 chance you are among the 9 and only win 1/9, which is less than average, but there is also 1/10 change of winning full prize, which is much better than average. If you take a weighed average of these (9/10*1/9+1/10*1) you get 1/5, back to the average prize. The average individual prize per draw is (1/9+1)/2=5/9, but it is kind of a meaningless number.

Another way to see it is that most of the times, you will win less than average, but the few times you win more, then you will win big. But isn't it what lotteries are all about?

31. mecsred ◴[] No.41303746{3}[source]
Well, I simulated it and the numbers seem to agree with you. The example is interesting. I still have trouble seeing why my original reasoning doesn't hold though. I'll give an example, if anyone can clear up the issue that would be appreciated.

1/10 odds, 10 entrants, one winner expected on average.

Given a particular winner: Expect: 0.9 + 1 winners

Given the same particular loser: Expect: 0.9 winners

Over all cases we see: 0.1(1.9) + 0.9(0.9) = 1 winners

Checks out, but if the numbers are correct then any winner should be able to calculate the higher average and be right knowing only that there is at least one winner. So in cases where there is at least one winner: P(winners>=1)=1-(9/10)^10=~65% The expectation should work out to 1.9. The rest of the time we expect zero winners. However if I use those numbers I get an overall expected number of winners as 1.237, which has increased the overall number of winners across all cases. In order for that number to work out to one, the expected winners when there is at least one winner would have to be ~1.535. Which suggests that the expected outcome is different depending on if you check your own ticket, or someone else's, even if you see the same thing?

Am I just not on for math today? I thought the solution to the paradox would be that the higher expectation discounts outcomes with zero winners.

32. cortesoft ◴[] No.41304155{4}[source]
I mispoke a bit when saying it is ALL because of the case where zero people win.

It still holds for non-zero cases, too.

Since whether any individual wins is independent of other people winning, selecting only the situations where you win doesn't change the odds of other people winning, it simply adds a 100% chance of you winning. So it has all the same combination of winners, plus you.

I don't have time right now to type out a more full explanation, but I hope this somewhat makes sense given my previous comment.

replies(1): >>41304259 #
33. kgwgk ◴[] No.41304259{5}[source]
> So it has all the same combination of winners, plus you.

And the same is true when you condition on having at least one winner. One winner doesn't change the odds of other people winning.

[edit: this may not be correct, never mind "In your example it doesn't matter whether you condition on A winning, on B winning or on at least one of A and B winning."]

replies(1): >>41304302 #
34. cortesoft ◴[] No.41304302{6}[source]
Right, one winner doesn't change the odds... but we are choosing to throw out all the scenarios where that winner doesn't win, which DOES change the overall odds distribution. We are changing our selection criteria.
replies(1): >>41304333 #
35. kgwgk ◴[] No.41304333{7}[source]
I think my previous comment was wrong. Anyway, the point is that the original claim

"if you win the lottery jackpot, then you win less than the average lottery winner"

seems wrong unless the winnings of "the average lottery winner" are defined in a quite unnatural way.

In your example the average lottery winner wins 3/4 of the jackpot. Half the winners take it all, the other half have to split it with someone else.

replies(1): >>41306717 #
36. nonameiguess ◴[] No.41304369[source]
This is (technically) wrong, but not for the reasons I've seen others give so far. Your reasoning is basically fine, but your definition of an average jackpot prize is not. If we have k lottery winners and we denote each individual prize as n_i, then the average prize is sum(n_1 ... n_k) / k. It's pretty easy to see that number cannot possibly be larger than all individual n_i and thus it cannot be the case that "you" won less than the average prize for all possible yous. Some winners win less than average and some win more, or they all win exactly the same amount.

On the other hand, your analytically computed expected winning is indeed less than an analytically computed expected average prize, when conditioned on the fact that you won, because you are more likely than not to be in a lottery that has more winners than the average lottery. This is mathematically the same phenomenon as the thing where the perceived average class size if you sample random students is greater than the actual average class size, because more students will be in the larger classes. This doesn't mean every class is larger than the average class, which is not possible. It just means that if you randomly select a student, you have a better than 50/50 chance of selecting someone in a larger than average class.

37. ruuda ◴[] No.41305006[source]
Another one is that your friends on average have more friends than you. (Because you are more likely to be friends with people who have many friends than with people who have few friends.)
38. pessimizer ◴[] No.41306172{4}[source]
> Consider now that you are one of the two players

This assumes that when you decide to buy a lottery ticket, you get to prevent someone else from buying one. If you decide to buy a lottery ticket, now there are three players.

replies(2): >>41307270 #>>41307281 #
39. cortesoft ◴[] No.41306717{8}[source]
No, it it still the case that “if you win the lottery jackpot, then you will win less than the average lottery winner”. Let me see if I can explain in another way that might make this more clear… the example of only 2 people actually confuses the issue.

So in our example with 50% chance of winning, we know the average number of winners will be n/2, where n is the number of players. This means that the average lottery winner will win prize_pool / (n/2).

Now, let’s say we know I won. That means the average number of other winners is going to be (n-1) / 2. If you add in the known winner (me), we would have an average of 1 + (n-1)/2 winners… meaning the prize per person when I win is going to be prize_pool / (1 + (n-1)/2).

You can clearly see that the prize pool will be smaller when you know I am a winner. If it isn’t clear, just sub in 10 for N and solve it… the average winner will get prize_pool / (10/2) or prize_pool / 5. When I win, the average winner will get prize_pool / (1 + (10-1)/2), or prize_pool / 5.5. You can see that when I win, the average is lower.

This of course works whenever you start with the assumption that a particular person wins… you are turning the 1/2 chance for that person into a 100% chance, which increases the overall average number of winners.

replies(2): >>41307224 #>>41307528 #
40. kgwgk ◴[] No.41307224{9}[source]
> This means that the average lottery winner will win prize_pool / (n/2).

Does it?

Take the case n=2. Run the lottery a few times and take all the winners.

Half the winners win prize_pool.

Half the winners win prize_pool/2.

How do you define “average lottery winner” so the average lottery winner will win prize_pool / (n/2) = prize_pool ?

41. ◴[] No.41307270{5}[source]
42. kgwgk ◴[] No.41307281{5}[source]
Not sure what’s your point bur the original claim was about a constant (or constant-enough) number of players.

> I'm also assuming your individual ticket contribution doesn't materially affect either the prize pool or the number of people playing. For a large N, small p, this holds true.

43. kgwgk ◴[] No.41307528{9}[source]
Taking the n=10 case, because you think n=2 is confusing.

> the average winner will get prize_pool / (10/2) or prize_pool / 5.

No, the average winner will get expected_prize_pool / expected_number_of_winners.

If 5 is the number of winners averaged over all draws - including those without winners - the (average) pot they share has also to take into account draws without winners.

The average prize shared in this case is not prize_pool, it’s 1023/1024 times prize_pool.

44. FabHK ◴[] No.41310193{5}[source]
You are correct. So, the statement should not be "if you win the lottery jackpot, then you win less than the average lottery winner", but "if you win the lottery jackpot, then you win less than lottery winners win on average"?
replies(1): >>41310574 #
45. kgwgk ◴[] No.41310574{6}[source]
When you win the lottery you win on average what lottery winners win on average when they win the lottery.

One may find ways to define things differently so something is less than something else but I’m not sure what’s the point in doing so.