The math of shuffling cards almost brought down an online poker empire

(www.scientificamerican.com)

97 points indigodaddy | 1 comments | 05 Sep 25 22:50 UTC | HN request time: 0.001s | source

Show context

pmarreck ◴[07 Sep 25 03:54 UTC] No.45155251[source]▶

Why not just simulate a real shuffle?

Just "cut" in a random location (rotate the deck a random amount), then split the deck roughly in half (add a bit of random variation), then "flip them together" back to front by alternately taking 1 or 2 (randomly, add a small chance of 3, so maybe 50% 1, 40% 2 and 10% 3) from each side till there are no cards left to shuffle. Then repeat 8 times or so (there's a certain minimum number of times that ensures good randomness)

replies(3): >>45155382 #>>45155506 #>>45156321 #

kerkeslager ◴[07 Sep 25 04:23 UTC] No.45155382[source]▶

>>45155251 #

> Why not just simulate a real shuffle?

If you are asking why your proposed shuffling method is insecure: I don't know, and that's why I would never use that.

Asking "why not do X?" is entirely not paranoid enough for security. If you want to propose an algorithm, start with trying to prove the security claims of the algorithm. In this case, you'd need to prove that your algorithm creates a permutation that is indistinguishable from random. If you can't prove it, it's highly probable that your algorithm doesn't create a random permutation and someone will figure out how to break it.

I'll point out that we already have proven shuffling algorithms[1] and they're obviously faster than what you've proposed. So the simple answer to your question is "Because it's unproven and slower than proven algorithms."

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

replies(1): >>45155421 #

pmarreck ◴[07 Sep 25 04:33 UTC] No.45155421[source]▶

>>45155382 #

From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Turns out that an even easier way is just to assign 128 bits of randomness to each card and sort them by that. Degrades gracefully from biased sources, apparently.

replies(3): >>45155555 #>>45156002 #>>45207546 #

oneshtein ◴[07 Sep 25 06:58 UTC] No.45156002[source]▶

>>45155421 #

> Turns out that an even easier way is just to assign 128 bits of randomness

52! is roughly 2^226.

You cannot address all 2^226 positions with a 2^128 address generated from 2^64 seed.

replies(2): >>45156669 #>>45158430 #

rcxdude ◴[07 Sep 25 09:20 UTC] No.45156669[source]▶

>>45156002 #

it's 128 bits per card. That's vastly more than 226 bits.

replies(1): >>45160082 #

1. amluto ◴[07 Sep 25 17:15 UTC] No.45160082[source]▶

>>45156669 #

You could use something like 12 bits of randomness per card (a very rough approximation of the log_2(n^2)) to get the probability that you reuse a number down to a manageable level, check if you’ve reused a number (which is basically free once you’ve sorted), and then repeat the whole process if you reused a number.

Or you could have a lazily calculated infinite precision random number per card and use more like 6 bits per card on expectation. Other than using more (and annoyingly variable) memory, this may well be faster than a properly unbiased Fisher-Yates.

Or you could assign a few bits per card, sort, and then recurse on each group of cards that sorted into the same bucket.

In summary, there are lots of genuinely unbiased solutions (assuming a perfect binary RNG), and they all boil down to something roughly equivalent to rejection sampling.

↑