The math of shuffling cards almost brought down an online poker empire

(www.scientificamerican.com)

97 points indigodaddy | 3 comments | 05 Sep 25 22:50 UTC | HN request time: 0.202s | source

Show context

pmarreck ◴[07 Sep 25 03:54 UTC] No.45155251[source]▶

Why not just simulate a real shuffle?

Just "cut" in a random location (rotate the deck a random amount), then split the deck roughly in half (add a bit of random variation), then "flip them together" back to front by alternately taking 1 or 2 (randomly, add a small chance of 3, so maybe 50% 1, 40% 2 and 10% 3) from each side till there are no cards left to shuffle. Then repeat 8 times or so (there's a certain minimum number of times that ensures good randomness)

replies(3): >>45155382 #>>45155506 #>>45156321 #

kerkeslager ◴[07 Sep 25 04:23 UTC] No.45155382[source]▶

>>45155251 #

> Why not just simulate a real shuffle?

If you are asking why your proposed shuffling method is insecure: I don't know, and that's why I would never use that.

Asking "why not do X?" is entirely not paranoid enough for security. If you want to propose an algorithm, start with trying to prove the security claims of the algorithm. In this case, you'd need to prove that your algorithm creates a permutation that is indistinguishable from random. If you can't prove it, it's highly probable that your algorithm doesn't create a random permutation and someone will figure out how to break it.

I'll point out that we already have proven shuffling algorithms[1] and they're obviously faster than what you've proposed. So the simple answer to your question is "Because it's unproven and slower than proven algorithms."

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

replies(1): >>45155421 #

pmarreck ◴[07 Sep 25 04:33 UTC] No.45155421[source]▶

>>45155382 #

From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Turns out that an even easier way is just to assign 128 bits of randomness to each card and sort them by that. Degrades gracefully from biased sources, apparently.

replies(3): >>45155555 #>>45156002 #>>45207546 #

kerkeslager ◴[07 Sep 25 05:05 UTC] No.45155555[source]▶

>>45155421 #

> From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

Yes.

Pretty much every modern language ships with a secure PRNG (which probably just calls /dev/urandom). A poker site probably has enough throughput to want to not block while waiting for /dev/urandom to build up entropy, so they might do something faster, but /dev/urandom is probably secure, it just might be a slower than a big poker site needs.

The non-evenly-diving divisor thing is a bit trickier, which is why standard libraries implement Fisher-Yates for you. But the solution is basically:

Using your PRNG, generate the number of bits you need. So if you need a number 0-51, generate 6 bits. 6 bits can hold 0-63. If you get a number in the range 52-63, discard that and generate a new 6-bit number with the PRNG.

If you need a number in an awkward range like 0-35, you'll need to discard a lot of numbers, but keep in mind this may still be faster than modular division which is pretty dang slow.

> My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

That's very much not what I said.

"Testing it thoroughly" is not adequate. Start by proving the algorithm is correct. If you don't have a proof the algorithm works there's no reason to even start implementing it so there's nothing to test.

> EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Jesus Christ, no. If you still believe anything ChatGPT says then security is not the field for you.

replies(2): >>45157612 #>>45158395 #

1. pmarreck ◴[07 Sep 25 14:24 UTC] No.45158395[source]▶

>>45155555 #

And if you believe nothing ChatGPT says then you simply suffer from the opposite bias. The dilemma is that it is mostly right, most of the time, isn't it? Almost like... people? I mean, one person responded with an error that ChatGPT almost certainly wouldn't have made: https://news.ycombinator.com/item?id=45158430

If ChatGPT says things that are empirically provable, then it's not wrong, is it?

replies(2): >>45159921 #>>45207239 #

2. tripletao ◴[07 Sep 25 17:00 UTC] No.45159921[source]▶

>>45158395 (TP) #

"Mostly right" is the most insidious option, because the frequent true statements or implications will lull you into believing the infrequent false ones.

For Fisher-Yates, if the RNG is good then the output permutation is independent of the input permutation. That's not true for your proposed algorithm, and neither you nor the the LLM has analyzed to determine how quickly the sensitivity to the input permutation decays. That sensitivity will have complicated structure, so it's a nontrivial problem in cryptanalysis.

The problem where the low bits of old, not cryptographically secure PRNGs were particularly non-random is real. That would be a problem for any algorithm using it though--imagine if your riffle took rand() % 2. The resolution (use the high bits, not the low bits) is the same for any algorithm. It's possible that your proposed algorithm is less sensitive because the much greater number of RNG calls makes it somehow average out, but neither you nor the LLM has analyzed that, because the residual structure is again a cryptanalysis problem.

Any potentially adversarial application should use a CSPRNG, moving the cryptographic security into that existing, well-studied block. So the problem you're trying to solve doesn't exist in practice. The LLM can probably be prompted to tell you that; but since you stated that the new shuffling algorithm was your idea, it worked to find reasons it was a good one. I don't think this is a good use of LLMs, but if you're going to then you have to avoid any suggestion that the new idea is yours, to avoid triggering the sycophancy.

3. kerkeslager ◴[11 Sep 25 03:04 UTC] No.45207239[source]▶

>>45158395 (TP) #

> And if you believe nothing ChatGPT says then you simply suffer from the opposite bias.

No, you're making a logical error. It's not that I believe the opposite of what ChatGPT says, it's that I don't believe what ChatGPT says based on ChatGPT saying it alone. If ChatGPT says your algorithm is right, it's not that I automatically believe your algorithm is wrong, it's that I believe I still don't know it's right because ChatGPT is shit at being right.

> The dilemma is that it is mostly right, most of the time, isn't it? Almost like... people?

I mean, yeah. Which is why I don't believe things just because some random person says them, either.

I tend to assign confidence intervals to beliefs in my head, and even and expert saying something doesn't make me super confident in it. I tend to believe things strongly if I can understand the hard evidence for them, not because any person, even an expert, says it.

With security, we have tens of thousands of examples of unproven security assertions being proven wrong later with disastrous effect, so the standard of proof is very high, and ChatGPT saying it is absolutely nowhere near close enough to proof. The only acceptable proof with a security algorithm is mathematical proof.

> If ChatGPT says things that are empirically provable, then it's not wrong, is it?

ChatGPT also says things that are empirically disprovable, so basically you always have to go to empirical proof to tell whether what ChatGPT has said is true, making ChatGPT next-to-useless if accuracy is at all a concern. Just go to empirical proof and skip the pointless ChatGPT step.

And accuracy really, really is a very big concern in secure random shuffling algorithms, so you really really need to empirically prove that they work which you really really have not done because you stopped thinking when you trusted ChatGPT. Your shuffling algorithm is not proven, and very likely is not secure.

↑