←back to thread

97 points indigodaddy | 6 comments | | HN request time: 0.83s | source | bottom
Show context
pmarreck ◴[] No.45155251[source]
Why not just simulate a real shuffle?

Just "cut" in a random location (rotate the deck a random amount), then split the deck roughly in half (add a bit of random variation), then "flip them together" back to front by alternately taking 1 or 2 (randomly, add a small chance of 3, so maybe 50% 1, 40% 2 and 10% 3) from each side till there are no cards left to shuffle. Then repeat 8 times or so (there's a certain minimum number of times that ensures good randomness)

replies(3): >>45155382 #>>45155506 #>>45156321 #
kerkeslager ◴[] No.45155382[source]
> Why not just simulate a real shuffle?

If you are asking why your proposed shuffling method is insecure: I don't know, and that's why I would never use that.

Asking "why not do X?" is entirely not paranoid enough for security. If you want to propose an algorithm, start with trying to prove the security claims of the algorithm. In this case, you'd need to prove that your algorithm creates a permutation that is indistinguishable from random. If you can't prove it, it's highly probable that your algorithm doesn't create a random permutation and someone will figure out how to break it.

I'll point out that we already have proven shuffling algorithms[1] and they're obviously faster than what you've proposed. So the simple answer to your question is "Because it's unproven and slower than proven algorithms."

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

replies(1): >>45155421 #
pmarreck ◴[] No.45155421[source]
From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Turns out that an even easier way is just to assign 128 bits of randomness to each card and sort them by that. Degrades gracefully from biased sources, apparently.

replies(3): >>45155555 #>>45156002 #>>45207546 #
1. oneshtein ◴[] No.45156002[source]
> Turns out that an even easier way is just to assign 128 bits of randomness

52! is roughly 2^226.

You cannot address all 2^226 positions with a 2^128 address generated from 2^64 seed.

replies(2): >>45156669 #>>45158430 #
2. rcxdude ◴[] No.45156669[source]
it's 128 bits per card. That's vastly more than 226 bits.
replies(1): >>45160082 #
3. pmarreck ◴[] No.45158430[source]
With all due respect, I don't think you understand. The 128 bits are just to sort the 52 cards into a random-enough order, and they have sufficient entropy and space for that.

We're not interested in sorting all possible decks, just all possible cards (52) into 1 unique randomized deck. The more bits we assign to these, the more any error or bias in the source goes to nil. (In contrast, consider the use of only 4 bits.)

Since everyone's bashing my use of ChatGPT, I don't think ChatGPT5 would have made that categorical error.

replies(1): >>45161676 #
4. amluto ◴[] No.45160082[source]
You could use something like 12 bits of randomness per card (a very rough approximation of the log_2(n^2)) to get the probability that you reuse a number down to a manageable level, check if you’ve reused a number (which is basically free once you’ve sorted), and then repeat the whole process if you reused a number.

Or you could have a lazily calculated infinite precision random number per card and use more like 6 bits per card on expectation. Other than using more (and annoyingly variable) memory, this may well be faster than a properly unbiased Fisher-Yates.

Or you could assign a few bits per card, sort, and then recurse on each group of cards that sorted into the same bucket.

In summary, there are lots of genuinely unbiased solutions (assuming a perfect binary RNG), and they all boil down to something roughly equivalent to rejection sampling.

5. oneshtein ◴[] No.45161676[source]
2^128*52 is 6656 bits, or 30x more than 2^226, so it will work IF those bits are independent. If CSPRNG is used, such as AES128, it will use 2^128 seed, which is not enough to cover all 2^226 permutations. If PRNG sequence is reused to generate more shuffles, then the next shuffle can be predicted from the previous one.

CSPRNG AES256 with 2^256 random seed is safe.

CSPRNG AES128 with 2^128 (or less) random seed is not safe.

replies(1): >>45165388 #
6. tripletao ◴[] No.45165388{3}[source]
> If PRNG sequence is reused to generate more shuffles, then the next shuffle can be predicted from the previous one.

That's theoretically true by a counting argument, but for a CSPRNG the only way to compute that next shuffle costs about as much computationally as brute force. 128 bits is enough to make that infeasible, though 256 may be preferable for the usual abundance of caution.

The number of permutations doesn't matter here, since an attack becomes computationally infeasible before it becomes counting infeasible. It's also possible for a practical attack to exist even when the output can't be predicted exactly, since statistical information is still valuable.