←back to thread

68 points der_gopher | 1 comments | | HN request time: 0s | source
Show context
sedatk ◴[] No.46211578[source]
Whenever ULID comes up, I need to remind that it has a sequential ID generation mode in its spec which is prone to conflicts on multi-threads, processes or hosts which kills the purpose of a "universal" identifier. If you need a sequential ID, just use an integer, preferably one that's autoincremented by the database.

It's best to stick to UUIDv7 because of such quirks of ULID.

replies(6): >>46212131 #>>46212144 #>>46212658 #>>46212724 #>>46213359 #>>46213609 #
listenallyall ◴[] No.46212144[source]
Under what circumstances is it prone to conflicts? On separate threads/hosts/processes, id's created within the same millisecond would be differentiated by the 80 bits of randomness (more than UUID v7).
replies(2): >>46212224 #>>46212294 #
jasonwatkinspdx ◴[] No.46212294[source]
No, ULID has a "monotonic" feature, where if it detects the same millisecond timestamp in back to back calls, it just increments the 80 bit "random" portion. This means it has convoying behavior. If two machines are generating ids independently and happen to choose initial random positions near each other, the probability of collision is much higher than the basic birthday bound.

I think this "sort of monotonic but not really" is the worst of both to be honest. It tempts you to treat it like an invariant when it isn't.

If you want monotonicity with independent generation, just use a composite key that's a lamport clock and a random nonce. Or if you want to be even more snazzy use Hybrid Logical Clocks or similar.

replies(2): >>46212754 #>>46212879 #
listenallyall ◴[] No.46212879[source]
I dont think this holds up. Define "near each other" in an 80-bit random space. Further, the likelihood of a potential conflict is offset by the fact that you have far fewer "initial random positions" instead of every single element defining its own random position. And the extra random bits (over UUID7) reduce conflict possibilities by orders of magnitude.

I concede I'm no mathematician and I could be wrong here, but your analysis feels similar to assuming 10-11-12-13-14-15 is less likely to be a winning lottery ticket because the odds against consecutive numbers are so massive.

replies(1): >>46212926 #
jasonwatkinspdx ◴[] No.46212926[source]
No, the calculation is straightforward and I'm not making the fallacious assumption you say there at the end about a magical lottery ticket number.

My basic point is the probability of collision is lower than the birthday bound, there's no need for this, and as comments in this thread make clear people are not understanding this limitation even exists with the specification.

replies(1): >>46213274 #
1. listenallyall ◴[] No.46213274{3}[source]
> the calculation is straightforward

Ok then, make it easy - your requirement is to independently pick 4 numbers from the range 0 to 9, without resulting in any duplicates. Which is more likely to be successful:

- pick 4 random digits independently

- pick a random digit, which will be appended by the next digit as pick #2 (i.e. if you pick 5, then 6 will automatically be your second digit, if you pick 9, 0 will be your second digit). Then pick once more on the same terms.

The math here is easy: scenario 1 you have 0.9 x 0.8 x 0.7 = 0.504 likelihood of success. Scenario 2 it's simply 0.7.