Most active commenters

UltraSane(4)

Popular/hot comments

>>42576494 #

←back to thread

Postgres UUIDv7 and per-back end monotonicity

(brandur.org)

Show context

fngjdflmdflg ◴[02 Jan 25 17:09 UTC] No.42576248[source]▶

>>42575900 (OP) #

>The Postgres patch solves the problem by repurposing 12 bits of the UUID’s random component to increase the precision of the timestamp down to nanosecond granularity [...]

>It makes a repeated UUID between processes more likely, but there’s still 62 bits of randomness left to make use of, so collisions remain vastly unlikely.

Does it? Even though the number of random bits has decreased, the time interval to create such a duplicate has also decreased, namely to an interval of one nanosecond.

replies(3): >>42576319 #>>42576330 #>>42577456 #

1. londons_explore ◴[02 Jan 25 17:17 UTC] No.42576330[source]▶

>>42576248 #

I could imagine that certain nanoseconds might be vastly more likely than other nanoseconds.

For example, imagine you have a router that sends network packets out at the start of each microsecond, synced to wall time.

Or the OS scheduler always wakes processes up on a millisecond timer tick or some polling loop.

Now, when those packets are received by a postgres server and processed, the time to do that is probably fairly consistent - meaning that X nanoseconds past the microsecond you probably get most records being created.

replies(1): >>42576494 #

2. UltraSane ◴[02 Jan 25 17:32 UTC] No.42576494[source]▶

>>42576330 (TP) #

But only one nanosecond slower or faster and you get another set of 4.611 billion billion random IDs. I think random variations in buffer depths and CPU speeds will easily introduce hundreds of nanoseconds of timing variations. syncing any two things to less than 1 nanosecond is incredibly hard and doesn't happen by accident.

replies(3): >>42576585 #>>42577223 #>>42593338 #

3. zamadatix ◴[02 Jan 25 17:41 UTC] No.42576585[source]▶

>>42576494 #

The important part is the events in time aren't going to be as random as the actual random source. The chances of an actual collision remain low but the distribution of events over time is a weaker (in relative terms) source of random bits compared to proper "random" sources which won't have obvious bias at all.

replies(1): >>42577252 #

4. mlyle ◴[02 Jan 25 18:31 UTC] No.42577223[source]▶

>>42576494 #

We're not talking about nanoseconds of real time; we're talking about nanoseconds as measured by the CPU doing the processing. Nanoseconds are not likely to be a uniform variate.

replies(1): >>42577302 #

5. UltraSane ◴[02 Jan 25 18:34 UTC] No.42577252{3}[source]▶

>>42576585 #

I am sure there is bias but 1 nanosecond is an incredibly narrow window. It really would be an interesting experiment to evaluate the optimal balance of bits for timestamp and for random value. What about hostname and even process ID? Snowflake IDs are 63 bits long with 41 bits as a millisecond timestamp, 10 bits as a machine ID, and 12 bits as a sequential counter.

replies(2): >>42578585 #>>42579410 #

6. UltraSane ◴[02 Jan 25 18:38 UTC] No.42577302{3}[source]▶

>>42577223 #

Yes and they are also not likely to be so non-uniform that more than 6.411 billion billion events all happen in one nanosecond.

replies(1): >>42577523 #

7. mlyle ◴[02 Jan 25 19:01 UTC] No.42577523{4}[source]▶

>>42577302 #

Note it's not that number, but roughly the square root of that number, that matters.

And they might be quite non-uniform. If the scheduler tick and the nanosecond clock are synchronous, you could end up with a few thousand popular values instead of a billion.

It's not a real concern today, and probably won't be a real concern in 10 years, but it's not so far removed from possibility that no one has to think about it.

replies(1): >>42577649 #

8. UltraSane ◴[02 Jan 25 19:12 UTC] No.42577649{5}[source]▶

>>42577523 #

Good point about the square root of the random part. I guess that is why the 63 bit Snowflake ID uses a sequential counter.

9. WorldMaker ◴[02 Jan 25 20:38 UTC] No.42578585{4}[source]▶

>>42577252 #

Similarly for direct comparison, ULID has 48-bit timestamps, also at the millisecond, and 80 random bits.

Also to compare, the ULID spec technique for monotonicity is to take a single random value and then start incrementing the lowest bits, trading random entropy for direct "nearness", one after another. Versus the rand_a approach is effectively using the most significant bits, but keeping more random entropy.

10. zamadatix ◴[02 Jan 25 22:01 UTC] No.42579410{4}[source]▶

>>42577252 #

I suppose that would depend entirely on how you measure what optimal is. Optimal randomness is 128 bits from the best random source and 0 bits from anything else, like time. Optimal "just random enough for my use case but no more so I can fit other information in the value" depends entirely on the requirement of your use case (more specifically, not just "for databases" but "for my database to... on the hardware... in which the access is... on the presumed growth..." and so on). For picking a "good enough" value 12 bits is probably as reasonable as one will find generic reason for.

replies(1): >>42582283 #

11. deepsun ◴[04 Jan 25 08:16 UTC] No.42593338[source]▶

>>42576494 #

System doesn't give you 1 nanosecond precision really. It varies by OS+hardware, from what I remember you may get like 641 nanosecond precision. The number is nanoseconds, yes, but you never get "next nanosecond". In other words, every number you get has the same mod 641. In other systems you may get like 40,000 precision or even worse on SPECTRE/MELTDOWN-protected systems.

↑