Most active commenters

    ←back to thread

    230 points craigkerstiens | 17 comments | | HN request time: 1.681s | source | bottom
    1. urronglol ◴[] No.42576069[source]
    What is a v7 UUID. Why do we need more than 1. uuid from a random seed and 2. one derived from that and a timestamp (orderable)
    replies(6): >>42576084 #>>42576085 #>>42576155 #>>42576193 #>>42576274 #>>42576349 #
    2. n2d4 ◴[] No.42576084[source]
    UUID v7 is the latter, whereas v4 is the former.

    All the other versions are somewhat legacy, and you shouldn't use them in new systems (besides v8, which is "custom format UUID", if you need that.)

    replies(1): >>42576174 #
    3. cube2222 ◴[] No.42576085[source]
    UUID v7 is what you numbered #2.

    For the others, it’s best to read up on Wikipedia[0]. I believe they all have their unique use-cases and tradeoffs.

    E.g. including information about which node of the system generated an ID.

    [0]: https://en.m.wikipedia.org/wiki/Universally_unique_identifie...

    4. mind-blight ◴[] No.42576155[source]
    A deterministic uuid based off of a hash of bits is also very useful (UUID5). I've used that for deduping records from multiple sources
    5. elehack ◴[] No.42576174[source]
    UUID v5 is quite useful if you want to deterministically convert external identifiers into UUIDS — define a namespace UUID for each potential identifier source (to keep them separate), then use that to derive a V5 UUID from the external identifier. It's very useful for idempotent data imports.
    replies(1): >>42577132 #
    6. chimpontherun ◴[] No.42576193[source]
    As it is usual in many areas of human endeavor, newcomers to the field tend to criticize design decisions that were made before them, only to re-invent what was already invented.

    Sometimes it leads to improvements in the field, via rejection of the accumulated legacy crud, or just simply affording a new perspective. Most other times it's a well-intentioned, but low-effort noise.

    I, personally, do it myself. This is how I learn.

    7. purerandomness ◴[] No.42576419[source]
    It comes off as a low-effort question that seems to try to evoke a reply from a peer, while it's the kind of question that is best answered by an LLM, Google, or Wikipedia.
    replies(1): >>42576927 #
    8. a3w ◴[] No.42576927{3}[source]
    Well, a LLM could answer it or write total bullshit. But yes, Wikipedia or other research quick internet research will help generally. And excactly here, it will tell you that there are competing standards since we have competing use cases.
    replies(1): >>42577002 #
    9. kraftman ◴[] No.42577002{4}[source]
    For simple questions like this with unambiguous answers, it is statistically very unlikely that you'll get a bullshit answer from an LLM.
    10. jandrewrogers ◴[] No.42577132{3}[source]
    Both UUIDv3 and UUIDv5 are prohibited for some use cases in some countries (including the US), which is something to be aware of. Unfortunately, no one has created an updated standard UUID that uses a hash function that is not broken. While useful it is not always an option.
    replies(1): >>42585785 #
    11. treve ◴[] No.42578276[source]
    I didn't downvote you, but the terseness made it for me immediately come off as a kind of criticism, e.g.: "Why would we ever need it". May not have been your intent but if it was a genuine question, form matters.
    replies(1): >>42579530 #
    12. urronglol ◴[] No.42579530{3}[source]
    Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

    Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.

    When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

    Please don't fulminate. Please don't sneer, including at the rest of the community.

    Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

    replies(2): >>42579865 #>>42586819 #
    13. mtmail ◴[] No.42579865{4}[source]
    From the same guidelines "Please don't comment about the voting on comments. It never does any good, and it makes boring reading." treve gave insight into their thought process when they read your initial comment, and I had the same reaction. Neither treve nor I downvoted it.
    14. kbolino ◴[] No.42585785{4}[source]
    Could you provide an example of such a prohibition? I've never heard of that before.

    I doubt that the quality of the hash function is the real issue. The problem with MD5 and SHA1 is that it's easy (for MD5) and technically possible (for SHA1) to generate collisions. That makes them broken for enforcing message integrity. But a UUID is not an integrity check. Both MD5 and SHA1 are still very good as non-cryptographic hash functions. While a hash-based UUID provides obfuscation, it isn't really a security mechanism.

    Even the existence of UUIDv5 feels like a knee-jerk reaction from when MD5 was "bad" but SHA1 was still "good". No hash function will protect you against de-obfuscation of low-entropy inputs. I can feed your social security number through SHA3-512 but it's not going to make it any less guessable than if I fed it through MD5.

    Moreover, a UUID only has 122 bits of usable space. Even if we defined a new SHA2- or SHA3-based UUID version, it's still going to have to truncate the hash output to less than half of its full size. This significantly alters the security properties of the hash function, though I'm not sure if much cryptanalysis has been done on the shorter forms to see if they're more practically breakable yet.

    There is one area where the collision resistance of the hash function could be a concern, though. If all of the inputs to the hash are under the control of a potential attacker, then maliciously constructed data could produce the same UUID. I still wouldn't think this would be a major issue, since most databases will fail to insert a duplicate key, but it might allow for various denial of service attacks. This still feels like quite a niche risk, though, and very circumstance-dependent.

    replies(1): >>42593244 #
    15. treve ◴[] No.42586819{4}[source]
    I'm not arguing with you, but I'm giving you feedback on your communication style, which is even with this comment still completely lacking.
    16. jandrewrogers ◴[] No.42593244{5}[source]
    Systems where a sophisticated attacker may engineer collisions are precisely why UUIDv3/5 are prohibited. SHA1 is deemed broken by some government authorities and not to be used in any critical systems, including as UUID (this is where I’ve seen it expressly prohibited). The entire point of UUIDs in many systems is that collisions should be impossible, system integrity is predicated on it. Many systems exist in a presumptively adversarial environment.

    Similarly, UUIDv4 is also prohibited in many contexts because people using weak entropy sources has been a recurring problem in real systems. It isn’t a theoretical issue, it has actually happened repeatedly. Decentralized generation of UUIDv4 is not trusted because humans struggle to implement it correctly, causing collisions where none are expected.

    There are also contexts where probabilistic collision resistance is disallowed because collision probabilities, while low, are high enough to be theoretically plausible. Most people aren’t working on systems this large yet.

    Ironically, there are many reasonable ways to construct reasonable and secure 128-bit identity values but the standards don’t define one. Some flavor of deterministic generation + encryption are not uncommon but they are also non-standard.

    That said, many companies unavoidably have a mix of standard and non-standard UUIDs internally. To mitigate collisions, they have to transform those UUIDs into something else UUID-like, at which point it is pretty much guaranteed to be non-standard. Not ideal but that is the world we live in.

    replies(1): >>42602563 #
    17. kbolino ◴[] No.42602563{6}[source]
    Ok, that makes sense. As far as I can tell, even truncated to "just" 122 bits, there's still no known way to generate a SHA-256 collision, so the MD5/SHA1 versions are comparatively vulnerable vs an hypothetical SHA256 UUID version. However, it's starting to feel like UUIDs may not be long enough in general to meet the need for secure, distributed ID generation.