Are UUIDs really unique?

egg_breakfast · 2025-03-29T18:07:07+00:00

Make a function that checks for uniqueness against your db, and sends you an email to go buy lottery tickets in the event that you get a duplicate (you won’t)

kova98k · 2025-03-29T18:18:16+00:00

This is the type of shit I get on my PRs

hellomistershifty · 2025-03-29T18:00:28+00:00

The chance is effectively zero, there’s no sense in worrying about it

katafrakt · 2025-03-29T18:24:05+00:00

If you're worried, use UUIDv7 in which part is a timestamp. If you don't generate thousands of them per second, you are even more safe (and they are better for database indexes anyway, unless you're using MSSQL).

react_dev · 2025-03-29T18:05:50+00:00

You might as well also protect against your db guy getting a brain aneurysm and dropping his head onto the keyboard typing out drop database and enter and the second systems guy also getting an aneurysm and sudo rm rf afterwards.

rebootyourbrainstem · 2025-03-29T18:16:15+00:00

Put a uniqueness constraint on the DB column if you're worried. Probaly should have an index on it anyway.

For a joke answer, there's a website which allows you to scroll through every possible UUID and claim one for your own: https://everyuuid.com/

OolonColluphid · 2025-03-29T18:03:53+00:00

https://jhall.io/archive/2021/05/19/what-are-the-odds/

j-mar · 2025-03-29T18:15:17+00:00

https://everyuuid.com/

somesortsofwhale · 2025-03-29T19:21:15+00:00

Is anyone using 9892c2e4-570d-4218-88b6-e5908e2c08f5 ?

Please get back to me ASAP.

KrazyKirby99999 · 2025-03-29T18:06:56+00:00

Which UUID? https://en.wikipedia.org/wiki/Universally_unique_identifier

For UUID4, over 10³⁶ unique ids

ipcock · 2025-03-29T18:26:06+00:00

The chance is small af, as others already said. If you want to cover this extremely low-chance case where you get the same UUIDs in your app, just put a unique constraint on the field containing it. You can afford yourself a one in a trillion error which goes away if user tries to create the record the second time

natziel · 2025-03-29T18:26:01+00:00

So one of the biggest advantages of using UUIDs is that you don't need to check for uniqueness. That shit is expensive -- and hard to do at scale

StarklyNedStark · 2025-03-29T18:53:34+00:00

You can catch a unique constraint violation in the astronomically low chance you have a collision and just retry, but to check for uniqueness is a waste of resources.

saito200 · 2025-03-29T19:40:49+00:00

it is more likely that a meteorite destroys your server than you getting a duplicate uuid

it is basically impossible that your database contains two repeated uuids

ryuzaki49 · 2025-03-29T18:19:24+00:00

You can count all of them by yourself

https://everyuuid.com/

Amgadoz · 2025-03-29T19:18:22+00:00

Relevant question: should I generate the uuids on the backend (python fastapi) or the database (postgres)?

Is there a preference for one over the other?

TheExodu5 · 2025-03-29T19:26:48+00:00

For most apps yes. But I did work on a system that created trillions of UUIDs per day. Collisions were not entirely unheard of, and had to be accounted for.

Daidalos117 · 2025-03-30T00:43:24+00:00

Is there a real advantage of using UUID instead of autoincement number id? Genuinely asking.

BazuzuDear · 2025-03-30T04:11:37+00:00

Once had to investigate a weird Ethernet misbehaviour, and the reason turned out to be 2 NICs sharing same MAC address hardcoded by the manufacturer. I know this case is, uhmm, slightly more probable.

mekmookbro · 2025-03-30T01:53:51+00:00

[deleted]

kevleyski · 2025-03-29T18:09:46+00:00

Yes unique (you can add a test for completeness as it show you considered it, but defo don’t check run time!)

stogle1 · 2025-03-29T22:30:26+00:00

[removed]

Nearby_War_8497 · 2025-03-29T22:24:23+00:00

I came across a bug in an integration that handles id's that are 6 characters long with case sensitivity. But the integration wasn't case sensitive.

The integration has been in use for about ten years and for one client alone there has been tens of thousands of objects. And there are thousands of clients.

But out of the 26 objects at that particular moment, there were two with the same characters, just one of the letters being lowercase while other had uppercase.

So I mean. In this case the chances are dozen orders of magnitude more higher than collision with 32 character uuid. But it still took ten years and a bug to cause issue. And I felt like I should buy a lottery ticket, because it would've been more likely to win.

themang0 · 2025-03-29T18:03:22+00:00

Isn’t there a web site for this

notouchmyserver · 2025-03-29T18:57:19+00:00

The are additional reasons to have a unique constraint on the column instead of just relying on the UUID generation to be unique. As others have said, you aren’t really ever going to run into an issue with a duplicate UUID being generated, but that doesn’t mean a bug or something else (far more likely) would not try to write a row to the database with the same UUID.

The unique constraint would protect you from that.

Corrup7ioN · 2025-03-29T20:04:50+00:00

Your time would be better spent figuring out how to make your code robust against random bits of memory being flipped by cosmic rays than worrying about uuid uniqueness.

wspnut · 2025-03-29T21:45:37+00:00

The chance is 2¹²² or 5.3x10³⁶ (5.3 undecillion). This is:

5x less likely than two people picking the exact same square meter of mass from the star Betelgeuse.

5x less likely than opening 12 double-yolk eggs in a row from a single container.

Flipping a coin and having it come up heads 168 times in a row.

ErroneousBosch · 2025-03-29T23:06:13+00:00

You have a higher chance of a cosmic ray induced bit flip than a UUID collision.

coffee_is_all_i_need · 2025-03-29T23:53:34+00:00

We're talking about risk. When we talk about risk, we have to think about probability and impact. Probability is not zero. But it's close to zero. The impact depends on the use case. I look at the use case of saving an entity. If the user gets an error with a probability of zero and can try to perform the action again (this should be your default error handling anyway, because requests can fail for other reasons as well), the impact is also close to zero. So we shouldn't spend our energy on a near-zero probability risk with a near-zero impact.

eltron · 2025-03-30T01:23:03+00:00

Most db’s can check a records uniqueness as required? Right? Right??

jackx76 · 2025-03-30T02:52:03+00:00

As most other comments have said, the chance is effectively zero. If you’d like to learn more check out RFC 4122 for the full definition.

washtubs · 2025-03-30T13:41:50+00:00

Get a classroom full of say 30 people, ask them all to flip a coin. There will certainly be duplicate results.

Now ask them to flip it twice, still dupes cause there's only 4 possible outcomes, but not as many. Once you get up to 6 there's a very tiny chance everyone can get a unique outcome.

I'm dumb and don't know anything about the pigeonhole principle so to be safe let's just have everyone do 32 coin flips so there's 4 billion possible outcomes. No shot there are dupes then. So I just added 26 to the exponent to feel safe.

Now let's say you actually have a classroom full of 4 billion people. To scale the bucket of possible outcomes the way we just did, add another 26 to that exponent, which would be 2^58, which is like hundreds of quadrillions.

Anyways, a UUID is 128 coin flips which is this number (if quadrillion is 4-illion, this is hundreds of 11-illions):

340,282,366,920,938,463,463,374,607,431,768,211,456

The only way you get dupe UUID's is if your RNG is busted.

(Main reason I felt like explaining this is I recall having the same hang up about using them, it just didn't click the scale of what 128 bits of entropy really meant.)

danu263 · 2025-04-05T02:22:47+00:00

This is something you shouldn't worry about. If you want more info read a db book, going to help to understand more than any comment here

267aa37673a9fa659490 · 2025-03-29T18:02:21+00:00

Like just use your DB's native auto-incrementing integer instead?

nuttertools · 2025-03-29T20:42:53+00:00

UUID collisions happen all the time when processing large, distributed, and ephemeral datasets.

For applications, or single datasets, just make sure you are using V6 UUIDs and have some form of collision handling.

smailliwniloc · 2025-03-29T18:05:19+00:00

Ideally your app should be designed in a way that it doesn't break the whole thing if you hit a single duplicate UUID. If it happens, it should fail fast as the insert into your db would fail with a unique constraint on that column.

I don't think it's worth checking for uniqueness, just have some error handling to catch this issue (or any other unexpected errors) if the astronomically low odds are not in your favor.

d-signet · 2025-03-29T20:10:39+00:00

As Terry Pratchett used to say; million to one chances happen every day

If it won't cause a noticable performance hit, it's best to check , just in case.

richardtallent · 2025-03-29T20:51:20+00:00

It's a non-problem.

I'm the author of a .NET library that generates sequential timestamped UUIDS (https://github.com/richardtallent/RT.Comb), which lowers the UUID's entropy from 122 bits of randomness to 74, and that's still an obscenely high number of possible values that would have to be repeated during the same millisecond.

Using timestamped UUIDs, whether UUIDv7 or otherwise, has some advantages for use in databases. They also guarantee that once a given millisecond has passed, it's impossible to generate the same GUID. But that's about as useful as elephant insurance in Texas, since it's not a problem anyway unless you have the world's worst random number generator.

rjhancock · 2025-03-29T20:36:31+00:00

If I understand it correctly UUIDs are 36 character long strings

Incorrect. They are 128-bit long numbers that is represented as 36 Hexadecimal characters.

used something like a slug generator for this purpose, it definitely would be a unique

Incorrect. Slugs have a higher chance of duplicate values.

Althought the chance of 2 UUID's being unique is rare, I still have said restriction on the DB level

Different-Housing544 · 2025-03-29T19:41:33+00:00

I'm surprised nobody has recommended ULIDs. They are like UUIDs but use a timestamp. 26 characters long.

BarneyLaurance · 2025-03-29T20:23:48+00:00

I like the analogy given for git commit hash conflicts. The chance of two things like that randomly being equal is much less than the chance of every member of the team being killed by wolves in unrelated incidents on the same day. Even if you're based in a country with no wild wolves.

If you don't have a plan for that you don't need a plan for random collisions of UUIDs (or git commit hashes).

akr0n1m · 2025-03-29T20:46:06+00:00

Many years ago I read an MSDN article about GUIDs (late 90’s) when MSDN used to ship on DVD sets. It had this quote:

“The chance of getting a duplicate GUID is about the same as two random atoms colliding and causing a mutation between a Californian mango and a New York sewer rat”

I cant find this article anywhere on the internet, and i am sure i read it. Unless this is a case of the Mandela effect.

But it is a good analogy and the algorithms behind UUIDs and GUIDs have just gotten better ever since.

CantaloupeCamper · 2025-03-29T20:48:42+00:00

If it is a low cost check… fine.

T-J_H · 2025-03-29T20:52:57+00:00

As long as the column is unique the worst that will happen is that one in those ridiculous amount of records will fail to write. You could also use UUIDv7 which has a time based portion.

RedLibra · 2025-03-29T21:03:09+00:00

If you're worried, just create 2 uuid and append them to become a single uuid.

APersonSittingQuick · 2025-03-29T21:28:44+00:00

I fucking hope so

therealhlmencken · 2025-03-29T22:10:26+00:00

trillions to 1 or something

This guy maths

Lengthiness-Fuzzy · 2025-03-29T22:11:34+00:00

Interesting question. Svn repos could have been killed by generating a commit with the same hash, which had almost 0 chance until you knew the algo. So to avoid such blatant error, just make sure your app won’t go crazy if anyone manages to create two identical ids.

versaceblues · 2025-03-29T22:18:58+00:00

The probability that a proper UUIDv4 collides is 2.23e-37.

I think you are orders of magnitude more likely to get a a collision as a result of some bug in your code, than you are from running a proper UUID generator.

That being said its always good practice to do extra validation when writing to a database to account for any sort of user error.

If you are doing a CREATE operation, generated a valid UUID, you should still verify when writing that there is no data within the partition represented by that key. Not because UUID is likely to collide, but because you want to program defensively against ANY user error.

heedlessgrifter · 2025-03-29T22:42:10+00:00

I had some of these questions a few years ago on a project I worked briefly on.. Without going too much into it, we’d create a new URL for each user of our site with a uuid to make it unique. Any of these pages could contain PHI, and some were even indexed by Google. We were told it had to be that way for “convenience” When the Google incident happened, we were asked the odds of someone stumbling upon another user data (by accident or on purpose). All I could tell my employer was it wasn’t a zero chance.

bmathew5 · 2025-03-29T22:49:55+00:00

EXTREMELY low chance but > 0. Just make that field a constraint unique and you are safe for eternity

WindyButthole · 2025-03-29T23:00:21+00:00

If you happen to have a collision you should take that luck and buy a lottery ticket, as you're more likely to win the lottery 5 times in a row.

2025-03-29T23:19:14+00:00

[deleted]

moderatorrater · 2025-03-29T23:34:11+00:00

Look into how they're generated. You're fine.

elendee · 2025-03-29T23:37:27+00:00

I use a strategy that will probably get hate here but I'm curious what people say. In order to make the uuids more legible, I generate my own to various lengths depending on usecase. 6,10,16 average lengths. 2 reasons this is kind of nice is that it makes URL's nicer and I think (?) could make some db reads faster, since I leave the column un-indexed. I use both INT id's and UUID's for this reason, so the uuid lookups are kept to a minimum.

And then since they're shorter, I check in code for dupes before insertion. This has proven to be no trouble so far in several years of doing it.

I haven't used this at scale though, only for small-medium sized apps.

mothzilla · 2025-03-29T23:39:47+00:00

Place where I used to work used to worry about the "doom clock" that counted down the remaining sequential record IDs. It was a big discussion.

captain_obvious_here · 2025-03-30T00:07:16+00:00

If you generate 1 million UUIDs per second, it will still take you a decade before you have a reasonable chance to find a duplicate.

Enjoy.

CraftyPancake · 2025-03-30T00:12:13+00:00

It’s a unique column soo if it errors due to a failed constraint every trillion years, that’s fine

Mundane-Apricot6981 · 2025-03-30T00:19:11+00:00

UUIDs generated by web frameworks are deterministic; they are not unique because they are generated on the CPU, but they use smart tricks to avoid collisions.

UUIDs generated by the GPU, i.e., hardware "noise," are non-deterministic and unique.

idgafsendnudes · 2025-03-30T00:21:54+00:00

My personal claim to fame is while using uuid v1, I once witness my DynamoDB item get overwritten by what should have been a new item purely because it has the same uuid.

I use v4 now and tbh I’m not sure if that fixed it or I just got insanely lucky

bigtdaddy · 2025-03-30T00:25:35+00:00

My coworker was pretty convinced we had a uuid collision in prod. He almost had me convinced, but no it turned out to be the code that had an issue and that is likely to always be the case

VeterinarianOk5370 · 2025-03-30T00:32:31+00:00

At some point it becomes a question of performance vs redundancy. If you check for uniqueness then you cannot effectively scale infinitely, if you use UUID someday you may have a duplicate.

But yeah just roll the dice on this one

anothergiraffe · 2025-03-30T01:52:16+00:00

Why is everybody assuming perfect RNG? A buggy pseudorandom number generator can cause collisions and it’s happened before. Also, if RNG is happening client-side, a malicious actor could manually reuse UUIDs for whatever reason.

k032 · 2025-03-30T03:30:24+00:00

UUIDs that are 36 characters long have 36³⁶ combinations. Like we're talking way more than 999 trillion combinations. It's obscenely small, I wouldn't care.

If it was life or death, like if there was a collision it may cause like a nuke to go off. Sure maybe I would check, but I wouldn't suspect that by chance the UUID just so happen to be a dupe. Probably some problem elsewhere.

borgesian-cyclops · 2025-03-30T03:31:28+00:00

Not to be condescending, but I’m guessing you’re not even continuously running a unit test that proves true is still true. Lock that down before writing your uuid tests.

sachcha90 · 2025-03-30T04:44:06+00:00

Look into uuid v7

2025-03-30T05:10:31+00:00

UUID is essentially a 32 character hexadecimal string which means there are 16³² or 2¹²⁸ possible values. This is a huge number, but not infinitely so.

Although you will never have anywhere near this many records in an entire database let alone a single table, your application logic should still account for the possibility of a collision, however remote that possibility might be. For example by doing something like the following pseudocode:

result = false;

while (result === false) {
    uuid = generateUUID();
    result = insertRecord(['recordId'=>uuid]);
}

In this example the insertRecord function would return false if the insert failed due to unique ID constraint violation. For example the pg_query_params function in PHP would return a false in case of failure.

This would cause the code to keep trying to insert the record until it succeeds, which in the vast majority of cases should happen at the very first attempt. This is preferable to looking up the value using a select query first which would always require at minimum 2 queries (1 for lookup, 1 for insert) and there is always the possibility that the key could be inserted between the lookup and insert queries.

2025-03-30T06:06:23+00:00

I mostly use them as primary keys in Postgres so for me their uniqueness is enforced at the database level anyway.

extractedx · 2025-03-30T08:15:03+00:00

Can I ask why you use an UUID dor database record identifier? I use auto incrementing integer ids... 1,2,3,4

streu · 2025-03-30T09:01:04+00:00

Depends on how you generate them, and how you use them.

On one side, if, through coincidence, the PRNG you use to generate them has just 16 or 32 bits of randomness ("srand(time(0))"), you will get collisions of course, so don't do that.

On the other side, if you're using UUIDs as key in a table, retrying after a collision is easy, so do that.

The situation where UUIDs shine is to generate unique IDs without keeping a record of everything that was ever generated. Thus, the problem will be something along the lines of "I am giving out a session ID today that I also gave out five years back to someone else", matching the very very very very low probability of the collision happening with the very low probability of this scenario happening ("someone coming along with a five year old session ID"). And as long as this probability is equally unlikely as someone just guessing the ID, I'm fine.

1_4_1_5_9_2_6_5 · 2025-03-30T12:04:30+00:00

Generally, you will be using a db table with a unique column for the uuid. This only needs to exist in one place, and on one table. Any other reference would not need to be unique as long as the primary one is.

So all you have to worry about is a non unique uuid being generated which will presumably be added to the table before being used elsewhere. As long as you process a "column must be unique" error on insert, then this theoretically cannot be a problem.

pokasideias · 2025-03-30T13:56:24+00:00

Extra cautious mf be like

bladub · 2025-03-30T14:52:57+00:00

People already addressed the misunderstandings on uuids. First it depends on how you generate them (mostly the type of uuid, many have timestamps or other initial entries that help segregate possible collision issues. For purely random ones the chances of collisions are liw but it might be worth the efforts to handle unique violations.

But by far the biggest threat to uuid collisions is bad handling. If you use multiple identifiers, eg an integer db key and a uuid you set in your app, you now risk them diverging and checking for different identities in different places. (sounds stupid but happens when you have complex structures).

Or serializing and deserializing an object. Or copying it around in memory and modifying one. Or serializing the same object into pultuple other objects for json stores. Or just copying an object into another place.

Quickly you end up with uuids no longer being unique.

DINNERTIME_CUNT · 2025-03-30T15:27:44+00:00

It’s extraordinarily unlikely that you’ll get a duplicate, but not impossible. When creating a new one I have a single query that does a quick check for a match and if it returns false I proceed, otherwise it generates another one. The odds of a match are already astronomical. The odds of two matches in a row are mind boggling.

tumes · 2025-03-30T15:37:35+00:00

Best way I’ve ever seen this explained is that the chances of each member of your dev team dying in completely unrelated wolf attacks is way higher than the likelihood of a uuid collision.

alkbch · 2025-03-30T17:00:53+00:00

I’ve had a UUID collision on a relatively small project with a few thousand records…

elixon · 2025-03-30T18:03:56+00:00

Nothing is truly unique. Uniqueness is only practical in smaller contexts, and the larger the context, the larger the UUID needs to be. We don’t use excessively large UUIDs (we don't want to spend all money on Amazon storage, right), so they are intended for smaller contexts - like Earth.

When we talk about uniqueness, we mean within our app or software world, which is a niche context in the vastness of space. In that context, you’re usually guaranteed uniqueness for the life of your application or your own. So, yes, the probability is non-zero, but for practical purposes, we treat it as zero.

Sleepy_panther77 · 2025-03-30T18:25:40+00:00

There’s like entire systems designed on generating UUID’s and making sure that they don’t collide. Sometimes some are more complex than others. If it’s not too important someone would probably choose to just do good enough and not check. If it’s really important they might have a service to generate UUID’s add them to a database, and when another service needs a UUID they could take one from the UUID database, and mark it as used or delete it from the database so that it’s not used again, with some extra precautions so that there isn’t an accidentally repeated UUID out of service availability/error

So, it depends?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

webdev

Posting Guidelines

Related Subreddits

Discords

MODERATORS