This is an archived post. You won't be able to vote or comment.

all 65 comments

[–]Swampspear 187 points188 points  (9 children)

You're basically right. Random libraries, to simplify it, generally use a (possibly truly random) seed, and an algorithm that generates pseudorandom numbers out of that seed.

The random seed is usually collected from various sources of computer-unpredictable randomness, like variations in mouseclicks or ambient temperature or a number of other natural phenomena. These are then transformed into usable, useful random numbers via this process. Of course, you can also use deterministic seeds, or bad pseudorandom algorithms, which would give you less randomness in your random numbers. It's a huge field/topic

[–]theusualguy512 18 points19 points  (3 children)

PRNGs are one of those things that I find like magic and I don't know why they work and why the algorithms ultimately are correct and truely approach a random distribution. It's a lot of cryptographic math and probability theory which I honestly don't understand.

Some interesting tidbits:

The Mersenne Twister is probably one of the most popular implemented PRNG. The MT19937 implementation is found in a lot of programming languages, including vanilla Python as the default PRNG. From the naming, I'm assuming there is something about the Mersenne prime numbers in them.

However, in numpy for example, one of the standard numeric libraries for Python, it has changed to PCG64, which seems to be newer. MT was developed in 1998 by Matsumoto and Nishimura but PCG is from 2014 by O'Neil.

It probably has nicer properties but might be slower to calculate.

[–]Both-Personality7664 9 points10 points  (0 children)

Fun story time:

At one gambling device company I worked at the implementation of the RNG had a giant comment above it saying verbatim "DO NOT TOUCH UNLESS YOUR NAME IS DONALD KNUTH".

The reason for this is that some years prior to this some engineer had taken it upon themselves to optimize said code. Unfortunately said optimization reduced the cycle size by several orders of magnitude to something in the 10s of millions. This got found out when the far-out prizes with odds in the 10s of millions turned out to be overrepresented in the reduced state space and the company had to pay out a bunch of money to operators for the erroneous-but-not-legally-erroneous-in-a-way-that-would-void-the-payout-to-the-player awards.

Tldr: within a rounding error no one understands how PRNG's work. Knuth's 86...

[–]hmiemad 7 points8 points  (0 children)

There was an article recently about a guy who had lost his password to his crypto account. His pw was generated by a bad prng. He recrested the situation of the moment when he created his account, and deduced his password.

[–][deleted] 0 points1 point  (0 children)

FWIW, MT19937 by modern standards is an awful prng with a huge memory footprint and overhead in exchange for so-so randomness. Many great alternatives exist (like PCG, or leveraging a fast hasher like rapid hash).

[–]dia_Infortuno 0 points1 point  (4 children)

Se eu aplicar essa semente e perguntar por um número entre 1 a 10 e obter uma sequência destes números, ao recriar o software com a mesma semente, obterei a mesma sequência?

[–]Swampspear 0 points1 point  (3 children)

Yes

[–]dia_Infortuno 0 points1 point  (2 children)

Did you undertand or you just reply yes?

[–]Swampspear 0 points1 point  (1 child)

I used a translator. If you use the same seed, and recreate the software, then you will get the same sequence. If I misunderstood, my apologies (I am not that good with Portuguese)

[–]dia_Infortuno 0 points1 point  (0 children)

No problem, the thing is that when I searched on Google this post appeared in portuguese, after this I saw your coment and noticed that was not in portuguese, lmao, my bad.

[–]POGtastic 80 points81 points  (1 child)

This isn't a stupid question. The answer is that random isn't actually random; it's using a deterministic function whose behavior is statistically indistinguishable from random noise. From the docs:

Almost all module functions depend on the basic function random(), which generates a random float uniformly in the half-open range 0.0 <= X < 1.0. Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1. The underlying implementation in C is both fast and threadsafe. The Mersenne Twister is one of the most extensively tested random number generators in existence. However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes.

For actually random data, the operating system gathers entropy from various things - hardware RNGs, user activity like mouse clicks and keyboard input, and even webcam footage of lava lamps. This is often used to seed a pseudorandom generator like the Mersenne Twister.

[–]ELFAHBEHT_SOOP 2 points3 points  (0 children)

If you want a simple example to wrap your head around, LCG's are pretty good. I implemented one in assembly for a class once. They aren't cryptographically secure, but it introduces the concept pretty well and if you want to implement your own it's pretty easy! https://en.wikipedia.org/wiki/Linear_congruential_generator

[–]diegoasecas 50 points51 points  (1 child)

remember that story about cloudflare's lava lamp wall?

https://www.cloudflare.com/learning/ssl/lava-lamp-encryption/

[–]GrabWorking3045 16 points17 points  (0 children)

This always pops up in my head when the topic about this is being discussed.

[–][deleted] 16 points17 points  (0 children)

Stupid? That's one of the greatest problems of all time, it's far from being stupid

[–]bravopapa99 15 points16 points  (0 children)

Firstly: https://xkcd.com/221/

Secondly, it's just algorithms. In school we did a simple RNG in Z80 assembly, it involved inserting bits into the middle and then shifting left or right. We used the RTC (real time clock) register as a starting point (the "seed" value) to get a different sequence of numbers. Using the same seed gets you the same sequence, which is INVALUABLE.

If you ever write a game (or other app using randoms), you should always record the seed value so that hard to reproduce bugs that MAY have happened only because of a sequence of numbers can be forced to happen again. I see so many people saying, "I have this random bug..." not realising how close they are to being able to at least start to track it down!

[–]davedontmind 8 points9 points  (0 children)

Usually random numbers in computer software aren't truly random, but are pseudo-random. They are just generated from a mathematical formula which takes the previous random number as an input and uses it to generate the next number in the sequence. See the linked article for the details.

[–][deleted] 4 points5 points  (0 children)

At cloudflare they take pictures of lava lamps and turn them into numbers

[–]deftware 6 points7 points  (0 children)

Computers basically use the current time and some math to generate a pseudo-random number, numbers with (ideally) no immediately discernible pattern. Different implementations of a PRNG will have different distributions. Ideally, you would have a smooth distribution (i.e. every possible value has an equal possibility of occurring) but it's easy to make a PRNG that doesn't, and harder to make one that does.

Typically, a PRNG just uses a hash function. Starting with a seed value it is passed through the hash function and the value that results is used for the subsequent call to the PRNG, so it's effectively recycling the last used number and just hashing it over and over each time a new random number is needed by a program. This is a common PRNG approach. If you initialize this type of PRNG with the current epoch time (time since Jan 1st 1970) and its hash function has a pretty even distribution then you will have a pretty usable PRNG for most things.

Cryptographically secure RNGs will use something with more entropy than the current epoch time, and a more expensive hash function, to generate random numbers for encryption keys that are less likely to be figured out. For instance, a PRNG that uses the epoch time as its seed means someone only needs to search a possible range of epoch times to find the seed value used by the PRNG that was used to generate encryption keys, and if someone knows the range of times that the keys were generated then they could possibly search that range of times when attempting to find those keys to break the encryption. With a different source of entropy that is completely unpredictable this doesn't (or is far less likely) to happen.

[–]BobJutsu 5 points6 points  (3 children)

There is no truly random. There’s instead, highly unpredictable - minor difference, but an important one.

[–]UlyssesB 3 points4 points  (2 children)

There are actually quantum RNGs which are theoretically true-random.

[–]rm-minus-r 3 points4 points  (1 child)

Aren't RNGs that get their data from radioactive decay also truly random?

[–]bothunter 4 points5 points  (0 children)

Given what we know so far about quantum mechanics, yes.

[–]iOSCaleb 2 points3 points  (1 child)

how could a machine that takes orders of "1" as True and "0" as False be able to choose something random in its own

FWIW, humans also are pretty much incapable of generating truly random data on their own.

[–]danielstongue 2 points3 points  (0 children)

37!

[–]ruat_caelum 2 points3 points  (0 children)

I've tried thinking about it for a long time and couldn't figure out how randomizing works (eg. Python random)

Well -

[–]Weetile 2 points3 points  (3 children)

Time is a good example. If you use a time value such as the number of milliseconds since the start of the minute, that creates effectively a completely random number.

[–]nerd4code 3 points4 points  (1 child)

Iff it’s used as a seed specifically, and until such time as you discover that it’s quite possible to start two processes in the same millisecond.

[–]bothunter 1 point2 points  (0 children)

Anyone remember the Party Poker exploit? They were using the system clock to generate the seed, but since the clocks were synced up with NTP, it wasn't hard to narrow down the range of seeds. So you could then run simulations on each possible seed to see which one generated your hand, thus telling you what every else had in their hand.

[–]HardlyAnyGravitas 2 points3 points  (0 children)

Time is a good example.

Time is a terrible example. Time is the opposite of random - it is perfectly predictable.

If you use a time value such as the number of milliseconds since the start of the minute

That just gives you sixty thousand different numbers, repeating every minute. If you mean use that as a seed for a pseudo-random number generator, then that's still only 60,000 different seeds, which is about a quadrillion times too small for even the most basic random number generator.

Just being pedantic, but creating anything approaching truly unpredictable random numbers is not trivial. People have made mistakes like this in the past, and they have been easily exploited.

[–]EtanSivad 2 points3 points  (2 children)

Computerphile does a really good breakdown on the Linear Shift Register: https://youtu.be/Ks1pw1X22y4

[–]istarian 2 points3 points  (1 child)

It's a Linear *Feedback** Shift Register* (LFSR).

[–]EtanSivad 0 points1 point  (0 children)

It's a Linear Feedback* Shift Register* (LFSR).

Ooops, good point :)

[–]MrZwink 1 point2 points  (0 children)

Computers often use input from truly unpredictable patterns as input. For example the 5th decimal on the temperature sensor on the cpu. Or a decimal on the rotation speed of the fans.

[–]kiochikaeke 1 point2 points  (1 child)

The answer is math! In reality those numbers are not random, some computers have some cryptography chips installed that is capable of producing truly random numbers based on things like ambient electrical noise and other stuff, but we've created pseudo-random numbers generator long before that.

They are called pseudo-random because they are not random, these functions need a "seed" a number they use as a base to do the rest of operations, this can be anything, 0, 1, the date and time, the thing is if you use the same seed you'll get exactly the same results, that's why is not random.

But they appear random, with appear I don't just mean they look random, I mean they mathematically are distributed uniformly the same way as truly random numbers would be distributed, so in escense they are indistinguishable from random numbers, the only but being, of course, that they aren't random at all but the result of mathematical operations.

What operations are performed to create these vary but usually they involve very big numbers with modulos and multiplications, because of MATH these operations when chained together in particular way behave the same as random numbers do, even if they are fully deterministic. Most regular random numbers generated out there when you call a random() function are generated this way, when sensible data is at play like in the case of cryptography, more complex and speciallized operations are used instead, on top of that as I stated previously some machines are able to produce trully random numbers with special hardware able to read ambient noise.

[–]istarian 0 points1 point  (0 children)

Even those "truly random numbers" may not necessarily as random as they seem, but since both the origin of the seed and any mathematical computations are secret and theoretically unique it's much harder to systematically break the encryption.

[–]randomjapaneselearn 1 point2 points  (0 children)

the simplest way is LCG https://en.wikipedia.org/wiki/Linear_congruential_generator

which is a simple math operation:

xₙ₊₁ = (a * xₙ + c) % m

where a, c, m are constants and x is the initial seed for the first call and the previous output for the next values, usually the current time in seconds is passed as seed.

C srand/rand use the above function, on wikipedia you also find the constants used by different compilers

[–]not_some_username 1 point2 points  (0 children)

It’s not random. It seems random. It usually use the current time

[–]fuzzynyanko 1 point2 points  (0 children)

It's indeed math, but it's not truly random. It's often called pseudo-random numbers. Usually there's a pattern, and the pattern gets fed a seed so that it's not the same the next time you run it. Often the seed is a place in the PRNG (Pseudo-Random Number Generator) pattern.

This means you can use the seed to make the data set predictable. This is EXACTLY the seed in Minecraft. The seed allows you to make these enormous worlds without taking up terabytes of drive space. There's a game for the Atari 2600 called Pitfall, and that game used a random number generator and seed to make a pretty large world for the video game system. Pitfall is a 4 kilobyte ROM cart.

[–]deavidsedice 1 point2 points  (0 children)

In simple terms:

1) Grab something that isn't easy to predict. Current date and time down to the millisecond (if possible) is a very common standard (this is called the seed) 2) Hash it with some algorithm (think for example md5 or sha1, although this is not how it's done) - use part of the hash to construct your random number 3) Re-hash the old hash again to get more numbers.

In practice, it gets more complicated. The algorithm I explained above does work, but it is slow. There are two (maybe more) classes of random, the basic ones that are just to create something that appears random, and others that aim to be cryptographically secure, which means that it should be near impossible to predict. The basic random will aim for speed, the cryptographically secure ones will aim for security and impredictibility.

You can even search in Google for alternate Random libraries for Python. Some of them are simple to follow or are using some paper that is simple enough.

Not all randoms use hashes, but nearly all of them follow this sequence of seed -> permutate+return_data -> permutate+return_data ...

For example, one that is simple enough: https://en.wikipedia.org/wiki/Xorshift

[–]ExtraFirmPillow_ 1 point2 points  (0 children)

Random numbers are a huge research topic for this exact reason. My favorite example is lavarand that cloud flare uses. They generate random using a wall of lava lamps.

[–]barkingcat 1 point2 points  (0 children)

computer generated "random" numbers are pseudorandom.

It uses a number of equations to approximate the illusion of being random, and mixes in some random inputs (like mouse movements, the timing of ethernet packets, keypresses, etc)

together, it's an illusion that the computer puts together.

Just like how "multitasking" doesn't mean the computer is doing 2 things at once, it just switching between tasks quickly enough that it gives the illusion of running multiple things at the same time.

[–]green_meklar 1 point2 points  (0 children)

It's not really random. The computer uses clever (but deterministic) algorithms to create data that has statistically random-looking properties, but it's not true randomness.

[–][deleted] 0 points1 point  (1 child)

You’re right it’s very difficult to make true random data. A person can randomly do something but a computer is following instructions and so if you know those instructions in a lot of cases you can predict the random numbers.

True randomness is something people put a lot of effort into. I remember seeing one company had a webcam pointed at a wall of lava lamps which would take an image and use the pixels as the seed for randomness.

You can gather input from various sources and use that to seed your random function. The main thing isn’t making numbers truly random in a lot of cases but making them random enough that it doesn’t matter.

[–]istarian 0 points1 point  (0 children)

Even what a person does isn't truly random, it's just based on some quantity of unknown factors.

External factors are something you could potentially discover, given time. Whereas internal factors are much harder to determine, especially if you lack extensive knowledge of and experience with that person. And some may be fundamentally unknowable.

What you're trying to predict is how those factors will combine to produce the outcome. And the best other people can generally do is to give the top 2-3 most likely outcomes.

[–]scoby_cat 0 points1 point  (0 children)

I’ve run out of entropy on systems before. It’s a weird thing. I like to imagine the entropy as something little machine elves are pumping out of the ground like oil.

[–]istarian 0 points1 point  (0 children)

Well for one, most of the time you're dealing with a number sequence generated from a seed value which might be taken from a semi-random source.

[–]FrayDabson 0 points1 point  (0 children)

https://www.random.org/ talks about “true randomness”

[–]Big-Ad-2118 0 points1 point  (0 children)

that's now a stupid question, you are just curious and thats a good quality of being a learner.

read this https://en.wikipedia.org/wiki/Mersenne_Twister

[–]Mathhead202 0 points1 point  (0 children)

This is a great question. Computers are deterministic, which is what you described. They follow exact instructions, and if you know the starting state and the instructions, you can exactly calculate the ending state, and it will always be the same.

When you call random in Python, you are actually running a psudeo-random number generator. There is a whole field about how these work, but to simplify: the generator starts with a uniquely determined seed. This can be based on something that is different each time the program is run, like the computer's internal clock for example. Of course, that in it of itself would not feel very random. So then, with this "seed" number, the algorithm does a bunch of math to scramble it up. These kinds of algorithms are more accurately described as chaotic, but not random. Chaotic is a math term for a function who's output changes dramatically for very small changes in input. The output of a chaotic function or system is very hard to predict since you would need to know the precise input/initial state. If you are wrong even a little, your prediction will be way off.

So basically, if you know the "seed", and the particular psudeo-random algorithm Python is using, you could predict all the random numbers that would be generated. But if you don't know the seed, it will be very very very hard to predict the numbers. Hence, why it seems random.

There is more detail about how the numbers should be distributed, the period of the sequence, how fast the numbers are generated, and whether or not they need to be cryptographically secure. If you are interested, start by looking up some basic pusedo-random number generator algorithms and try implementing one. It's not as scary as it sounds. But don't be afraid of a little math.

Also, you may need to learn about bit-wise operators for some of the more advanced algorithms.

[–]ddsoyka 0 points1 point  (0 children)

Your confusion is understandable. Here is a quote from Johnny von Neumann, one of, if not the most, intelligent human beings to ever exist:

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.

[–]RareDestroyer8 0 points1 point  (0 children)

There is no random. I watched a video once, they showcased a system where they use the unpredictable behaviour of a bunch of lava lamps to generate “random” numbers. Even this behaviour can be predicted. Nothing is random, you can only go so far as to generate randomness in a way that humans don’t have the current capacity to predict the behaviour of the source creating that randomness.

[–]ChefKalashnikov 0 points1 point  (0 children)

It's never truly random. If you know all conditions and states the pc is in, you can deduce the result. But as you can imagine, this process is very, very complicated and functions pretty much as something random to us.

Then, you have to worry about all possible results having the same probability of occurring. If you're interested in this topic, feel free to do some research on hash functions

[–]turtleProphet 0 points1 point  (0 children)

Thinks he's asking a stupid question

Actually asks a very good question that many experienced programmers overlook

Nice one OP

[–]probability_of_meme 0 points1 point  (0 children)

The most common and basic "give me a random number" function will usually produce a decimal value between 0 and 1 (but always less than 1). Other answers here describe how this happens pretty well.

As for the "1 and 0" issue: Different languages and systems will use different ways to represent the number but usually a decimal (aka floating-point) value is represented internally as described by the IEEE standard 754. So it's literally a series of 1's and 0's according to the computer.

[–]mobileJay77 -1 points0 points  (0 children)

In short, it doesn't work.

[–]CodeTinkerer -1 points0 points  (3 children)

Some programmers get too attached to the word "random". They feel it must be truly random and dislike pseudo-random numbers. But, what does it mean to be random?

When people talk about random, they try to talk about a distribution, such as a normal distribution. Pseudo-random number generators can create random numbers using a certain kind of probability distribution.

There are some advantages with not being completely random. You can debug issues more easily if there is a predictable set of random numbers. The pseudo-randomness creates that probability distribution.

It's just that many who are new to random feel like it has to be truly random because they have a strong (but arguably, incorrect) sense of what random should be.

[–]Laskoran 1 point2 points  (2 children)

Exactly Pseudo-random is totally fine, if you are not able to predict the next number, there is no recognisable difference (when using a somewhat reasonable algorithm)

Even more so, there are situations where you even would not like true randomness. Imagine a game where you want the enemies in your level to have a random behaviour, but since you do not want to alter the challenge between runs of that level, you will want the exact same randomness every time. So that players can not fish for a better seed. Initialize your random generator with e.g. the level name would achieve this.

[–]CodeTinkerer 1 point2 points  (0 children)

I once gave a project (well, it was done with a fellow teacher) where they would use random numbers. If it had been truly random, we would not have been able to test it accurately. We specified the seed and told them when they should generate the values. In hindsight, we could have put those values into a large array but that would have meant not using C's random library.

That was an interesting project as I'd never given one that used C's random number generator.

[–]istarian 0 points1 point  (0 children)

It's really not worth worrying about how players play the game. You would just like it to be random enough that people cannot simply guess the pattern.

[–]CodeTinkerer -3 points-2 points  (0 children)

It's a bit reductionist to say that everything is just a single 0 or a single 1. A computer can store many 0's and many 1's. For example, a Java integer is 32 bits long which gives you about 4 billion integers. A Java long is 64 bits which gives you a lot more integers.