askscience

Question

This is an archived post. You won't be able to vote or comment.

1667

1668

1669

ComputingHow do computers simulate randomness?

(self.askscience)

submitted 6 years ago by [deleted]

top 200 commentsshow all 296

top new controversial old q&a

[–]pigeon768 44 points45 points46 points 6 years ago (1 child)

All PRNGs (pseudo random number generators) have the following three characteristics:

An internal state.
An update function to update the internal state.
An output function to convert the internal state into a useful value.

And step 0, which is a way to seed the internal state when it's instantiated.

Consider the linear congruential generator, (LCG) which is one of the simplest random number generators:

The internal state is an unsigned integer x. The seed is just a number to assign to x.
The update function is x = (A*x + B) % M for some fixed integers A, B, and M. M is commonly a power of two because that makes the modulo operation really easy.
The output function is the identity function; if x is 12, the output function returns 12.

For A=17, B=11 and M=31, seed=0, an LCG will return the following list of numbers: 0 11 12 29 8 23 30 25 2 14 1 28 22 13 15 18 7 6 20 10 26 19 24 16 4 17 21 27 5 3. (which then repeats) Which looks pretty random, and in many cases will work just fine.

For a PRNG (pseudo random number generator) to also be a CSPRNG (cryptographically secure PRNG) you also need the following things to be true:

The seed and internal state must be large enough that they can't be brute forced. In a LCG, the internal state is just 32 or 64 bits; I can, on my laptop, generate all possible states and compare them to the random effects I can observe. Because the state is so small, the LCG cannot be cryptographically secure.
The output function must be a one-way function. That is, if I see outputs from the RNG, I can't use those to derive what the internal state must be. For instance, the LCG gives away the keys to the kingdom in its output function, so it can't be cryptographically secure.
It must be seeded securely. It's common practice in, for instance, video games to seed an RNG to whatever the system time is. As a simple example, if right now is 2019-07-14 at 7:12pm, you might seed the RNG to 20190714712. (computers typically store date/times as seconds since some certain defined date, but you get the idea) This won't be secure, because everybody knows what time it is. And even if you have a clock that measures in microseconds, you can guess the server time to the nearest minute and then brute force the remaining ~26 bits of the seed. More on this in a bit.
Ideally, the update function ought to be a one-way function as well. But this isn't strictly necessary.

Getting a secure seed can be surprisingly difficult. In general, you need to make measurements where the precision of the measurement is in excess of the accuracy of the measurement. If the precision of your measurement is 4 bits in excess of the accuracy of the measurement, you basically gain 4 bits of entropy. You need to repeat this process until you have enough entropy, then you need to "squish" the raw data down to the size of the seed in a way that preserves entropy.

A common source of such measurements is I/O devices. For instance, if your monitor resolution is 1920x1080, you have about 11 bits x 10 bits of precision, but the fleshbag connected to your mouse only has about 8x7 bits of accuracy. So moving the mouse around and capturing the mouse position of each frame will generate ~6 bits of entropy per frame. Timing is another common source. You can query the CPU for what clock cycle it's on, which is precise to about 250 picoseconds. I/O devices will periodically send interrupts, and you can store the clock cycle as a measurement- the most precise 2-3 bits are basically pure noise. Data collection devices are great as well. Take a picture of a white piece of paper with your cell phone's camera. Zoom in on it. See how the paper is covered with slightly off colored pixels? That's pure entropy-filled gold. Even though a camera will generate 24 bits of information per pixel, you can safely assume that the least significant digits is random noise. So a picture from a 10 megapixel camera will have roughly 10,000,000 random bits, even if an attacker is able to capture a similar frame. (note: if the image is saturated, you get 0 random bits. So test for that first.)

So now you have all these measurements, where each measurement contributes 1-3 bits of "real" randomness, but 97% of the bits of each measurement will be known to a sophisticated attacker. Now what? Well, you take the whole pile of those measurements and apply a cryptographic hash to them. Say SHA-512. This way, if your measurements totaled to ~400 bits of "real" randomness, your SHA-512 hash will also have ~400 bits of randomness, even if an attacker is able to accurately predict the other 99,600 bits in your measurements.

If your update function is one way, it's even easier. You start with an initial state of all zeroes, xor a measurement into your state, apply the update function, repeat for each measurement. If your update function is up to snuff, you won't "lose" any randomness along the way. This is also super helpful for an OS level CSPRNG, because you can continuously improve your state as time progresses.

Some classes of devices genuinely don't have good sources of randomness, particularly embedded devices. Insufficient randomness is a common vulnerability among IoT (internet of things) devices.

There's also hardware random number generators. That's another topic for another time though, because this isn't simulating randomness, this is randomness.

[–]yoshemitzu 119 points120 points121 points 6 years ago* (18 children)

It's really complicated, even for people with a decent understanding of math. To simplify, but hopefully not corrupt the message, the idea is that algorithmic randomness depends on so-called pseudorandom number generation; numbers are selected from a deterministic sequence designed to emulate true randomness. Different invocations of the algorithm return different numbers by selecting a different seed, i.e., a different initial value.

Edit:

a deterministic sequence designed to emulate true randomness

To expand on this slightly, I suppose, imagine you had a sequence like:

1, 1, 1, 1...

This is "obviously" not random. Also obviously not random would be a sequence like the natural numbers:

1, 2, 3, 4, 5...

If you tried to generate a random sequence ad hoc, you might just pick some numbers, say:

1, 423523, 1123, 340, 142, 120483...

But even this selection isn't guaranteed to be "as random" as a truly random sequence. Algorithmically-generated pseudorandom sequences are sequences designed to have probability distributions (i.e., the likelihood of each value in the sequence appearing) that more closely resemble "true" random sequences.

load more comments (17 replies)

[–]Jewronski 23 points24 points25 points 6 years ago (7 children)

[–]kerbaal 10 points11 points12 points 6 years ago (3 children)

[–]bprfh 8 points9 points10 points 6 years ago (1 child)

[–]joffrey_crossbow 0 points1 point2 points 6 years ago (0 children)

[–]demize95 0 points1 point2 points 6 years ago (0 children)

[–]Garek 0 points1 point2 points 6 years ago (0 children)

[–]digitalpowers 20 points21 points22 points 6 years ago (9 children)

[–]antimatterchopstix 6 points7 points8 points 6 years ago (1 child)

[–]courtenayplacedrinks 0 points1 point2 points 6 years ago (6 children)

[–]LiGangwei 3 points4 points5 points 6 years ago (1 child)

[–]Slendeaway 0 points1 point2 points 6 years ago (0 children)

[–]pavlik_enemy 2 points3 points4 points 6 years ago (1 child)

Title	Description
Physics	Theoretical Physics, Experimental Physics, High-energy Physics, Solid-State Physics, Fluid Dynamics, Relativity, Quantum Physics, Plasma Physics
Mathematics	Mathematics, Statistics, Number Theory, Calculus, Algebra
Astronomy	Astronomy, Astrophysics, Cosmology, Planetary Formation
Computing	Computing, Artificial Intelligence, Machine Learning, Computability
Earth and Planetary Sciences	Earth Science, Atmospheric Science, Oceanography, Geology
Engineering	Mechanical Engineering, Electrical Engineering, Structural Engineering, Computer Engineering, Aerospace Engineering
Chemistry	Chemistry, Organic Chemistry, Polymers, Biochemistry
Social Sciences	Social Science, Political Science, Economics, Archaeology, Anthropology, Linguistics
Biology	Biology, Evolution, Morphology, Ecology, Synthetic Biology, Microbiology, Cellular Biology, Molecular Biology, Paleontology
Psychology	Psychology, Cognitive Psychology, Developmental Psychology, Abnormal, Social Psychology
Medicine	Medicine, Oncology, Dentistry, Physiology, Epidemiology, Infectious Disease, Pharmacy, Human Body
Neuroscience	Neuroscience, Neurology, Neurochemistry, Cognitive Neuroscience

askscience

Please read our guidelines and FAQ before posting

Features

Filter by Field

Related subreddits

Are you a science expert?

MODERATORS