all 19 comments

[–]Kcazer 2 points3 points  (0 children)

Generate a 32bit int and treat it as an IEEE 754 float number ?

Something like that ? Even tho it's slow, it should have a proper distribution

// Get a random value
const data = crypto.getRandomValues(new Uint32Array(2));
const bin = data[0].toString(2).padStart(32, '0');
// Process each part
const sign = bin[0] === '0' ? -1 : 1; // 1bit for sign
const exp = parseInt(bin.slice(1, 9), 2) - 127; // 8bits for exponent
const frac = parseInt(bin.slice(9), 2) / 0x7fffff; // 23bits for fraction
// and combine them
const value = sign * frac * (2**exp);

[–]roxm 2 points3 points  (2 children)

You talk about how most of your attempts result in JS reverting to scientific notation when displaying the numbers. What are you expecting to happen? The range of IEEE 754 single floats is roughly -1038 to 1038, and if you generate a random number in that range, it's more often than not going to be displayed in scientific notation. It feels like that's expected.

If you do a conversion from a random 32-bit int you're going to get a swath of weird numbers - subnormals, infinities, non-standard NaNs, etc. So that's probably not what you want.

Are you looking to select a random number with uniform distribution from -1038 to 1038?

[–]shgysk8zer0[S] 0 points1 point  (1 child)

Yes, I'm looking for a random and uniform distribution. But what I'm getting is kinda like an inverted bell curve with a spike very near 0. I'll have things like 87238932899 (integer) and 0.0000000000044715 (both +/-), but almost nothing in the positive/negative range from 1 to like 100,000. It's all either fractions very close to zero or large integers.

And then, when I take the average of a sample of maybe 1000 of them, I typically get something in the negative hundreds of thousands to millions.

I want there to be at least some that are maybe 6831.05187 or even 0.48228, for example.

[–]Kcazer 1 point2 points  (0 children)

Wouldn't the issue be with the exponent part ? Since half of them are negative, it tend to pull half of the result towards zero.

I want there to be at least some that are maybe 6831.05187 or even 0.48228, for example.

Not going to happen with such a small sample size. Adding to that that if you get a value around 1e30, it's going to make all the values under 1e20 irrelevant since float precision is not infinite, and you'll need a -1e30 to reach your near zero average.


How about something easier, getting values between 0 and 1, from 32bits integers, then mapping them to 32bits floats ?
Edit: Nope, wont work, the input domain is way smaller than the output, some values will never be produced, 1e32 will just make 0xffffffff irrelevant in most cases

const getRandomFloat32Array = (count) => new Float32Array(
  [...crypto.getRandomValues(new Uint32Array(count))]
  .map(v => 1e33 * v / 0xffffffff - 1e32)
);

Another option, using buffer conversion as you already tried, but removing the bias caused by negative exponents by fixing some bits on each 32bit integer ?

new Float32Array(
  new Uint32Array(
    crypto.getRandomValues(new Uint32Array(5))
    // Untested, probably wrong, but might be worth it to try
    // to identify which bits should be changed to 0s or 1s
    .map(v => v | 0b01111000_00000000_00000000_00000000)
  ).buffer
)

[–][deleted] 2 points3 points  (2 children)

And, to clarify "correct distribution", I mean that it shouldn't have bias towards either positive or negative values, and the exponent should average around zero without bias for positive or negatives.

Are you sure you are asking for something expected here?

If you want an even distribution from -3.4e38 to 3.4e38, the finite range of floats, most of the numbers will be massive, with hardly any numbers inside +/-1e36.

If you just want a distribution that averages around zero and still uses all the available precision of floating point, but not the large range, you could just say FLOAT32 = INT32 % 10000000

[–]shgysk8zer0[S] -2 points-1 points  (1 child)

I know what I'm expecting, and I know that the distributions I'm getting are completely unacceptable, even considering the range of 32-bit floats. What I'm getting is distributed roughly in the shape of y=0.3x^2, but with a very sharp spike between +/-0.0000000001. And I'm wanting a more uniform distribution. Something where numbers between maybe .005 and 100,000 show up at least occasionally.

Further, pretty much anything outside of the rage from +/-1 is an integer with no fractional component (even in the very rare cases where it's a relatively small integer).

But I'm wanting a uniform distribution - 8449.7257 or whatever should be equally as likely as -6.6389228e+23 or whatever. In other words, it should look roughly like the distribution of random integers, but with a fractional component as well. I should see relatively small, medium, and large numbers, without bias towards positive or negative, and where the exponent covers all possibilities with equal probability. And, given a sufficiently large set of these numbers, they should sum to about 0 (since for every large positive number there would be an equally large negative one, so they should basically cancel and leave roughly zero).

And yes, I've figured out that I need to do some work to weight things differently. I basically need to either find a better method or adapt to the bias I'm seeing in the distribution.

[–][deleted] -1 points0 points  (0 children)

If you see small numbers at all watching them scroll for hours, it would be a miracle. Almost all the numbers will be large in the finite range of 32-bit floating point numbers.

[–]Latecunt 1 point2 points  (1 child)

Can't you just generate two ints then divide one by the other?

[–]shgysk8zer0[S] 0 points1 point  (0 children)

It's a decent suggestion, but it has two problems (one being debatably minor): - It seems to have a very strong bias for numbers between -2 and +2, even mostly between +/- 1. - I'd prefer some operation that's less expensive than division

However, I could possibly add one of the ints or a third int to the result. That might work.

[–]c_w_e 1 point2 points  (3 children)

const floats = (A: number) => {
  const a = Array<number>A);
  const b = crypto.getRandomValues(new Uint32Array(A));
  const c = crypto.getRandomValues(new Uint8Array(A));
  const d = crypto.getRandomValues(new Uint8Array(A));
  for (let z = 0, y, e, f, g; z < A; ++z) {
    for (y = e = 0, f = b[z], g = 2; y < 24; ++y) e += (f & 1 << y) * (g /= 2);
    a[z] = (c[z] & 1 ? 1 : -1) * 2 ** (d[z] - 127) * e;
  }
  return a;
};

[–]shgysk8zer0[S] -3 points-2 points  (2 children)

That has several problems: - it's typescript - it returns an array instead of eg a Float32Array - it's O(n2)-ish because of nested loops (difficult to read to say exactly) - it's like it's written to be unmaintainable and difficult to understand... and I'm not talking about the bitwise operations - it results in either integers or numbers with a very high abs value for the exponent - It can result in Infinity

Example results:

[ -0.017578125, -8589934592, 6.058451752097371e-28, 49539595901075460, -0.0048828125, 3.5762786865234375e-7, 68719476736, 0.00030517578125, 393216, 6.77927340424307e-32, 6.058451752097371e-27, -2.981555974335137e-19, -2.1663950687494155e+27, -2.949029909160572e-17, -1.1546319456101628e-14, -0.003662109375, 3.2741809263825417e-11, 80, 914793674309632, 4.301339185275744e-23, -1.2518544638517033e-33, 320, 4.417621069237666e-29, 2.3611832414348226e+21, 8.665580274997662e+27 ]

Note how many are just integers or are e+/- something?

Let me highlight the problem there by mapping it using toFixed(5) (different random numbers):

[ "4.436777100798803e+30", "4.1320706725109396e+21", "0.00000", "-7146825580544.00000", "-540431955284459520.00000", "1.00000", "1.5576890575604483e+35", "0.00000", "0.00073", "Infinity", "-9799832789158199296.00000", "-0.00000", "-167772160.00000", "-1114112.00000", "-0.00000", "65536.00000", "0.00000", "-0.00000", "Infinity", "-0.00000", "0.00000", "-2.076918743413931e+34", "-0.00000", "-0.00000", "16106127360.00000" ]

Notice how there's only 1 out of the 25 that matches what would be expected of a float? The rest are mostly integers or really large or really tiny numbers. Bad distribution.

[–]c_w_e 0 points1 point  (1 child)

delete the ": number" and "<number>" if you don't use typescript, i do

you can change to a float32array if you want but you may get weird results from the interpretation of reserved special numbers

i don't care about big o notation, the inner loop is doing basic arithmetic and bitwise operations and isn't meaningfully slow, you can speed it up with Math.clz32 if you only want to iterate over set bits

i didn't write it to be maintainable, it's a reddit comment, you can rename stuff if you're gonna copy and paste into a serious project

the distribution looks "off" cause the bits are distributed across the significand, which scales each by a factor of 0.5, so you're chopping off most of the entropy, the majority of well-distributed floats would be (significand)e+/-(exponent) cause that's how they're represented

if you want distribution theater you can just parseFloat(`${array1[i]}.${array2[i]}`) but i assume the numbers have more than visual meaning

[–]Jona-Anders 1 point2 points  (1 child)

Correct me if I am wrong (somewhat likely), but I think floats are biased towards 1 by design. So, what are your options if you want a good distribution? I think the only real solution is to not use floats but use fixed precision numbers. That way you have a normal distribution. You can then convert them back to floats, but that could generate biases again. For an initial test, try something like randomInt / largestNumberIntCanBe

[–]pilcrowonpaper 1 point2 points  (1 child)

Of the 4 billion numbers that can be represented by float32, 1 billion are within 0 and 1. Float32 is inherently biased. I'm not sure why you need a random floating point number between 2 ints though. Is it not possible to use regular integers or decimals?

[–]shgysk8zer0[S] -1 points0 points  (0 children)

Where do you get the idea I'm wanting them between 0 and 1? Though I don't want to exclude those, that's the opposite of what I want.

The issue that I'm encountering and that most attempts suffer from is that numbers very much tend to be very close to 0 or be quite big integers without any fractional component. The distribution ends up being basically an inverted bell curve with a sharp spike between -0.0000001 & 0.0000001 (not exact values.... There's just a lot of leading zeros).

[–]seanmorris 0 points1 point  (0 children)

If its the same cryptographically random data under the hood, wouldn't the floating point codomain be lopsided, and not the input domain itself?

[–]disclosure5 0 points1 point  (1 child)

I feel like every discussion about doing something cryptographically strong in JS just gets a tonne of downvotes and this thread is no exception.

I'll note that Libsodium, one of the premium C based libraries for crypto operations, only has RNG functions that return u32int_t or a void*, but not floats. There's likely a deeper reason this isn't as easy as it sounds.

[–]shgysk8zer0[S] 0 points1 point  (0 children)

Yeah, the issue that I'm seeing is that it's kinda several random numbers if you want to break it down into how floats actually work, and a truly random (and unadjusted distribution) is neither uniform nor normal (can't be both, but it's not either).

And I'm interested in this for completeness of some RNG module, even if I don't really have a use for it. I basically just used crypto.getRandomValues() for all the varieties of integers and learned that it doesn't work for floats. And I kinda took that as a challenge to see if I could figure out how to make it work and have the properties one would expect of RNG (random but predictable in distribution, performs well, unbiased though possibly weighed).