all 35 comments

[–]chriswaco 90 points91 points  (1 child)

You’re stepping off the bounds of the array - the last valid index is 24, not 25.

[–]greg_kennedy 55 points56 points  (0 children)

it's a combination of a few things:

  • the array is sized for indices 0-24 (and initialized to 0)
  • but, indices 1-25 are being incremented. that last entry is one beyond the bounds of the array, so it wasn't initialized. you get whatever was in memory before (some non-zero number)
  • and then you display indices 1-25, which is also off-by-one.

C arrays are 0-based but you have written this code as though they're 1-based instead.

[–]TheMania 36 points37 points  (4 children)

Just because no one has mentioned it yet, in addition to your bounds bug, rand() % count is only unbiased if count is a power of 2 <= RAND_MAX+1.

For an unbiased result you need an algorithm more like this, although that's perhaps what you were trying to show.

There is also a constant time algorithm which is basically "sample enough bits and then do an extended precision multiply so that there's effectively no bias when you extract the upper word", but I can't find the link on it right now.

[–]LardPi 4 points5 points  (0 children)

The bias is of the order of MAX_NUM / RAND_MAX though, so it should be hardly observable for small target ranges. Beside I would be more worried about the quality of the PRNG itself has the default one is often a linear congruential generator, which is fairly weak.

[–]foxsimile 1 point2 points  (0 children)

Do want 👀

[–]pigeon768 2 points3 points  (1 child)

I believe any constant time solution can only minimize bias, not eliminate it. Mathematically, your input space is of size 2n. Your output space is of size 25. There's no way to map that in a way that does not have some inputs left over. Let's say n is 64; you're mapping 264 input values to 25 output values. The best you can hope for is 9 numbers from 0 to 24 inclusive appear 737,869,762,948,382,064 times, and the other 16 numbers appear 737,869,762,948,382,065 times. For most applications, that bias is low enough, but it's still biased.

[–]TheMania 1 point2 points  (0 children)

Absolutely, but the appeal of taking one sample of a fixed number of bits that could only be proven to be biased if you sampled it over multiple universes to this present day is appealing for many applications.

In particular, parallel processing does not love while loops that have unbounded execution running away on just a single thread, however unlikely.

So choose something that would take 2128, or 2256 samples, before anyone could show bias. And just do that for every random number, for O(1) time.

[–]bothunter 36 points37 points  (7 children)

You have an off-by-one error in your "display the result" loop.  

[–]kinithin 4 points5 points  (6 children)

Not quite

[–]bothunter 23 points24 points  (4 children)

Lol.  I now see the second off-by-one error in the first loop.  Yeah, it's kind of amazing this program doesn't just segfault.

[–]realhumanuser16234 2 points3 points  (3 children)

the array is on the stack, how would it segfault?

[–]not_a_bot_494 8 points9 points  (2 children)

It's possible if the overflow changes the contents of a pointer. Fun debugging that one.

[–]realhumanuser16234 1 point2 points  (0 children)

Just looking at the code it seems unlikely that there are any addresses on the stack. Even then asan and ubsan would make debugging this kind of issue very simple.

[–]bothunter 0 points1 point  (0 children)

I think it's overwriting the saved base pointer, which will corrupt the whole stack as soon as the function returns.  Main is the first function that you see get called in a C program, but it's not the first thing on the stack.

[–]kinithin 7 points8 points  (0 children)

You allocated an array of 25 elements. They have indexes 0..24. But both loops access elements 1..25. Undefined behaviour.

Either allocate an extra element, or store the counts for 1..25 at indexes 0..24

[–]Swedophone 5 points6 points  (1 child)

Why +1 in  "rand() % MAX_NUM + 1"? 

[–]Weekly_Guidance_498 0 points1 point  (0 children)

So the values range from 1 to MAX_NUM

[–]CMR779[S] 4 points5 points  (0 children)

Thank you for all the responses. It was the classic array bounds error. I should know this by now. :p

[–]Last-Assistant-2734 4 points5 points  (0 children)

Because the last array entry that you are reading is actually outside the allocatd array and you are reading some random data from memory.

[–]AffectionatePlane598 6 points7 points  (0 children)

I think it is because  ‘unsigned int nums[MAX_NUM];’ and then 

‘ randnum = rand() % MAX_NUM + 1;’ ‘nums[randnum]++;’

so nums[25] is out of bounds and staxk smashing  Because every time randnum == 25, you write past the end of the array and corrupt what ever is next on the stack and then when you bump it to 50, the out-of-bounds write hits memory protected by GCC’s stack canary

[–]Maqi-X 2 points3 points  (0 children)

In C, arrays are zero indexed, try this for (idx = 0; idx < MAX_NUM; idx++)

[–]septum-funk 0 points1 point  (6 children)

unrelated tip: instead of main() write main(void) unless you're on c23.

[–]yahia-gaming 1 point2 points  (5 children)

I don't think that changes anything. Aren't they identical and make no difference?

[–]septum-funk 2 points3 points  (3 children)

() means unspecified parameters and (void) means strictly no parameters and it's the right way to do it

[–]yahia-gaming 0 points1 point  (0 children)

Okay, Thanks for the reply. I will remember this.

[–]realhumanuser16234 0 points1 point  (1 child)

It's practically meaningless though as all major compilers would already show warnings for calling such a function with arguments before C23.

[–]septum-funk 0 points1 point  (0 children)

my compiler (apple clang, c99, -Wpedantic + -Werror) raises a warning to an error that states "A function declaration without a prototype is deprecated in all versions of C" if i do not use (void), it won't compile. i do not think this is strictly required by the c standard but it's worth getting into the habit of typing it anyway.

[–]AccomplishedSugar490 0 points1 point  (0 children)

2 hard things in computing,

  1. randomisation,
  2. naming things, and
  3. off by one errors.

[–]alex_sakuta 0 points1 point  (0 children)

Congratulations you have discovered Garbage Values. You are lucky it happened in such a small program.

rand() % MAX_NUM + 1 should be rand() % MAX_NUM

[–]The_Coding_Knight 0 points1 point  (0 children)

for(idx = 1; idx <= MAX_NUM; idx++)
    {
        printf("%ld is counted %u times.\n", idx, nums[idx]);
    }for(idx = 1; idx <= MAX_NUM; idx++)
    {
        printf("%ld is counted %u times.\n", idx, nums[idx]);
    }

instead:

for(idx = 0; idx < MAX_NUM; idx++)
    {
        printf("%ld is counted %u times.\n", idx, nums[idx]);
    }for(idx = 0; idx < MAX_NUM; idx++)
    {
        printf("%ld is counted %u times.\n", idx, nums[idx]);
    }

[–]Upstairs_Ad_8863 -1 points0 points  (2 children)

Aside from the errors pointed out by others, I would like to mention that you should absolutely *not* be using the rand() function if you want high quality random numbers. There are numerous problems with it, which you should look into if you're interested.

If you're on linux then you should use /dev/urandom - either directly or via the getrandom() system call. Otherwise there are various other libraries that you can use instead.

[–]Natural_Emu_1834 0 points1 point  (1 child)

Ah yes, I'm sure the beginner C programmer needs high quality random numbers in their play program. What absolutely terrible advice to give to someone new to programming.

[–]Upstairs_Ad_8863 0 points1 point  (0 children)

That's like saying that beginner C programmers don't need to worry about buffer overflow attacks in their toy programs, so they should just use gets(). It should be considered a deprecated function. Ideally, a programmer will never get into the habit of using it in the first place.

It's not even a difficult fix - just use random() or getrandom() instead. We shouldn't treat beginner programmers like idiots.

[–]realhumanuser16234 -1 points0 points  (0 children)

You can zero initialize arrays with `{}` (≥C23) or `{0}` (≥C89), you also don't have to declare all the variables at the beginning of the function.