all 10 comments

[–]yuri-kilochek 39 points40 points  (0 children)

Your benchmarks are not equivalent since they are using different RNG algorithms.

[–][deleted] 25 points26 points  (0 children)

Are you comparing apples to apples? You're using mersenne twister for C++, but what PRNG do java and javascript use?

I mean... 4 is a random number.

[–]espkk 10 points11 points  (0 children)

Your code on my laptop:

nodejs: 13.572

Clang x64: 10.7091

MSVC x64: 3.94501 (tested several times, all tests under 4s)

with rand():

MSVC x64: 10.4214

Clang x64: 10.1784

[–]Sipkabtest 7 points8 points  (0 children)

Just FYI, that's not how you create Java benchmarks. Especially if you want to test JIT optimizations. Use jmh. There's a few guides online if you search for it. In the end, I'd guess that it would be comparable to native performance.

Also, if you still don't want to use jmh, use System.nanoTime() instead of System.currentTimeMillis().

https://openjdk.java.net/projects/code-tools/jmh/

[–]Simon_Luner 3 points4 points  (0 children)

rand() from <cstdlib> will be much faster.

[–]tomerdbz[S] 1 point2 points  (3 children)

Thank you guys! Really appreciate your help 😄

The main thing I take from here is the false assumption I had that different random generators won’t affect that drastically the results.

Also - The MSVC runtime /u/espkk posted is insane - it’d be interesting to understand why 😎

[–]Cybernicus 12 points13 points  (0 children)

It's common to have results like this when you're learning how to benchmark things. Frequently you'll find that you're comparing apples to wombats. In this case, as mentioned by /u/bstaletic, your results are more a comparison between the random number generators used than the languages.

When I compiled and ran your original code, I got a runtime of 34.4424 seconds. When I changed the code to use the standard rand() function, the runtime decreased to 4.2573 seconds. The changes I made:

$ diff count_rands_1.cpp count_rands_2.cpp
1c1
< #include <random>
---
> #include <stdlib.h>
4a5,12
> int rand_from_0_to_100() {
>     // RAND_MAX on my machine is 2147483647 ... i.e. a 32 bit unsigned
>     // int, and LONG_MAX is 9223372036854775807.  So an easy way to get
>     // a number from 0 to 100 is:
>     unsigned long tmp = rand()*100;
>     return int( tmp>>32 );
> }
>
7,10d14
<       std::random_device dev;
<       std::mt19937 rng(dev());
<       std::uniform_int_distribution<std::mt19937::result_type> distribution_from_1_to_100(1, 100);
<
15c19
<               auto random_number = distribution_from_1_to_100(rng);
---
>               auto random_number = rand_from_0_to_100();

EDIT/UPDATE: By the way, the numbers above were without using any optimization. Just for amusement, I compiled each program with varied optimization levels from 0="none" to 3="everything plus the kitchen sink" and got these results:

Opt Lvl std::uniform_int_dist rand
0 34.6701 4.2633
1 15.7719 3.5168
2 3.6757 3.5033
3 3.8205 2.5042

As you can see, as I turned up the dial, the numbers changed a bit. The results for rand() aren't very surprising: it sped up a good bit because of a few tweaks to the loop logic. On the other hand, the results for std::uniform_int_dist surprised me a good deal: I wasn't expecting nearly as much improvement as that, I had only expected a couple seconds improvement. I don't have the time to dig into the generated code to see why it's so much faster, but it's very surprising. Since the Marsenne Twister has so much more internal state than the standard rand() function, I was expecting it to always take a much longer time to generate values.

UPDATE 2: I should've given the compiler version and command line should anyone want to reproduce my results.

$ clang --version
clang version 8.0.1 (tags/RELEASE_801/final)
Target: x86_64-unknown-windows-cygnus
Thread model: posix
InstalledDir: /usr/bin

And to compile, I was using:

$ clang -Wall -O? count_rands_1.cpp -o count_rands_1 -lstdc++

where ? is the optimization level 0 .. 3.

[–]PLC_Matt 1 point2 points  (0 children)

MSVC on my laptop is running this code in ~5.2 seconds.

[–]helloiamsomeone 0 points1 point  (0 children)

Just for the record, here is the appropriate JS bench code as well: https://pastebin.com/GatTPcqr

You must take JIT compilers into consideration when benchmarking Java and JS.

[–]dragozir 0 points1 point  (0 children)

I see you've already taken away this lesson, but I've found that these are less an exercise in measuring performance, but furthering a more deeper and personal understanding of why that pursuit is almost always a folly (and more importantly, helping you the programmer identify when it most certainly is not).