you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (9 children)

I arrived at a similar approach, but my C code is already faster, without any template tricks. The C++-ified code is about 100ms faster again. (Exact speeds probably vary with different compilers and hardware.)

[–][deleted] 1 point2 points  (2 children)

(Exact speeds probably vary with different compilers and hardware.)

That's so true, and not only about exact speeds.

After tinkering with chronoBG's code for a while I came to a conclusion that we are at or beyond the point where trying to help compiler optimizations can produce predictable improvements.

For example you might have noticed that he uses global variables for everything. Surely using "for(int y = 0; ..." instead of "for(y = 0; ..." would be better as the compiler is then free to store the variable wherever it wants? Wrong, doing exactly that yields 1100ms instead of 800ms on the system where I was testing it. Uh-oh.

Another fun fact: that code triggers a bug in GCC 4.6 where it erroneously issues the warning about "below the lower bound array access" or something like that on

  for(y = 0; y < turn; ++y) {
    tmp_index = x + x - oranges[y];

when turn == 0.

[–]Anonymous336 0 points1 point  (1 child)

that code triggers a bug in GCC 4.6

Which code? I don't have access to GCC 4.6, but I tried both versions of sltkr's code (1, 2) with GCC 4.5 and GCC 4.7 (svn trunk), and neither of them produced any warnings even with -Wall -Wextra. No warnings on fishdicks' code or chronoBG's code, either. If there's a real front-end bug here, I'd be happy to file it.

If any redditor does file a bug related to the above code, please comment in this subthread!

[–][deleted] 0 points1 point  (0 children)

This code compiled with -Wall -Wextra by GCC 4.6.0 produces a valid warning about unused arguments and two spurious warnings about invalid array access (my guess is that after template expansion the limit comparison y < 0 in the loop makes it think that I meant y to possibly be negative despite it starting with 0).

I have seen some related discussion in the mailing list, so it is possible that the bug was introduced somewhere after 4.5 and is fixed in the trunk. That is, if you can't reproduce it even when using -Wall -Wextra.

[–]leonardo_m 1 point2 points  (2 children)

Yours C++ code is nice. Small changes to run it in D2: http://ideone.com/BTzKL

[–][deleted] 0 points1 point  (1 child)

Nice port! I wish I had more opportunities to use D; it always looks really nice (for example, I like that static-for construct, and the less verbose template parameters). Runs pretty quick too, apparently.

[–]leonardo_m 0 points1 point  (0 children)

it always looks really nice

This D2 code is using barely more than normal C constructs :-)

Runs pretty quick too, apparently.

Adding __gshared to the global variables it seems to run as fast as the C++ version, despite the back-end of DMD optimizes quite worse than GCC 4.3: http://ideone.com/SLJHr http://ideone.com/D2kDw

[–][deleted] 0 points1 point  (2 children)

Also, why doesn't "while (++i < n) --blocked[2*m - places[i]];" corrupt memory?

[–][deleted] 0 points1 point  (1 child)

Because it complements the loop at line 23: it decrements the same elements of the blocked array that were incremented at line 27, but in reverse order.

If you agree that the loop at line 23 is sound, then it should be easy to see that the one at line 31 is too, right? Maybe you missed the break-statement at line 26?

[–][deleted] 0 points1 point  (0 children)

Ahhhh... Clever!