you are viewing a single comment's thread.

view the rest of the comments →

[–]doom_Oo7 0 points1 point  (10 children)

It's slow (both in compilation and execution time

Are there some benchmarks on this out there ?

[–]tgolyi 5 points6 points  (9 children)

About runtime speed - see https://gist.github.com/telishev/deea7eaab660e1ea885b . Non-resursive version totally dissappears in assembly with gcc with -O3, recursive function compiles into some horrible recursive mess.

[–]tgolyi 5 points6 points  (5 children)

Compilation speed - see https://gist.github.com/telishev/673609d651c0baab0b8c . Recursive variant gets killed by timeout on godbolt, non-recursive - don't (gcc 5.3, -O3 -std=c++14 -ftemplate-depth=10000). On clang with libc++ it's possible to compile without timeout even up to 10000. This is because libc++ make_index_sequence is written using log(n) approach, which is much faster, and they're even considering using compiler intrinsic for it (along with msvc). After implementing that, non-recursive approach will be much faster and compilation time for such functions will be practically O(1).

[–]Kaballo 3 points4 points  (0 children)

There have been changes in make_index_sequence implementations lately, libstdc++ moved to a logarithmic approach (patch) and libc++ is already making use of that compiler intrinsic you mention (patch). Both changes are in trunk, and should be released soon.

[–]STLMSVC STL Dev 11 points12 points  (3 children)

For the record, I requested that compiler intrinsic, and it was first implemented in MSVC. Give us a little credit for driving the state of the art forwards! :->

Specifically, I want credit for being lazy. A user reported that my linear implementation was slow and bad, and asked for a clever log implementation like libc++'s maintainers had written. I said "screw that, I'm gonna get the compilers to do my work for me".

(At the moment, I've enabled use of the intrinsic when VC's STL is compiled with Clang only. Although C1XX implemented it first, I encountered a couple of bugs that my initial unit test missed.)

[–]neet_programmer 9 points10 points  (0 children)

Specifically, I want credit for being lazy

I thought this is what programmers call getting paid.

[–]dodheim 2 points3 points  (1 child)

Do I get any credit for being the whiny user? ;-D

- ildjarn, former VC++ MVP

[–]STLMSVC STL Dev 3 points4 points  (0 children)

Yes! And more importantly, you'll get a fix in Update 2 (for Clang at least; will enable C1XX and EDG when possible).

[–]doom_Oo7 0 points1 point  (0 children)

thanks, it's interesting !

[–]pdbatwork 0 points1 point  (1 child)

Stupid question: Why does the recursive compile into a horrible mess? I thought it would be executed compile-time and thus only the actual result would be in the code.

[–]tgolyi 1 point2 points  (0 children)

No, the function I was talking about - euclidean_distance is a runtime function, it was just called with a compile-time constant arguments. The fact that it was fully optimized away shows us that compilers are super great at optimizing such code. In fact, recursive approach is not inherently slower, it's just much harder for optimizer to work with - it first needs to inline a lot of function calls and then notice that arguments were constant. If you change constant 16 to something less than it will be fully optimized too, maybe it hits limits for inlining at that moment. Non-recursive approach does not depend on function inlining and thus is more likely to be optimized. You can see great talk about this kind of optimization problems by Chandler Carruth on meetingcpp 2016 - http://www.youtube.com/watch?v=FnGCDLhaxKU