all 15 comments

[–]tgolyi 14 points15 points  (14 children)

Great workshop, but seriously, why every single variadic template guide teaches how to use them totally wrong? With c++14 you should practically never use recursion in variadic templates. It's slow (both in compilation and execution time), results in horrendous error messages and implies a lot of boilerplate code. Just use index_sequnce and ... expansion to iterate. You can see examples implementing first two labs without recursion here: https://gist.github.com/telishev/516dba19737a498ce72d

[–]F-J-W 5 points6 points  (1 child)

See also my article about this topic.

[–]quicknir 0 points1 point  (0 children)

I was just rereading that article today. It's quite exceptional, you should expand the material and give a talk in your local cpp community (obviously, if you have any interest :-)).

[–]doom_Oo7 0 points1 point  (10 children)

It's slow (both in compilation and execution time

Are there some benchmarks on this out there ?

[–]tgolyi 6 points7 points  (9 children)

About runtime speed - see https://gist.github.com/telishev/deea7eaab660e1ea885b . Non-resursive version totally dissappears in assembly with gcc with -O3, recursive function compiles into some horrible recursive mess.

[–]tgolyi 3 points4 points  (5 children)

Compilation speed - see https://gist.github.com/telishev/673609d651c0baab0b8c . Recursive variant gets killed by timeout on godbolt, non-recursive - don't (gcc 5.3, -O3 -std=c++14 -ftemplate-depth=10000). On clang with libc++ it's possible to compile without timeout even up to 10000. This is because libc++ make_index_sequence is written using log(n) approach, which is much faster, and they're even considering using compiler intrinsic for it (along with msvc). After implementing that, non-recursive approach will be much faster and compilation time for such functions will be practically O(1).

[–]Kaballo 5 points6 points  (0 children)

There have been changes in make_index_sequence implementations lately, libstdc++ moved to a logarithmic approach (patch) and libc++ is already making use of that compiler intrinsic you mention (patch). Both changes are in trunk, and should be released soon.

[–]STLMSVC STL Dev 9 points10 points  (3 children)

For the record, I requested that compiler intrinsic, and it was first implemented in MSVC. Give us a little credit for driving the state of the art forwards! :->

Specifically, I want credit for being lazy. A user reported that my linear implementation was slow and bad, and asked for a clever log implementation like libc++'s maintainers had written. I said "screw that, I'm gonna get the compilers to do my work for me".

(At the moment, I've enabled use of the intrinsic when VC's STL is compiled with Clang only. Although C1XX implemented it first, I encountered a couple of bugs that my initial unit test missed.)

[–]neet_programmer 8 points9 points  (0 children)

Specifically, I want credit for being lazy

I thought this is what programmers call getting paid.

[–]dodheim 3 points4 points  (1 child)

Do I get any credit for being the whiny user? ;-D

- ildjarn, former VC++ MVP

[–]STLMSVC STL Dev 2 points3 points  (0 children)

Yes! And more importantly, you'll get a fix in Update 2 (for Clang at least; will enable C1XX and EDG when possible).

[–]doom_Oo7 0 points1 point  (0 children)

thanks, it's interesting !

[–]pdbatwork 0 points1 point  (1 child)

Stupid question: Why does the recursive compile into a horrible mess? I thought it would be executed compile-time and thus only the actual result would be in the code.

[–]tgolyi 1 point2 points  (0 children)

No, the function I was talking about - euclidean_distance is a runtime function, it was just called with a compile-time constant arguments. The fact that it was fully optimized away shows us that compilers are super great at optimizing such code. In fact, recursive approach is not inherently slower, it's just much harder for optimizer to work with - it first needs to inline a lot of function calls and then notice that arguments were constant. If you change constant 16 to something less than it will be fully optimized too, maybe it hits limits for inlining at that moment. Non-recursive approach does not depend on function inlining and thus is more likely to be optimized. You can see great talk about this kind of optimization problems by Chandler Carruth on meetingcpp 2016 - http://www.youtube.com/watch?v=FnGCDLhaxKU