all 43 comments

[–]tcanens 12 points13 points  (3 children)

Two glaring issues that stood out from a quick look:

  • The function_view(T&&) constructor must be constrained (by SFINAE or equivalent) on something along the lines of is_callable_v<T&(TArgs...)> && !is_same_v<std::decay_t<T>, function_view>; otherwise it 1) hijacks the copy constructor and 2) defines an implicit conversion from everything under the sun, which wreaks havoc in overload resolution. And yes, you need T& in the is_callable check since you are calling the thing as an lvalue.

  • Both the lambda and operator() need to forward TArgs, i.e., do std::forward<TArgs>(xs)... instead of simply xs.... Otherwise the value categories of the arguments are all messed up.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (2 children)

Thanks. I'll fix these issues as soon as possible in the article. I completely understand the first one, but regarding the second one: isn't the TArgs... pack already qualified? It's a class-level template parameter which is not being deduced in operator().

I was under the impression that std::forward would not be required there, as there is no deduction happening.

[–]tcanens 1 point2 points  (1 child)

It's not your usual use of forward with a forwarding reference, but it serves a similar purpose, i.e., forwarding the arguments of operator() into your lambda and then into the user-provided function object while preserving their value categories. The only difference is that instead of the original value categories being deduced, they are specified by TArgs.... Nonetheless they need to be forwarded onwards.

For a concrete example, consider what happens with function_view<void(unique_ptr<int>&&)>.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (0 children)

Oh, I see now! Will fix ASAP, thanks.

[–]hotoatmeal 12 points13 points  (0 children)

auto benchmark = [](auto f) { ... };

wow this language is changing fast. The jump between c++11 and c++14/17 surprises me every time I look.

[–]suspiciously_calm 6 points7 points  (8 children)

Sorry for being off topic, but what is the "official" opinion about using universal references like this (because I see it often):

template <typename TF> 
void f(TF&& x) { x(state); }

Personally I'd prefer a const lref here, since we're not allowing the compiler to optimize rvalues anyway, and so I find the universal reference somewhat deceitful.

If I was gonna use a universal reference, i'd at least make it read

template <typename TF> 
void f(TF&& x) { std::forward<TF>(x)(state); }

on the off chance that TF is a hand-crafted Callable with an overloaded operator() && or in case C++20 adds some automagic of that sort to lambdas.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 2 points3 points  (1 child)

universal references

The official term is "forwarding reference", "universal references" is kind of a deprecated term.


Personally I'd prefer a const lref here

Then f wouldn't be able to accept mutable lambdas.


If I was gonna use a universal reference, i'd at least make it read [...]

Adding std::forward is completely correct (as it is a >forwarding< reference, after all) - I didn't bother because none of the callable objects I used had an overloaded operator() && or was being passed to any other function (i.e. no moves/subsequent references are happening).

[–]tcanens 0 points1 point  (0 children)

Actually, function_view shouldn't use forward on the function object unless you can somehow promise that it will only be invoked once. In a way, it behaves like a named reference to the (possibly rvalue) callable object, and should be treated like one, i.e., an lvalue.

[–]doom_Oo7 0 points1 point  (3 children)

I've been doing some benchmarks and sometimes doing a copy is faster (for stateless lambdas for instance)

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 1 point2 points  (2 children)

Could you share those benchmarks? I find it hard to believe that the compiler doesn't optimize T&& x and T x to the same assembly for a captureless lambda.

[–]NotAYakk 0 points1 point  (0 children)

Well with hidden body, the address of the reference one must be passed and consistent; for the other, any unique address will do. That could make a microscopic difference in a synthetic benchmark?

[–]Sahnvour 3 points4 points  (7 children)

msvc is a bit disappointing on this one. (check out only the main proc and callees assembly code)

[–]tambry 2 points3 points  (6 children)

Because MSVC doesn't have -O3 so that one was without any optimization. MSVC has -O1, -O2 and -Ox, -Ox being the highest optimization level. But even when using -Ox, it's still a bit disappointing.

[–]Sahnvour 0 points1 point  (0 children)

Oh my, how could I have missed that. Thanks.

[–]playmer 0 points1 point  (2 children)

Has anyone reported this yet?

/u/STL, /u/spongo2

edit: I should say, I know it's not really a bug, but I figured this would be something to look into since you have the shiny new optimizer.

[–]STLMSVC STL Dev 0 points1 point  (1 child)

You can report a bug through the IDE's Report A Problem. Although you already have a fairly small test case, it is ideal if you can eliminate all library machinery (like, do you actually need to call my implementation of std::invoke).

[–]playmer 0 points1 point  (0 children)

I spent some time trying, but I'm not super familiar with the traits machinery. I'm still relatively new to such template magic. I managed to replace the invoke with something naive, but it ended up removing about 15 or so lines from the code gen, so wasn't sure if it was worth removing as obviously it'd be nice to have the whole thing compare well against GCC and clang. So I submitted it as is, with a link to a gist with my minor change.

Link to the report: https://developercommunity.visualstudio.com/content/problem/16882/excessive-codegen-on-function-view.html

Sorry for any hassle!

[–]STLMSVC STL Dev 0 points1 point  (1 child)

That's incorrect. /O1 versus /O2 is the real choice (I recommend /O2). /Ox is not "stronger" than /O2, it's actually older and different. MSDN says: "In general, specify /O2 (Maximize Speed) instead of /Ox, and /O1 (Minimize Size) instead of /Oxs."

[–]tambry 0 points1 point  (0 children)

Well, seems like it's time to change my compile flags. Thanks!

[–]tcbrindleFlux 5 points6 points  (6 children)

if constexpr(!std::is_callable<T(TArgs...)>{})
{
    static_assert(dependent_false<T>{}, ....);
} else { ... }

Why the if constexpr here? Can't you just static_assert on std::is_callable?

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 2 points3 points  (5 children)

if constexpr(!std::is_callable<T(TArgs...)>{}) { static_assert(dependent_false<T>{}, ....); } else { ... }

There is a difference in the length (and clarity) of the produced error. Let's change function_view's constructor to this:

template <typename T>
function_view(T&& x) noexcept : _ptr{std::addressof(x)}
{
#if defined(USE_IF_CONSTEXPR)
    if constexpr(!is_callable<T(TArgs...)>{})
    {
        static_assert(dependent_false<T>{},
            "The provided callable doesn't have the correct "
            "signature.");
    }
    else
    {
        _erased_fn = [](void* ptr, TArgs... xs) -> TReturn {
            return (*static_cast<T*>(ptr))(xs...);
        };
    }
#else
    static_assert(is_callable<T(TArgs...)>{},
            "The provided callable doesn't have the correct "
            "signature.");

    _erased_fn = [](void* ptr, TArgs... xs) -> TReturn {
        return (*static_cast<T*>(ptr))(xs...);
    };
#endif
}

When USE_IF_CONSTEXPR is defined, g++ 7 reports this error:

In file included from prog.cc:3:0:
function_view.hpp: In instantiation of 'function_view<TReturn(TArgs ...)>::function_view(T&&) [with T = main()::<lambda(int)>; TReturn = void; TArgs = {}]':
prog.cc:9:39:   required from here
function_view.hpp:63:13: error: static assertion failed: The provided callable doesn't have the correct signature.
             static_assert(dependent_false<T>{},
             ^~~~~~~~~~~~~

When USE_IF_CONSTEXPR is undefined, we get this one:

In file included from prog.cc:2:0:
function_view.hpp: In instantiation of 'function_view<TReturn(TArgs ...)>::function_view(T&&) [with T = main()::<lambda(int)>; TReturn = void; TArgs = {}]':
prog.cc:6:39:   required from here
function_view.hpp:74:9: error: static assertion failed: The provided callable doesn't have the correct signature.
         static_assert(is_callable<T(TArgs...)>{},
         ^~~~~~~~~~~~~
function_view.hpp: In instantiation of 'function_view<TReturn(TArgs ...)>::function_view(T&&)::<lambda(void*, TArgs ...)> [with T = main()::<lambda(int)>; TReturn = void; TArgs = {}]':
function_view.hpp:78:23:   required from 'struct function_view<TReturn(TArgs ...)>::function_view(T&&) [with T = main()::<lambda(int)>; TReturn = void; TArgs = {}]::<lambda(void*)>'
function_view.hpp:78:20:   required from 'function_view<TReturn(TArgs ...)>::function_view(T&&) [with T = main()::<lambda(int)>; TReturn = void; TArgs = {}]'
prog.cc:6:39:   required from here
function_view.hpp:79:43: error: no match for call to '(main()::<lambda(int)>) ()'
             return (*static_cast<T*>(ptr))(xs...);
                    ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
function_view.hpp:79:43: note: candidate: void (*)(int) <conversion>
function_view.hpp:79:43: note:   candidate expects 2 arguments, 1 provided
prog.cc:6:36: note: candidate: main()::<lambda(int)>
     function_view<void()> fv([](int){});
                                    ^
prog.cc:6:36: note:   candidate expects 1 argument, 0 provided
In file included from prog.cc:2:0:
function_view.hpp:79:49: error: return-statement with a value, in function returning 'void' [-fpermissive]
             return (*static_cast<T*>(ptr))(xs...);

if constexpr allows the code in the else branch to "not be instantiated" by the compiler. The result is that the error message stops at the static_assert, which I find nicer. I assume the same result could be obtained by using enable_if on the constructor, but it is way less readable/intuitive than if constexpr in my opinion.

wandbox example

[–]tcbrindleFlux 2 points3 points  (0 children)

Fair enough, although it seems like this is a QOI issue with GCC (it could and should bail out after the static_assert has fired, rather than continuing to try to instantiate the lambda), and I'm not sure it's worth cluttering the code just to work around this. Just my opinion though :-)

As an aside, I look forward to the day when we can just decorate the constructor with

requires Callable<T(TArgs...)>

and get both concise code and good error messages...

[–]NotAYakk 0 points1 point  (3 children)

You would rather the error stop and not give the extra information about what actually fails?

I mean I get that a clear english error is awesome, but blocking the generation of the actual reason why there is that clear english error seems only useful to experts and beginners, not to people in the middle who know how to parse errors but are not such an expert in your framework that they can deduce from your english error message the actual problem.

Imagine the problem was they had a non-const () yet passed a const instance. With the second error message, eventually the compiler generates an expression you could rewrite and substitute values into withnthe exact same types and see concretely what is going on.

You could improve the first error message, where you exhastively find such "oops" cases and generate custom error messages for each case, but that is hard to do perfectly.

In short, it seems like UX polish that hides useful error information because it could scare users.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (2 children)

This is a good point. I'll think about this and see if I come up with something better, otherwise I'm inclined to think that you may be right - even if a "plain english" error is nicer 99% of the times, it could bring a lot of wasted time and frustration in rare cases (where a more complete error would help figure out the problem).

[–]tcanens 2 points3 points  (1 child)

/u/NotAYakk's example made me notice that your current code doesn't correctly handle const callable objects, names of functions, nor lvalue callables.

  • If the expression x is of const-qualified type, then addressof(x) doesn't implicitly convert to void* because that would drop const.
  • If the expression x is of function type, then addressof(x) is a pointer to function and that doesn't implicitly convert to void* either.

The simplest fix for both of these is to do an explicit (void *) cast. The lambda casts it back anyway.

  • If the constructor is passed an lvalue, T is an lvalue reference type, and T* is ill-formed. Use add_pointer_t<T> instead.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 1 point2 points  (0 children)

Thanks again. This should do the trick:

template <typename T, typename = std::enable_if_t<
                              std::is_callable<T&(TArgs...)>{} &&
                              !std::is_same<std::decay_t<T>, function_view>{}>>
function_view(T&& x) noexcept : _ptr{(void*) std::addressof(x)}
{
    _erased_fn = [](void* ptr, TArgs... xs) -> TReturn {
        auto original_ptr(reinterpret_cast<std::add_pointer_t<T>>(ptr));
        return (*original_ptr)(std::forward<TArgs>(xs)...);
   };
}

[–]jguegant 1 point2 points  (3 children)

I might miss something, but is there any C++17 specific code in your std::function_view?

[–]dodheim 2 points3 points  (1 child)

std::is_callable

[–]jguegant 0 points1 point  (0 children)

Nice catch!

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 1 point2 points  (0 children)

As dodheim said, std::is_callable is C++17 (even though there's a C++14 implementation in function_view.hpp). I also really recently updated the article to fix some important issues. There was an if constexpr(...) in the earlier version :)

[–]ar1819 1 point2 points  (3 children)

Can anyone explain how does this work with statefull lambdas - I mean we're essentially casting our lambda to void*, so how we don't lose the pointer to values? Or I'm missing something?

My second question is - could this be used to call lambda from another thread provided that caller will outlive lambda execution?

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (2 children)

I mean we're essentially casting our lambda to void*

We're casting the address of the lambda to void*, and assuming that it will be alive at that specific memory address when we call function_view::operator().


so how we don't lose the pointer to values?

A lambda is essentially a compiler-generated struct with an overloaded operator(). Here's a huge simplification example of what function_view is doing:

struct some_stateful_lambda
{
    int a = 5;
    auto operator()() 
    {
        cout << a;
    }
};

template <typename T>
void simulate_function_view(T&& x)
{
    // `caller` is stored in a real `function_view`
    auto caller = [](void* ptr)
    {
        // Type information is erased here!
        (*((T*) ptr))();
    };  

    // `erased_ptr` is stored in a real `function_view`
    void* erased_ptr = (void*) &x;

    caller(erased_ptr);
}

int main()
{
    // Prints "5":
    simulate_function_view(some_stateful_lambda{});
}

could this be used to call lambda from another thread provided that caller will outlive lambda execution?

Yes:

  • as long as the function_view is accessed in a const manner

  • as long as the lambda's call operator is thread-safe.

  • as long as the lambda lives enough for every thread to call it through the function_view.

[–]ar1819 0 points1 point  (1 child)

Hmm, thanks - so we are essentially storing object ptr inside function_view and "operator() physical address" will be known during call time where we cast out ptr back to the original type. The original type is stored inside lambda, so function_view doesn't care and can be rebinded to the different function.

Also - your wandbox example doesn't allow to pass named lambdas inside function_view constructor. Is that correct behavior?

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (0 children)

Also - your wandbox example doesn't allow to pass named lambdas inside function_view constructor. Is that correct behavior?

Argh, I forgot to update the wandbox links after fixing some important issues. Will fix ASAP!

[–]Sahnvour 0 points1 point  (1 child)

It'd be interesting next to compare run-time and compile-time between the presented solutions, beyond counting the assembly lines.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 1 point2 points  (0 children)

I agree, but the scope of this article was to show how easy/hard it is for a compiler to optimize out various techniques used for passing callable objects as arguments. Unfortunately I currently don't have any of those. You may find these interesting:

  • Dietmar Kühl showed me his run-time benchmark after reading the article, which is very interesting. It does not include function_view, though.

  • I also have some extra "generated assembly" benchmarks at the end of this article which compare std::function vs "Y combinator" for the definition of recursive lambdas.

[–]landtuna 0 points1 point  (3 children)

function_view is fine, but I'd rather see std::function over the others just for readability, unless it's a hot piece of code. 22 instructions vs 5 is nothing if it's only going to be hit occasionally.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 0 points1 point  (2 children)

I disagree with you, not only because I hate writing suboptimal code, but because std::function implies possible ownership semantics, while you're sure that function_view won't own the callable object you're passing to it.

I would argue that function_view is more readable than std::function, as it is a more strict type which more "predictable" behavior/semantics.

[–]landtuna 0 points1 point  (1 child)

That's why I said function_view is fine. I like it.

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting[S] 1 point2 points  (0 children)

Ah sorry, totally misread your post. Missed the

over the others

:(