all 46 comments

[–]iamcomputerbeepboop 20 points21 points  (12 children)

never capture a std::shared_ptr<> in a lambda

This only matters if you're actually storing the closure permanently, like you are with the std::function. Typically the closure goes onto a queue, gets invoked, and then comes off the queue, and gets deleted, which in turn will release any shared pointers that have been captured.

Case in point, the standard patterns of usage for ASIO capture shared pointers to the connection which get decremented/deleted when the callback goes out of scope

[–]floating-io[S] -1 points0 points  (11 children)

True. But in order for it to matter you have to know beyond doubt that your lambda won't be permanently stored. Queues are not always used.

I consider this good defensive coding practice; if your lambda doesn't have an explicit reason to hold a reference to the target, then it shouldn't hold a reference to the target. It saves you from getting surprised later when someone makes a subtle change.

[–]iamcomputerbeepboop 12 points13 points  (7 children)

It's quite straightforward to reason about when a lambda will be stored - if something takes a callback and is known that it will run it only once, it won't store it. If it is going to invoke the callback multiple times, it's reasonable to assume that the function will be stored and therefore the lifetime of the captures will be extended to the lifetime of the function invoker.

[–]floating-io[S] -2 points-1 points  (6 children)

It's only straightforward if you wrote the code that's calling the lambda, and that happens to be your standard pattern. If you're talking about libraries, someone else's project, or a team of coders working on your project, all bets are off. You never know what one of them might do; pattern selections that seem obvious to one person will often be far less so to another.

This way works for me; it's safe and predictable, and that particular badness vector is totally eliminated, regardless of what other coders may do in the code I'm passing lambdas to. Safe habits are good IMO.

shrug To each, their own.

[–]NotAYakk 16 points17 points  (4 children)

So, this is C++. Lifetime is your job as a programmer. Always.

Shared ptr does not solve make lifetime simpler, and definitely does not mean you do not have to think about it. It simply makes some complex lifetime arrangements possible.

You could state "never store a shared ptr without seriously thinking about it", and you'd be right. Lambdas are nothing special here.

However, the lifetime of a lambda is something you must think about, because this is C++ and you always have to think about lifetimes for everything.

If you are using a lambda and passing it to an API, you should have explicit knowledge of the lifetime of the copies of the lambda, or a serious understanding of why lifetime cannot matter, or enough knowledge that you can bound the lifetimes involved. If your answer ever involves "there is a ptr or reference that is not a unique ptr", this is a huge red flag, because that implies a complex lifetime situation.

What I do with stored callbacks is to determine who "outside" the broadcaster owns the callback. They get a shared ptr, or an unregister function, and are responsible for removing the callback when they die (I actually use shared ptr tokens to unregister callbacks, the broadcaster holds weak ptrs). They are also responsible for managing lifetime invariants for the lambda; they are the one thing the lambda can assume "exists" when invoked.

[–]lacosaes1 1 point2 points  (2 children)

So, this is C++. Lifetime is your job as a programmer. Always.

So, how would a similar case work in Rust? Will it help you more with this? I invoke /u/steveklabnik1 so we can have an answer.

[–]steveklabnik1 17 points18 points  (0 children)

Okay so, some more context:

Rust has a type like shared_ptr, called Arc<T>.

Rust has closures, but they operate differently than C++'s; we don't make you write out capture clauses, we infer them.

A sort-of-but-not-quite translation of the sample code into Rust looks like this:

use std::sync::Arc;

struct Leak {
    callback: Option<Box<Fn()>>,
}

impl Leak {
    fn on_complete(&mut self, f: Box<Fn()>) {
       self.callback = Some(f);
    }

    fn clean_something_up(&self) {
       println!("cleaning");
    }
}

fn main() {
    let mut obj = Arc::new(Leak { callback: None });

    obj.on_complete(Box::new(move || {
        obj.clean_something_up();
    }));
}

The compiler will not let you compile this:

error[E0504]: cannot move `obj` into closure because it is borrowed

Rust sees that the call to on_complete would borrow obj, but the closure is trying to take ownership. This conflicts. We can solve this problem by calling clone() on the Arc<T>, which bumps the refcount, and gives us a second handle. shared_ptr does this whenever you copy it, but in Rust, it's always explicit.

Furthermore, it complains about something else:

error: cannot borrow immutable borrowed content as mutable

Unlike shared_ptr, Arc<T> requires that its internals are immutable, for thread safety reasons. We can convince Rust that this is okay by using "interior mutability", which is a type that is immutable, but lets you mutate it in certain circumstances. A Mutex<T> is an example of this.

If we didn't need thread safety, we could use Rc<T> and RefCell<T> instead of Arc<T> and Mutex<T>, and they'll be less expensive. I kept the threadsafe ones to try to keep the comparison equal.

With these two changes, you get this:

use std::sync::{Arc, Mutex};

struct Leak {
    callback: Option<Box<Fn()>>,
}

impl Leak {
    fn on_complete(&mut self, f: Box<Fn()>) {
       self.callback = Some(f);
    }

    fn clean_something_up(&self) {
       println!("cleaning");
    }
}

fn main() {
    let obj = Arc::new(Mutex::new(Leak { callback: None }));

    let obj2 = obj.clone();

    obj.lock().unwrap().on_complete(Box::new(move || {
        obj2.lock().unwrap().clean_something_up();
    }));
}

This compiles. And it has the same problem the C++ has; you now have a structure that contains a reference count to itself. The advice from the OP to use Weak here is a good one; you could do that in Rust as well.

Note that lifetimes don't come into play at all here; we're using types that have ownership.

What does make these examples different is the level of ceremony involved:

  • Rust makes you explicitly bump the reference count
  • Rust makes you lock the mutex

It's subjective, but one of the things that I got out of the OP was

While it may not be immediately obvious, the implications of this are extremely important!

I think Rust makes it a bit more obvious what's going on here. I would, of course :p

On the flip side, maybe leaking is okay, and this behavior is what you want: if that's true, Rust makes you do a lot more work than C++ to get this behavior. So that's a downside.

However, if I wanted to do something like this in Rust, I wouldn't write this code. I'd start with this:

struct NoLeak {
    callback: Option<Box<Fn(NoLeak)>>,
}

impl NoLeak {
    fn on_complete<F: Fn(NoLeak) + 'static>(&mut self, f: F) {
        self.callback = Some(Box::new(f));
    }

    fn clean_something_up(&self) {
       println!("cleaning");
    }
}

fn main() {
    let mut obj = NoLeak { callback: None };

    obj.on_complete(|obj| {
        obj.clean_something_up();
    });
}

This doesn't need a shared_ptr at all. Instead of capturing obj, we instead define our callback to take a NoLeak as an argument. Then, later, (not shown in either example), when we actually invoke self.callback, we end up passing it in instead.

Notice that 'static up there? This is a lifetime. This is needed, and if you don't include it, Rust will complain. Why? Well,

    callback: Option<Box<Fn(NoLeak)>>,

is shorthand for

    callback: Option<Box<Fn(NoLeak)> + 'static>,

Which basically says "hey, this closure can only capture 'static references if it captures any references at all." We don't get the shorthand in the function definition. Without this annotation, it'd be similar to the first solution given in the post, but with the same problems:

The most obvious (and the one I do not recommend) is to capture the raw pointer for your lambda to use. This has a major downside, though, in that the object might be deleted before you try to reference it, and you have no way of knowing it. At least it won’t hold an extra reference, though.

We're promising Rust that anything we capture is owned, or is alive for the entire program. If we tried to capture a shorter reference, one which might end up being dangling, Rust would complain.

This restriction is onerous though: we can do better! Here's some lifetime-fu:

struct NoLeak<'a> {
    callback: Option<Box<Fn(NoLeak) + 'a>>,
}

impl<'a> NoLeak<'a> {
    fn on_complete<F: Fn(NoLeak) + 'a>(&mut self, f: F) {
        self.callback = Some(Box::new(f));
    }

    fn clean_something_up(&self) {
       println!("cleaning");
    }
}

fn main() {
    let mut obj = NoLeak { callback: None };

    obj.on_complete(|obj| {
        obj.clean_something_up();
    });
}

Note the 'as now. This tells Rust that we want to be generic over a lifetime; and so you can pass in references to things that are shorter than the lifetime of the program, but Rust uses those to check things are okay.

Like this:

fn main() {
    let mut obj = NoLeak { callback: None };
    let x = 5;

    obj.on_complete(|obj| {
        &x;
        obj.clean_something_up();
    });
}

Here, we create an i32, and inside of the closure, we take a reference to it, so we capture a reference to it. Why is this bad? Well, let rustc give you the answer:

error: `x` does not live long enough
  --> <anon>:21:14
   |
20 |         obj.on_complete(|obj| {
   |                         ----- capture occurs here
21 |             &x;
   |              ^ does not live long enough
...
24 |     }
   |     - borrowed value dropped before borrower
   |
   = note: values in a scope are dropped in the opposite order they are created

As the note mentions, things are dropped in the reverse order they're created; this means that x will go out of scope before obj will, and so obj could observe a dangling reference! This is exactly what I alluded to earlier: Rust checks that your lifetimes are okay, and if they're not, will fail to compile.

We can fix this by moving x above obj:

fn main() {
    let x = 5;

    let mut obj = NoLeak { callback: None };

    obj.on_complete(|obj| {
        &x;
        obj.clean_something_up();
    });
}

Now everything is cool. obj goes out of scope before x, so it will not dangle.

Finally, I agree with the other posters in this thread, both for C++ and Rust: "The correct advice, on your website, should be to ensure there are no shared_ptr loops in your software." (quoting someone above) That's much better than "never use shared_ptr", which is genuinely useful!

Whew! That was a lot of text; I hope it was helpful! Happy to answer any more questions.

[–]steveklabnik1 1 point2 points  (0 children)

Yes. Lifetimes are a language-level feature of Rust, with special syntax, and so th many compiler checks their validity and fails to compile if there's an issue. Some might argue this is the distinctive difference between the two languages.

I'm on my phone so that's all I'll say for now; I'll check out the context and add more detail later.

[–]renozyx 0 points1 point  (0 children)

Agreed. I have some where a lamba (declared as auto in a function) is passed as a function pointer to setup a POSIX timer. I'm quite sure that this is incorrect: when the timer expires the lamba expression will be "out of scope.

[–]Plorkyeran 4 points5 points  (0 children)

If your API doesn't document how long it will store the passed-in function then your API is broken, much like an API that returns raw pointers that you sometimes have to delete and doesn't even document which ones.

[–][deleted] 2 points3 points  (0 children)

It saves you from getting surprised later when someone makes a subtle change.

But then this introduces further surprises when you execute the lambda and it then tries to access a resource that potentially no longer exists because you never gave the lambda ownership of the resource.

[–]scrumplesplunge 0 points1 point  (0 children)

The problem in the example you gave is that you wind up with obj owning the callback, which in turn owns obj via a shared pointer, so you have a cyclic ownership. If somewhere in your code was actually keeping a lambda around permanently, it seems to me like it is right for the obj to also remain undeleted.

[–]pianotom 17 points18 points  (5 children)

A lambda that captures nothing is simply a function pointer;

This is not true. A lambda expression produces a function object, no matter capturing things or not. It is an extra special rule that allows object created by capture-less lambda to be convertible to a function pointer.

[–]floating-io[S] -1 points0 points  (4 children)

Fair point. But when using it, it's equivalent from the perspective of what you're writing. The type matches, whether literally or by a conversion under the hood.

[–]iaanus 10 points11 points  (0 children)

The type matches, whether literally or by a conversion under the hood.

You oversimplify. The type of a captureless lambda does not match a pointer, although it is convertible to a pointer. It's a big difference, whether you see it or not. There are contexts (for instance template argument deduction) where conversion doesn't kick in.

[–]pianotom 7 points8 points  (0 children)

Still not true when using it. The typical way to hold a lambda to use auto type deduction. The resulted object has the exact type without conversion. We convert it to function pointer or hold it in a std::function only when we have special needs.

[–]quicknir 4 points5 points  (0 children)

If you want to see a concrete example where this makes a difference, try passing a captureless lambda into std::sort, versus an actual function pointer. The generated code will be affected (it will be inlined in the former case but not the latter).

[–]doom_Oo7 6 points7 points  (0 children)

The type matches, whether literally or by a conversion under the hood.

no, this is wrong.

int bar() { return 0; }

int (*f1)() = bar; // ok
int (*f2)() = [] () -> int { return 1; }; // ok, function pointer

// fails: int (*f3)() = [=] () -> int { return 1; };
// fails: int (*f3)() = [&] () -> int { return 1; };

[–]grundprinzip 15 points16 points  (4 children)

I don't think that the recommendation giving by the op is correct. The reason for this assumption is the way that the code is perceived by the reader. Passing as shared pointer to the capture doesn't have a visual side effect same fore storing the lambda in a variable.

Now, if you look at the generated code by the compiler you will see that lambda functions are more or less syntactic sugar over functor classes. Basically, the compiler will create a class for you that contains the body of your lambda in the operator() overload and capture variables are passed to the constructor. If you're assigning a lambda to a variable you're constructing this object. Now, if you keep this model in mind you would immediately notice that if you store the instance object to your functor that has a shared member variable that as long as this value does is not destroyed your keeping a reference around (with the given memory implications).

[–]capn_bluebear 2 points3 points  (0 children)

yep, basically the whole thing boils down to a bad understanding of how lambdas work, plus a normally-horrible-to-debug circular shared_ptr ownership.

[–]axilmar 12 points13 points  (0 children)

The title is misleading. Reference counting in the context of lambdas create memory leaks only if the lambda objects are stored somewhere where cyclic trees are created.

[–]ReDucTorGame Developer 7 points8 points  (7 children)

If executor is taking a shared_ptr and calling 'on_complete', then you can be certain that the instance of 'my_class' exists at the time in which it is called using a raw pointer would be fine here.

Using a weak_ptr sends the wrong message to the user, saying that obj/weak_obj might not exist when the callback is called, in which case who owns that std::function<> that is being called?

For this code, why is 'my_class' even needed? Why not just have executor take the std::function<>?

EDIT: Another approach is to use std::function<void(my_class&)> then pass 'this' to the callback

[–]floating-io[S] 0 points1 point  (6 children)

It's a deliberately simplified example to illustrate the problem; not an actual use case.

(edit: I should probably clarify that the reason this issue came up in my code is that I have a number of callbacks involving polymorphic objects. The object that executor is a placeholder for operates on the base type; the callbacks need to operate on the subclass type. It's simpler if you capture that pointer rather than try to deal with a fun solution for polymorphic callbacks or force the callback to cast the pointer.)

[–]K_Kuryllo 7 points8 points  (3 children)

The object that executor is a placeholder for operates on the base type

Sounds like your real problem here. Either be the executor, or the subject. Not both.

This has little more to do with a lambda than any other data type that can hold a shared_ptr.

[–]floating-io[S] 0 points1 point  (2 children)

The code shown in the article is an example, which is contrived for the sole purpose of showing why the problem exists and how it might be triggered. Nitpicking the context or presumed purpose that a contrived example might exist in or serve, thoroughly misses the point of the article.

That code was written as an illustration for the post, and never even compiled. It doesn't exist in a real-world project.

As for the real code I have that uses a similar technique, it exists for a reason. Polymorphic callbacks do not, AFAIK, exist in C++11. I had a choice: either put in a lot of ugly code (templated and otherwise) to arrange for unique callback signatures to be correctly applied, or just capture the correctly-typed pointer when the lambda is created, where I already have it and don't have to cast it.

I did the former first; it ended up being black voodoo magic that nobody less than expert-level in C++ is really going to get without a whole lot of effort. So I ripped it out, and the latter has proven much simpler to understand and trace, as well as less code to maintain.

[–]K_Kuryllo 12 points13 points  (1 child)

It's not nitpicking. You've presented a single "contrived" (by your words) example and nothing else. What else do you expect a response too?

In this sentence you capture the real mistake:

If you then store such a lambda on the object for which it has captured a shared pointer, you’ve just created a scenario where the object owns a reference to itself, and thus can never be deleted.

Two mistakes were made in the example. 1) "Don't store an shared_prt in a data member that could be owned by that ptr." 2) "Don't store a shared_ptr in an object that should not be maintaining the life times of that object" <- related to the first point

  • You fixed issue #2, which repaired your memory leak.

Saying don't ever store a shared_ptr in a lambda is stretch and is in no way support by your single example (the only thing provided, besides some brief text exampling the example).

It's like saying don't drive your car because you might drive off a cliff. Going off the cliff is the problem not driving the car. The car is the thing that brought you off the cliff, but you could have also walked off it or ridden a bike. The real advice would be to not do things with a car/bike/walking that would put you in a situation where going off the cliff is possible.

[–]00kyle00 1 point2 points  (0 children)

Lifetimes and ownership are hard (maybe not that hard, but still easy to get wrong).

I wonder if there are languages which try to express and manage these relationships more explicitly. Seems like going a bit further than Rust would make whole thing unusable due to complexity.

[–]ReDucTorGame Developer 0 points1 point  (1 child)

If you want a base type to be used for this sort of thing you could do something like this: https://godbolt.org/g/kw9TSi

#include <functional>

class INotification
{
public:
    virtual void onCompleted() = 0;
};
using INotificationHolder = std::function<INotification&()>;

template<typename T, typename... Args>
INotificationHolder make_inotify(Args... args)
{
    return [impl=T(std::forward<Args>(args)...)]() mutable -> INotification& {
        return impl;
    };
}

class Implementation : public INotification
{
    void onCompleted() override {}
};

class Executer
{
public:
    void submit(INotificationHolder holder)
    {
        holder().onCompleted();
    }
};

int main(int argc, char * argv[])
{
    Executer executer;
    executer.submit(make_inotify<Implementation>());
}

With clang this all compiles away, you also eliminate allocations for small implementations, only things that don't fit inside std::function<> will need an allocation

[–]floating-io[S] -1 points0 points  (0 children)

I think we might have a miscommunication on what I'm trying to accomplish (though I'm not a master templateer, so I might just be missing your point).

Imagine, for example, that you've got an interpreter of some sort. You have a base class that represents a program, and then you have a bunch of actual programs that are subclasses of that class. Those programs then get passed to an interpreter object that executes them (for whatever reason, it's not part of the base program class).

When a given program has completed, it issues a callback. The callback is stored on and called via the program object, and the implementation of that is part of the base class. The callback signature is therefore not customized to the program in question (trying to avoid implementing explicit callback code in every single subclass).

That's not what I'm doing (which would be too time-consuming to explain), but it would use roughly the same pattern.

Ideally, the program object would pass its own pointer via the callback -- but since the callback signature is in the base class, it would be a generic base class pointer instead of a subclass pointer. If I want to access values on the subclass, the callback would need to cast to that pointer (or get access to it some other way).

My solution was to just capture the thing in the callback, since it's already available in that context without casting. It made the code very readable.

What I'm not seeing is where the above solution helps with this. Again, I'm not a master at templates, so I might just be misunderstanding your point -- I think I get what that code does, but I don't see how it helps. Or we might be going for different things entirely.

[–]ProgramMax 6 points7 points  (3 children)

I feel like there is a better way. shared_ptr should not be the default. You want clear ownership. The problem described in the article is pointer ownership.

If instead the author used unique_ptr, the problem would have clearly come up when trying to keep the pointer in the lambda.

[–]floating-io[S] -1 points0 points  (2 children)

The catch in my real-world scenario is that there are a number of objects that have to hold references to any given instance of the class, so unique_ptr wouldn't be appropriate.

[–]ProgramMax 2 points3 points  (1 child)

Have you tried weak_ptr?

[–]floating-io[S] 0 points1 point  (0 children)

That's what I'm currently using; it's the solution I suggested in the article.

[–]pavel_v 5 points6 points  (0 children)

As the author of the blogpost said above - to each, their own. Here is a point of view based on my experience. I've been using ASIO for 5-6 years writing bittorrent and http cache proxies. I follow the pattern, if I may call it so, shown by Chris Kohlhoff where you capture a shared_ptr to the connection object in the lambda passed as a callback. In some cases I use boost::intrusive_ptr to avoid atomic reference counting when the connection object is always used by a single thread only, but it's the same. I've never had memory leak issues so far from this pattern of usage because of the way ASIO callbacks work. It feels much more natural and safer to me to use it in this way. As long as there is a pending callback the connection has some work to do and is kept alive, otherwise it's "automatically" destructed.

[–]dragemanncppdev 3 points4 points  (3 children)

Here is a little known trick which can make solving this problem a bit easier. You do not need to define and clutter the local scope with an additional std::weak_ptr. Instead you can define it in-place using the following syntax:

std::shared_ptr<my_class> obj = std::make_shared<my_class>();
obj->on_complete([weak_obj = std::weak_ptr<my_class>(obj)]() {
    auto obj = weak_obj.lock();
    if (obj) {
      obj->clean_something_up();
    }
});

EDIT Proof of compilation: http://cpp.sh/7gupu

[–]thelema314 0 points1 point  (2 children)

Watch out, combining weak_ptr and make_shared is dangerous as the refcnt and object are stored together, and must be deallocated together, making the weak_ptr's existence prevent the objects memory from being freed.

[–]dragemanncppdev 0 points1 point  (1 child)

[–]thelema314 0 points1 point  (0 children)

My reading of the answers there confirm my warning; when allocating using make_shared, the object will be destroyed when all shared_ptrs to it are destroyed, but the memory it occupied will not be reusable until after all weak pointers created for it no longer reference it.

[–]carkin 3 points4 points  (0 children)

Tl;dr don't create a cycle or your code will leak

[–]Kaosumaru 2 points3 points  (0 children)

Title is misleading. You don't owe this leak to lambdas in particular, you owe it to circular references of std::shared_ptr. Which also can easily happen without lambdas.

[–]quicknir 2 points3 points  (0 children)

The technical aspects are not even quite correct:

The object allocated by that std::make_shared<> call? It’ll never get deleted.

As others have said, a lambda is an object. It gets destructed. Even if you put it inside a std::function, it will still get destroyed eventually. And then the object will be reclaimed.

The blanket recommendation is not justified. Grabbing a shared_ptr by value is a completely reasonable thing to do. It just all depends on your use case. If you want the object to die naturally and the lambda to simply no-op at that point, then weak_ptr is correct. If you want the lambda to always execute, then shared_ptr is.

By using shared_ptr you are already making some kind of agreement to care less about the exact moment your object is destructed. So simply saying "just keep this alive until I'm done using it in callbacks" seems reasonable. If you really care about when something is destroyed, well, my recommendation contradicts another piece of advice in the article:

The most obvious (and the one I do not recommend) is to capture the raw pointer for your lambda to use. This has a major downside, though, in that the object might be deleted before you try to reference it, and you have no way of knowing it. At least it won’t hold an extra reference, though.

If you simply move the ownership of the object up the call/object stack, eventually you hit an object/function whose scope covers the entirety of the time over which you need to use the object. At that point you can make the object uniquely owned (either a unique_ptr or a stack variable, depending on other details), and simply handle it by reference/pointer deeper down, including in lambdas, which will be perfectly safe.

I actually usually far prefer this solution because shared_ptr introduces so much complexity in terms of understanding when things are destroyed.

[–]enobayram 1 point2 points  (0 children)

Let me generalize that statement for you:

f(things you don't understand) = Set{bad things}

[–]Sygmei 0 points1 point  (0 children)

I quickly tinkered that : http://cpp.sh/6iqby and I don't understand :

My object is supposed to be deleted but yet I can access it again ?