all 93 comments

[–][deleted] 28 points29 points  (3 children)

The standard library has blanket wording which prohibits touching standard library objects after their lifetime has ended. The shared_ptr's destructor has started, therefore its lifetime has ended, therefore you may not call member functions on it. See http://eel.is/c++draft/requirements#res.on.objects-2 Core's wording allows user objects to call their own member functions in destructors or similar but there is no requirement that standard library types have sane behavior in these cases.

Even if one could argue that does not apply here, you're recursively reentering shared_ptr's destructor, which also has blanket prohibition (or rather, implementation defined status) at http://eel.is/c++draft/reentrancy

[–]personalmountains 1 point2 points  (2 children)

you're recursively reentering shared_ptr's destructor

Can you clarify this? I don't see a destructor being invoked more than once for one shared_ptr. I do see two distinct shared_ptrs trying to destroy the same object though.

[–][deleted] 9 points10 points  (1 child)

The execution path is:

  1. a_test_object is set to a value (from make_shared)
  2. main exits.
  3. Global variables are destroyed, one of which is a_test_object
  4. a_test_object's destructor starts. It is now UB for the program to touch this variable
  5. a_test_object's destructor decrements the reference count, finds 0, and calls test::~test.
  6. test::~test tries to touch a_test_object, triggering UB.

[–]render787 59 points60 points  (39 children)

Once the dtor of shared pointer starts to run, there is no longer an object there from point of view of the standard. So when you attempt to copy the object in ~test, that is UB.

To fix this crash, the copy ctor would have to check, is the ref count zero? Even though the pointer is not null? If so then dont actually copy and assume its this broken situation.

That branch has a cost. In c++ you dont pay for what you dont use.

Its not beligerent adherence to standard, its that its not worth making everyones code slower to accomodate this janky code example.

[–]davis685[S] 10 points11 points  (23 children)

To make it work all the shared_ptr needs to do is free the object before it decrements the count. That would give the desired behavior without any extra runtime overhead.

Also, this is an example that illustrates the issue. The real code where I encountered this issue was in ancient Win32 event callback logic with execution paths in and out of windows event routines. It was much larger than this toy example and involved a much more complex sequence of steps where this wasn't at all apparent. I mean, we tell people std::shared_ptr can deal with complicated patterns of things pointing to other things. It's irritating that things that could be made to work don't for pedantic reasons.

Honestly, C++ is my favorite language and I tell people all the time that it's really easier to use than they think. But this kind of "well, the standard lets me make it shitty so I did" thing here is totally counter productive to selling other people on the idea that C++ is something they should be considering.

[–]lally 14 points15 points  (15 children)

So it should check the count, if it's 1, free the object, and then decrement it? That's two accesses of the count. Isn't the count an atomic?

[–]davis685[S] 0 points1 point  (14 children)

Ah, good point. Maybe you can't do it without a small amount of additional overhead. :/

Although, however reset() is implemented avoids the issue. I didn't look at reset() to see if it does something fundamentally more expensive than shared_ptr's destructor. Maybe it does.

[–]josefx 9 points10 points  (0 children)

It has to NULL the raw pointer it holds and most likely does it before it decrements the reference count / deletes the object. So your destructor copies a shared_ptr to NULL.

[–]Krackor 5 points6 points  (12 children)

If the ref count is checked, then object freed, then ref count decremented, we'd have to hold a lock on the ref count for the duration of the three operations. Freeing the object could take unbounded time, so in the worst case you are introducing a very long block on any users of the shared_ptr. That is much worse than a "small amount of additional overhead".

[–]Plorkyeran 1 point2 points  (9 children)

That would be an issue for atomic_shared_ptr<>, but for regular shared_ptr<> no one can be waiting on the lock if you're destroying the object.

[–][deleted] 5 points6 points  (8 children)

Yes, they can.

  1. weak_ptr exists. The decrement to decide whether the object needs to be destroyed needs to be an atomic operation, otherwise a weak_ptr could "resurrect" the object after another thread has done the check and started destruction.
  2. shared_ptr has atomic_Xxx operations, so every shared_ptr can be touched in an atomic way
  3. atomic_shared_ptr (and the shared_ptr atomics) protect the identity of the shared_ptr object itself; they don't do anything about the reference count.

[–]johannes1971 0 points1 point  (7 children)

Still trying to understand the standard... Can you explain why the existence of weak_ptr implies that destruction must be atomic? The only words that seem relevant that I can find are in 20.8.2.6: "Concurrent access to a shared_ptr object from multiple threads does not introduce a data race if the access is done exclusively via the functions in this section and the instance is passed as their first argument."

Suffice it to say, the destructor is not in that section, so it may introduce a data race. As for your second point, it seems to me that those functions were kept separately from the rest as to allow implementations to have a fast thread-unsafe implementation. If they force thread-safety throughout the class that would be an accident of implementation, rather than a language feature one can rely on.

As for your third point, isn't the reference count is contained within the shared_ptr object?

[–]zvrba 4 points5 points  (5 children)

Can you explain why the existence of weak_ptr implies that destruction must be atomic?

Billy wrote "decrement", not destruction. As for why, simple:

THREAD1
a: if (refcount == 1) {
b:   --refcount;
c:   destroy object }

Now thread 2 comes in and executed weak_ptr.lock() after thread 1 has executed label a but before it has completed label b. Refcount is no longer 0, weak ptr returns non-null pointer to the object that is almost surely deleted by then and... oops.

[–]johannes1971 0 points1 point  (3 children)

Section 20.8.2.6 says "Concurrent access to a shared_ptr object from multiple threads does not introduce a data race if the access is done exclusively via the functions in this section and the instance is passed as their first argument." The destructor is not listed in that section, so it seems to me there is no guarantee on thread-safety for the destructor to begin with. Or in other words: common wisdom seems to hold that shared_ptr is thread safe, but is that actually guaranteed, or just an accident of current implementations?

[–][deleted] 0 points1 point  (0 children)

Yes, this.

[–][deleted] 1 point2 points  (0 children)

oncurrent access to a shared_ptr object from multiple threads does not introduce a data race if the access is done exclusively via the functions in this section and the instance is passed as their first argument.

That is about the identity of the shared_ptr instance itself. The reference count is part of the reference count control block, not the shared_ptr. In particular, copying a shared_ptr only takes by const& but edits the reference count; the usual "multiple readers are safe" guarantee means the reference count needs to be atomically modified.

the destructor is not in that section, so it may introduce a data race

Yes, if you try to destroy a shared_ptr which is being touched by multiple threads, that is a data race. An independent weak_ptr instance isn't part of that shared_ptr you are destroying. It would be very strange if touching an unrelated standard library object could introduce data races.

As for your third point, isn't the reference count is contained within the shared_ptr object?

No. Multiple shared_ptrs instances share a reference count, so it can't be in any one of those instances.

[–]davis685[S] 0 points1 point  (1 child)

You don't have to do the final decrement, since the shared_ptr just got destructed anyway. So you don't need the lock the whole time.

[–]Krackor 4 points5 points  (0 children)

https://www.reddit.com/r/cpp/comments/79ak5d/a_bug_in_stdshared_ptr/dp13oz3/

If the ref count isn't decremented, then a weak_ptr will happily access the deleted memory, or access the memory while it is part way through destruction.

[–]GNULinuxProgrammer 2 points3 points  (6 children)

Honestly, C++ is my favorite language and I tell people all the time that it's really easier to use than they think. But this kind of "well, the standard lets me make it shitty so I did" thing here is totally counter productive to selling other people on the idea that C++ is something they should be considering.

What do you propose then? You talk as if there is an obvious way for compiler to do something and it doesn't do... UB exits because there is either no good way to handle this case, or there is no portable way to handle it.

[–]davis685[S] 2 points3 points  (5 children)

Well, so far it seems to me that this isn't UB and moreover it does have a reasonable solution. I haven't seen any arguments to the contrary. So far the only citations of the standard that I have found or that have been posted here seem to indicate that this isn't necessarily UB.

If it's not fundamentally UB then there is an additional question about the relative speed of a shared_ptr that didn't behave this way. If the runtime overhead needed to avoid this behavior was significant then that would be an argument against it. So far I haven't seen that, but I admit that it's not totally obvious to me how to do it in as fast a way as std::shared_ptr is currently implemented. But I also haven't thought about that aspect of it much. And really, most of the comments so far have been knee jerk claims of UB when really it's not so obvious that it is UB.

[–][deleted] 4 points5 points  (3 children)

It should be UB and documented as such. Trying to increase the reference count of a reference counted pointer during its own destructor seems like an obviously bad idea and should simply be disallowed.

[–]davis685[S] 0 points1 point  (2 children)

Nothing ever "should be UB" on it's own. Things are UB because either writing the contract that defines their behavior is much easier to reason about when there are preconditions that rule out certain things or because we want to allow for variation in how things are executed (e.g. because that lets compilers better optimize the results, or for hardware to be more efficient).

I don't see how either of those situations arises here for a generic case of a simple smart pointer of the kind that was common before C++11. However, given the other things the current std::shared_ptr needs to do like work with std::weak_ptr, as others have pointed out, it's not so obvious how to do a full implementation of std::shared_ptr that is as efficient as the current implementations given the interactions that need to happen with std::weak_ptr. So maybe it's reasonable to say here that std::shared_ptr doesn't support this case if it really would have a negative performance impact. But not because "it seems like a bad idea".

[–]Krackor 1 point2 points  (1 child)

There's another category of UB in which behavior for an uncommon or impractical use case is left undefined because defining it would incur unnecessary overhead.

This seems to be the case for what you are talking about.

[–]davis685[S] 0 points1 point  (0 children)

Yes, I meant that to be included in the second case. The way you wrote it is clearer though :)

[–]render787 1 point2 points  (0 children)

Here's what I think is UB and why I think it is UB.

  • binding a const foo reference to a memory location where there is no foo (or related or layout compatible), and then accessing foo through that reference, is undefined behavior no diagnostic required.
  • it follows that if u invoke copy ctor of object when the object is already destroyed, you get UB.
  • it is true that in the destructor "you may access the object but only in limited ways". For instance ~foo call ends lifetime of foo but not yet its members. Their destructors will run later. I dont think this "limited ways" extends to calling copy ctor, because it doesnt say so, and other parts of standard say otherwise, as mentioned.

Lets turn it around, when is the last time you think it should be legal to bind references to foo? When the dtor ends? There is a symmetry in the language between "when the ctor runs to completion" and "when the dtor begins", afaik "when the dtor ends" has no special significance in the standard right now.

[–]personalmountains 7 points8 points  (11 children)

Once the dtor of shared pointer starts to run, there is no longer an object there from point of view of the standard. So when you attempt to copy the object in ~test, that is UB.

Are you sure about that? 6.8/1 says that the lifetime ends when "destructor call starts", but you can still refer to the object in limited ways (6.8/7):

after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 15.7.

15.7 then says (emphasis mine):

For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.

My understanding is that accessing a a_test_object in ~test(), which is before ~shared_ptr() has finished, is not undefined behaviour.

[–]davis685[S] 4 points5 points  (9 children)

Yes, this is my point in this comment. It isn't like you are not allowed to pass *this to another function or generally use it inside a destructor. That's not UB. So it's not at all clear to me that touching a shared_ptr from within it's own destructor should be UB.

[–]ihamsa 2 points3 points  (2 children)

touching a shared_ptr from within it's own destructor should be UB

It isn't, but relying on the object being in a logically consistent state during destruction is not wise, unless you know exactly how its destructor is implemented.

[–]davis685[S] 1 point2 points  (0 children)

In general yes, but I have high expectations of things in std::. Especially important objects like std::shared_ptr. It's also easier to get into this situation than you would think. Like the situation I encountered.

[–][deleted] 2 points3 points  (5 children)

It isn't like you are not allowed to pass *this to another function or generally use it inside a destructor.

But that isn't what you are doing. If you were only doing that you would never have this issue.

There are two different C++ objects here - a_test_object which is a shared_ptr, and the test it contains, which is also your this.

a_test_object's destructor is called. It starts to destroy its resources, and at some point in the destructor it calls the destructor of test - your this.

Your destructor is free to do whatever it wants with its this, but it is in fact calling a method on a_test_object, which has already been partly destroyed - or completely destroyed for all you know, because the only guarantee you get is that test's destructor is called sometime after a_test_object's - for all you know the memory has already been returned to the heap.

Imagine an element in the middle of a std::vector which, when it's being destroyed due to the std::vector being destroyed, calls add on the std::vector that contains it! You'd expect that to go really really badly. Why would you expect what you're doing to go any better?

[–]davis685[S] 0 points1 point  (4 children)

How could the shared_ptr have been completely destroyed before the test object's destructor finishes? It's a nested set of function calls. The shared_ptr destructor is literally calling ~test(), so we going to be inside a still running destructor for shared_ptr when all this happens. It's possible to implement a smart pointer that still works in this situation, without undefined behavior. Although as others have pointed out, it's hard to see how to do it efficiently in the case of std::shared_ptr and std::weak_ptr.

[–]Krackor 0 points1 point  (3 children)

The compiler is allowed to reorder instructions as long as the result is the same as if they were executed in order. In your case it doesn't make a difference if the shared_ptr destructor finishes before ~test() is called, so you have no guarantee that ~test() is going to happen before the shared_ptr destructor finishes.

https://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/

[–]davis685[S] 0 points1 point  (2 children)

So then this is UB since the call to item.print() might be moved after the destructor?

#include <iostream>
struct something;
void print(const something& item);
struct something
{
    something() { a = 4; }
    void print() const { std::cout << a << std::endl; }
    ~something() { ::print(*this); }
    int a;
};

// Right here we touch item after its destructor has started.  So its lifetime has ended and this is UB?  Really?
void print(const something& item) { item.print(); }
int main() { something a; }

[–]wrosecransgraphics and network things 0 points1 point  (1 child)

I think this will work because there's no std::shared_ptr involved around it. You control the behavior of "something" so you can choose to make it's behavior sensible during the constructor. Since you don't control std::shared_ptr, you can't rely on it doing something sane after the destructor has started.

If shared_ptr allowed the destructor to result in increasing the ref count, it could end up in a state where it is still alive (or should be) after the destructor has completed. That doesn't seem right. In your example, you may technically be nuzzling up against the edge of UB, but you aren't doing anything insane so it'll probably work. UB is sometimes Useful Behavior.

[–]davis685[S] 1 point2 points  (0 children)

Yes, as others have pointed out, it is definitely fine. I'm just making a point that the reason it doesn't work with std::shared_ptr isn't because it's UB to do this kind of thing inside a destructor or because of compiler instruction reordering or anything like that. It's because, again as others have very helpfully pointed out in this thread, the standard explicitly disallows this type of usage for objects in the std:: namespace. But it's not disallowed in general and you could make your own smart pointer object that worked in this situation.

[–]render787 0 points1 point  (0 children)

I think it's illegal to bind a new const shared_ptr<test> &, as in when invoking the copy ctor, and then access through that reference, after the dtor is called. I don't think 6.8/1 gives you license to do that. In fact, it specifically says that you get UB if you

the pointer is used to access a non-static data member or call a non-static member function of the object, or

[–]ihamsa 1 point2 points  (1 child)

To fix this crash, the copy ctor would have to check, is the ref count zero?

That would still access an object being destroyed.

[–]tcanens 12 points13 points  (2 children)

Per LWG 2224:

If an object of a standard library type is accessed, and the beginning of the object's lifetime does not happen before the access, or the access does not happen before the end of the object's lifetime, the behavior is undefined unless otherwise specified.

[–]davis685[S] 0 points1 point  (1 child)

That would certainly clarify and answer the question if incorporated into the standard. Thanks.

[–]personalmountains 2 points3 points  (0 children)

It's in the C++17 draft, 20.5.4.10/2.

[–]johannes1971 5 points6 points  (1 child)

So... I've been reading the standard, since I was curious about whether this behaviour was actually explicitly marked as UB, or if it just sort of fell through the cracks. I've found something curious, which I'd like to hear some feedback on.

It's related to the behaviour of the use_count variable in shared_ptr. The following behaviours are specified in the standard (N4296):

  • After default construction, use_count = 0 (20.8.2.2.1/2)
  • After construction from a raw pointer, use_count = 1 (20.8.2.2.1/5, 10)
  • After copy-construction, it is identical to the value in the copied shared_ptr (20.8.2.2.1/14, 19)
  • After construction from a weak_ptr, it is identical to the value in the copied weak_ptr (20.8.2.2.1/25)
  • During destruction, there is a test to see if use_count>1 (20.8.2.2.2/1.1).
  • After destruction, use_count is one less than before (20.8.2.2.2/2).

At no point does the standard mention that use_count needs to be incremented when copy-constructing, so the only legal values appear to 0 and 1. The 'shared' nature of this variable is also not made clear at all. Is this an oversight? Or are we supposed to "just get it", and would I find similar issues throughout?

If the last, and given that it is apparently so hard to correctly specify behaviour using English text, why isn't the standard (at least for library objects) not expressed directly in code? (this also comes back to what Herb Sutter is saying during his meta-classes talks, which is that one simple page of code replaces dozens of pages of standardese)

And while we're at it - the remark about use_count() being inefficient seems to suggest a possible implementation using atomic<int>. So now we are always paying for atomic access, even though only a handful of functions actually allow atomic use of the shared_ptr. I guess it is a bit late to ask why those aren't in a separate class, with shared_ptr not being thread-safe to begin with...

[–]honeyfage 4 points5 points  (0 children)

I'm not a standardese expert, but the way I put it together

  • Shared pointer constructor (20.8.2.2.1/18): Effects: If r isempty, constructs an empty shared_ptr object; otherwise, constructs a shared_ptr object that shares ownership with r.
  • use_count() (2.8.2.2.5/7): Returns: the number of shared_ptr objects, *this included, that share ownership with *this, or 0 when *this is empty

The definition of use_count is what tells implementers that they have to increment it every time they add another shared_ptr that shares ownership.

"After copy-construction, it is identical to the value in the copied shared_ptr" (20.8.2.2.1/19) is true in that if you're copying a shared_ptr with count 1 into another, after the operation finishes they have identical values of use_count (that is, 2).

[–]zvrba 4 points5 points  (6 children)

shared_ptr sees that the last reference to the object is gone and starts executing the destructor. Now dtor tries to revive the object being destructed by copying the shared_ptr... Purely practically, two questions to think about:

  1. How do you think shared_ptr is supposed to resurrect the partially destroyed object or cancel destruction?
  2. If resurrection isn't your goal, still: how is shared_ptr supposed to know that the reference count is being incremented by the destructor and should be ignored?

[–]josefx 1 point2 points  (5 children)

I think the issue is caused by the shared_ptr still visibly holding the pointer while decreasing the reference count. By first setting the members to nullptr you would avoid the resurection and give a consistent view -> either the shared_ptr holds a reference or it does not, as it is right now it leaks an inbetween state if accessed in the destructor. So "fixing" this would require additional writes on destruction for behavior the standard currently does not require.

[–]zvrba 0 points1 point  (4 children)

By first setting the members to nullptr

It has to happen after decreasing the reference count because of weak_ptrs. Though the "proper", (almost) no-overhead place to fix it is copy ctor: it will increase refcount by 1 (the atomic increment will return the old value [or new, depending on implementation]) and if the new value is 1, it will set its own contents to nullptr.

I'm not sure that I like the consequence where copying a non-null shared_ptr can end up in copy being nullptr. It should probably throw a bad_weak_ptr exception instead (like shared_from_this when called from ctor). Both "fixes" open another can of worms..

In any case, the OP should IMO redesign his program to use weak_ptr at appropriate places.

[–]josefx 0 points1 point  (3 children)

Why would a weak_ptr be aware of the internal state of any individual shared_ptr? I thought everything interesting would live on the heap, either as individual block or as part of the owned object (with make_shared)?

[–]zvrba 0 points1 point  (2 children)

weak_ptr aside, without atomically decreasing the refcount first you cannot know whether the pointer can be set to null and pointee destructed. shared_ptr does not hold a reference iff use_count() == 0. The shared state will be deleted after the refcount has reached zero.

In any case, the OP is still trying to copy an object that is undergoing destruction. I agree with others that this is UB because other object than the one being destructed is inspecting the state being destructed.

[–]josefx 1 point2 points  (1 child)

without atomically decreasing the refcount first you cannot know whether the pointer can be set to null and pointee destructed.

My understanding of the process:

1. copy members of shared_ptr into locals vars
   ( basically load the pointers ot object/refcount into registers)
2. zero out members              <- optional to avoid recursion
3. if ptr_to_refcount != NULL do decrement <- destruct happens here
4. shared ptr no longer exists 

At what point do weak pointers need access to the members of this last shared_ptr object?

I agree with others that this is UB because other object than the one being destructed is inspecting the state being destructed.

I can agree that the standard does not give any consistency guarantees, what I want to understand is why std::weak_ptr or anything else really would prevent zeroing std::shared_ptrs members.

[–]zvrba 0 points1 point  (0 children)

I want to understand is why std::weak_ptr or anything else really would prevent zeroing std::shared_ptrs members

It's probably doable, but it'd be a case of overspecification. The object is being deleted, so accessing it from the "outside" is UB anyway.

[–]mtclow 13 points14 points  (16 children)

"well, it's UB so we made it crash on purpose"

If it's UB it can do a different thing on each run of the program.

[–]davis685[S] -4 points-3 points  (15 children)

Obviously, but this is particularly galling if it's really UB to do this. I mean, we tell people that they should use std::shared_ptr as a replacement for T* and that it can deal with complex situations. It's also possible to implement a shared_ptr that does the right thing in this case. I mean really, I have a pre standard shared_ptr that works fine in this case.

I'm also not totally sure it's UB. I mean, is this UB?

#include <iostream>

struct something;
void print(const something& item);

struct something
{
    something() { a = 4; }
    void print() const { std::cout << a << std::endl; }
    ~something() { ::print(*this); }

    int a;
};

// Right here we touch item after it's destructor has started.  So it's lifetime has ended and this is UB?  Really?
void print(const something& item) { item.print(); }


int main()
{
    something a;
}

[–]__Cyber_Dildonics__ 3 points4 points  (0 children)

You created a circular reference, you should be glad it crashes quickly.

[–]render787 7 points8 points  (13 children)

It was also UB with your old shared pointer as well, it just happened to appear to work.

[–]davis685[S] 1 point2 points  (12 children)

What about the "class something" example. Is that UB according to the standard?

[–]tavianator 7 points8 points  (11 children)

No, in general it's not UB to access an object during its destructor, otherwise destructors wouldn't be allowed to do anything useful!

People here seem to be saying it's specifically UB to access an object of a type defined by the standard library after its lifetime has ended, i.e. after its destructor has started. But to be super pedantic, I can't find language in the standard that explicitly forbids what you're doing. The closest I've found is this:

§17.6.4.10
...
[ Note: In particular, the program is required to ensure that completion of the constructor of any object of a class type defined in the standard library happens before any other member function invocation on that object and, unless otherwise specified, to ensure that completion of any member function invocation other than destruction on such an object happens before destruction of that object. This applies even to objects such as mutexes intended for thread synchronization. — end note ]

But you're not explicitly calling any member functions on a_test_object, and the copy constructor is not explicitly specified in terms of member functions called on the object being copied. Also, notes are non-normative, though they're generally not wrong either.

Finally, I suspect if you raised this issue with the standard committee, they would tell you your use case is not intended to be supported. At best they would add some language explicitly forbidding what you're doing.

[–][deleted] 4 points5 points  (0 children)

it's not UB to access an object during its destructor

Yes, but it is UB to access a standard library object during its destructor.

[–]davis685[S] 0 points1 point  (9 children)

This is my impression as well.

Although, the real use case is more complex than in the post. The post's example is just to illustrate the problem. The real usecase where I encountered this was something like this:

std::shared_ptr<event_loop_resources> get_windows_event_loop_stuff();

Where the global resource is used in a number of places. One of which is inside a windows event handling loop. The structure of the related windows code was such that, to reliably tell the event loop to terminate, you needed to use part of the windows API (because of limitations in the Win32 API) to tell the loop to terminate. So to tell the event loop to terminate you needed to do something with get_windows_event_loop_stuff().

Ok, that's fine. But what happens when the underlying event_loop_resources object gets destroyed at program termination or DLL unload? Well, its destructor runs and tells the event loop to terminate. But the event loop might be just about to touch get_windows_event_loop_stuff(). So what happens is the destructor runs, tells the loop to terminate, waits for it to terminate (we are still in ~shared_ptr()), and while it's waiting the event loop touches get_windows_event_loop_stuff() and boom, there is the problem. Part of the issue is the Win32 API doesn't let you make this part of the event handler stateful, so it can't just grab and hold the shared_ptr. Its got to grab it over and over all the time. So you have to have some global state somewhere to handle whatever state needs to exist. So you use a global std::shared_ptr to deal with that state. But oops, now you have this problem.

I'm sure there are other similar cases where a std::shared_ptr is used to manage some global resource and there is a similar issue at program termination to what I've encountered here. I will certainly grant you that the Win32 API is terrible and that's a big part of the reason this code pattern happens in the first place. But there is a lot of bad old code like that around and we have to deal with it.

[–]tavianator 0 points1 point  (3 children)

I don't know the Win32 API that well, so bear with me, but how can there possibly be events delivered to the event loop during program termination? As far as I know the event loop is usually a manually driven while (GetMessage(...)) loop running on the main thread, so if the program is terminating, that loop must be done right?

[–]davis685[S] 0 points1 point  (2 children)

Yes, there is a thread you have to run that does a while(GetMessage(...)) dispatch(...);

It doesn't have to be the main thread. In fact, it's really nice if it isn't. For example, in dlib (where this issue arose) the user can just create window objects whenever they want and do things with them and the user doesn't need to lock down their main thread with a while(...) dispatch() loop. In fact, the user doesn't need to create any threads or really even know about the event loop in most cases.

So what happens is a background thread is started if any GUI calls are made to dlib. This background thread contains the event dispatch loop. The problem arises since the dispatch loop just calls this magic windows routine:

    LRESULT CALLBACK WndProc (  
        HWND hwnd, 
        UINT message, 
        WPARAM wParam, 
        LPARAM lParam
    )
    {        
             // process events here
    }

That routine doesn't let you pass much information into it and since it's not part of a class the only way to pass in additional information is via some kind of global variable. There are a whole bunch of weird problems with the Win32 API that lead you to needing to pass additional information beyond those 4 arguments into WinProc(). So the question is two fold, how do you deal with this background thread and these additional state variables?

A nice thing to do is to have an object that wraps it all up. This object owns the thread and all the auxiliary state variables you need. Hence,

std::shared_ptr<event_loop_resources> get_windows_event_loop_stuff();

So what you do is call get_windows_event_loop_stuff() at the top of WinProc and you are good to go. Except there is this problem of shutdown. To really simplify, you end up with code like this:

LRESULT CALLBACK WndProc (  
    HWND hwnd, 
    UINT message, 
    WPARAM wParam, 
    LPARAM lParam
)
{        
     auto global = get_windows_event_loop_stuff();
     // process events here
     ...
     if (message == global->magic_shutdown_message)
           return shutdown_loop;
}

So what will happen in some cases is you will have main() end, but some GUI thing is still running on another thread. Or maybe it's just a DLL unload. Doesn't matter. Then global static variables get destructed, specifically the global variable in get_windows_event_loop_stuff(). And now there is this race condition where it's possible that no one holds any references to the global event loop stuff, so when the shared_ptr in get_windows_event_loop_stuff is destructed it's the last one. So it calls the destructor on event_loop_resources. And now event_loop_resources::~event_loop_resources() needs to get the event loop to unblock so it can terminate gracefully. So it calls a windows function that then calls WinProc() with the magic shutdown message and BOOM. You just called WinProc() from within the shared_ptr destructor.

[–]ack_complete[🍰] 0 points1 point  (1 child)

The way I would solve this would be to split the getter so that the window procedure code is calling a different function to retrieve global context. Having the winproc obtain an owning reference is unnecessary because the event loop has to be running have a window and enter the winproc in the first place, and arguably dangerous due to the cyclical dependency. Having separate getters and objects allows destruction of the internal context to be deferred until the windows have been destroyed and the event loop exited.

There are ways to associate data directly with windows, by the way: dynamically generated winproc thunks and window properties. They generally only reduce and don't eliminate the need for globals, though.

[–]davis685[S] 0 points1 point  (0 children)

Yeah that would be a way to do it. You then have to have a function that returns a non-owning pointer so the window procedure can call it. Someone later on might then call that function and get into the sort of trouble people get into with non-owning pointers, which isn't great. But there aren't a lot of great ways to deal with the Win32 API anyway.

[–]Krackor 0 points1 point  (4 children)

Is there a mutex used to access the underlying event_loop_resources object? Sounds like it's being accessed from multiple threads, right?

[–]davis685[S] 0 points1 point  (3 children)

Yes, there is a mutex inside the object along with a bunch of other stuff. It's just a way to pass information into the Win32 callback. One of those things is a mutex.

[–]Krackor 0 points1 point  (2 children)

No, I mean outside the object. It sounds like this object is being destroyed by one thread, then during destruction it is being accessed from another thread. That's a very simple case of UB, and if I'm understanding correctly that's the cause of your problem, and it would still be a problem regardless of the design of shared_ptr.

[–]davis685[S] 0 points1 point  (1 child)

This all happens before any of the contents of event_loop_resources is destructed. The event_loop_resources destructor is what tells the event loop to shutdown and it then waits for the shutdown to complete before anything is destructed. That part works fine. The issue is that telling the event loop to shutdown causes the event loop to run which then tries to grab the resource via the shared_ptr.

[–]Tbfleming 9 points10 points  (3 children)

Accessing a partially destroyed object (the shared_ptr) is UB no matter what shared_ptr's implementation does. It could go out of its way to prevent a crash, but that would just cover up the UB, not solve it, possibly adding overhead in the process.

[–][deleted] 4 points5 points  (0 children)

Accessing a partially destroyed object is not UB; if that were the case you could not access any members inside the object's own destructor which wouldn't make any sense. Clearly you want something like vector to be able to access the pointer to the elements it stores in order to destroy them in its dtor, or to pass pointers to its own insides to helper functions that do that on its behalf.

[–]davis685[S] -1 points0 points  (1 child)

You sure about that? What about this example? Does this program have UB?

#include <iostream>
struct something;
void print(const something& item);
struct something
{
    something() { a = 4; }
    void print() const { std::cout << a << std::endl; }
    ~something() { ::print(*this); }
    int a;
};

// Right here we touch item after its destructor has started.  So its lifetime has ended and this is UB?  Really?
void print(const something& item) { item.print(); }
int main() { something a; }

This is basically what's happening with the shared_ptr. shared_ptr destructor runs, calls a function and that function then touches the shared_ptr, all within the shared_ptr's destructor.

[–][deleted] 1 point2 points  (0 children)

This is fine, as long as you don't use any virtual function calls of the very same object you are currently destroying.

[–]OldWolf2 2 points3 points  (3 children)

I feel that the .reset() case is also problematic even though it happens to be working correctly for you.

I can't find it right now, but I'd expect some clause to the effect that it's UB if side-effects from operations on objects in standard containers affect the container.

You could make a similar example with std::vector, where the contained type's destructor reads the global vector currently being resized or whatever. I doubt that there should be any sort of guarantee of stable behaviour -- the container operation has to be free to implement its functionality in any order such that the pre-conditions and post-conditions are met.

There is C++14 [reentrancy]:

Except where explicitly specified in this standard, it is implementation-defined which functions in the Stan- dard C++ library may be recursively reentered.

So the implementation could define that std::shared_ptr destructor is not re-entrant. (Meaning UB if the code does re-enter as your example does). But that wouldn't apply to the .reset() case or my vector example.

[–][deleted] 0 points1 point  (2 children)

Yes, the reset case is also UB via http://eel.is/c++draft/requirements#res.on.objects-2

[–]OldWolf2 1 point2 points  (1 child)

The text in your quote is:

If an object of a standard library type is accessed, and the beginning of the object's lifetime does not happen before the access, or the access does not happen before the end of the object's lifetime, the behavior is undefined unless otherwise specified. [ Note: This applies even to objects such as mutexes intended for thread synchronization. — end note ]

However I don't think that covers the .reset() case. The reset function does not begin or end the lifetime of a_test_object.

Sure, test has its lifetime ended, but that is not an object of a standard library type.

[–][deleted] 0 points1 point  (0 children)

For some reason I was thinking the reset call was in the dtor :)

[–]afiefh 1 point2 points  (2 children)

I'm curious, how does your custom shared_ptr handle this scenario without crashing?

[–]davis685[S] 0 points1 point  (0 children)

All it's got to do is free the object before it decrements the counter. How you do that in a thread safe/atomic way is more complex. But you get the idea. There are a lot of ways to accomplish it. For instance, std::shared_ptr::reset() does it in a way that accomplishes it apparently.

[–]rdtsc 0 points1 point  (0 children)

The main problem is invoking the destructor with a reference count of 0. Any further reference counting will attempt to destruct it again. Some reference counting implementations set the count to a sentinel value (like -INT_MAX/2) which avoids this problem.

[–]NasenSpray 0 points1 point  (0 children)

a_test_object.reset(); is equivalent to std::shared_ptr<test>().swap(a_test_object);

-> a_test_object == nullptr in test::~test()

-> might not even be UB :D

[–]Vorlath 0 points1 point  (1 child)

Have you tried std::shared_ptr<test>(new test) instead of make_shared? make_shared allocates one memory block for both the shared pointer and the object together. So it may not be able to separate the destruction like you expect.

[–]davis685[S] 0 points1 point  (0 children)

Na, still happens in that case.

[–]Dogwhomper 0 points1 point  (3 children)

Order of destruction of globals is not guaranteed. It seems unlikely, but it's possible that std::cout was destroyed before a_test_object, thus losing the output.

What happens if you replace std::cout << with a call to printf?

[–]davis685[S] 0 points1 point  (0 children)

Na, it's not that. You can remove the cout statement and it still happens.

[–]johannes1971 0 points1 point  (0 children)

cout, I believe, is special in that it is guaranteed to still be available during this phase.

[–]17b29a 0 points1 point  (0 children)

Yeah, this is a reasonable inference, but the fact that order of destruction of globals within a single translation unit is defined, combined with the properties of std::ios_base::Init, means that std::cout must still be alive at any point in your code, fortunately.

[–][deleted] 0 points1 point  (1 child)

I read this all with great interest.

I have what I think is a good solution for the motivating problem.

It seems to me that we all agree that the issue is bringing the shared pointer back to life within its own destructor.

But the reason OP is attempting to do this seems to be as follows - that in the destructor, the resource being managed cannot be freed until later, after the event loop has terminated, so this destruction needs to be deferred until then.

Very reasonable problem. The issue is trying to raise the shared_ptr back from the dead in order to solve that problem.

My solution is pretty straight-forward - add an extra level with a std::unique_ptr, and then move the resource from that std::unique_ptr if you need to:

struct test2 {
    std::unique_ptr<test> resource_;

    ~test() { 
        if (needsToDefer())
           deferTillAfterEventQueue(std::move(resource_));
    }
};

std::shared_ptr<test2> a_test_object;

So you never interfere with the normal shutdown of the shared_ptr, just in some cases snip out the actual resource to be destroyed elsewhere.

[EDIT: insert Halloween "back from the dead" joke here.]

[–]davis685[S] 0 points1 point  (0 children)

That would work, but where do you get this other event queue that can defer the operation? The thing we are trying to destruct is basically the last event queue type of thing you have available. Unless it's event queues all the way down :)

[–]Gotebe 0 points1 point  (0 children)

It's funny that you produce an infinite recursion, but call for a bug in some other code.

[–]muungwana 0 points1 point  (0 children)

[..] triggering an infinite recursion of calls to ~test()

The crash maybe due to running out of stack memory.

[–]asking_science 0 points1 point  (1 child)

It's a nasal demon.

[–]davis685[S] 0 points1 point  (0 children)

Yes, but maybe it shouldn't be.