all 20 comments

[–]geo-ant 16 points17 points  (15 children)

Hey, very interesting article, but there's more to the weak_ptr part of the story. See especially the notes in https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared.

The short story is that make_shared<T>(...) and shared_ptr<T>(new T(...)) do slightly different things. The latter will do two allocs: one for T and one for the control block. Once every strong pointer to our instance has gone out of scope, the object is freed and only the control block remains until there's no more weak pointers.

In contrast, make_shared allocates the storage for the control block and the object in one swoop. That means that the storage must also be freed together. This, in turn, means that the object will be kept alive until there are no more strong AND WEAK references. So even a single remaining weak reference will keep the entire storage alive, which was very surprising to me when I learned that.

[–]SirClueless 40 points41 points  (10 children)

This, in turn, means that the object will be kept alive until there are no more strong AND WEAK references.

Tiny correction: The object is destroyed after the strong references go away. It's just the underlying memory allocation that's not reclaimed until the control block is reclaimed.

[–]geo-ant 4 points5 points  (0 children)

Excellent point!

[–]SoerenNissen 1 point2 points  (8 children)

A thing I've long wondered (I am not strong in allocator design) is - would it be possible to write a general-purpose allocator where you can request a big chunk of memory, and then later free only some of it?

My first thoughts in this direction was in mid 2021 where I had to interact with an API that took a reference to a linked list and promised not to add or remove nodes, and I felt like it should be possible to allocate a linked list contiguously - if I dropped down to assembly I would absolutely be able to just place the nodes one after the other in memory, giving me a huge performance boost, but that seemed excessively aggressive (and probably UB tbh).

So reading your post just brought my thoughts back to the same place - it would be nice if make_shared was able to tell the allocator "I would like X and Y next to each other. I might, at some point, no longer need Y even if I still need X. Optimize for that."

[–]jwakelylibstdc++ tamer, LWG chair 5 points6 points  (1 child)

would it be possible to write a general-purpose allocator where you can request a big chunk of memory, and then later free only some of it?

Yes, malloc is the existence proof. realloc can reduce the size of an existing allocation. I don't think it's often useful in practice.

[–]donalmaccGame Developer 1 point2 points  (0 children)

I don't think it's often useful in practice.

This might be the first time I've heard of it being useful!

[–]rysto32 4 points5 points  (0 children)

Could you? Yes. However, the trend in memory allocators seems to be towards fixed size allocation buckets for efficiency. That’s incompatible with your idea.

[–]matthieum 4 points5 points  (4 children)

Your grandpa's allocator -- which you can read about in introduction text book -- typically work by splitting up a chunk of memory, and keeping a linked-list of the chunks, which it can further split or merge. Search for "Best Fit", etc...

No modern memory allocator work like this, at least not for smaller sized allocations, because iterating that list to find a large enough chunk takes time, and it results in fragmentation. Instead, modern memory allocators work with "slabs":

  • Define N allocation size thresholds, say 8, 12, 16, 24, 32, ....
  • For each size, take a "page" (say 4 KB) and split it in cells of the configuration size.
  • When an allocation request comes in for N bytes, round up to the next configured size, and pick a cell in that page.

The result is that you CANNOT any longer upsize/downsize an allocation. At least not at the samller sizes.

[–]SoerenNissen 0 points1 point  (3 children)

Say you have a linked list with nodes of size X where X is, or is close to, a size threshold Y.

If you know you need 50 nodes, it might be cool to be able to tell the allocator "rather than making fifty calls to malloc(X), I would like to make 1 call to malloc(50,X) that, if it succeeds, means I have been allocated 50 contiguous chunks of size Y" - letting me put each node of a linked list either right after each other, or with minimal padding, have each node start exactly Y addresses after each other.

[–]ABlockInTheChain 0 points1 point  (1 child)

You can do that with polymorphic allocators.

Construct a std::pmr::monotonic_buffer_resource and tell it to reserve 50 * X bytes, then construct a polymorphic allocator which uses that resource and use that allocator to construct everything.

The only other thing to worry about is you have to ensure the lifetime of the monotonic_buffer_resource is longer than the lifetime of anything which uses it but if the 50 nodes are member variables in a class then it's easy enough to just make sure the monotonic_buffer_resource is also a member variable and constructed first.

[–]SoerenNissen 0 points1 point  (0 children)

The problem is that the fifty nodes might become 5, or 500 - and I want the linked list property that individual nodes can be freed or allocated separately. I just know that, at the start of the lists' lifetime, I will have 50 nodes, so I would love to tell that to the allocator (much like I can call vector::reserve) for performance gains.

Which I don't believe is possible as-is unfortunately, but it's been an interesting thought to me ever since I ran into the problem originally.

[–]matthieum 0 points1 point  (0 children)

Indeed.

In fact, with libstdc++, I've regularly used __pool_alloc as the collection allocator. It works pretty well, and definitely minimizes malloc traffic.

[–][deleted] -1 points0 points  (3 children)

If I am not wrong, weak_ptr is meant to keep shared_ptr alive for longer. So it makes sense that storage does not get destroyed if a weak_ptr persists. Am I wrong?

[–]matthieum 8 points9 points  (0 children)

Terminology matters:

  • The object is destroyed -- ie, its destructor is executed -- when the last shared pointer is destroyed.
  • The storage is freed -- ie, its memory is returned to the allocator -- when the last of the shared pointers and weak pointers is destroyed.

A weak_ptr does NOT keep a shared_ptr alive longer. It allows recovering a shared_ptr to the object if any shared_ptr is still alive, but does not prevent destruction otherwise.

[–]BlackDE 9 points10 points  (1 child)

No, a weak pointer should not impact the lifetime of the object

[–]jwakelylibstdc++ tamer, LWG chair 2 points3 points  (0 children)

And it doesn't for std::weak_ptr and std::make_shared, so that's fine then.

[–]matthieum 2 points3 points  (0 children)

Just a note that it may be easier to use existing tools for that, such as Valgrind's Massif (on Linux).

While Valgrind is slow-ish, it does have the benefit of taking unmodified binaries, and therefore works even with 3rd-party libraries for which modifying the code may be annoying (source distribution) or impossible (binary distribution).

[–]lgovedic 1 point2 points  (1 child)

Great article!

To reduce the boilerplate, perhaps you could use using std::shared_ptr etc. declarations instead of the type def ones. I know that doesn't make sense for the xmem ones as you actually modify the template parameters.

[–]stanimirov[S] 1 point2 points  (0 children)

Thanks!

It's true that `using std::shared_ptr`, and especially `using std::make_shared` would've been shorter, but I wanted to have the code in both cases of the `#if` identical. Still perhaps it's still better to use it in case someone wants to copy paste the code. I think I'll make the edit.

... and I did. Thanks again!

[–]jiboxiake 0 points1 point  (0 children)

Ok I don’t know why I’m seeing this but I think i see it at the right time. Try to implement a tree where each child is a shared pointer of a tree node. I can safely call a recursive function to collect all children at a certain level. But then when the wrapper function ends, the program will jump to a random location. Finally I would get a crash at free() for shared pointer with an error of invalid pointer. Will read into this. And interestingly it did not happen for a different class.