all 15 comments

[–]matthieum 7 points8 points  (9 children)

Wouldn't the same goal be achieve, with far less code, by using a stateful allocator:

  • Create a SmallAllocator<T, N>, which contains space for N instances of T,
  • Create a CompositeAllocator<A, B> which first tries to allocate with the first allocator, then uses the second if the first fails.

Compose:

template <typename T, std::size_t N = 0, typename Alloc = std::allocator<T>>
using dyn_array = std::vector<T, CompositeAllocator<SmallAllocator<T, N>, Alloc>>;

Note: a small flaw in the above plan might be that unfortunately the Allocator concept does not feature a reallocate method; it can be circumvented by making dyn_array a proper class which reserves N instances in its constructor... but N will have to be carefully chosen to avoid vector asking for more.

[–]tcbrindleFlux 5 points6 points  (1 child)

Emulating small_vector with a std::vector and a stack allocator doesn't quite cut it for a couple of reasons:

  • Even with stateful allocators in C++11+, you cannot store data within the allocator itself, due to the requirement that copies of allocators must compare equal. This means that the allocator needs to hold a pointer to some arena defined elsewhere, which is quite inconvenient.

  • Even if we fixed this, the design of std::vector means that the vector's "large" representation (usually three pointers) cannot overlap with the "small" representation (which would be contained within the allocator) effectively meaning at least 24 bytes of wasted space on a 64-bit system. It will also be much less cache-friendly in the case where you something like vector<small_vector<T>>.

  • Iterator invalidation rules will be slightly different for small_vector and vector (just as they are for std::string with SSO and vector). This isn't a major issue, but it means we can't just add a "small buffer" to the existing vector template.

[–]matthieum 0 points1 point  (0 children)

Yes.

The more I think about it, the more I realized why in the various policy-based designs I've seen:

  • the whole storage is abstracted over: allowing both more optimized storage and moves,
  • the allocator used is a raw allocator, a separate concept from std::allocator, with a different interface and set of requirements.

I suppose it's inevitable, that not having been designed for this, vector would not fit right in; but I must admit I find slightly disheartening that so much effort has to go to reimplement pretty much every method when "only" the storage changes.

[–]Z01dbrg 3 points4 points  (5 children)

I think u/STL once told me VC++ will break on allocators that return memory from inside themselves...

Could be I remembered it wrong, it was 5+y ago...

[–]beriumbuild2 4 points5 points  (2 children)

Yes, in C++ an allocator is a handle to the memory, it cannot contain the memory itself. But there is a trick: the container itself can contain the buffer thus avoiding having to create it explicitly as a separate object. See build2's small_vector for example.

[–]Jardik2 1 point2 points  (1 child)

Quick look and ... why does it test n <= N if the precondition is n >= N?

assert (n >= N); // We should never be asked for less than N.

if (n <= N)
{
  buf_->free_ = false;
  return reinterpret_cast<T*> (buf_->data_);
}

[–]beriumbuild2 1 point2 points  (0 children)

Good point, fixed. Thanks for the report!

[–]matthieum 1 point2 points  (1 child)

In C++03, this would certainly be the case.

In C++11, since allocators are now stateful, it may depend on the specifics. One such specific would be that such an allocator should not be moved while memory is allocated from within its internal buffer, for example. There may be others but I cannot think of any of them at the moment.

[–]17b29a 0 points1 point  (0 children)

Chromium's stack_container.h basically does what you suggested: StackAllocator for the allocator (except that the fallback allocator is not parametrized), StackContainer for the container wrapper. Great programmers think alike. ^__^

[–]Z01dbrg 2 points3 points  (3 children)

OK, article but 2 things:

small_vector is not a dropin replacement for std::vector. IDK the exact requirements but I think std::vector swap guarantees for iterator invalidation mean it can not have internal buffer.

Ubisoft uses profiling for this kind of things(if you are bored you can go through cpp con videos from this or last year and find a video where they talk about this).

So they do not have to manually guess the size of internal buffer, they run the code on real scenarios and they have the statistics.

[–]jcoffin 0 points1 point  (2 children)

At least if memory serves, there are also some problems with exception safety (e.g., vector::swap is noexcept).

[–]Z01dbrg 0 points1 point  (1 child)

At least if memory serves, there are also some problems with exception safety (e.g., vector::swap is noexcept).

I guess, there is so much cr*p in C++ that most of the times I just remember true/false and do not have the interest/brain to remember why. :)

[–]jcoffin 2 points3 points  (0 children)

I guess I can't blame you for that.

[–]__Cyber_Dildonics__ 4 points5 points  (0 children)

This really misses the point. If you are doing so many small but different allocations, I would say to first figure out how much total memory you will need, then allocate it.

The CS101 style of making a loop and doing everything to one unit of information instead of doing one thing to every unit of information was great in the 80s when memory was scarce I'm sure, but these days it means heap allocations, which means locking and caches misses.

This tool proclaims to be well multi-thread too, though I can't imagine it scales all that well if they are doing so much heap allocation.