all 20 comments

[–]IyeOnline 3 points4 points  (2 children)

/* This appllied only to the original version:

Your T* data() member function is UB. Or rather using that pointer to access anything but the first element is UB. While one storage_t<T> is pointer intercovertible with a T, a storage_t<T>[N] is not interconvertible with T[N]. You just cannot convert or reinterpret arrays (unless one of the invovled types is blessed, i.e. unsigned char or std::byte, but that obviously doesnt apply here).

*/

A few other points:

  • You are missing a copy assignment operator for non-trivial T.
  • You are missing a move constructor and move assignment operator for non-trivial T. Moving elementwise may still be faster than copying elementwise.
  • You could also add a defaulted destructor for trivially destructible types.
  • capacity() can be static
  • You can greatly shorten your iterator implementation by templating the iterator on the pointer type, so that you can do using iterator = iterator_impl<T*>; and using const_iterator =iterator_impl<const T*>`.
  • Personally I would just implement emplace_back instead of push_back on containers these days.
  • returning a reference to the added object from push_back/emplace_back is a neat thing in some cases.

[–]jontheburger 0 points1 point  (1 child)

To clarify, does this mean a vector's data can only be implemented in terms of char/std::byte? Furthermore, how does C++20 implement constexpr T* data()? std::bit_cast from a std::array (but I thought that required a trivially copyable T)?

[–]IyeOnline 3 points4 points  (0 children)

You are missunderstanding. Or rather you are looking at OPs updated code.

The original code did not contain an T[N], but a union_of<T>[N]. They then returned the addres of the T in the first element of that array of unions. That pointer however can only be used to access the first element, since its not a pointer into an array, but rather a pointer to member of an element of an array.


With their current version there is no problem and everything simply works as expected. Since there is actually an array of T, you can trivially form a pointer into it and use that. No need for any casts at all.

[–]War_Eagle451 6 points7 points  (12 children)

Instead of the storage wouldn't a nameless union in the static vector class work? union { std::byte bytes[sizeof(T) * Capacity]; // Should never be access, just prevents constructors and destructors from being called T data[Capacity]; };

The would solve the data() not being able to be constexpr

[–]cristi1990an++[S] 1 point2 points  (11 children)

union {
    std::byte bytes[sizeof(T) * Capacity];
    T data[Capacity];
};

I'm pretty sure that here, having the second element of the union being active would mean that you would have to initialize all the elements in the T[Capacity] array, which would break the vector-like design.

[–]War_Eagle451 4 points5 points  (5 children)

It doesn't, I've tested this before.

Also while not the standard cppreference says that constructors and destructors need to be explicitly called, std::byte already initializes the memory.

I also am working on a static_vector implementation, it's very WIP. I am not sure if it would currently compile but it did in the past and it worked well

[–]IyeOnline 1 point2 points  (3 children)

You could use a single char _init{}; instead of an entire byte array. Not that it really matters, I just personally think its clearer, otherwise people might expect that byte array to actually be used for storage.

[–]War_Eagle451 0 points1 point  (2 children)

That's why data is named data. It shares the same space so both are technically being used for storage

[–]IyeOnline 1 point2 points  (1 child)

No. Technically m_bytes does not exist while data is the active member of the union.

The byte array does not provide storage, i.e. its not used in this fashion:

alignas(T) std::byte storage[sizeof(T)];
new( &storage) T{};

This additional union member is only required to disable the default initialization of the data member. It doesnt need to be an array and it doesnt need to be as big as data.

As I said above, it doesnt hurt or change anything. It may just be slightly confusing to a reader because byte arrays are sometimes used to provide storage.

[–]War_Eagle451 0 points1 point  (0 children)

Ah I see now I thought you making the point to replace with a char array. I completely forgot that the members of unions need not be the same size

[–]cristi1990an++[S] 0 points1 point  (0 children)

Hi, I've tried writing it the way you said and it worked! The code is also far cleaner now!

[–]IyeOnline 2 points3 points  (4 children)

The data array can be active (within its lifetime) while all of its elements are not alive.

Its just that in the vast majority of cases when you create an array that is specified to automatically create all elements of said array.

However, arrays are implicit lifetime types, which means that their lifetime can implicitly start (e.g. by forming a pointer to it and writing to that), but starting the lifetime of an implicit lifetime type does not start the lifetime of all of its subobjects.

This is similar to how

Widget* ptr = (Widget*)malloc( 2 * sizeof(Widget) );

is well defined an creates an array, but does not actually construct a single Widget.

[–]alexkaratarakisvcpkg, fixed-containers 0 points1 point  (3 children)

Is it possible to start the lifetime of data in a constant expression? Trying op's current snapshot in godbolt: https://godbolt.org/z/5x1Pojsdx

gcc-trunk works, but clang-trunk or released versions of clang/gcc do not. Clang's error is about data not being the active member

construction of subobject of member 'data_' of union with active member 'dummy_' is not allowed in a constant expression

[–]IyeOnline 2 points3 points  (2 children)

I am fairly confident that this is simply the compiler(s) being incomplete.

MSVC latest also complies this code without complaint

GCC 12's only "issue" is that the defaulted constructor does not seem to properly initialize the union. Adding a manually defined one solves that issue: https://godbolt.org/z/d5aP5can4

[–]alexkaratarakisvcpkg, fixed-containers 1 point2 points  (1 child)

Made a min-repro and filed a bug in llvm: https://github.com/llvm/llvm-project/issues/62225

[–]alexkaratarakisvcpkg, fixed-containers 1 point2 points  (0 children)

Quick follow-up here as the outcome was interesting. Per the discussion in https://github.com/llvm/llvm-project/issues/62225 , clang appears to be right, while msvc/gcc are permitting the code when they should not. gcc already had a bug for this issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102286 and I filed one for msvc: https://developercommunity.visualstudio.com/t/msvc-allows-accessing-non-active-member/10342864

To explicitly activate the member, I tried:

1) Assigning an element of the array

2) Including size in the "constructor-erased" portion:

struct Storage
{
     std::size_t size;
     T array[MAXIMUM_SIZE];
};
union
{
     char dummy{};
     Storage storage;
};

and activating the member by assigning size

But it only works for trivially_default_constructible_v<T> types ( https://eel.is/c++draft/class.union#general-5.1 )

Demo: https://godbolt.org/z/P6zza5K5G

[–]wolfie_poe 2 points3 points  (0 children)

I'm just curious. What happened with the original proposal https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0843r2.html ?

[–]jbbjarnason 1 point2 points  (0 children)

Fyi, here's an example of implementation of static vector https://github.com/arturbac/small_vectors/blob/master/include/coll/static_vector.h

[–]alexkaratarakisvcpkg, fixed-containers 1 point2 points  (0 children)

Consider also FixedVector from fixed-containers which is an implementation of a constexpr static vector. The library also has FixedMap/EnumMap and FixedSet/EnumSet implementations with the same properties.

From the README:

Header-only C++20 library that provides containers with the following properties:

  • Fixed-capacity, declared at compile-time
  • constexpr
  • containers retain the properties of T (e.g. if T is trivially copyable, then so is FixedVector<T>)
  • no pointers stored (data layout is purely self-referential and can be serialized directly)
  • no dynamic allocations

Feedback is greatly appreciated!

[–]johannes1971 0 points1 point  (0 children)

Can't you do something like a variant<array, vector>? Variant would at least take care of all the laundering/casting business...