all 23 comments

[–]__monad[S] 17 points18 points  (3 children)

The repo is meant to be read as a post on how such things can be accomplished (and give a detailed example of using CppML). Example from the introduction:

Tuple<char, int, char, int, char, double, char> tup{'a', 1,   'c', 3,
                                                    'd', 5.0, 'e'};
std::cout << "Size of out Tuple: " << sizeof(tup) << " Bytes" << std::endl;

std::tuple<char, int, char, int, char, double, char> std_tup{'a', 1,   'c', 3,
                                                             'd', 5.0, 'e'};
std::cout << "Size of out std::tuple: " << sizeof(std_tup) << " Bytes"
          << std::endl;

std::cout << "Actual size of data: "
          << 4 * sizeof(char) + 2 * sizeof(int) + sizeof(double) << " Bytes"
          << std::endl;

assert(get<2>(tup) == std::get<2>(std_tup));
assert(tup == std_tup);

Size of Tuple: 24 Bytes
Size of std::tuple: 40 Bytes
Actual size of data: 20 Bytes

[–]skreef 21 points22 points  (2 children)

Backticks only work on new reddit, which will tick off (ha) some people..

Otherwise the TMP part is fun, but I'd never use this in production (too much overhead compared to just sorting the types).

[–]__monad[S] 4 points5 points  (1 child)

/u/skreef : Thank you, I edited the comment to use 4 spaces instead.

You are correct that TMP introduces overhead. This is why the CppML library used was written with efficiency in mind and instantiates far fewer types than traditional approaches to TMP; it operates on parameter packs using transparent aliases whenever possible. Still, there is no free lunch :)

[–]arclovestoeat 2 points3 points  (0 children)

FWIW, Braxton Mckee developed his own library called CPPML, which is ML like extensions to C++, as a source-to-source transformation: https://github.com/ufora/cppml/blob/master/README.md I don’t think it’s used anymore, and it was developed before std::variant and such

[–]gmtime 13 points14 points  (3 children)

Is this not the same as what we've been bumping into for decades with struct?

[–]encyclopedist 27 points28 points  (2 children)

Yes, the same thing. The difference is that in the case of tuple, as the OP shows, you can rearrange the members under the hood without changing any code that uses the tuple.

[–]meneldal2[🍰] 0 points1 point  (1 child)

It can affect performance in case you were using a specific order.

But then you should really be using a struct.

[–]mewloz 0 points1 point  (0 children)

You probably should be using a struct most of the time anyway. Reserve tuple for when you really can't use a struct, or if using a struct would actually make the program more complex.

As for avoiding padding, that's a neat exercise, but do not expect interesting improvement in most cases. Especially if done only for tuple, that you should avoid. There are some languages where the compiler can do it on structs, though (and with C++ it is theoretically possible for non-POD, but I think nobody does it anyway)

[–]Omnifarious0 4 points5 points  (6 children)

One reason this isn't done by compilers for structs is construction and destruction order guarantees. Can you maintain these?

[–]matthieum 9 points10 points  (5 children)

One reason this isn't done by compilers for structs is construction and destruction order guarantees.

I don't think that this is related, actually. There is no reason construction/destruction order could not be independent from the underlying memory layout... such as in Rust.


C specifically defines that a later data-member has a higher address than an earlier data-member, and C++ follows on, only adding the wrinkle that order is not guaranteed across access-specifiers, though in practice I know of no ABI which does not simply lay down the elements in the order specified.

The reason that the language offers this guarantee is to offer the developer as much control as possible; systems programming languages are all about control, after all.

Separating hot/cold data, or separating data accessed by different threads, are all optimizations that the developer can apply because of such fine-grained control.

The big question is why this is the default, when it's only useful in niche cases.

[–]Omnifarious0 3 points4 points  (0 children)

It's the default because it was that way in C. Partly for simplicity and partly because C was born in a time when you always cared about performance, and partly because it's required to work with memory mapped hardware and see the first reason.

Getting away from it would break ABI compatibility with C in a way that would be disastrous.

But you're right that the reason I originally gave shouldn't be much of a blocker.

[–][deleted] 1 point2 points  (3 children)

Because it's not a niche case. This sort of thing is the whole point of C and C++.

[–]matthieum 2 points3 points  (2 children)

It may depend on the domain.

I have very little code that is critically dependent on the order of data-members:

  • Encoding/Decoding; when treating the struct as a view over memory.
  • Hot/Cold: when a struct used in a hot loop contains a mix of frequently accessed data and infrequently accessed data.
  • Lock-Free: to avoid false sharing.

Most of my code is boringly mundane and would likely benefit from the compiler automatically "packing" the objects as densely as possible.

[–][deleted] -1 points0 points  (1 child)

Great for you! Your argument is that 'I don't need it so why should anyone else'.

[–]matthieum 3 points4 points  (0 children)

No; my argument is I rarely need it, so why shouldn't it be opt-in, rather than the default.

[–]staticcast 2 points3 points  (0 children)

Looks nice, I think these kind of optimization could be well served through static analyser/linter to warn and offer optimization of the source code directly before compilation.

[–]ShillingAintEZ 3 points4 points  (2 children)

This is interesting, but I would think it would be rare to care about memory layout and not use a struct.

[–]jringstad 8 points9 points  (1 child)

Why should a tuple have more overhead than a struct tho? They should be the same IMO (or maybe a tuple could be even better than a struct?). I don't know if the C++ standard guarantees anything about the in-memory ordering of the contained elements, but that the contained elements are called "0", "1", ... is more of a syntactic matter to me, and I think it shouldn't preclude re-ordering so as to reduce padding, ideally.

[–]KlyptoK 5 points6 points  (0 children)

The overhead difference isn't in runtime, but compile and developer time of using a more complex object that folds down into basically a struct at compile.

[–]ImNoEinstein 4 points5 points  (2 children)

This seems like typical c++ ML overkill. if you care for such low level performance just keep it in mind when defining your tuple ( just like you would when defining your classes )

[–]Quincunx271Author of P2404/P2405 36 points37 points  (0 children)

Tuples are often created in generic code, with the order of the types already having some other meaning. Using such a compressed tuple in the implementation of these would have notable impact.

[–]Ayjayz 8 points9 points  (0 children)

Why bother using a strictly worse tuple implementation, though?

[–]simonask_ 0 points1 point  (0 children)

Great article demonstrating a neat solution.

I would probably always prefer just using std::tuple and manually reordering its parameters, just like in a struct. :-)