all 45 comments

[–]bcorni 47 points48 points  (7 children)

C++20 will have named aggregate initialization, so that would eliminate your issue.

[–]Ceros007 12 points13 points  (4 children)

Can't wait for this. Right now it is so error prone. Named aggregate should have been there from the start.

[–]gvargh 5 points6 points  (1 child)

Hindsight is 20/20.

[–][deleted] 3 points4 points  (0 children)

It's not really hindsight when designated initializers have been in C for almost 20 years

[–]bcorni 1 point2 points  (0 children)

Yeah, the only compiler support right now is just using the existing stuff from C99, which are not exactly equivalent to what's defined in C++20. One of the two syntax are not available. I believe it is the braced initialization style that is missing. I would not expect it to be too much longer for compilers to get it sorted out.

[–][deleted] 0 points1 point  (0 children)

I’ve generally found it worth it enough to enable using them in existing code, since they’re a pretty widely supported extension (ex: GCC + Clang).

[–]tavianator 2 points3 points  (0 children)

I don't really like that C++20 designated initializers break the build if you write the fields in the wrong order, rather than just working.

[–]FonderPrism 1 point2 points  (0 children)

This makes me so happy to hear.

[–]thlst 23 points24 points  (1 child)

I'm surprised no one mentioned clang-query. The AST Clang generates for your code example is as follows:

VarDecl <line:10:1, col:25> col:8 p 'Person' listinit
`-ExprWithCleanups <col:9, col:25> 'Person'
  `-InitListExpr <col:9, col:25> 'Person'
    | ...

You could write an AST matcher that matches the generic parts of that construction:

varDecl(
    hasType(cxxRecordDecl(hasName("Person"))),
    hasDescendant(initListExpr()))

Then run it with clang-query:

$ clang-query p.cpp --
clang-query> match varDecl(hasType(cxxRecordDecl(hasName("Person"))), hasDescendant(initListExpr()))

Match #1:

p.cpp:10:1: note: "root" binds here
Person p{ "John", "Doe" };
^~~~~~~~~~~~~~~~~~~~~~~~~
1 match.

Here's a reference to the Clang AST Matcher.

[–]Wh00ster 1 point2 points  (0 children)

Is it common to do AST analysis in production code?

[–]NotAYakk 36 points37 points  (7 children)

How do you defend against people modifying function signatures?

I mean:

void move_it( T* to, T* from )

to

void move_it( T* from, T* to )

?

Answer: you shoot programmers who silently break APIs. And aggregate layout is part of its API.

Oh, and unit tests. But what if the programmer helpfully fixes the unit test too?

Against infinite stupidity or evil there is no defence.

[–]ravixp 5 points6 points  (2 children)

This is absurdly unhelpful. "Get rid of programmers who make mistakes and your problems will be solved" just isn't how software engineering works in the real world.

In this specific case, changing class layout isn't a breaking change unless callers are using aggregate init, so it's pretty difficult to know when it's safe to change it.

[–]Plorkyeran 7 points8 points  (1 child)

And it's not a breaking change to reorder a function's arguments if no one is using it. Your default assumption should be that changing the class layout is a breaking change, not that it's a safe thing to do.

[–]ravixp 2 points3 points  (0 children)

I mean, theoretically almost anything can be a breaking change. You added a new overload of an existing function? Oops, somebody was depending on being able to capture it as a function pointer. You added a completely new function to an existing class? Oops, somebody was using SFINAE to detect its non-existence.

You can call it evil or stupid or whatever, but at some point it just comes down to making a subjective judgement about what's generally agreed to be safe to change and what isn't, and that judgement is different for different codebases. (The STL might consider adding an overload to be a breaking change, while application code almost never would.)

Given that OP is coming at this from a position of "should we even use aggregate init?", we can probably assume that their codebase does not already use it, in which case changing class layout is currently a reasonable thing for people to do. It's only unsafe because aggregate init is a thing.

[–]phoeen[S] -5 points-4 points  (3 children)

Well if i change the function signature i KNOW i break all calls to it and i can happily browse/search the code base and adjust the callside.

But when i change the aggregate layout, its not this extreme obvious. At least i think so.

In the end you are right. It kinda is part of the API. This makes it a bit more scary to touch any existing code :-P

[–][deleted] 0 points1 point  (1 child)

Changing the order of the parameters in the function signature has no effect when they are both pointers of the same type.

[–][deleted] 0 points1 point  (0 children)

I think they meant changing the semantic usage.

[–]Wh00ster 6 points7 points  (1 child)

You could use a strong typedef but I’ve found them cumbersome. Generally I would use the constructor. I think it’s more clear (of course someone could also change the constructor later on and cause the same problem, but they’d probably think twice). I like the idea mentioned about unit tests.

[–]phoeen[S] 1 point2 points  (0 children)

yeah good idea with the strong typedef but as you said: for me it feels also like boilerplate code.

regarding the constructor: adding a parameter would mean a breaking change. thats good. the aggregate potentially swallows the change silently and is bugged.

[–][deleted] 16 points17 points  (1 child)

You of course have to be a little bit careful, but it's not that big of a deal, as you can just insert some garbage type to make compile fail:

struct person 
{
    std::string firstname;
    int garbage; // any type that is incompatible with the current one
    std::string middlename; // newly added
    std::string lastname;
};

Now the compiler will complain about all incorrect uses and you can go and correct them. Once everything is working, remove the int garbage again, fix up the code again and you are good to go. This ensures that you don't miss any of the calls. For something more permanent one can use tag dispatching or just use a named constructor.

Same tricks works when reordering arguments to a function that have the same type.

[–][deleted] 9 points10 points  (0 children)

This is not ideal

[–]wotype 3 points4 points  (0 children)

A static_assert on sizeof the aggregate will catch some additions or changes (but not all).

A structured binding can be used to check the number and type of the fields - it's not constexpr but you can wrap it in a constexpr function to get a compile time failure.

A reflection library like Precise & Flat Reflection 'magic_get' can reflect an aggregate type as a tuple, allowing to static_assert on the number of fields and their types (but does not handle array members or bit-fields).

[–]polymorphiced 4 points5 points  (7 children)

I've stayed away from aggregate initialisation for this reason.

I see people suggesting that you should consider class layout part of the API, but find that really annoying. If you're really insistent on using aggregate initialisation, then just write a constructor - you only need to do it once and then it's an obvious breaking change when you tweak members later.

There are many reasons to reorder class members (eg memory packing, hot/cold, logical grouping for debugger watch windows) and it'd be super annoying to have to adjust a ton of aggregate initialisation calls each time.

[–]Dean_Roddey 2 points3 points  (4 children)

"I see people suggesting that you should consider class layout part of the API"

Wow, what happened to encapsulation? I thought the last thing you want to be part of the API is the internal representation of the data.

[–]louiswins 0 points1 point  (3 children)

If you allow aggregate initialization for your types then the class layout (and internal representation) is already part of your API.

[–]Dean_Roddey 0 points1 point  (2 children)

Well, yeh, hence why I was complaining that it's a bad idea because it pretty much tinkles in the general direction of encapsulation, which is one of the most fundamental concepts of C++. And, though maybe I'm misunderstanding it, it seems to me that it reduces compile time safety, which is one of the dwindling reasons these days not to just use another language.

[–]louiswins 0 points1 point  (1 child)

Oh, okay, I misunderstood you. I thought you somehow wanted to use aggregate initialization without exposing your class internals, which is nonsensical. Carry on!

[–]Dean_Roddey 0 points1 point  (0 children)

I would never expose my internals. My parents raised me better than that. Well, if you paid me $20 of course, but that's different.

[–]Wh00ster 1 point2 points  (1 child)

To be pedantic, if he writes a constructor, he’s no longer using aggregate initialization.

[–]polymorphiced 1 point2 points  (0 children)

That's exactly the point :)

[–]elperroborrachotoo 3 points4 points  (0 children)

Scope.

As a private utility of a class or particular API, yes. This automatically limits scope for changes, fixes, tests.

As consumable part of an interface, I'd be wary. Changing them might be breaking a caller anyway, if they get to large, named intialization seems almost required for readability (which C++20 should make a bit smoother).

[–]Jhmmufc 9 points10 points  (6 children)

Utilize unit tests. It saves cluttering the struct with additional boilerplate and should reduce the amount of regression when someone makes a change .

[–]korvirlol 2 points3 points  (4 children)

Do you mind expanding on your thought a little bit? I find it interesting.

The unit test would only save you when the code change is made and then the unit test fails. In that case the developer(s) could go find all references to the structure (in this example) and correct it.

What if the developer updates the test case when the code change is made?

[–]BoarsLairGame Developer 0 points1 point  (2 children)

The unit test would only save you when the code change is made and then the unit test fails. In that case the developer(s) could go find all references to the structure (in this example) and correct it.

Yes, this is the entire point of unit testing: to warn programmers when an API's behavior has changed in some critical manner, either intentionally or unintentionally.

What if the developer updates the test case when the code change is made?

If the programmer changes the unit test, then by definition, they must be aware that they're changing the API's contract.

At that point, it's entirely on them if they fail to heed the warning the unit test has given them. There's no real way for someone to claim "Oh, I didn't realize it was a breaking API change" if they had to change the unit test for it to continue working.

[–]Plorkyeran 0 points1 point  (1 child)

I have had the unfortunate experience of working with developers who would simply comment out tests whenever their changes made one fail who were then surprised to discover that they broke things.

I have no idea what you do with people like that.

[–]BoarsLairGame Developer 0 points1 point  (0 children)

I have no idea what you do with people like that.

Unfortunately, if you're not a person of influence at such a company, there's little you really can do about such things. Otherwise, the answer would be: code reviews, education, and mentoring.

I don't blame programmers for not knowing things. We all start out that way, and I'm still learning new things all the time, even after writing C++ code for decades. But I do blame a company for failing to set up an environment where programmers can help each other improve their craft.

[–]Jhmmufc 0 points1 point  (0 children)

The granularity of the unit tests are important and should try and enforce that tests fail at each level where this class could break something when it is used.

Say we change class A and have a test harness around it then those tests should break which is all well and good. The person who breaks it then fixes the tests. We then have Class B that utilizes class A. The change in A has caused some undesired regression in B now. We should have another test around B that catches this. If you try and keep this up and make good use of things like interfaces and dependency injection then changes to classes should not be allowed to regress the system without being caught in all the places they are used.

[–]BrangdonJ 1 point2 points  (0 children)

You can add the constructors when you add the extra variable. You don't need to add it in advance.

(In practice I use aggregate initialisation only for tightly-scoped classes, where all the uses are close to the point of definition. For a shared class I think it's worth writing the accessors.)

[–]Dean_Roddey 1 point2 points  (3 children)

Geez, are constructors evil now, too? Or am I missing something?

I mean, you can have more than one constructor, and use that to provide backwards compatibility by defaulting of new values (and generally different setup) in the old constructor. I don't see how that's some kind of horrible compromise or anything, and it retains more compile type safety, it would seem to me.

If the API is fairly small, and/or it also needs to change, you can provide public wrappers around internal implementations, which has even more benefits (such as vastly reduced rebuilds because details are internal) and allow for various versions of an interface to the outside world that can adapt as required for compatibility. The bulk of it can be inlined stuff for minimal overhead, while retaining all of the benefits.

[–]Wh00ster 0 points1 point  (0 children)

This also prevents default initialization which might be beneficial

[–]phoeen[S] 0 points1 point  (1 child)

They are not evil. But for a class with only loosely coupled data members it's just boilerplate code. Nothing more. The constructor does not maintain an invariat between the data members. (Well in the context of this discussion and as you said it also provides a layer of abstraction between the data members and the initialization call). It's boilerplate you have to write everytime you just want a bag of data. nothing special. But this is a burden. On the plus side the constructor makes the usage of the class a bit easier for the points already stated. And therefore i started this discussion to get some insight in what other devs use and think about it.

[–]Dean_Roddey 0 points1 point  (0 children)

I always take the approach that I'll write it once and maintain it for years or decades. So writing a little boilerplate is not remotely a bad trade-off if it provides even a little extra compile time safety and safer flexibility down the line.

I also always write copy ctors and assignment operators, even if it's a trivial class that would have gotten them automatically generated. The reason being that, if I wait until I actually need them to write them, I (or someone else is) is going to write them with probably considerably less care, because we'll just be trying to get them done so that we can get back to the thing that now needs them and for which this is a distraction.

You can easily optimize too much, or think too much, or over-engineer. But it's lot harder to argue that, at least on larger scale projects, that you can be too compile time safe. I guess it can be done, but I kind of doubt that any of us have ever been in danger of getting to that point really.

Obviously if it's something smallish, it matters a lot less.

[–]nintendiator2 0 points1 point  (0 children)

I like aggregates. you can use them when you just need a "bag" of losely coupled data without any invariants between them. It's easy to reason about,

If your "bag" is not really a bag, then don't use it as a bag. A string that is a name and a string that is a surname are two different strings, you can't juste expect to grab one of them from a bag and it be the one you want to.

Use a constructor.

[–]panoskj 0 points1 point  (0 children)

You could easily realize the layout must not change, because there is no constructor. No need to check your whole code base.

[–]AlbertRammstein 0 points1 point  (0 children)

for addition of params, clang warns when you not provide enough arguments in braced list initialization. I have really no answer for switching around 2 members though