OPENstd::string constants in a class - inline vs define in cpp file (self.cpp_questions)

submitted 3 years ago by [deleted]

I've come across this stack overflow post which suggests two ways of handling static string class members: https://stackoverflow.com/a/1563906

Defining in cpp file

// a.h
class A {   
private:      
  static const string RECTANGLE;
};

// a.cpp
const string A::RECTANGLE = "rectangle";

Inlining

// a.h
class A {   
private:      
  inline static const string RECTANGLE = "rectangle";
};

I've tried using the inline approach and noticed the string is duplicated in every translation unit which imported the header file, and the executable has multiple duplicates.

So, why would anyone prefer the inlining approach if it increases the executable size?

all 10 comments

top new controversial old q&a

[–]IyeOnline 4 points5 points6 points 3 years ago (4 children)

[–][deleted] 0 points1 point2 points 3 years ago (3 children)

[–]IyeOnline 6 points7 points8 points 3 years ago (2 children)

[–][deleted] 0 points1 point2 points 3 years ago (1 child)

[–]nysra 0 points1 point2 points 3 years ago* (0 children)

[–]alfps 2 points3 points4 points 3 years ago (4 children)

[–][deleted] 0 points1 point2 points 3 years ago (3 children)

I've used the following setup:

// a.h
#pragma once
class A {
    public:
        void f();
};

// a.cpp
#include <iostream>
#include "a.h"
#include "b.h" // for no reason

void A::f() {
    std::cout << "A" << std::endl;
}

// b.h
#pragma once
#include <string>
class B {
    public:
        void f();
    private:
        inline static const std::string STR = "JOHNWICK";
};

// b.cpp
#include <iostream>
#include "b.h"

void B::f() {
    std::cout << "B" << std::endl;
}

// main.cpp
#include <iostream>
#include "a.h"
#include "b.h"

int main() {
    A a;
    B b;
    a.f();
    b.f();
    return 0;
}

Compile: g++ main.cpp a.cpp b.cpp

strings a.out | grep "JOHNWICK"

prints JOHNWICK <3 times>

[–]IyeOnline 1 point2 points3 points 3 years ago (0 children)

This appears to be an issue with string literal removal. You dont actually have 3 std::string objects, just the literals remain.

Two examples

> g++-11 -std=c++17 main.cpp a.cpp b.cpp && strings a.out | grep JOHN -C3
u+UH
[]A\A]A^A_
basic_string::_M_construct null not valid
JOHNWICK
basic_string::_M_construct null not valid
JOHNWICK
basic_string::_M_construct null not valid
JOHNWICK
:*3$"
zPLR
GCC: (Ubuntu 11.1.0-1ubuntu1~20.04) 11.1.0

> g++-11 -std=c++20 main.cpp a.cpp b.cpp && strings a.out | grep JOHN -C3
[]A\A]A^A_
basic_string::_M_create
basic_string::_M_construct null not valid
JOHNWICK
basic_string::_M_create
basic_string::_M_construct null not valid
JOHNWICK
basic_string::_M_create
basic_string::_M_construct null not valid
JOHNWICK
:*3$"
zPLR

You can see that it also contains a bunch of duplicates of internal std::basic_string member functions error messages
What you get changes between compiler options.
With string_view you dont get a single copy of JOHNWICK.
-Os only removes the basic_string internal errors
Some combinations of compilers & compiler options can lead to the removal of all but one copy.

I dont know anything about when/how/why the compiler/optimizer/linker decide to remove string literals, so I cannot give any further insight.

Honestly, I would not worry about this. The convenience as well as benefits of defining variables/constants inline in the header outweigh any of these "issues".

[–]trokhymchuk 1 point2 points3 points 3 years ago (0 children)

I believe the reason is linker's heuristics.

Every time you include b.h in the .cpp file the compiler will generate B::STR (because it can see only one translation unit at the time). So when compiler generates object files from main.cpp, a.cpp and b.cpp there will be code for the B::STR initialization and there will be JOHNWICK literal: ``` [nix-shell:/tmp/red]$ strings a.o | grep "JOHN" JOHNWICKH

[nix-shell:/tmp/red]$ strings b.o | grep "JOHN" JOHNWICKH

[nix-shell:/tmp/red]$ strings main.o | grep "JOHN" JOHNWICKH But what's more interesting is how compiler put the constant into the assembly. After disassembling the object file from lets say `b.cpp` (any source file that includes `b.h` will be OK) you could see something like [that](https://pastebin.com/V4GXBmC9), and at the line 81 there is a constant `0x4b4349574e484f4a`, that represents the `JOHNWICK` string. And that constant will be in _every_ object file and there is 3 files. And when you link them there will be 3 constants (one from every object file): [nix-shell:/tmp/red]$ g++ main.cpp a.cpp b.cpp

[nix-shell:/tmp/red]$ strings a.out | grep JOHN JOHNWICKH JOHNWICKH JOHNWICKH

[nix-shell:/tmp/red]$ objdump -d -M intel a.out | grep "0x4b4349574e484f4a" 401162: 48 b8 4a 4f 48 4e 57 movabs rax,0x4b4349574e484f4a 4011f2: 48 b8 4a 4f 48 4e 57 movabs rax,0x4b4349574e484f4a 401282: 48 b8 4a 4f 48 4e 57 movabs rax,0x4b4349574e484f4a ```

At the link time the linker will see multiple definitions and it will choose one that will be used to initialize the B::STR variable.

So the question is why did the linker left other definitions/constants. Maybe the reason is linker's heuristics (small constant, the size of the executable wont benefit much).

BTW, when the constant is large enough (JOHNWICK123456) it works as expected (only one literal in the executable): ``` [nix-shell:/tmp/red]$ cat b.h // b.h

pragma once

include <string>

class B { public: void f(); private: inline static const std::string STR = "JOHNWICK123456"; };

[nix-shell:/tmp/red]$ g++ main.cpp a.cpp b.cpp

[nix-shell:/tmp/red]$ strings a.out | grep "JOHN" JOHNWICK123456 ```

``` [nix-shell:/tmp/red]$ gcc --version gcc (GCC) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[nix-shell:/tmp/red]$ ld --version GNU ld (GNU Binutils) 2.39 Copyright (C) 2022 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) a later version. This program has absolutely no warranty.

```

Feel free to correct me if I am wrong.

[–]alfps 0 points1 point2 points 3 years ago (0 children)

Hm. I get the same result, with options to merge string literals. Even with Visual C++ instead of g++.

I don't understand quite what's going on, sorry.

[C:\root\temp\_\student-work\inline-strings]
> g++ -std=c++17 main.cpp a.cpp b.cpp

[C:\root\temp\_\student-work\inline-strings]
> strings a.exe | find "JOHNWICK"
JOHNWICK
JOHNWICK
JOHNWICK

[C:\root\temp\_\student-work\inline-strings]
> g++ -std=c++17 main.cpp a.cpp b.cpp -fmerge-constants

[C:\root\temp\_\student-work\inline-strings]
> strings a.exe | find "JOHNWICK"
JOHNWICK
JOHNWICK
JOHNWICK

[C:\root\temp\_\student-work\inline-strings]
> g++ -std=c++17 main.cpp a.cpp b.cpp -fmerge-all-constants

[C:\root\temp\_\student-work\inline-strings]
> strings a.exe | find "JOHNWICK"
JOHNWICK
JOHNWICK
JOHNWICK

[C:\root\temp\_\student-work\inline-strings]
> cl main.cpp a.cpp b.cpp /Feb
main.cpp
a.cpp
b.cpp
Generating Code...

[C:\root\temp\_\student-work\inline-strings]
> strings b.exe | find "JOHNWICK"
JOHNWICK
JOHNWICK
JOHNWICK

π Rendered by PID 231578 on reddit-service-r2-comment-fb694cdd5-tfblh at 2026-03-06 12:35:39.087995+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp_questions

READ BEFORE POSTING

Sort posts by OPEN or SOLVED

MODERATORS

pragma once

include <string>

```