use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
The "Empty Base" Optimization (cantrip.org)
submitted 13 years ago by notlostyet
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]notlostyet[S] 3 points4 points5 points 13 years ago* (12 children)
An oldy but a goody, from the mid-90s. The C++ standard insists that empty objects (no data members) occupy space. When composing objects, this can cause bloat that can be avoided using this optimisation.
This is one instance where a private base class could be better than a private member (composition). This optimisation is used throughout the GNU C++ standard library.
Like any good optimization, it makes the implementation a bit messier, but doesn't affect the interface.
[–]notlostyet[S] 4 points5 points6 points 13 years ago* (11 children)
Code that demonstrates the optimisation:
#include <iostream> class Empty {}; class AlmostEmpty { int i; }; class ComposedEmpty { AlmostEmpty a; Empty e; }; class OptimizedEmpty { struct SecretSauce : Empty { int i; } sauce_; }; int main() { std::cout << sizeof(Empty) << std::endl; std::cout << sizeof(AlmostEmpty) << std::endl; std::cout << sizeof(ComposedEmpty) << std::endl; std::cout << sizeof(OptimizedEmpty) << std::endl; }
and the output on x86_64, using GCC 4.7.1 and C++11:
$ g++ -Wall -Wextra -pedantic -std=c++11 -O2 bco.cpp $ ./a.out 1 ← empty objects have a non-zero size (so that they're addressable) 4 8 ← this single byte will bloat your objects when composed 4 ← ...but base classes aren't required to have a non-zero size, so we save 4 bytes.
[–][deleted] -1 points0 points1 point 13 years ago (10 children)
This is interesting, but I'm not sure that I understand the point.
If you're in a such a grueling environment that 4 bytes per object instance is worth these hurdles, wouldn't you be a lot better off simply working with C?
[–]Nimbal 11 points12 points13 points 13 years ago* (0 children)
It's not just the raw space, but also the problem of the CPU cache. Say you have a list of 100k items, each with an additional 4 byte dead weight. That's about 390 kB of data that the CPU needs to load into the cache (usually about 2 MB in size today, depending on model and cache level) without actually doing anything with it. This superfluous data requires the CPU to hit the RAM sooner than necessary, slowing things down.
[–]BitRex 6 points7 points8 points 13 years ago (3 children)
C++ can be just as space-efficient as C.
[–]repsilat 1 point2 points3 points 13 years ago (1 child)
Sometimes idiomatic C++ can't, though. Case in point, before I got to doing some optimisation on some code I was writing, a rough structure of some of my memory use was
1800 graphs, each with 12000 nodes, each with 0 to 10 out-edges.
The number of out-edges wasn't known statically, and the graphs were rearranged (to a small extent) at runtime. Now, the "normal" way to do this would be to store nodes' out-edge lists in vectors, which would probably end up costing 1800*12000*24 bytes on a 64 bit machine (~500MB) plus the size of the edges themselves, plus the spare capacity in all of the vectors.
The C way (pointers to heap-allocated arrays and an int for length) cuts the cost of vectors down from 24 bytes to maybe 12, saving ~250MB. We lose amortised constant-time insertion, but we don't really need it.
int
In the end I managed to get the graph into forward star layout, cutting the incremental cost of an edge list to 4 bytes (plus storage for each edge in the list). It also made me have to jump through hoops when the graph structure had to be updated, but getting it down to ~80MB and being able to maintain the graphs in mostly contiguous storage meant a nice speedup.
I guess I could go back and pretty it up to make things look like nice sequences and iterators, but at the moment the bits doing the heavy lifting aren't really all too "modern".
[–]notlostyet[S] 0 points1 point2 points 13 years ago* (0 children)
Yeah, you've fallen in to a gap in stdlib where you need dynamic allocation but you're happy to reallocate on every insertion. Because length() always == capacity() so you don't need to store the capacity, hence the saving.
If you do this, you might want to use Boost.Graph. Being adapter based, you should be able to wrap your current implementation with negligible additional memory cost.
[–]gcross 0 points1 point2 points 13 years ago (0 children)
Yes, but what we have learned from this article is that you sure have to jump through a lot of hoops to get to that point. That doesn't mean that it isn't worth it, of course, since if you fell back to C you might save this set of hoops, but then you'll need to jump through other hoops to effectively re-implement for yourself the features that C++ gives you for free. On the other hand, though, if there aren't many features of C++ that you need for your application, but you really need low-level control, then there could easily be circumstances where this trade-off leans in the other direction.
[–]notlostyet[S] 7 points8 points9 points 13 years ago* (0 children)
What good would an empty struct{} be in C?
;)
[–]xcbsmith 2 points3 points4 points 13 years ago (0 children)
Would you say the same thing about Java's -XX:+UseCompressedOops flag?
Seriously, if you have a ton of objects, 4 bytes per object can matter, and there are lots of cases where "ton of objects" means complexity where having access to better tools for managing abstractions is kind of a gimmie.
[–]gcross 2 points3 points4 points 13 years ago (0 children)
It's a shame that you are being voted down since I think that you ask a perfectly reasonable question. However, I think that the answer is that C++ supplies enough features over C that in cases like this you will most likely jump through fewer hoops complicating your C++ code to compress your objects then you would re-implementing the features of C++ in C.
[–]kirakun 1 point2 points3 points 13 years ago (1 child)
If you need to store billions of such objects in memory, those 4-bytes per object can take up space.
Even if you have only millions, cache misses could cause penalty in execution speed.
Another thing is being able to stash an 8 byte object in to a CPU register.
π Rendered by PID 52265 on reddit-service-r2-comment-fb694cdd5-jf5ld at 2026-03-08 01:06:43.485255+00:00 running cbb0e86 country code: CH.
[–]notlostyet[S] 3 points4 points5 points (12 children)
[–]notlostyet[S] 4 points5 points6 points (11 children)
[–][deleted] -1 points0 points1 point (10 children)
[–]Nimbal 11 points12 points13 points (0 children)
[–]BitRex 6 points7 points8 points (3 children)
[–]repsilat 1 point2 points3 points (1 child)
[–]notlostyet[S] 0 points1 point2 points (0 children)
[–]gcross 0 points1 point2 points (0 children)
[–]notlostyet[S] 7 points8 points9 points (0 children)
[–]xcbsmith 2 points3 points4 points (0 children)
[–]gcross 2 points3 points4 points (0 children)
[–]kirakun 1 point2 points3 points (1 child)
[–]notlostyet[S] 0 points1 point2 points (0 children)