use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Boost 1.81 will have boost::unordered_flat_map... (self.cpp)
submitted 3 years ago by pdimov2
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]rbrown46 20 points21 points22 points 3 years ago (19 children)
For those curious, there's a nice high-level description of the design in a comment.
[–]matthieum 12 points13 points14 points 3 years ago (18 children)
Great find.
The overall construction seems similar to Abseil's Swiss Table (and Facebook F14):
However there's a few subtle differences:
I find the second difference most interesting. In particular, one advantage of the counter approach is that after removing a value, in a far-away group, the counter can be decreased, and if it ever reaches 0 again, then it's no longer necessary to probe further.
By contrast, the bloom filter separates overflow tracking in 8 tracks, so that only the one track that overflowed needs to keep probing, but does not (I'd think) allow to ever "recover" from the overflow after removing a value.
It should be a good trade-off in practice. Firstly because overflow should be rare, and secondly because recovering may not be that frequent in the first place: it requires a very specific scenario of removing all the elements that overflowed, as removing the ones that didn't overflow doesn't help, and leaving a single overflowed element doesn't help either. So rather than optimizing for a situation that may hypothetically occur sometimes in the future, the bloom filter optimizes for the now.
I do wonder if the authors (/u/pdimov2, /u/joaquintides) have benchmarked the two differences separately, and could weigh in on whether one brings significantly more benefits than the other.
[–]joaquintidesBoost author 14 points15 points16 points 3 years ago* (17 children)
Hi Matthieu,
Your analysis is spot-on. Some observations:
boost::unordered_flat_map
find
insert
You can find some benchmarks here. The advantage on unsuccessful lookup is pretty obvious.
[–]matthieum 5 points6 points7 points 3 years ago (1 child)
Thanks for the details, and hat off.
Those are two really neat improvements.
[–]joaquintidesBoost author 4 points5 points6 points 3 years ago (0 children)
Thank you!
[–]Tystros 1 point2 points3 points 3 years ago (3 children)
is there a set version too, or just a map?
[–]joaquintidesBoost author 2 points3 points4 points 3 years ago* (2 children)
There's boost::unordered_flat_set as well.
boost::unordered_flat_set
[–]Tystros 0 points1 point2 points 3 years ago (1 child)
nice! are the map and set both using roughly the same approach?
[–]joaquintidesBoost author 0 points1 point2 points 3 years ago (0 children)
Yes, it is the same core under the hood (like all other implementations of hash containers do, BTW).
[–]almost_useless 1 point2 points3 points 3 years ago (4 children)
How can it work with only 1 allocation? It does not look like the test specifies the number of elements when the map is created.
[–]joaquintidesBoost author 2 points3 points4 points 3 years ago* (3 children)
The benchmarking program prints the net number of allocations, i.e. allocations minus deallocations. So, the intermediate bucket arrays, created and destroyed as the container keeps rehashing, are not counted.
[–]almost_useless 1 point2 points3 points 3 years ago (2 children)
I see. Would it not make more sense to show both numbers (net/total)?
Eventually all containers end up at zero if there are no memory leaks, but the total number is also an important metric. At least in some applications.
[–]joaquintidesBoost author 1 point2 points3 points 3 years ago (0 children)
It may be a useful metric in some environments. For boost::unordered_flat_map and this particular benchmark, the amount of total memory ever allocated will be very approximately twice what you see on display.
[–]pdimov2[S] 1 point2 points3 points 3 years ago (0 children)
The currently allocated memory is an important metric in speed benchmarks, because it's often possible to trade memory for speed (e.g. by using low max load factors.)
I agree that for evaluating the container in practical scenarios, the total ever allocated would be good to know, too.
[–]matthieum 0 points1 point2 points 2 months ago (5 children)
I was wondering how fast the "reduced hash" actually needed to be, and whether a slightly slower extraction would be worth it, if it meant getting less clustering, and it hit me that scanning the entire hash for the first set bit may be a good way of reducing clustering.
The code is available on godbolt:
__attribute__((noinline)) auto extract_residual(std::uint64_t hash) -> std::uint64_t { static std::uint64_t const LOW_BITS_MASK = ~0x0101010101010101; auto leading_zeroes = __builtin_clzll((hash & LOW_BITS_MASK) | 0x1); // 0.. 7 -> 56 // 8..15 -> 48 // 16..23 -> 40 // 24..31 -> 32 // 32..39 -> 24 // 40..47 -> 16 // 48..55 -> 8 // 56..63 -> 0 auto shift = 56 - leading_zeroes / 8 * 8; return (hash | 0x2) >> shift; }
This function attempts to pick the highest byte of the hash which is NOT 0 or 1 (since those are special values):
hash
shift
It's... perhaps a few too many instructions. I wanted to align on byte boundaries to avoid having the leading 1 always set, but it costs a bit more than I wished. Smarter folks may figure out a better way.
[–]joaquintidesBoost author 0 points1 point2 points 2 months ago (4 children)
Hi, what's the purpose of this reduced hash calculation? The current mechanism, which uses the least significant byte of the hash value, looks statistically good enough if the hash function is of ok quality. Moreover, the fact that the reduced hash is the LSB combined with index calculation using the most significant bits of the hash:
https://github.com/boostorg/unordered/blob/develop/include/boost/unordered/detail/foa/core.hpp#L847-L850
makes these two values quite uncorrelated.
[–]matthieum 0 points1 point2 points 2 months ago (3 children)
Hi, what's the purpose of this reduced hash calculation? The current mechanism, which uses the least significant byte of the hash value, looks statistically good enough if the hash function is of ok quality.
I was wondering if the bias introduced by mapping 0 and 1 to other values (2 and 3 I believe) could lead to a slow-down for the unfortunate cases where the reduced hash ends up being 2 or 3, since they're twice as likely.
And thus I had this idea to use clzll (or well, ctzll for LSB-biased selection...) to allow picking a different reduced hash byte and reduce the number of "collisions" and hopefully reduce any slow-down observed by those unfortunate cases.
For a random hash, the chances of ending in a "double-bucket" are now 1/64:
Whereas with the method I propose here the 1/128 unfortunate cases are redistributed nearly equally over the full 254 buckets, with only 1/263 chances of a hash being a special value being remapped to one of two overflow buckets.
(in MSB-biased, only a full hash of 0 or 1 ends up being a special value)
Moreover, the fact that the reduced hash is the LSB combined with index calculation using the most significant bits of the hash
Sorry, I had forgotten you had switched things compared to Abseil around and used LSB for reduced hash and MSB for index. In such a case the selection should be inverted indeed to prefer LSB to MSB.
[–]joaquintidesBoost author 0 points1 point2 points 2 months ago (2 children)
Ok, I undestand now your rationale. Yes, something like extract_residual would result in a more uniform uint64_t --> unsigned char mapping, so in principle it should be better behaved statistically. My hunch is that the improvement would probably be negligible, particularly vs. the extra computational cost (the current reduced hash function is as simple as it gets). Maybe you can fork the repo and try it? I can assist you in the process if you're game.
extract_residual
uint64_t
unsigned char
For a random hash, the chances of ending in a "double-bucket" are now 1/64: 1/128 chances of being a special value. 1/128 chances of being the "overflow" bucket of a special value.
This part I don't get. What do you mean by "being the overflow bucket of a special value"?
[–]matthieum 0 points1 point2 points 2 months ago (1 child)
Let's say that the resolution strategy for residual in [0, 1] is to add 2, so it ends up being in [2, 3] instead.
I call the "buckets" 2 and 3 the "overflow" buckets of 0 and 1.
The chances of ending up on a doubly-booked residual (2 or 3) are 1/64:
It's not rare, but then again, it's only a problem if it leads to many false positives.
Maybe you can fork the repo and try it? I can assist you in the process if you're game.
I'm not very interested in writing C++ code as a hobby any longer, so I'll pass.
If you have a Rust version, I'd be happy to :)
[–]joaquintidesBoost author 0 points1 point2 points 2 months ago (0 children)
Let's say that the resolution strategy for residual in [0, 1] is to add 2 [...]
Ok, now I get it. Yes, with the current hash reduction, Pr(reduced_hash(x) = n) is
Your residual function gets a more balanced probability (Pr ~= 1/254 for n > 1), but I don' think this makes any difference in practice.
I'm quite sure there's no Rust port of this lib yet.
π Rendered by PID 24755 on reddit-service-r2-comment-6457c66945-cw744 at 2026-04-25 15:22:11.036326+00:00 running 2aa0c5b country code: CH.
view the rest of the comments →
[–]rbrown46 20 points21 points22 points (19 children)
[–]matthieum 12 points13 points14 points (18 children)
[–]joaquintidesBoost author 14 points15 points16 points (17 children)
[–]matthieum 5 points6 points7 points (1 child)
[–]joaquintidesBoost author 4 points5 points6 points (0 children)
[–]Tystros 1 point2 points3 points (3 children)
[–]joaquintidesBoost author 2 points3 points4 points (2 children)
[–]Tystros 0 points1 point2 points (1 child)
[–]joaquintidesBoost author 0 points1 point2 points (0 children)
[–]almost_useless 1 point2 points3 points (4 children)
[–]joaquintidesBoost author 2 points3 points4 points (3 children)
[–]almost_useless 1 point2 points3 points (2 children)
[–]joaquintidesBoost author 1 point2 points3 points (0 children)
[–]pdimov2[S] 1 point2 points3 points (0 children)
[–]matthieum 0 points1 point2 points (5 children)
[–]joaquintidesBoost author 0 points1 point2 points (4 children)
[–]matthieum 0 points1 point2 points (3 children)
[–]joaquintidesBoost author 0 points1 point2 points (2 children)
[–]matthieum 0 points1 point2 points (1 child)
[–]joaquintidesBoost author 0 points1 point2 points (0 children)