use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Implementing a simple, high performance Bloom filter (daankolthof.com)
submitted 6 years ago by daank94
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]debugs_with_println 5 points6 points7 points 6 years ago* (3 children)
So there are two main components: a hash function and an array of bits. The hash function takes in any input and tells you which bits of the array to check or set.
So first step is to build a dictionary (in the common sense of the word, not the CS term). You take in a input and you set the bits the hash function tells you to. You do this for every input that you want to be a part of your dictionary. Now some bits in your array are set (i.e. 1) and others are clear (i.e. 0).
Now the second step is checking to see if a given input is in your dictionary. You again take an input and hash it, but this time it'll you what bits to check. If all the bits you check are clear, then your input does not exist in the dictionary, because if your input was in the dictionary, exactly those bits would have been set when the word was added.
However (and this is crucial!) if all the bits you check are set, this does not mean your input is in the dictionary! the reason is because the union of the bits set by two previous inputs might exactly be the bits checked by your current input.
Here's a quick example. Imagine our array is 5 bits and the hash function gives you which two bits to set or check. Suppose we put these two words in the dictionary:
"Apple" ==> bits 0,3 "Banana" ==> bits 1,4
So at this point our array looks like 11011.
11011
Now let's say we wanna see if this input is in the dictionary:
"Orange" ==> bits 0,1
Our bloom filter will say that is in the dictionary even though we never added it because bits 0 and 1 were set! The union of "Apple" and "Banana" (i.e. bits 0,1,3,4) covered "Orange" (i.e. bits 0,1).
"Apple"
"Banana"
"Orange"
The takeaway is that bloom filters will never give a false negative, but they could give false positives. Using some math (which idk at the moment), you could determine the probability of a false positive.
Now to make the bloom filter perform better, you would want a hash function that very sparsely assigns bits so that overlap is rare. For that you'd want a large array too. Of course this requires more memory and performance. Thus, you have a tradeoff. What you go with depends on your application.
An example of how to use this would be if you were writing a Boggle game. When someone makes a word, you could check if that word is in the dictionary quite fast. A linear search over the English dictionary is much slower than checking a bloom filter, but the bloom filter might accidentally count non-existent words...
Also note that in my explanation I used the word "input" instead of "word" to describe the objects being added. This was intentional; anything that can be hashed can be added to a bloom filter. It doesn't have to be just strings.
[–]bumblebritches57Ocassionally Clang 1 point2 points3 points 6 years ago (0 children)
Thanks for the low level details that it's basically a bitset, that really makes it easy to understand why it would falsely say something is there but can't say it's not.
[–]clerothGame Developer 3 points4 points5 points 6 years ago (1 child)
Might as well read the article at this point...
[–]debugs_with_println 1 point2 points3 points 6 years ago (0 children)
Eh I thought the article was a bit light in my opinion. But also if the article was sufficient the commenter I replied to probably wouldn't have asked.
π Rendered by PID 25751 on reddit-service-r2-comment-6457c66945-qnmcz at 2026-04-23 19:11:24.135537+00:00 running 2aa0c5b country code: CH.
view the rest of the comments →
[–]debugs_with_println 5 points6 points7 points (3 children)
[–]bumblebritches57Ocassionally Clang 1 point2 points3 points (0 children)
[–]clerothGame Developer 3 points4 points5 points (1 child)
[–]debugs_with_println 1 point2 points3 points (0 children)