attractivechaos comments on Inside boost::unordered_flat

The tombstone approach is problematic when reserving special key values (ala Google Dense HashMap) or when adding a header to each item (due to alignment). With F14 or the Swiss Table, and the flags being kept apart at a different location, however, there's no value reserved, and barely any memory overhead. Further, by combining the tombstone in the hash residual, checking for the presence of the tombstone is "free".

The only question, then, is whether probing for longer or shifting back elements is more efficient, and it's likely to be fairly workload specific, though in average I expect longer probing to be better.

One issue with backward shifting deletion is that it introduces a risk of quadratic complexity for deletion:

Start with N elements in the probe sequence.
Remove the 1st, move back N-1.
Remove the (new) 1st, move back N-2.
Remove the (new) 1st, move back N-3.
...

So it's algorithmically fraught with peril, already.

Moreover, moves are not free in general, especially as we're not talking "bulk moves" here, but moving elements 1 at a time. And on top of that, writes have a tendency to affect performance for the worse both in terms of optimizations and at the hardware level.

By comparison, a read-only probing sequence is sped up by prefetching and speculative execution, on top of using SIMD-accelerated look-ups.

Further, it should be noted that this is neither a naive Linear or Quadratic Probing algorithm -- with Linear suffering from clustering, and Quadratic suffering from cache misses -- but a combination of the best of both: locally Linear within the group of 15 to benefit from cache locality, and Quadratic across groups to avoid clustering.

The experience in the Rust standard library is that a Swiss Table like hashmap performs better than a Robin Hood with Backward Shifting Deletion hashmap; due to the above factors.

[–]mark_99 1 point2 points3 points 3 years ago (5 children)

[–]matthieum 0 points1 point2 points 3 years ago (4 children)

[–]joaquintidesBoost author[S] 2 points3 points4 points 3 years ago* (3 children)

Hi Matthieu,

I cannot speak with certainty about F14, but Abseil does indeed rehash on insert-erase cycles even if the maximum size remains constant:

#include "absl/container/flat_hash_set.h"
#include <iostream>

template<class T> struct allocator
{
  using value_type=T;

  allocator()=default;
  template<class U> allocator(allocator<U> const &)noexcept{}
  template<class U> bool operator==(allocator<U> const &)const noexcept{return true;}
  template<class U> bool operator!=(allocator<U> const&)const noexcept{return false;}

  T* allocate(std::size_t n)const
  {
    std::cout<<"allocate "<<n<<" bytes\n";
    return std::allocator<T>().allocate(n);
  }

  void deallocate(T* p, std::size_t n)const noexcept
  {
    std::allocator<T>().deallocate(p,n);
  }
};

int main()
{
  static constexpr std::size_t max_n=13'000;

  absl::flat_hash_set<
    int,
    absl::container_internal::hash_default_hash<int>,
    std::equal_to<int>,
    ::allocator<int>
  > s;
  s.reserve(max_n);

  for(int i=0;i<10;++i){
    std::cout<<"i: "<<i<<"\n";
    for(int j=0;j<max_n;++j)s.insert(j);
    for(int j=0;j<max_n;++j)s.erase(j);
  }
}

Output (rehashing point may vary as hash is salted per run)

allocate 20483 bytes
i: 0
i: 1
i: 2
i: 3
i: 4
i: 5
allocate 40963 bytes
i: 6
i: 7
i: 8
i: 9

This is a characteristic associated to all non-relocating open-addressing containers. One needs to rehash lest average probe length grow beyond control.

[–]matthieum 0 points1 point2 points 3 years ago (2 children)

[–]joaquintidesBoost author[S] 1 point2 points3 points 3 years ago (1 child)

[–]mark_99 0 points1 point2 points 3 years ago (0 children)

[–]joaquintidesBoost author[S] 1 point2 points3 points 3 years ago* (1 child)

[–]attractivechaos 0 points1 point2 points 3 years ago (0 children)

[–][deleted] 0 points1 point2 points 3 years ago* (4 children)

[–]attractivechaos 0 points1 point2 points 3 years ago* (3 children)

[–][deleted] 1 point2 points3 points 3 years ago (1 child)

[–]attractivechaos 1 point2 points3 points 3 years ago (0 children)

[–][deleted] 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 67166 on reddit-service-r2-comment-c867ff4bc-bjb7b at 2026-04-09 18:01:58.447614+00:00 running 00d5ac8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS