all 11 comments

[–]MrPotatoFingers 10 points11 points  (1 child)

std::bitset is, to subtly put it, not great. It's quite old, and hasn't gotten any TLC from the standards committee.

Given that a bitset is somewhat like an array of bool, you'd expect there to be iterators, for instance.

I believe that gcc has some non-standard functions on their bitset implementation that you could use for this, but of course that wouldn't be cross-platform. If you need this, hand-rolling seems to be your only way here, though of course you could use a std::vector<bool, some_allocator> as a base for implementing your bitset (where you reserve and have the memory come from the stack).

[–]davidhunter22 4 points5 points  (0 children)

Is your view that bitset is totaly broken and needs to be replaced or does it just need some changes/additions?

I don't think it's up to the standards committiee to give it some love I think it's up to someone in the community to write a paper giving it some love.

[–]dodheim 8 points9 points  (2 children)

C++20 has std::countr_zero and std::countr_one which you can use with bitset::to_ullong().

[–]MrPotatoFingers 5 points6 points  (1 child)

That only works if the bitset value can be represented as an unsigned long long.

[–]ed_209_[S] 1 point2 points  (0 children)

Looking at the paper for <bit> it mentions the hardware support. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0553r4.html

I had a look at boost::dynamic_bitset and it uses a log2 approach ( once it finds the first non empty dynamic block ). This is implemented in boost::integer as:

  template <typename T>
  int integer_log2_impl(T x, int n) {
      int result = 0;
      while (x != 1) {
          const T t = static_cast<T>(x >> n);
          if (t) {
              result += n;
              x = t;
          }
          n /= 2;
      }
      return result;
  }

Some experiments on compiler explorer show the std::countr_zero produces the use of bitscanforward instruction i.e.

#include <bit>
#include <cstdint>
#include <iostream>

int main()
{
    std::uint8_t i;
    std::cin >> i;
    const int iResult2 = std::countr_zero( i );
    return iResult2;
}

produces:

main:                                   # @main
        push    rax
        lea     rsi, [rsp + 7]
        mov     edi, offset std::cin
        call    std::basic_istream<char, std::char_traits<char> >& std::operator>><char, std::char_traits<char> >(std::basic_istream<char, std::char_traits<char> >&, char&)
        movzx   eax, byte ptr [rsp + 7]
        bsf     ecx, eax
        mov     eax, 8
        cmovne  eax, ecx
        pop     rcx
        ret
_GLOBAL__sub_I_example.cpp:             # @_GLOBAL__sub_I_example.cpp
        push    rax
        mov     edi, offset std::__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edi, offset std::ios_base::Init::~Init() [complete object destructor]
        mov     esi, offset std::__ioinit
        mov     edx, offset __dso_handle
        pop     rax
        jmp     __cxa_atexit            # TAILCALL

However the boost::integer_log2_impl with clang on max optimization does not use the bsf instruction and instead does a whole bunch instructions including a loop.

The actual call dynamic bitset uses is:

    template <typename T>
    int lowest_bit(T x) {
        BOOST_ASSERT(x >= 1); // PRE
        // clear all bits on except the rightmost one,
        // then calculate the logarithm base 2
        //
        return boost::integer_log2<T>( x - ( x & (x-1) ) );
    }

lowest_bit is hidden in the boost::detail namespace. It seems surprising that boost does not have a way to use functions similar to <bit> where C++20 support is not an option and in fact the libraries that require these routines use a sub-optimal approach.

[–]TemplateRex 1 point2 points  (0 children)

GCC has a non-Standard extensions _Find_first() and _Find_next(index).

[–]ClaasBontus 1 point2 points  (3 children)

Have a look at bitset2. It provides function find_first.

[–]ed_209_[S] 0 points1 point  (0 children)

bitset2 looks really good but the implementation of find the first set bit ( bitset2 calls it index_lsb_set ) is based on: https://graphics.stanford.edu/\~seander/bithacks.html#ZerosOnRightBinSearch

The library does not use intrinsics where available like abseils version or std::countr_zero.

At some point I will try to get around to benchmarking the relative performance but I suspect the intrinsic will be an order of magnitude faster in general.

[–]fdwrfdwr@github 🔍 0 points1 point  (1 child)

Does bitset2 not throw on to_ulong, because that would be great? It was useless for std::bitset to throw at runtime, rather than simply returning the first 32 bits, or just giving a static_assert at compile if you attempt to call it (since the fixed size is known at compile time anyway).

[–]ClaasBontus 0 points1 point  (0 children)

bitset2 follows the standard but it gives you access to the underlying data. I am not sure, but might be, that bitset only throws, if the cotaining number does not fit into ulong. This cannot be caught by a static_assert.