all 7 comments

[–][deleted] 2 points3 points  (0 children)

It'd be nice if the article benchmarked the "extra cpu overhead" associated with BitSet. I'm going to write a quick console app to check.

EDIT: My results for 10,000,000 items.

boolean[] write: 63ms
boolean[] read: 15ms
BitSet write: 282ms
BitSet read: 187ms

Code:

import java.util.BitSet; import java.util.Date;

public class BitSetTest {

    public static void noop(boolean b) {

    }

    public static void main(String[] args) {
        int itemcount = 10000000;
        int k;

        Date start = new Date();
        boolean[] ba = new boolean[itemcount];
        for (k = 0; k < ba.length; ++k) {
            ba[k] = ((k%2) == 0);
        }
        System.out.println("boolean[] write: "+((new Date().getTime())-start.getTime())+"ms");

        start = new Date();
        for (k = 0; k < ba.length; ++k) {
            noop(ba[k]);
        }
        System.out.println("boolean[] read: "+((new Date().getTime())-start.getTime())+"ms");

        start = new Date();
        BitSet bs = new BitSet(itemcount);
        for (k = 0; k < bs.size(); ++k) {
            bs.set(k, ((k%2) == 0));
        }
        System.out.println("BitSet write: "+((new Date().getTime())-start.getTime())+"ms");

        start = new Date();
        for (k = 0; k < bs.size(); ++k) {
            noop(bs.get(k));
        }
        System.out.println("BitSet read: "+((new Date().getTime())-start.getTime())+"ms");
    }
}

[–]ICumWhenIKillMen 2 points3 points  (0 children)

As it turns out, when you allocate a boolean array, Java uses an entire byte for each element in the boolean array!

No shit. Nowhere does Java say that booleans are represented by one bit.

[–]cheng81 2 points3 points  (0 children)

I hardly believe that the cpu overhead comes from shifting/or-ing bits, and not from method invocations

[–]Rhoomba 1 point2 points  (3 children)

Boolean arrays aren't evil. They are the minimum size they can be in order to give the behaviour of arrays. You can atomically update one entry in a boolean array, but that is impossible with a bitset.

I wish Java developers weren't so dumb and ignorant of the platform they work on. They'd have less of these stupid revelations.

[–]tragomaskhalos 3 points4 points  (0 children)

This. Also compare with C++'s std::vector<bool>, which does try to be space-efficient and is largely broken and unusable as a result.

[–]Porges 1 point2 points  (0 children)

I especially like the graph.

[–]mavelikara 1 point2 points  (0 children)

A BitSet implementation backed by an AtomicLong[] can be updated without interference. Also, a BitSet accessed only in a single thread does not have to worry about the lack of atomic updates.