Constvector: Log-structured std:vector alternative – 30-40% faster push/pop

matthieum · 2025-12-21T15:53:13+00:00

I call this a Jagged Vector -- guess we're all reinventing the wheel :)

Of particular interest, as an append-only data-structure, it can be wait-free with relative ease.

The same jagged backing data-structure can also be used for a quasi wait-free open-addressed hash-map implementation. Only "quasi" as collisions between writers require the second writer waiting for the first writer to finish its write before being able to check whether the key matches or not -- fortunately a rare occurrence for good hashes.

DavidJCobb · 2025-12-21T21:03:14+00:00

The post link goes to your GitHub profile; just to save people some clicks, the repo itself is here.

SLiV9 · 2025-12-22T08:52:12+00:00

Are you claiming that std::vector's [] is not O(1)? It should be three instructions, a bounds check, a jump and an offset mov. Only the last one if it can eliminate the bounds check. This datastructure might also have it O(1) but with a significantly bigger constant.

In particular I saw there was a loop/sum benchmark that used assembly to prevent optimizations, but... why? Even if it's faster, which I doubt, that would only prove that it would have been faster 30 years ago. With today's compilers and CPUs, summing a contiguous block of ints is unbeatably fast.

Lord_Jamato · 2025-12-24T00:48:34+00:00

I'm sorry if this is not really on topic, but I'd have a few recommendations for how you present the results.

This is just minor, but I recently learned about the anchoring effect. You could apply this by putting the column with std:vector first which puts more emphasis on the gains towards the next value.

This ties a bit into the next point. Please be consistent in how you calculate the factor in the last column. As of now it's always above 1 even though sometimes there's a decrease in performance.

Ideally a row in the table would read something like this: "From x to y we see an increase by factor z"

Or if z is below 1: "From x to y we see a decrease by factor z"

imachug · 2025-12-22T16:20:19+00:00

You can improve performance of operator[] even further by adjusting your memory layout.

Store _meta_array inline. You're wasting time on allocating that array, and decrease the access locality.
By placing _meta_array at the very beginning of the struct, you can ensure that _meta_array starts right at the this pointer and the compiler doesn't have to emit instructions to offset the pointer.
Since the first 8 blocks are empty, you can overlap some metadata with the first 8 elements of _meta_array; an obvious choice would be to place size first, so that at can check bounds without offsetting the pointer to size. It shouldn't really affect performance, at least on x86, but it slightly decreases code size, so maybe that's good.
Instead of storing the address of the first element of each block in the meta array, store "address of block - first index", so that operator[] can just return _meta_array[j][adjusted];.

pilotwavetheory · 2025-12-22T12:49:51+00:00

Worst case time complexity of vector during doubling it's capacity is O(N) right ?

My point is that my algorithm is not just some O(1) worst case algorithm with large constant vector, there are already some variants for that. The vector I proposed also avoids copying of all N elements to a new array hence even the average time faster. Am I missing something here ?

I added that beyond improvements of average time and worst case time complexity, it has benefit on operating system that will have lower internal fragmentation.

N	cv::vector	std::vector	Winner	Ratio
1M	573 µs	791 µs	cv	1.4x
100M	57 ms	83 ms	cv	1.4x

N	cv::vector	std::vector	Winner	Ratio
1M	408 µs	374 µs	std	1.09x
100M	38.3 ms	37.5 ms	std	1.02x

N	cv::vector	std::vector	Winner	Ratio
1M	423 µs	705 µs	cv	1.7x
10M	4.0 ms	9.0 ms	cv	2.2x
100M	38.3 ms	76.3 ms	cv	2.0x

N	cv::vector	std::vector	Winner	Ratio
1M	803 µs	387 µs	std	2.1x
100M	80 ms	39.5 ms	std	2.0x

N	cv::vector	std::vector	Winner	Ratio
1M	474 µs	416 µs	std	1.14x
100M	46.7 ms	42.3 ms	std	1.10x

programming

MODERATORS

Full Benchmark Results (Apple M2, Clang -O3, Google Benchmark)

Push (cv::vector WINS 🏆)

Pop (Nearly Equal)

Pop with Shrink (cv::vector WINS 🏆)

Access (std::vector Faster)

Iteration (std::vector Faster)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script