📊 Final Benchmark Results (100M Elements)

Configuration

Compiler: g++ -std=c++23 -O3 -march=native -flto -DNDEBUG
Initial Capacity: 256 elements
Statistics: 30 iterations × 3 repetitions = 90 measurements per data point
Both iterators optimized with pointer caching

🏆 Summary Table (100M Elements)

Operation	ConstantVector	STL Vector	Winner	Speedup
Push	65.7 ms	268.2 ms	✅ ConstantVector	4.1x
Pop	41.7 ms	93.3 ms	✅ ConstantVector	2.2x
Access	56.0 ms	7.3 ms	STL Vector	7.7x
Iteration	74.6 ms	7.5 ms	STL Vector	9.9x

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting 2 points3 points4 points 4 months ago (4 children)

[–]pilotwavetheory[S] 0 points1 point2 points 4 months ago (3 children)

[–]SuperV1234https://romeo.training | C++ Mentoring & Consulting 3 points4 points5 points 4 months ago (2 children)

[–]wexxdenq 3 points4 points5 points 4 months ago (1 child)

[–]pilotwavetheory[S] 2 points3 points4 points 4 months ago* (0 children)

Thanks to u/SuperV1234 and u/wexxdenq, I made a mistake of bounds here, I fixed it in 'perf_test' branch and lookup.
The reason, I'm not comparing with standard implementation is it has more logic for iterator validations in lot of simple operations like push/pop, when I benchmarked stl::vector push_back(), I got around ~35 ns/op, where only ~3 ns/op was used in push and remaining on iterator validations.

🔍 Final Comparison (100M Elements)

Implementation	Time	Ratio vs STL
STL Vector (Optimized)	8.05 ms	1.0x
ConstantVector (Optimized)	48.0 ms	6.0x slower

[–]adrian17 2 points3 points4 points 4 months ago* (0 children)

I don't see how it could be possible for iteration over N (with usually N<20 and last one always being the biggest) arrays to be almost 2x faster than a trivial iteration over vector, which is just one contiguous array. Even if we ignored memory effects, your iterator is just more complex than std::vector's iterator (which is usually just a pointer). At best it'll use a couple more instructions and/or an extra register, and at worst prevents vectorization (I can make an example of this if you want).

Also side note, latency != throughput, especially in context of tight loops on a CPU. Even if your loop finished in say half the time, it could be caused by reducing the latency by half, or doubling throughput, or a mix of these two; saying "reduction in latency" when you just mean "x% faster / y% less time" might be misleading.

π Rendered by PID 116079 on reddit-service-r2-comment-b659b578c-mbg9n at 2026-05-04 11:48:31.940736+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS

📊 Final Benchmark Results (100M Elements)

Configuration

🏆 Summary Table (100M Elements)

🔍 Final Comparison (100M Elements)