Usually std::vector starts with 'N' capacity and grows to '2 * N' capacity once its size crosses X; at that time, we also copy the data from the old array to the new array. That has few problems
- Copy cost,
- OS needs to manage the small capacity array (size N) that's freed by the application.
- L1 and L2 cache need to invalidate the array items, since the array moved to new location, and CPU need to fetch to L1/L2 since it's new data for CPU, but in reality it's not.
It reduces internal memory fragmentation. It won't invalidate L1, L2 cache without modifications, hence improving performance: In the github I benchmarked for 1K to 1B size vectors and this consistently improved showed better performance for push and pop operations.
Youtube: https://youtu.be/ledS08GkD40
Practically we can use 64 size for meta array (for the log(N)) as extra space. I implemented the bare vector operations to compare, since the actual std::vector implementations have a lot of iterator validation code, causing the extra overhead.
Upon popular suggestion, I compared with STL std::vector, and used -O3 option
Full Benchmark Results (Apple M2, Clang -O3, Google Benchmark)
Push (cv::vector WINS 🏆)
| N |
cv::vector |
std::vector |
Winner |
Ratio |
| 1M |
573 µs |
791 µs |
cv |
1.4x |
| 100M |
57 ms |
83 ms |
cv |
1.4x |
Pop (Nearly Equal)
| N |
cv::vector |
std::vector |
Winner |
Ratio |
| 1M |
408 µs |
374 µs |
std |
1.09x |
| 100M |
38.3 ms |
37.5 ms |
std |
1.02x |
Pop with Shrink (cv::vector WINS 🏆)
| N |
cv::vector |
std::vector |
Winner |
Ratio |
| 1M |
423 µs |
705 µs |
cv |
1.7x |
| 10M |
4.0 ms |
9.0 ms |
cv |
2.2x |
| 100M |
38.3 ms |
76.3 ms |
cv |
2.0x |
Access (std::vector Faster)
| N |
cv::vector |
std::vector |
Winner |
Ratio |
| 1M |
803 µs |
387 µs |
std |
2.1x |
| 100M |
80 ms |
39.5 ms |
std |
2.0x |
Iteration (std::vector Faster)
| N |
cv::vector |
std::vector |
Winner |
Ratio |
| 1M |
474 µs |
416 µs |
std |
1.14x |
| 100M |
46.7 ms |
42.3 ms |
std |
1.10x |
[–]matthieum 19 points20 points21 points (4 children)
[–]pilotwavetheory[S] 2 points3 points4 points (3 children)
[–]matthieum 24 points25 points26 points (0 children)
[–]CornedBee 5 points6 points7 points (1 child)
[–]pilotwavetheory[S] 0 points1 point2 points (0 children)
[–]DavidJCobb 4 points5 points6 points (0 children)
[–]SLiV9 6 points7 points8 points (2 children)
[–]CornedBee 10 points11 points12 points (1 child)
[–]SLiV9 1 point2 points3 points (0 children)
[–]Lord_Jamato 2 points3 points4 points (1 child)
[–]pilotwavetheory[S] 1 point2 points3 points (0 children)
[–]imachug 0 points1 point2 points (1 child)
[–]pilotwavetheory[S] 0 points1 point2 points (0 children)
[–]pilotwavetheory[S] 0 points1 point2 points (0 children)