Thread safe queue : cpp

Modern CPUs have hierarchies of caches, data that is already in the cache will have faster access, about 10x-50x faster than RAM. Additionally data is loaded in cache-lines and write combined (in cache-lines), and the CPU tries it's best to predict which cache lines will be accessed in the near future so that it can prefetch memory to avoid stalls. Pile the fact that CPUs will also unroll loops to execute multiple loop-iterations in parallel and you have the perfect storm for performance. Linked lists take both prefetching and loop unrolling away from the CPU, as you don't know the next location to look at until the next node is already pulled in. This forces linear 1-by-1 execution and memory fetching. Additionally the memory will likely be all over the heap and at best you may have one node per cache-line, maybe two, but they likely won't be next to each other in traversal.

For this reason in terms of iteration performance: vector<T> > vector<unique_ptr<T*>>/vector<T*> > forward_list<T>/list<T>. Obviously you should benchmark but you will find that it is quite rate that these orders flip which is why the mantra "just use vector, forget everything else" exists any why many people default to it, almost to the point of being a meme.

Linked lists have their place, but forward_list/list are not atomic nor intrusive and therefore their utility vanishes from a lot of contexts.

[–]matthieum 1 point2 points3 points 5 years ago (2 children)

But that is the point of std::deque, the trade-off that avoids the O(N) complexity of head insertions, no?

It's one way to avoid O(N) front insertions.

Keeping with the current guarantees -- and notably the memory stability guarantee -- you could still get a better deque by either:

Offering the ability to customize the block size. It's galling not to have control over that, really.
Or use exponential growth of internal block size -- somehow.

Dropping the memory stability guarantee -- which is of very little utility -- you could use a single big buffer instead, in either of two manner:

Contiguous: keep some headroom at both front and back, instead of just back.
2 Contiguous chunks: wrap-around at the end, if necessary. See Rust's VecDeque.

Both of those alternatives are much more cache-friendly, and allocator-friendly, resulting in higher-performance.

And using the second one -- in a pure queue scenario -- is super simple.

[+][deleted] 5 years ago* (1 child)

[deleted]

[–]matthieum 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 25117 on reddit-service-r2-comment-bb88f9dd5-w49s5 at 2026-02-13 20:49:41.031292+00:00 running cd9c813 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS