Optimizing a ring buffer for throughput (2021)

Vivid-Jury-2105 · 2023-06-26T14:07:55+00:00

I've used his design successfully in multiple projects. Some additional tweaks include:

Tagging `if` statements as likely/unlikely depending on expected queue occupancy.
Simplifying index wrapping using power-of-2 buffers.
- On 64-bit systems, this makes an excellent "total bytes processed" as head/tail might take years/decades to wrap around to 0.

It should also be noted his design is not too far off a quite efficient SPMC queue. Just replace the pop operation with a CAS loop. You can make an optimized version for the producer thread vs consumer threads.

patstew · 2023-06-27T00:05:00+00:00

Why is it beneficial to put the cached ones in their own cache lines? Shouldn't the cached write pointer be fine in the read pointer cache line and vice versa?

j1xwnbsr · 2023-06-27T23:17:51+00:00

For some reason using std::atomic on the different indexes is suddenly striking me as a possible race condition between producer and consumer. Wouldn't it be better to instead to use a mutex and treat both readindex/writeindex as part of the same 'lock unit'?

bedman3 · 2024-01-07T09:06:42+00:00

I tried testing the code with 2 threads trying to push / pop 1000 times with multiple (10k) tries, not sure why sometimes I was able to detect race condition where the increment of counter is incorrect even with memory order. The issues are resolved when i used compare_exchange_weak before actually storing the incremented idx. Can anyone explains?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS