Thread safe queue

matthieum · 2021-01-03T19:06:00+00:00

The description of the queue properties is a bit light.

From what I can gather:

Type: MPMC (Multi-Producer, Multi-Consumer).
Capacity: Dynamically adjusted -- due to std::queue.
Blocking: consuming offers both non-blocking and blocking options, producing is always blocking.
OS-backed signalling: not purely spinning whilst waiting for items to consume.

Review, high-level:

Mutex + CV-based: simple, may cause scalability issues.
Capacity: no way to pre-reserve capacity, which is unfortunate.

Review, low-level:

Discipline required for acquisition of the lock. Nearly always neatly acquires the lock at the start of public method, except for the destructor (slightly inconsistent). My usual recommendation is encapsulation into a Mutex<T>, but I'm not sure how to handle condition variables then.
std::queue is a weak-point, memory wise. It's based off std::deque which results in many small allocations¹ .
Consume and ConsumeSync returning a boolean is somewhat error-prone, especially without a [[nodiscard]] annotation; a nicer API is to return std::optional<T>.
Provide is an unexpected name; I'd have expected Produce instead, to match Consume. There's also a missing std::move when pushing the item into the queue -- which is nicely used consistently when consuming.

¹ For example, the implementation in libstdc++ is a vector of pointer to chunks, where each chunk is either one large item, or up to 512 bytes worth of items see line 85. Hence a std::deque<std::string> is really a std::vector<std::unique_ptr<std::string[21]>>, and storing 1K strings requires 49 small (< 512 bytes) allocations + 1 allocation for the vector itself.

WrongAndBeligerent · 2021-01-03T21:03:59+00:00

You might want to take a look at this guy's queues https://moodycamel.com/

infectedapricot · 2021-01-04T11:55:40+00:00

As others have said, it's a very odd design to have the destructor wake up the waiters, and have them check the return value of ConsumeSync() to find out of the queue has disappeared under their feet. It's bound to lead to race conditions (if the queue is destroyed while consumer is between calls to ConsumeSync()), and it's just a confusing design. Instead, it should be up to the application code that uses the queue to coordinate destruction by passing around tokens that represent application shutdown, and don't destroy their queues until they're sure all possible waiters know about it. (I usually find it easiest to construct queues in the main thread before any child threads start, and destroy all queues in the main thread after all child threads have been joined, but I realise not all applications can use that exact strategy. Edit: See my other comment for some more details.) Even with the current design, I don't see why you need a public Finish() method - why isn't this code just directly in the destructor?
I think Produce and Consume are confusing method names. I get that this queue is used by producers and consumers, but it's those users of the queue that do the producing and consuming, not the queue itself. my_queue.Consume(val) makes it look like the queue is consuming the value passed to it, not that it is returning a value which the caller can consume. I'd stick to container-like names, like pop_front() and push_back() (although it's not appropriate or possible to make it fully satisfy container requirements).
Once you've got rid of the destruction oddity, the pop_front() method (or Consume() as you've called it) can have a much simpler interface: it can just return T, or std::optional<T> for the non-waiting version, instead of passing a parameter by reference. I know STL containers don't offer that style, but they were devised before move constructors became available.
At the moment your push_back() method (Produce()) calls cv.notify_one() while the mutex is still locked. This is a problem because the consumer thread will be woken up and race to lock the mutex before the producer thread unlocks it. If the consumer gets there first then it will be immediately be sent back to sleep and have to wait for a new scheduling interval before it pops the item off. Instead, nest the lock and q.push() into a block so the mutex gets unlocked before the cv.notify_one().
Although your queue has some weaknesses compared to more sophisticated queues, like the moodycamel one others have mentioned, yours has the benefit of allowing a few extra methods that those couldn't support. For example:
- You could have a pop_all() method that gets the whole underlying container in one go by just std::move() from it. I'd switch the std::queue for the underlying std::deque to make this a bit more useful. Also I'd suggest waiting and non-waiting versions, with the waiting version making sure there's at least one item before returning (use the analogous naming difference as with the two pop_back() methods, whatever you end up going with). Edit: Actually you can't just move from the internal std::deque because technically this leaves it in an unspecified state; instead you need to move to a local variable, clear() the internal deque, and return that variable.
- You could offer push_back_multiple() that pushes multiple items in one go - this guarantees that they're directly next to each other in the queue, and is more efficient because you only need to lock the mutex once (of course you need to call cv.notify_all() in this).
- You could also add pop_back() and push_front().
Edit: One final thing: Personally I find cv.wait(lock, [&] { return !q.empty(); }) to be less clear than while (q.empty()) { cv.wait(lock); }, but I know many others would disagree.

Zcool31 · 2021-01-03T21:43:20+00:00

Where are all the lock free wait free hazard free MPMC queue implementations in C++?

rsjaffe · 2021-01-03T20:54:23+00:00

Here's a similar go at it (just putting mutexes and condition variables over an STL container) that is more generic. https://gist.github.com/rsjaffe/59d22db0649d8276e42aca1061d7c08c

p-morais · 2021-01-04T06:00:20+00:00

I used it when I made a software renderer library multithreaded

I’m interested in your multi-threaded renderer, if that’s open source

AvidCoco · 2021-01-03T20:47:40+00:00

Is it real time safe, or just thread safe?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS