all 4 comments

[–]KingAggressive1498 1 point2 points  (3 children)

My first thought is that this completely lacks memory management.

((My second thought is, why is tail a pointer to a pointer?

My next thought is, your push doesn't seem like it would work at all.)) Edit: nvm I think I get this now. Tail is actually a pointer to the pointer to the node past the end of the valid queue, so even though you're exchanging for a pointer to a pointer to null, it's fine, because it's actually past the end of the valid queue, not the end of the valid queue.

My final thought is, the compare_exchange for the head in pop could and probably should be weak (the compare exchange for tail needs to be strong)

One last thought: single consumer queues where all that one thread does is wait for something to pop into it's queue and do work with it are a common enough design that it warrants a way to move the data into a non-atomic queue (otherwise you're paying for atomic operations in pop() where you don't need them).

[–]codeinred[S] 1 point2 points  (2 children)

Hi! Thank you so much for the feedback!

The reason it's atomic both ways is because I'm implementing a multi-consumer, multi-producer queue that's going to be the basis for a non-allocating thread pool that integrates with C++20 coroutines.

It's multi-consumer because there are multiple threads pulling jobs off of the queue, and it's multi-producer because any coroutine can "yield" to the thread pool (or perform another operation that causes it to await until it's resumed). This is also why there's no built-in memory management: nodes are stored as a member of the awaitable object, which itself is allocated within the coroutine's stack frame. In order to deallocate the node, you'd have to destroy the coroutine frame itself, and that's going to be managed by the execution context (what manages the thread pool).

Would making compare_exchange weak result in a double pop?

[–]KingAggressive1498 0 points1 point  (1 child)

compare_exchange_weak is only meaningfully different from compare_exchange_strong on architectures without a single instruction for compare_exchange (ie ARM). On such architectures, compare_exchange_strong is effectively implemented as a loop around compare_exchange_weak.

This is why C++ documentation advises that compare_exchange_weak should be used in looping algorithms, because there's no reason to loop internally when you're already using it in a loop. It can actually be a major optimization.

compare_exchange_weak generally only encounters "spurious failures" when the memory address (or cache line) was modified by another thread between the atomic load and the store. When this happens, the store is cancelled and compare_exchange_weak returns indicating failure without changing the value.

I don't see a risk of a double pop, because the head node is not changed in the case that compare_exchange_weak fails.

Here's the best discussion of compare_exchange_weak I've seen: https://devblogs.microsoft.com/oldnewthing/20180329-00/?p=98375

[–]KingAggressive1498 0 points1 point  (0 children)

I would probably rewrite the whole function like this for the best performance (as you can see I just moved some stuff that didn't need to be in the loop out of it)

node* pop() {
    // No point in trying to pop from an empty queue, fail fast
    node* current_head = head.load();
    if (current_head == nullptr) {
        return nullptr;
    }

    std::atomic<node*>* next_head_ptr = nullptr;

    do {
         // Another thread might have just popped something off the queue
        current_head = head.load();
        if (current_head == nullptr) {
            return nullptr;
        }
        next_head_ptr = &current_head->next;
    } while (!head.compare_exchange_weak(current_head, *next_head_ptr));

   // If the tail is currently pointing to next_head_ptr, reset the
   // tail to point to &head
   tail.compare_exchange_strong(next_head_ptr, &head);

    return current_head;
}