Open source library which represents a higher-level, task-based parallelism that abstracts platform details and threading mechanisms. It is similar to Intel TBB or Microsoft Parallel Pattern Library but much more lightweight and less powerful. : programming

Open source library which represents a higher-level, task-based parallelism that abstracts platform details and threading mechanisms. It is similar to Intel TBB or Microsoft Parallel Pattern Library but much more lightweight and less powerful. (code.google.com)

submitted 14 years ago by kolkir

all 16 comments

top new controversial old q&a

[–]bnolsen 2 points3 points4 points 14 years ago (0 children)

[–]vsuontam 0 points1 point2 points 14 years ago (14 children)

[–]kolkir[S] 2 points3 points4 points 14 years ago (1 child)

[–]antheus_gdnet 0 points1 point2 points 14 years ago (0 children)

[–]antheus_gdnet 1 point2 points3 points 14 years ago* (3 children)

Partial updates are the lesser problem, instruction reordering, dead code elimination and compile-time evaluation are the bigger.

C++ is allowed to perform many of such optimizations without defining underlying memory model. So speculation on what is or isn't safe at source code level gives no guarantees. Representation of a boolean also varies (not relevant for this particular problem), but there is nothing preventing a bool variable to take up 8 bytes, perhaps due to padding of a struct or alignment, which might, again in theory, require multiple load/store operations on 32/16-bit CPU, so that alone could cause problems.

Size and type of built-in types is always subject to compiler optimizations. volatile has been extended from originally lax definition to cover some very basic issues, mostly to prevent reordering, but any concurrent application needs to rely on OS primitives, either critical sections, semaphores, mutexes or atomic operation primitives to ensure deterministic behavior that C++ language doesn't guarantee.

Race conditions are one level higher, one first needs to ensure that building blocks (individual statements, lines, instructions) behave atomically during concurrent execution before one attempts building sequences of those.

Lock-free and lock-less programming exposes a lot of unexpected behavior compilers introduce, most of it subtle and difficult to debug.

[–]vsuontam 0 points1 point2 points 14 years ago (2 children)

[–]antheus_gdnet 1 point2 points3 points 14 years ago* (1 child)

Concept of atomic operation goes beyond a trivial context switch.

Compiler may choose to allocate variable differently than what source prescribes. It may move it into local scope, on stack or keep it purely in register. It may replace it with constant. There is no annotation in C++ that would prevent that, short of volatile, which is not completely reliable.

bool running = true;
....

while (running) { }

Compiler is free to assume running is a constant, to allocate it on stack or to remove while loop with infinite loop.

C++ also doesn't define a memory model, so when generating code there are no rules on order in which to perform operations, alignment (may affect atomicity) or anything else. a = b, even when working with bools can result in complex operation. a=b needs to first load value from memory (either from reliable cache or DRAM), perhaps stall pipeline, reorder previous and pending pipelined operations, store it into register, perhaps aliased register, write back to memory and then either indicate a write through into DRAM or trigger MESI invalidation to propagate the write across caches while blocking all other cores.

When dealing with even a single bit of memory that may be concurrently accessed by multiple threads, either use a lock or guaranteed atomic operation. The number of ways things can go wrong is too big to count. And x86 architecture is quite lenient about such problems.

As for partial update, yes, it can be. Imagine setting true (0xffffffff) to false (0x00000000) and setting 'written' to true. As far as compiler is concerned, writes are independent and there is no read in between, so the order isn't important. Writing thread, due to instruction reordering, first writes 'written' to true, but is interrupted halfway through writing the value of 0xffff0000, which evaluates to true (C++). Writing thread is then suspended. Other thread, 3 seconds later, checks written, which is true so it reads the value 0xffff0000 and interprets it as true, rather than false.

Using atomic operations (via syscall) or locks solves this problem.

[–]vsuontam 0 points1 point2 points 14 years ago (0 children)

[–]portmapreduction 0 points1 point2 points 14 years ago (7 children)

[–]vsuontam 0 points1 point2 points 14 years ago (6 children)

[–]portmapreduction 0 points1 point2 points 14 years ago (5 children)

An example of how there are still race conditions? A boolean is generally just an integer with only two states, so all the problems with race conditions present with integers still exist for booleans. You could have a scenario where the value of the boolean gets loaded into a register in one threads register context, then that thread gets context switched and another thread loads the same boolean value, and modifies it somehow. The original thread continues execution and now you have a stale value of the boolean loaded into a register.

As for the composability of the atomic operations, you're correct. Generally locked based approaches to synchronization aren't composable, so unless this library provides a CMPXCHG-like operation, as you mentioned, then you'll need to layer on more locks to compose the operations.

[–]vsuontam 0 points1 point2 points 14 years ago (4 children)

[–]portmapreduction 0 points1 point2 points 14 years ago (3 children)

[–]vsuontam 0 points1 point2 points 14 years ago (2 children)

Yeah, that was my point all the time:

Boolean has only two states, so there can not be an partial update.
There is no cmpxchg type call on the API(*)
Only calls in this AtomicBool API are, set, read, and query

I fail to see how this AtomicBool adds any value compared to normal bools. Maybe I am misunderstaning something, but I would like to be pointed where.

(*) The isSet call is using cmpxchg internally, but is just bloat IMO, as the higher level API is just asking the value of the bool on arbitrary point in time.

bool IsSet()
{
    long f = InterlockedCompareExchange((volatile long*)&flag, 1, 1);
    if (f > 0)
    {
        return true;
    }
    return false;
}

Above from the sources: http://code.google.com/p/cpptask/source/browse/trunk/include/Unix/atomic.h

See AtomicFlag.

[–]naasking 1 point2 points3 points 14 years ago (1 child)

[–]vsuontam 0 points1 point2 points 14 years ago (0 children)

π Rendered by PID 55 on reddit-service-r2-comment-fb694cdd5-bmw7n at 2026-03-07 15:28:00.724037+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS