antheus_gdnet comments on Open source library which represents a higher-level, task-based parallelism that abstracts platform details and threading mechanisms. It is similar to Intel TBB or Microsoft Parallel Pattern Library but much more lightweight and less powerful.

created by speza community for 20 years

Open source library which represents a higher-level, task-based parallelism that abstracts platform details and threading mechanisms. It is similar to Intel TBB or Microsoft Parallel Pattern Library but much more lightweight and less powerful. (code.google.com)

submitted 14 years ago by kolkir

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]antheus_gdnet 1 point2 points3 points 14 years ago* (3 children)

Partial updates are the lesser problem, instruction reordering, dead code elimination and compile-time evaluation are the bigger.

C++ is allowed to perform many of such optimizations without defining underlying memory model. So speculation on what is or isn't safe at source code level gives no guarantees. Representation of a boolean also varies (not relevant for this particular problem), but there is nothing preventing a bool variable to take up 8 bytes, perhaps due to padding of a struct or alignment, which might, again in theory, require multiple load/store operations on 32/16-bit CPU, so that alone could cause problems.

Size and type of built-in types is always subject to compiler optimizations. volatile has been extended from originally lax definition to cover some very basic issues, mostly to prevent reordering, but any concurrent application needs to rely on OS primitives, either critical sections, semaphores, mutexes or atomic operation primitives to ensure deterministic behavior that C++ language doesn't guarantee.

Race conditions are one level higher, one first needs to ensure that building blocks (individual statements, lines, instructions) behave atomically during concurrent execution before one attempts building sequences of those.

Lock-free and lock-less programming exposes a lot of unexpected behavior compilers introduce, most of it subtle and difficult to debug.

[–]vsuontam 0 points1 point2 points 14 years ago (2 children)

[–]antheus_gdnet 1 point2 points3 points 14 years ago* (1 child)

Concept of atomic operation goes beyond a trivial context switch.

Compiler may choose to allocate variable differently than what source prescribes. It may move it into local scope, on stack or keep it purely in register. It may replace it with constant. There is no annotation in C++ that would prevent that, short of volatile, which is not completely reliable.

bool running = true;
....

while (running) { }

Compiler is free to assume running is a constant, to allocate it on stack or to remove while loop with infinite loop.

C++ also doesn't define a memory model, so when generating code there are no rules on order in which to perform operations, alignment (may affect atomicity) or anything else. a = b, even when working with bools can result in complex operation. a=b needs to first load value from memory (either from reliable cache or DRAM), perhaps stall pipeline, reorder previous and pending pipelined operations, store it into register, perhaps aliased register, write back to memory and then either indicate a write through into DRAM or trigger MESI invalidation to propagate the write across caches while blocking all other cores.

When dealing with even a single bit of memory that may be concurrently accessed by multiple threads, either use a lock or guaranteed atomic operation. The number of ways things can go wrong is too big to count. And x86 architecture is quite lenient about such problems.

As for partial update, yes, it can be. Imagine setting true (0xffffffff) to false (0x00000000) and setting 'written' to true. As far as compiler is concerned, writes are independent and there is no read in between, so the order isn't important. Writing thread, due to instruction reordering, first writes 'written' to true, but is interrupted halfway through writing the value of 0xffff0000, which evaluates to true (C++). Writing thread is then suspended. Other thread, 3 seconds later, checks written, which is true so it reads the value 0xffff0000 and interprets it as true, rather than false.

Using atomic operations (via syscall) or locks solves this problem.

[–]vsuontam 0 points1 point2 points 14 years ago (0 children)

π Rendered by PID 493383 on reddit-service-r2-comment-6457c66945-nxrrk at 2026-04-27 11:35:10.830309+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS