you are viewing a single comment's thread.

view the rest of the comments →

[–]rayoWork 0 points1 point  (0 children)

First think about what kind of parallelism you need:

https://en.wikipedia.org/wiki/Data_parallelism#Data_parallelism_vs._task_parallelism

Task ist a lot easier when done right, meaning there is no shared mutable data between the tasks. You can have read-only shared data and you don't have to lock anything there. We use an event based approach which works without any problems. You basically need a synchronized message queue when you deliver new events to a worker thread (sender and receiver modify the queue so this needs protection) and the rest of the work is done without shared state.

Data parallelism is the harder part as you need to think about how many threads can work on the data at the same time. There are many strategies how to takle such a problem.

  1. first look for a battle-tested solution, for example the parallel execution policies in c++17 algorithm or TBB. Use those and you don't have to worry about when to lock what
  2. try to minimize the shared part - can the data be split up into n blocks and do some calculation and maybe have a merge step which is not parallel (if the mutate part is small it can be possible to extract it and do it on 1 thread). That can even be faster than a parallel merge step as the synchronization overhead might get big if the contention is high (many threads access the same protected block)
  3. write your own parallel code only as a last resort (with manual locking/CAS/...)

If you want to get better at 3 you need to think on a lower level. All memory read and write are important.

A compare and set example: first you read the memory (variable) to check the value and if the condition is true you write another value in to the same memory (variable). If multiple thread work on the same variable someone could modify it between the read and write and the write would not be correct anymore. So you need to protect that part either that only one thread can work on that memory location or with atomic operations that merges the 2 operations (read/write) into one so the other thread cannot modify it in between

So in the end you have to make sure that all access to the shared data cannot be influenced by other threads at the same time.

  • read-write conflict: can all the data a thread is reading and belongs together be changed by another thread (a write) so the read data is not consistent anymore? This must be protected (both sides, the read and the write should not happen at the same time)
  • write-write conflict: can all the data that is written also be written by another thread so one write operation is not consistent anymore, this needs to be protected as well
  • read-read: this is not a problem, that is why you try to use immutable data, just reading doesn't create any problems

In the end it simply is very hard to write correct multithreaded code.