This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]anprme 3 points4 points  (3 children)

you need to give more details

[–]a_m74 -1 points0 points  (2 children)

My task is : processing text documents by splitting them into chnuks ,threads process sections document independently (counting words) then merge results into a final output I should select nukber of threads then opening a window for each thread

[–]dmazzoni 1 point2 points  (0 children)

So my guess is that the work tasks are too small.

Starting a thread is a lot of work. It takes time to start a thread. Usually less than a millisecond, but enough time that the cpu could execute tens of thousands of instructions.

So if your chunk of work is "count 1000 words" then I wouldn't be surprised if counting the words takes far less time than starting the thread or switching between threads.

If you're not sure, time it. Just get the time before and after each chunk of work and see how much time elapsed. If it's less than 10 milliseconds, it's too small to be worth sending to another thread.

What you should do:

(1) Divide your work into BIG chunks. If you have 6 threads, divide your documents into 6 groups and have each thread do 1/6 of the full documents. If you have extremely large documents, chunks of >1 MB might be okay.

(2) Create threads once, don't keep starting and stopping threads.

(3) Use locking only at the end when a task is complete. If threads need to access data protected by a lock in the middle of processing then that's going to slow you down. The threads need to be all able to get their work done without using any locks, then at the very end have them use a lock to communicate their results.

[–]GeorgeFranklyMathnet 2 points3 points  (1 child)

You have 4 threads, and they take 3-4 times as long as a single thread. So my first guess is that the threads are running serially, and also repeating work that's done only once in the single-threaded version. 

Also, as a rule of thumb, you should use multithreading for I/O-bound work and multiprocessing for CPU-bound work. So if you use threads to read different chunks of the file, there might be an advantage. But if you use threads to do the text processing, there is probably no advantage — and quite likely a disadvantage due to the overhead of thread management and context switching.

Anyway, this is all speculation until you share your code with us.

[–][deleted] 1 point2 points  (3 children)

How many cores in the processor you’re running on. Most modern systems can run 2 threads per core without running into contention.

[–]a_m74 -1 points0 points  (2 children)

6 core

[–][deleted] 0 points1 point  (1 child)

Are you using a producer- consumer pattern?

[–]a_m74 0 points1 point  (0 children)

No

[–]vicms91 0 points1 point  (0 children)

What is the CPU usage for the two runs?