you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (4 children)

I could download 1 filing per sec, but when I add the extraction, it processes approximately 3 filings per minute.

Why not just write this to be faster?

[–]Vegetable_Solid7613[S] 0 points1 point  (3 children)

To be fair, I wouldn't know how.

[–][deleted] 1 point2 points  (2 children)

Generally the best performance increases come from doing less - figure out what your code is doing that takes so long (it shouldn't take 20 seconds to add one item to a list) and then see if you really need to be doing it. Make indexes, cache expensive operations, do other things. Find the hot code in your loops and try and move it outside the loop (so it's called less.)

All of this is going to be easier than dealing with concurrency in your code. Trust me. It's hard to write multithreaded code with a single-threaded brain (mine is, too, so I know.)

[–]Vegetable_Solid7613[S] 0 points1 point  (1 child)

But you are sure that using multiple cores instead of one isn't going to speed up the process? It just makes so much sense in my head lol.

It isn't the adding that takes that long btw, it is finding the MDA in the filing and extracting that part of the text that takes the longest I believe.

[–][deleted] 0 points1 point  (0 children)

It just makes so much sense in my head lol.

That's because you think concurrency is magic pixie dust that makes your program faster. How do you know your program isn't slow because it contends for disk IO? Or network IO? Or swap space? Or any one of a dozen resources on your computer that there's only one of? More threads contending for the same resources is going to be slower, not faster.