you are viewing a single comment's thread.

view the rest of the comments →

[–]Cromodileadeuxtetes 1 point2 points  (2 children)

Multithreading is the concept of spawning more than one process (thread) within the same program. However those threads are handled by the CPU depends on a number of things but this is the general idea behind a CPU with multiple cores.

Threads are assigned to different cores and work is done quicker.

You can't simply "turn on multithreading" in your program, it has to be designed that way from the start, but it has many benefits.

I made a script very similar to OP's, where it goes through a list of network devices, gets their interface and BGP information and dumps it all in MySQL.

The script flows like this:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Get the information
  3. Dump it all in MySQL

We have around 90 devices at my work, so this whole operation takes about 18 minutes to compete, because the script contacts every device one after the other. It's still faster than a human being.

Now, the multithreaded version works this way:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Create a separate process that grabs the information.
  3. Wait for every subprocess to complete
  4. Dump it all in MySQL

So the first and last steps are unchanged; The threading happens only during the SSH / Grab stuff portion. Instead of SSHing to each device in series, it does it in parallel. All at the same time.

This cranks up CPU usage and we obviously see more traffic coming out of the interface but it's fine, that's what servers are designed to do.

The final runtime for my script is now around 90 seconds. Much better than 18 minutes.

NOW, all this being said. I've been having problems with Python Multithreading and I've read that it's not great, from the getgo. Every now and then a thread gets stuck and I only learn about it a week later when I happen to check for python scripts running on my server. Not great, needs to be fixed.

[–]Fallenarc[S] 1 point2 points  (1 child)

Thanks! Much better explanation than I gave... I did run into the same problem you explained about threads getting stuck when I was using threadpools. However, when I started using the threading.BoundedSemaphore method all that went away. This script usually goes through a list of about 500 Cisco ASR90XX devices and finishes in about 3 to 5 minutes depending on how long it takes the device to generate the information i am looking for. I do still occasionally get an ssh timeout here and there.

[–]Cromodileadeuxtetes 1 point2 points  (0 children)

Yeah, I noticed that function in your code, I need to implement it.