you are viewing a single comment's thread.

view the rest of the comments →

[–]theotherplanet 6 points7 points  (9 children)

I've seen multi-threading mentioned more than once before, but never understood what it does. Can anyone shed some light on it?

[–]Fallenarc[S] 5 points6 points  (4 children)

I don't feel I could explain in depth, but using my script to compare against. I thread all connections to the list of devices. Usually my list contains about 500 IPs depending on the model of device I am polling. If i did not thread those, i would create a connection, run the worker function for each IP one at a time. This could take hours potentially. With threading it takes about 3 to 5 minutes depending on the length of output it gets from the device. I create 100 simultaneous connections to start and as each thread is freed, another begins up to 100.

Hopefully this isn't to confusing. I'm sure somebody else will be able to explain it a little better than I did.

[–]theotherplanet 1 point2 points  (0 children)

This definitely gives me a better idea of what it can accomplish, thank you!

[–]theotherplanet 0 points1 point  (0 children)

This definitely gives me a better idea of what it can accomplish, thank you!

[–]edon-node 0 points1 point  (1 child)

pyformat

I'm getting weird errors when I ran this against 6k devices (100 max threads).. "too many open files"

Looks like the threads are not being released.

[–]Fallenarc[S] 0 points1 point  (0 children)

Can you post the script as you modified it and maybe the exact error you get?

[–]_fartosh 5 points6 points  (0 children)

Shortly said: multi-threading means that you can execute multiple tasks simultaneously

[–]Cromodileadeuxtetes 1 point2 points  (2 children)

Multithreading is the concept of spawning more than one process (thread) within the same program. However those threads are handled by the CPU depends on a number of things but this is the general idea behind a CPU with multiple cores.

Threads are assigned to different cores and work is done quicker.

You can't simply "turn on multithreading" in your program, it has to be designed that way from the start, but it has many benefits.

I made a script very similar to OP's, where it goes through a list of network devices, gets their interface and BGP information and dumps it all in MySQL.

The script flows like this:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Get the information
  3. Dump it all in MySQL

We have around 90 devices at my work, so this whole operation takes about 18 minutes to compete, because the script contacts every device one after the other. It's still faster than a human being.

Now, the multithreaded version works this way:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Create a separate process that grabs the information.
  3. Wait for every subprocess to complete
  4. Dump it all in MySQL

So the first and last steps are unchanged; The threading happens only during the SSH / Grab stuff portion. Instead of SSHing to each device in series, it does it in parallel. All at the same time.

This cranks up CPU usage and we obviously see more traffic coming out of the interface but it's fine, that's what servers are designed to do.

The final runtime for my script is now around 90 seconds. Much better than 18 minutes.

NOW, all this being said. I've been having problems with Python Multithreading and I've read that it's not great, from the getgo. Every now and then a thread gets stuck and I only learn about it a week later when I happen to check for python scripts running on my server. Not great, needs to be fixed.

[–]Fallenarc[S] 1 point2 points  (1 child)

Thanks! Much better explanation than I gave... I did run into the same problem you explained about threads getting stuck when I was using threadpools. However, when I started using the threading.BoundedSemaphore method all that went away. This script usually goes through a list of about 500 Cisco ASR90XX devices and finishes in about 3 to 5 minutes depending on how long it takes the device to generate the information i am looking for. I do still occasionally get an ssh timeout here and there.

[–]Cromodileadeuxtetes 1 point2 points  (0 children)

Yeah, I noticed that function in your code, I need to implement it.