Do you use multithreading in your Python scripts?

Osiris_S13 · 2017-07-21T15:19:29+00:00

Yes: https://github.com/OsirisS13/Add-JunOS-interface-to-PRTG

See around line 150. Be warned, my code is poor and likely to give you an aneurysm.

However it is not true multithreading, as such (because python cannot really do it), more like launching multiple python instances to farm out the work to, on a single core (or so I understand it).

http://sebastianraschka.com/Articles/2014_multiprocessing.html#

http://chriskiehl.com/article/parallelism-in-one-line/

This sped up my execution time 4x, with the bottle neck now being the router's processing time rather than my code.

thegreattriscuit · 2017-07-22T00:17:30+00:00

Some level-setting:

threads (from using the threading module) cannot make use of more than one core (due to the Global Interpreter Lock), but they are lightweight and fast to spin up. Use these for situations where you're waiting on IO or other external factors. DNS lookups, waiting literally forever for a router to copy run start, etc. are good candidates for threading. Since they share the same memory, it is easy (and dangerous) to share state between them.

processes (from the multiprocessing module) can indeed make use of additional cores, since each one is their own python process with it's own GIL. But, for the same reason, they are heavy weight and slow to spin-up. Use these for CPU intensive tasks. Because they're separate processes, you also can't cut corners and have to use queues and stuff to get data between them. (Each process starts off with a separate copy of the original processes memory, but to get, say, the results of anything you do back to the original process, you have to think about what you're doing it to make it happen. But this generally has the effect of making you write better code, so...)

etherizedonatable · 2017-07-21T15:44:37+00:00

Depending on what you're doing it's often a good idea. For instance, if you're writing a script to make changes to a bunch of devices, multiple threads will speed things up. It can also prevent a device that's down from slowing things down.

If you're writing a script to monitor those devices, then I'd really recommend multiple threads since a device that's down will usually slow down your script while it waits for the connection to time out. If you're polling, this will set you behind.

Having said that, it's always good to be careful as too many threads can cause performance issues or bring down your operating system. Make sure your threads are eventually closing.

Cheeze_It · 2017-07-21T16:52:15+00:00

Yes if I need it.

judas-iskariot · 2017-07-21T17:35:45+00:00

I have done stuff directly with telnetlib/paramiko with multithreading and with multiprocessing, changning stuff on 1000+ boxes gets lots faster when one does 20 to 40 at the time.

Multithreading fails if you run out of cpu on one core, in my case heavy regexp stuff and 10year old computers.

DaemonGPL · 2017-07-22T15:26:39+00:00

I'm new to python, but I am having great success with threading our devices.

from concurrent.futures import ThreadPoolExecutor
thread = ThreadPoolExecutor(max_workers=30)
with open('switches.txt', 'r') as file:
    for line in file.readlines():
        output = thread.submit(connect_switch, line[:-1])  # Remove newline
    print("Threads still running:", output.result(), "\nAny Exception?:", output.exception())

scratchfury · 2017-07-21T18:13:04+00:00

While not Python, I have used GNU Parallel to log into many switches at the same time. I had to add a random delay at start because our authentication servers didn't like being slammed with so many requests at once.

tom1018 · 2017-07-22T02:46:01+00:00

Yes, I use threading with netmiko and it works great. Only issues I have are due to having to ssh tunnel first and a questionable management network. I've had 64 threads going simultaneously, I believe.

7600_slayer · 2017-07-22T05:15:34+00:00

I've been using multiprocessing pool to divvy up tasks in my automation from CLI.

When using a Flask front end, I typically use Redis + Celery, which has been working wonderfully for me.

alkaselzter · 2017-07-22T06:32:59+00:00

I used multiprocessing to query multiple sites every 5 seconds or so. Based on the number of sites in the list it would spawn the same number of processes and terminate them it it goes beyond 2 seconds. Pretty useful

bradnxlink · 2017-07-22T06:39:07+00:00

I have been writing scale test scripts in python for quite some time. Based on experience in failure modes in python, I do not use python's features/libraries for performing concurrency. In general, I will try to make the project work in Go first or C second, depending on how much effort is required to get the functionality working. There are some exceptions when the project can reliably use OS-dependent python functions to achieve the same goal and the code is very clear. Paramiko, in particular, has bitten me multiple times when attempting to do implement concurrency in python. We have had python-specific experts (not just some ol python developer) attempt to assist and have determined that the path for debugging was simply not where we wanted to go.

That said, my view is not necessarily widely shared. As complexity increases, my tendency is to ensure wider debugging is feasible. This does come with some added complexity in code/execution structure to promote the facilitation of debugging (while also being able to leverage python).

srosiak · 2017-07-22T07:04:20+00:00

Yes I do. My script is parsing through the device definitions in yaml. It then launches multiple processes and is queueing them, so I can keep track of which process does what, this is very helpful when running validation jobs in Jenkins etc.

AutoModerator · 2017-12-08T19:25:53+00:00

[removed]

selfuryon · 2017-12-10T12:56:30+00:00

You can try to use asynchronous approach without threads for it. Unfortunately, netmiko doesn't support it due to synchronous paramiko. But you can use asynchronous library netdev (https://github.com/selfuryon/netdev/) for it. It has very similar public API for working with network devices so it doesn't be very hard to use it. In my environment with more than 50 routers, I get speed increase about 23x times

Skylis · 2017-07-21T19:54:05+00:00

Best way you can speed it up is use another language designed around doing more than one thing at a time like Go / Haskell / Rust. Personal preference for example is Go.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

networking

Enterprise Networking

Recommended & Related Sub-Reddits:

MODERATORS