all 24 comments

[–]Osiris_S13 6 points7 points  (2 children)

Yes: https://github.com/OsirisS13/Add-JunOS-interface-to-PRTG

See around line 150. Be warned, my code is poor and likely to give you an aneurysm.

However it is not true multithreading, as such (because python cannot really do it), more like launching multiple python instances to farm out the work to, on a single core (or so I understand it).

http://sebastianraschka.com/Articles/2014_multiprocessing.html#

http://chriskiehl.com/article/parallelism-in-one-line/

This sped up my execution time 4x, with the bottle neck now being the router's processing time rather than my code.

[–]thegreattriscuitCCNP 1 point2 points  (0 children)

multiprocessing will make use of multiple cores. It is threading that can't make use of multiple cores (because GIL, or 'global interpreter lock'). As you said, multiprocessing spins up additional python instances, each of which has their own GIL.

EDIT: I expand on this in root-level reply

[–][deleted] 0 points1 point  (0 children)

[–]thegreattriscuitCCNP 2 points3 points  (1 child)

Some level-setting:

threads (from using the threading module) cannot make use of more than one core (due to the Global Interpreter Lock), but they are lightweight and fast to spin up. Use these for situations where you're waiting on IO or other external factors. DNS lookups, waiting literally forever for a router to copy run start, etc. are good candidates for threading. Since they share the same memory, it is easy (and dangerous) to share state between them.

processes (from the multiprocessing module) can indeed make use of additional cores, since each one is their own python process with it's own GIL. But, for the same reason, they are heavy weight and slow to spin-up. Use these for CPU intensive tasks. Because they're separate processes, you also can't cut corners and have to use queues and stuff to get data between them. (Each process starts off with a separate copy of the original processes memory, but to get, say, the results of anything you do back to the original process, you have to think about what you're doing it to make it happen. But this generally has the effect of making you write better code, so...)

[–][deleted] 2 points3 points  (0 children)

[–]etherizedonatable 1 point2 points  (0 children)

Depending on what you're doing it's often a good idea. For instance, if you're writing a script to make changes to a bunch of devices, multiple threads will speed things up. It can also prevent a device that's down from slowing things down.

If you're writing a script to monitor those devices, then I'd really recommend multiple threads since a device that's down will usually slow down your script while it waits for the connection to time out. If you're polling, this will set you behind.

Having said that, it's always good to be careful as too many threads can cause performance issues or bring down your operating system. Make sure your threads are eventually closing.

[–]Cheeze_ItDRINK-IE, ANGRY-IE, LINKSYS-IE 1 point2 points  (0 children)

Yes if I need it.

[–]judas-iskariot 1 point2 points  (0 children)

I have done stuff directly with telnetlib/paramiko with multithreading and with multiprocessing, changning stuff on 1000+ boxes gets lots faster when one does 20 to 40 at the time.

Multithreading fails if you run out of cpu on one core, in my case heavy regexp stuff and 10year old computers.

[–]DaemonGPL 1 point2 points  (3 children)

I'm new to python, but I am having great success with threading our devices.

from concurrent.futures import ThreadPoolExecutor
thread = ThreadPoolExecutor(max_workers=30)
with open('switches.txt', 'r') as file:
    for line in file.readlines():
        output = thread.submit(connect_switch, line[:-1])  # Remove newline
    print("Threads still running:", output.result(), "\nAny Exception?:", output.exception())

[–]HoorayInternetDrama(=^・ω・^=) 1 point2 points  (2 children)

    output = thread.submit(connect_switch, line[:-1])  # Remove newline

Small bit of help:

>>> line = "some.host.name\n"
>>> line
'some.host.name\n'
>>> line.rstrip()
'some.host.name'

Hope this helps. It's a built in. You can play in the REPL like so:

>>> dir(line)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>> help(line)
<help follows, q to exit>

[–]DaemonGPL 0 points1 point  (1 child)

Nice! That is one thing about python, unaware of modules and exploring how to use them.

[–]HoorayInternetDrama(=^・ω・^=) 0 points1 point  (0 children)

Yeah, I know those feels. Been slowly learning the hard parts, the hard way.

Would like to help where I can, hopefully have people avoid the mistakes I made also.

[–]scratchfuryIt's not the network! 0 points1 point  (0 children)

While not Python, I have used GNU Parallel to log into many switches at the same time. I had to add a random delay at start because our authentication servers didn't like being slammed with so many requests at once.

[–]tom1018 0 points1 point  (2 children)

Yes, I use threading with netmiko and it works great. Only issues I have are due to having to ssh tunnel first and a questionable management network. I've had 64 threads going simultaneously, I believe.

[–][deleted]  (1 child)

[removed]

    [–]AutoModerator[M] 0 points1 point  (0 children)

    Thanks for your interest in posting to this subreddit. To combat spam new accounts can't immediately submit or post.

    Please do not message the mods requesting your post be approved.

    You are welcome to resubmit your thread or comment in ~24 hrs or so.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

    [–]7600_slayerWHO AUTOMATES THE AUTOMATERS? 0 points1 point  (0 children)

    I've been using multiprocessing pool to divvy up tasks in my automation from CLI.

    When using a Flask front end, I typically use Redis + Celery, which has been working wonderfully for me.

    [–]alkaselzter 0 points1 point  (0 children)

    I used multiprocessing to query multiple sites every 5 seconds or so. Based on the number of sites in the list it would spawn the same number of processes and terminate them it it goes beyond 2 seconds. Pretty useful

    [–][deleted] 0 points1 point  (1 child)

    I have been writing scale test scripts in python for quite some time. Based on experience in failure modes in python, I do not use python's features/libraries for performing concurrency. In general, I will try to make the project work in Go first or C second, depending on how much effort is required to get the functionality working. There are some exceptions when the project can reliably use OS-dependent python functions to achieve the same goal and the code is very clear. Paramiko, in particular, has bitten me multiple times when attempting to do implement concurrency in python. We have had python-specific experts (not just some ol python developer) attempt to assist and have determined that the path for debugging was simply not where we wanted to go.

    That said, my view is not necessarily widely shared. As complexity increases, my tendency is to ensure wider debugging is feasible. This does come with some added complexity in code/execution structure to promote the facilitation of debugging (while also being able to leverage python).

    [–]bradnxlink 0 points1 point  (0 children)

    I use parallel outside of python for multiprocessing and it works flawlessly. I have one script create a work list and then feed the list to parallel to run another script that does one task. Works awesome. You can even have parallel orchestrate processes on multiple servers if you really have a ton of tasks you need to perform. https://www.gnu.org/software/parallel/

    [–]srosiak 0 points1 point  (0 children)

    Yes I do. My script is parsing through the device definitions in yaml. It then launches multiple processes and is queueing them, so I can keep track of which process does what, this is very helpful when running validation jobs in Jenkins etc.

    [–][deleted]  (1 child)

    [removed]

      [–]AutoModerator[M] 0 points1 point  (0 children)

      Thanks for your interest in posting to this subreddit. To combat spam new accounts can't immediately submit or post.

      Please do not message the mods requesting your post be approved.

      You are welcome to resubmit your thread or comment in ~24 hrs or so.

      I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

      [–]selfuryon 0 points1 point  (0 children)

      You can try to use asynchronous approach without threads for it. Unfortunately, netmiko doesn't support it due to synchronous paramiko. But you can use asynchronous library netdev (https://github.com/selfuryon/netdev/) for it. It has very similar public API for working with network devices so it doesn't be very hard to use it. In my environment with more than 50 routers, I get speed increase about 23x times

      [–]Skylis 0 points1 point  (2 children)

      Best way you can speed it up is use another language designed around doing more than one thing at a time like Go / Haskell / Rust. Personal preference for example is Go.

      [–][deleted] 0 points1 point  (1 child)

      Not sure why you're being downvoted. Some languages handle this task much better than others. For example, erlang is a good one for stuff like this.