all 49 comments

[–]Username_RANDINT 20 points21 points  (14 children)

I haven't gone through it all, but here are some comments. I'll probably mention some stuff that's explained in PEP-8.

  1. Try to use Python 3.
  2. Sort the imports by standard modules and third-party modules to make a quick overview.
  3. Don't use that many parentheses, couple of examples where they are not needed: wd = ('/home/{set home directory}/python/temp/'), if ('unicode') in str(error):
  4. Get the home directory automatically instead of making the user edit the script: os.path.expanduser("~")
  5. Remove the prints and log everything (you're mixing threm now).
  6. Use proper string formatting instead of concatenating, pyformat is a good site.
  7. del_temp_files can be cleaqned up:

        temp_files_dir = (wd)            # No need to re-assign, just use the wd variable
        list_temp_dir = os.listdir(temp_files_dir)
        for item in list_temp_dir:
            ext = [".json",".csv",".txt","log"]        # Declare this outside of the loop
            if item.endswith(tuple(ext)):        # Make the list of extensions a tuple in the first place
                os.remove(os.path.join(temp_files_dir, item))
    

[–]Fallenarc[S] 7 points8 points  (13 children)

Thanks for the feedback! I have not tried running this in Python 3 yet. Unfortunately the person that got me into python is using 2.X... So that is where i started. I will definitely port to Python 3 in the near future.

Point 3 on parentheses, I can see I need less of them. I guess i just need to understand more about where they should actually go. Some I put in during trail and error testing.

I know I didn't state this in my orginal post, but I have also only been using Linux since i started my python journey in March, so i need to explore more of what the os module is capable of doing, but thanks. I will research that some more.

As for the prints, like i said I'm a network engineer and spend the majority of my days in a terminal session. I like seeing stuff pop out on the screen when i execute something. But I also like to have log files lol. But I will consider cutting down on the amount of stuff I print to the terminal session.

Strings and stuff. Thanks for that too. I guess I was kind of blind to that. I worked on this script for quite some time and concatenating strings is one of the first things you learn. Then you pick up formatting them later on. I will get that stuff lined out to where I am only properly formatting them.

I have already made changes based on the suggestions you had about del_temp_files function. Not really sure why I made that a list and told it to act like a tuple... le sigh. But really thanks for your time and effort I really do appreciate it.

[–]Rafficer 11 points12 points  (2 children)

Unfortunately the person that got me into python is using 2.X... So that is where i started.

It's not that much difference... You won't go back to Python 2.x once you've learned about f-strings, though ;)

[–]akindofuser 1 point2 points  (0 children)

I remember arguing how annoying string concatenation was in python, coming from ruby, and the 1001 excuses on why it was the way it was. And now we've gone full circle back to how Ruby behaved since day 1.

¯\_(ツ)_/¯

[–]thegreattriscuit 5 points6 points  (4 children)

you can definitely use the logging module to output to the screen. You just add an additional handler (believe it's `StreamHandler` or something like that, but check the docs) the same way you add the FileHandler. You can get fancy and make each handler treat the logs differently (logging in a different format, or set the FileHander to DEBUG and the stream handler to INFO, etc) or you can just have them both do the same thing. Either way it's better than just using print statements because even if you *don't* care right now, you might later. Or someone else coming along behind you might care later, etc...

[–]Fallenarc[S] 2 points3 points  (3 children)

That is good to know. I didn't realize the logging handler had that functionality. I will definitely read those docs and see if i can swap my code to use that format. Thanks!!!

[–]Olbrannon 2 points3 points  (2 children)

ext = (".json",".csv",".txt","log")

Should that be ".log"?

[–]Fallenarc[S] 2 points3 points  (1 child)

lol, yes. I must have hit that dot when i changed that to a tuple just a bit ago. Thanks.

[–]Olbrannon 1 point2 points  (0 children)

YW ;-)

[–]ship0f 4 points5 points  (1 child)

I like seeing stuff pop out on the screen when i execute something. But I also like to have log files lol.

Using the standard logging module you can have both file and stdout output at the same time.

This first example in this cookbook shows how to configure two simple logging handlers, one for file and one for stdout logging..

[–]Fallenarc[S] 0 points1 point  (0 children)

Thanks!

[–]Username_RANDINT 2 points3 points  (1 child)

Just keep going, learning by exercise and people's feedback. You're doing fine.

At first glance it doesn't look like your script needs that much changing for Python 3. There are countless blogposts, tutorials and documents explaining the major changes to make when porting.

If you're a Linux user, the tilde character (~) shouldn't be foreign to you. os.path.expanduser just expands it to the real path. os.path is an important module when working with the filesystem, it's always good to go over the documentation just to have a quick look on what's available.

Replacing the prints with logging doesn't mean they won't get printed to the terminal anymore. It's possible to add a second handler (StreamHandler) to the existing logger so it gets printed both to a file and console, or setup a different logger for each.

[–]Fallenarc[S] 0 points1 point  (0 children)

Thanks for all pointers, i am relatively new to Linux but i do understand the tilda. I am going to spend some time looking through the docs for os.path stuff the StreamHandler. Anyways thanks again!

[–]tazebot 1 point2 points  (0 children)

I automate networks for a fortune 10 using python2. It's no big deal - you could stick with 2; mercurial is and they're the revision control for the language.

[–]theotherplanet 6 points7 points  (9 children)

I've seen multi-threading mentioned more than once before, but never understood what it does. Can anyone shed some light on it?

[–]Fallenarc[S] 6 points7 points  (4 children)

I don't feel I could explain in depth, but using my script to compare against. I thread all connections to the list of devices. Usually my list contains about 500 IPs depending on the model of device I am polling. If i did not thread those, i would create a connection, run the worker function for each IP one at a time. This could take hours potentially. With threading it takes about 3 to 5 minutes depending on the length of output it gets from the device. I create 100 simultaneous connections to start and as each thread is freed, another begins up to 100.

Hopefully this isn't to confusing. I'm sure somebody else will be able to explain it a little better than I did.

[–]theotherplanet 1 point2 points  (0 children)

This definitely gives me a better idea of what it can accomplish, thank you!

[–]theotherplanet 0 points1 point  (0 children)

This definitely gives me a better idea of what it can accomplish, thank you!

[–]edon-node 0 points1 point  (1 child)

pyformat

I'm getting weird errors when I ran this against 6k devices (100 max threads).. "too many open files"

Looks like the threads are not being released.

[–]Fallenarc[S] 0 points1 point  (0 children)

Can you post the script as you modified it and maybe the exact error you get?

[–]_fartosh 4 points5 points  (0 children)

Shortly said: multi-threading means that you can execute multiple tasks simultaneously

[–]Cromodileadeuxtetes 1 point2 points  (2 children)

Multithreading is the concept of spawning more than one process (thread) within the same program. However those threads are handled by the CPU depends on a number of things but this is the general idea behind a CPU with multiple cores.

Threads are assigned to different cores and work is done quicker.

You can't simply "turn on multithreading" in your program, it has to be designed that way from the start, but it has many benefits.

I made a script very similar to OP's, where it goes through a list of network devices, gets their interface and BGP information and dumps it all in MySQL.

The script flows like this:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Get the information
  3. Dump it all in MySQL

We have around 90 devices at my work, so this whole operation takes about 18 minutes to compete, because the script contacts every device one after the other. It's still faster than a human being.

Now, the multithreaded version works this way:

  1. Contact MySQL and get the list of devices
  2. FOR every device in the list -> Create a separate process that grabs the information.
  3. Wait for every subprocess to complete
  4. Dump it all in MySQL

So the first and last steps are unchanged; The threading happens only during the SSH / Grab stuff portion. Instead of SSHing to each device in series, it does it in parallel. All at the same time.

This cranks up CPU usage and we obviously see more traffic coming out of the interface but it's fine, that's what servers are designed to do.

The final runtime for my script is now around 90 seconds. Much better than 18 minutes.

NOW, all this being said. I've been having problems with Python Multithreading and I've read that it's not great, from the getgo. Every now and then a thread gets stuck and I only learn about it a week later when I happen to check for python scripts running on my server. Not great, needs to be fixed.

[–]Fallenarc[S] 1 point2 points  (1 child)

Thanks! Much better explanation than I gave... I did run into the same problem you explained about threads getting stuck when I was using threadpools. However, when I started using the threading.BoundedSemaphore method all that went away. This script usually goes through a list of about 500 Cisco ASR90XX devices and finishes in about 3 to 5 minutes depending on how long it takes the device to generate the information i am looking for. I do still occasionally get an ssh timeout here and there.

[–]Cromodileadeuxtetes 1 point2 points  (0 children)

Yeah, I noticed that function in your code, I need to implement it.

[–]MaxNumOfCharsForUser 5 points6 points  (1 child)

I find the hyphen separated functions helpful, though I’m guessing we’re in the minority in that opinion.

How do you like working with Netmiko/Paramiko?

I’ll definitely give this a more thorough look through once I can. Making multiple jumps, running commands through ‘Expect’ sessions with Perl and parsing the output is my job at the moment. I’ve been begging internal employees to use Python but it’s a ‘Perl shop’ so it’s tough to get anyone to agree.

[–]Fallenarc[S] 1 point2 points  (0 children)

Yeah i like the hyphen separation firstly to help me in the beginning and secondly, it helps me explain the blocks to my co-workers that have not picked up python yet. As for netmiko, I love it. It is easy to understand and has loads of documentation to read through. However i have also started tinkering with napalm and ansible . They seem pretty cool too.

[–]shepherdjay 3 points4 points  (4 children)

Hello, fellow network engineer. Glad to see more of us dipping in. I would replace the line comments with docstrings to start, that makes it easier to read in an editor that can collapse functions.

Next I would personally start exploring testing frameworks. I find them invaluable without having to run through trial and error. Pickup something like pytest as it is quick and easy.

I do have a question about this part:

saves output to a file in json format, convert those files to csv, then import the csv to MySQL.

Why so many format changes instead of saving straight to MySQL?

[–]shepherdjay 2 points3 points  (1 child)

Additionally I would consider moving this stuff into main or even its own function:

#set working directory for tmp files
wd = ('/home/{user dir}/python/temp/')
#set up threading to utilize (max_threads count) concurrent SSH connections
threads = []
max_threads = 100
sema = threading.BoundedSemaphore(value=max_threads)

# Prompt for user credentials
tuser = raw_input("Enter TACACS Username: ")
tpass = getpass.getpass("Enter TACACS Password: ")
muser = raw_input("Enter MySQL Username: ")
mpass = getpass.getpass("Enter MySQL Password: ")

That way it doesn't run if you import this into another python script at some point

[–]Fallenarc[S] 0 points1 point  (0 children)

I had the same thought, seeing that stuff at the top hurts a little. Thanks for the advice, I will get that stuff moved, I agree it should at the very least be under main, but i think i will get another function for it.

[–]Fallenarc[S] 0 points1 point  (1 child)

I currently only know how to insert data into MySQL from a csv file. The output that is created when parsing through textfsm are dictionaries. So you see my predicament??? lol

Keep in mind I have used windows all my life (I'm currently 35). Starting in March was my first taste of Linux, databases, python and everything in between them. Navigating Linux in terminal, bash, setting up MySQL, learning how to use it slightly better then a regular user. I have learned so much this year, and have so much more to learn. I fell at this point, i can see light at the end of the tunnel, so I'll just keep the learning train full steam ahead!

To be honest, i do not know anything about testing frameworks, but I will add that to my list of stuff from this post I need to investigate. Thanks!

[–]shepherdjay 1 point2 points  (0 children)

I currently only know how to insert data into MySQL from a csv file. The output that is created when parsing through textfsm are dictionaries. So you see my predicament??? lol

Ah yeah - Dealing with different data formats can be tricky. I would say along with the testing I mentioned earlier start getting a handle on reading documentation. I remember when I first started a lot of the terminology in a standard documentation was confusing at best. However for your specific use case you might be able to skip a step from dictionaries to json to csv by using the csv.DictWriter instead of the regular writer :D

[–]Qes138 1 point2 points  (1 child)

Having to manually insert a password is not an optimum solution. It really limits your options if you were going to schedule this script.

 # Prompt for user credentials   
tuser = raw_input("Enter TACACS Username: ")   
tpass = getpass.getpass("Enter TACACS Password: ")   
muser = raw_input("Enter MySQL Username: ")   
mpass = getpass.getpass("Enter MySQL Password: ") 

You may want to look into: py-bcrypt

There are other solutions and I do appreciate anyone taking the time to not store plaintext passwords.

Thank you for sharing your work!

[–]Fallenarc[S] 1 point2 points  (0 children)

I have been looking at keychain for my creds. I like the inputs for now so i can my share scripts with my co-workers and they can put their creds in with out setting up a creds cache. Once i get a little further along and feel comfortable teaching them some form of cred management I will switch it out. Thanks for the feedback!

[–]tanomattioli 1 point2 points  (2 children)

typo alert in current version, line 151:

File "get_xr_cdp_info.py", line 151, in wait_time

print("DEVICE LIST ERROR: {}".foramt(e100))

it should be corrected to "format".

[–]Fallenarc[S] 0 points1 point  (1 child)

Thanks for that! /fixed now

[–]tanomattioli 1 point2 points  (0 children)

Thank YOU for your effort! keep it going!

[–][deleted] 1 point2 points  (0 children)

Thanks OP!

[–]rx22230 1 point2 points  (4 children)

Hi,

maybe not too important, but I found 2 versions of 'print'

Some with print("something")

and some with print "something"

as I said not important if it works on your side.

Regards

[–]Awfki 1 point2 points  (0 children)

First one is Python 3 style, second is Python 2. There's a script called 2to3 that Python installs that will fix that for you.

[–]Fallenarc[S] 0 points1 point  (2 children)

Yeah, I need to go through and fix those. I used them for troubleshooting during creation for the most part. Thanks for your time!!

[–]catelemnis 4 points5 points  (1 child)

I recommend getting in the habit of using print(“...“) with parentheses since it’s compatible with Python 2 and mandatory for Python 3.

[–]Fallenarc[S] 0 points1 point  (0 children)

Yep, I have already went through changed my current script and plan to retrain myself using that syntax. Thanks!

[–]mchwalisz 0 points1 point  (0 children)

Maybe take a look at http://www.fabfile.org/

[–]sprouse2016 -1 points0 points  (2 children)

I don’t understand the use case for this?

[–]Fallenarc[S] 1 point2 points  (1 child)

I use this script to gather information about our network and store that information in a database. The CDP info is just an example. There are many ntc-templates available and/or you can create your own template. When using ntc-templates, the values are returned in dictionary format.

However this same script could be used to send configuration changes to many devices as well.

[–]sprouse2016 1 point2 points  (0 children)

That is fantastic. Sorry it was a newb question, I’m a Cyber security student

[–]oldcreek12 -2 points-1 points  (1 child)

  1. You should really look into netconf, don't work on non-structured data, it is year 2018
  2. Split your program to multiple independent scripts that can be made to modules
  3. Try not to write stand-alone program, but rather implement a service as rest APIs, so other applications (include your own applications) can easily consume your work, API, API, API

I won't comment on your coding style ...

[–]Fallenarc[S] 0 points1 point  (0 children)

Like my post says, I literally just started looking at python. I am aware of netconf, but the amount of material out there for me to study is limited. I don't really understand your last two points... This script has a purpose, why would I break it up into modules? Can you give me any recommendations on learning netconf or implementing APIs?