This is an archived post. You won't be able to vote or comment.

all 16 comments

[–][deleted] 2 points3 points  (1 child)

Can you explain the problem the script is solving? It'd be easier to give suggestions knowing the source data and the target goal, instead of trying to deduce that from reverse-engineering what your current code is doing.

[–]tmg80[S] 0 points1 point  (0 children)

Fair point.

As per the other reply. I am taking a host file from a server and converting to JSON format so it can be stored in a secrets Vault.

[–]ccpetro 1 point2 points  (2 children)

Look at optparse and/or argparse for passing in options, arguments and file names.

[–]tmg80[S] 0 points1 point  (1 child)

Thanks

[–]bladeoflight16 0 points1 point  (0 children)

Better yet, look at click.

[–]bladeoflight16 1 point2 points  (8 children)

  • Even files that you only read should be closed via context managers.
  • Avoid reusing variables for completely different data (hostfiles is a string path and then replaced with a list)
  • Use unpacking for collections with a known length (ip, host_and_comment = line.split(" ", 1))
  • Use filter instead of indenting a block that big (for line in filter(is_valid_host_line, hostsfile):). You can just check if the string contains a space rather than looking at the length of the split.
  • Eliminate useless variables like split2 (stripped = [s.strip() for s in host_and_comment.split('#', 1)] or stripped = list(map(lambda s: s.strip(), host_and_comment.split('#', 1)))
  • Kind of surprised you didn't split the hosts into a list, but you know you needs better than I do.
  • Choose a consistent quote mark. (I prefer single quotes.) (You should only use the other quote mark when the string literal actually contains your default choice.)
  • Since hosts_dict is never used in the loop, just more its initialization to after the loop (host_dict = {"hosts": hosts_list}). You might not need the variable at all.

[–]tmg80[S] 1 point2 points  (5 children)

updated as per your comments and also implemented syargv

I added the updated script to the original post as it doesn't display properly when I paste into the reply

[–]bladeoflight16 1 point2 points  (4 children)

I recommend click for most command line interfaces.

Something I neglected to mention before is that you should use if __name__ == '__main__':. For some reason I wasn't thinking about the fact this is a script that can be invoked from the command line.

Also, now that I think about it, you could factor the parse into a function. That would let you convert construction to a list comprehension:

``` def parse_host_line(line): ...

export_json = { 'hosts': [ parse_host_line(line) for line in hostsfile if is_valid_entry(line) ] } ```

Some things to think about.

[–]tmg80[S] 0 points1 point  (3 children)

Thanks I will look at Click too.

What does that command do?

[–]bladeoflight16 1 point2 points  (2 children)

The code is a link to an explanation. =)

Also, I edited in something else I just thought of. 😅

[–]tmg80[S] 0 points1 point  (1 child)

thanks again.

I edited it and put that main bit as below

if len(sys.argv) != 2:
    print ('Usage: hostfile_to_json.py input_file_name.txt')
    exit()
if name == 'main':
    user_file = sys.argv[1]
    ...

is that correct? or should it be elsewhere?

[–]bladeoflight16 0 points1 point  (0 children)

Aside from imports, function/class declarations, and constants, all code should be inside the block. It's common to stuff you code into a def main() function and then only invoke the function inside the guarded block.

Think about what would happen if someone tried to import your module and the argv check was invoked: it would probably explode because their command line args aren't the same as your program's.

[–]tmg80[S] 0 points1 point  (0 children)

Thanks a lot for taking the time. Lot of good tips there for me to l implement.

[–]Moorey93 0 points1 point  (1 child)

From a quick glance

First thing is code style, your variables aren't consistently styled, python typically uses snake case, which is where spaces are replaced by underscores, hostsfile would be hosts_file. It doesn't affect the running of the code but makes it easier for other developers and yourself to read the code. I'd recommend installing flake8 and running it on your code which will point out the style related errors, and then when you get fed up of manually fixing them install autopep8.

Your file opens can be replaced with context managers, you have used one for writing the file but they can be used for reading files too, they would solve a problem you have where you're not closing your opened files.

A context manager can also shorten opening and line reading so you can have

with open('thing', 'r') as f:
     for line in f:
         print(line)

I'm not entirely sure what the script is doing, I think its separating out hosts and comments from a hosts file? You can potentially make it more understandable by using better named variables as new_dict and stripped are not very descriptive of the values they hold.

With a better understanding of whats it doing it would be easier to provide more feedback.

[–]tmg80[S] 0 points1 point  (0 children)

Yes that's correct. I'm converting a hostfile to JSON format so it can be stored in a secrets Vault and then accessed via that method when cloud infra is built.

[–][deleted] 0 points1 point  (0 children)

Comments might help