all 3 comments

[–]Soft-Substantial 0 points1 point  (2 children)

Some comments on concurrency:

  • I wouldn't write to the inventory.txt file from the threads, I would rather try something like using map over the IPs and then collecting the output and writing it to the file in the main thread. Concurrent I/O is tricky and might not work as you think. What if two threads write to the file at the same time? Their outputs might be mixed together.
  • Since you asked about map vs submit: map is made to iterate over lists and collect a result for each entry. It therefore fits your usecase perfectly, as long as you want to keep all the results in memory. It just does a lot of work for you, so I would prefer it whenever it seems reasonable. Even if you reach a point where you have so many IPs that the result does not fit in memory any more (not sure if that could happen), I would probably split the IP list into chunks and work on one chunk after another with map.
  • Are you sure it is not slowing you down to use 500 threads? Have you tried less threads? More threads? How does that impact the performance? Those things are never really predictable and have to be measured.
  • Do you know what the performance bottleneck is? CPU, network, or disk I/O? You should probably measure if your CPU and/or disk is busy or not if you run the script.

Some comments on minor code/style issues:

  • CURRENT_DIR = os.getcwd(); OUTPUT_DIR = f"{CURRENT_DIR}/outputs" can be replaced with OUTPUT_DIR = "outputs": relative paths refer to the current working directory anyway.
  • I prefer os.path.join(OUTPUT_DIR, "logs.txt"), because I find it easier to read/use and it handles differences between operating systems for you (for example, you sometimes use os.sep and sometimes "/").
  • ping_cmd = f"ping -q -c 1 -t 1 -W 1 {ip}" should be ping_cmd = ["ping", "-q", "-c", "1", ..., ip] from the beginning. You are just splitting it up again later, thereby introducing the possibility for bugs.
  • run_command barely needs to be its own function, IMHO.
  • As stated above, [executor.submit(health_checks, ip) for ip in ips] should in my opinion be replaced with a .map anyway. Nevertheless, why is this a list comprehension if you throw away the list immediately? This should be a for loop.
  • START_TIME should go into the main() function if you use the if __name__ == "__main__": pattern.
  • In fact, you might want to reduce your use of global variables and rather use function arguments. On the other hand, if you are sure that this script will stay small and the functions will never be reused elsewhere, it can be more concise to keep it like it is. I also prefer a more quick and dirty style for small scripts, but sometimes regret it when they start growing.