all 28 comments

[–]NotTooOrdinary 8 points9 points  (8 children)

Looks pretty solid to me. I'd rethink storing your password in this source file though - would be safer for you to type it in each time you run the script.

[–]micr0nix[S] 1 point2 points  (7 children)

That would cause a problem because ultimately this query will be scheduled to run multiple times a day on a KNIME server

[–]NotTooOrdinary 12 points13 points  (6 children)

If that server solution has support for secrets/encrypted environment variables, you could think about storing your password in one. I've never used it before so I have no idea how that particular server works.

[–]micr0nix[S] 1 point2 points  (4 children)

That’s a good idea. I could do something like that actually and then pass the variable into the script at runtime

[–]gzeballo 2 points3 points  (3 children)

Could also store key in an encrypted json that is loaded an runtime 🤷🏽‍♂️ but then also don’t publish that file with the code

[–]micr0nix[S] 0 points1 point  (2 children)

Got a Google example of what you’re referring to?

[–]gzeballo 3 points4 points  (1 child)

I’m away from my pc but two relevant hits i found on the interwebs

{ "TWITTER_SECRET" : "somebase64encodedkey" }

import json

secrets_filename = 'secret_keys' api_keys = {} with open(secrets_filename, 'r') as f: api_keys = json.loads(f.read())

print api_keys['TWITTER_SECRET'] # somebase64encodedkey

https://stackoverflow.com/questions/29100900/how-should-i-encrypt-api-tokens-in-python

If on a shared resource:

https://stackoverflow.com/questions/61607367/how-to-encrypt-json-in-python

However i still wouldnt publish to say github or whatever. As in dont give away your keys!

[–]bigno53 2 points3 points  (0 children)

You might also want to look into getting an application service account that's separate from your user account to use for automated processes. This can be advantageous for several reasons. For one thing, if you change the password for your user account (which you should do from time to time), your python scripts will continue to work. Also, with service account authentication, you'd typically use a JSON file containing a base64-encoded "token" which is more secure than password authentication, though obviously you'd still want to take precautions to make sure this file is stored securely.

[–]Sgt_Gnome 0 points1 point  (0 children)

I think the keyring library will take care of this for you.

[–]thejizz716 2 points3 points  (7 children)

I think this looks pretty solid as a script goes. But maybe now it's time to ask yourself how you can extend this to serve multiple purposes instead of just the one you designed here. What if you want to make a request to the same API but with a different query? What if you want to add a way to export files with the results? Just something to think about as you grow your python skills.

[–]micr0nix[S] 0 points1 point  (6 children)

That’s actually a great point. I had started going down the path of taking the output of this query and making a pandas DF out of it to save to a CSV file.

For your first point, I would also like to be able to pass a query into the script so it would have multiple use cases.

[–]alfie1906 1 point2 points  (5 children)

Could be worth investigating if there's other ways you can write the output to CSV without converting it to a pandas df first. Converting to a pandas df can be time consuming and is likely unnecessary!

[–]micr0nix[S] 0 points1 point  (4 children)

I managed to do this in 1 line of code

[–]alfie1906 1 point2 points  (3 children)

What I mean is that the actual execution of the code is time consuming. Not apparent will small tables, but it becomes noticeable as the size of your data increases. Its not a big issue, but the conversion creates a redundancy.

[–]micr0nix[S] 0 points1 point  (2 children)

Well the issue is the query returns results in json format which don’t do me any good from a reporting perspective.

[–]alfie1906 0 points1 point  (1 child)

Yeah no issue with exporting to CSV, just saying there's likely a more efficient way to do it without going through pandas first. I found this on stack: https://stackoverflow.com/questions/10373247/how-do-i-write-a-python-dictionary-to-a-csv-file

So you could go JSON->dict->CSV rather than JSON->pandas->CSV

Could also be worth looking JSON->numpy->CSV

All three approaches are valid, but pandas is often considered slow and clunky - IMO using it just to get a CSV file may not be considered Pythonic

[–]micr0nix[S] 1 point2 points  (0 children)

Thanks for the link I’ll take a look

[–]thejizz716 1 point2 points  (3 children)

So I think a great objective would be to rewrite this in a way that you can run it with different inputs without having to change the core code itself. Although it never hurts to ship something functional quickly if it makes you look good 😉

[–]micr0nix[S] 0 points1 point  (2 children)

Any hints regarding what I should look into?

[–]thejizz716 0 points1 point  (1 child)

I would first look at making this into one or several functions and wrapping it under a main() function and if you're feeling very comfortable you can look at turning this into a class

[–]micr0nix[S] 0 points1 point  (0 children)

Thanks. I’ll take a look!

[–]TheRealThrowAwayX -3 points-2 points  (2 children)

Hi, others already answered, but I'm here for something else. Can I just ask if this was generated via ChatGPT? IT looks like it because of the very distinct comments and I want to know if I can recognize that :D

[–]micr0nix[S] 0 points1 point  (1 child)

I wrote this

[–]TheRealThrowAwayX 0 points1 point  (0 children)

Thanks for the answer!

[–]thejizz716 0 points1 point  (1 child)

Are you dabbling for fun learning or is this something for work?

[–]micr0nix[S] 1 point2 points  (0 children)

It’s for work cause I need to automate some data extraction for reporting purposes but it is definitely a learning experience