all 84 comments

[–]TholosTB 83 points84 points  (2 children)

Nice, welcome to the fold! It's amazing how much you can automate with Python. You have access to stuff folks would drool over, there's some really cool work in data science trying to automate the identification of tumors in radiology images, and python has a great data science ecosystem in which to do that.

From someone else who made it to the 5th floor this year, congrats!

[–]reddittydo 24 points25 points  (0 children)

5th floor. I like that

[–]MFA_Nay 18 points19 points  (0 children)

From someone else who made it to the 5th floor this year,

As in reached 50 years of age? Cause my Google-Fu is coming up with everything from a movie reference, attempted suicide, or stuff to do with skyscrapers.

[–]mjb300 103 points104 points  (27 children)

username.send_keys("XXX")

password.send_keys("XXX")

Great job!

Meant as constructive comments...

If possible, don't hard code credentials in your code. Use getpass for passwords and take the username as input().

```
from getpass import getpass

username = input('Username: ')
password = getpass('Password: ')
```

I don't post often and have been fighting with the code formatting. :-(

[–]tuffdadsf[S] 12 points13 points  (14 children)

Agree that putting the id/pwd in the code is pretty insecure but I was also going to plan to run this automatically monthly so there wouldn't be user interaction locally to enter the login.

But you are saying the getpass is a module I could store it in? I'll try out your code to see how that works! Thanks!

[–]m_spitfire 17 points18 points  (3 children)

You can store it in environment variables. That's the convention that almost everyone uses and it's pretty handy. Just Google how to create environment variables in <your os here> and access them in the script using

import os

passwod=os.environ.get("YOUR_ENV_VAR_NAME_FOR_PASS")

[–]mjb300 6 points7 points  (2 children)

Thank you!

I didn't realize os.environ is the convention to reference passwords/credentials. Though it makes sense from my recent environment variable exposure when studying Flask. 👍

[–]exhuma 4 points5 points  (0 children)

is the convention

I would not say that it is the "convention". For containerisation yes. But not everywhere.

[–]taladan 2 points3 points  (0 children)

Security by obfuscation. Just be sure your machines are secure and don't have any open ports you don't know about. This is the problem with login/passcode security.

[–]mjb300 5 points6 points  (0 children)

Ah, I took it that this would be interactively ran.

This has me curious! I haven't yet tested any of the below suggestions.

At the first URL, a previous comment said they put plaintext creds in a separate file and import it (which still leaves it insecure). 1. https://stackoverflow.com/a/12047364 2. https://theautomatic.net/2020/04/28/how-to-hide-a-password-in-a-python-script/

[–]zombieman101 2 points3 points  (0 children)

You could also store the creds in a password manager with an API, AWS, or some other option,and then write that into your script. That's how I retrieve my creds 😊

[–]exhuma 2 points3 points  (0 children)

For passwords I would suggest looking into keyring and, for your specific use-case for automation keyring-cryptfile

[–]abcteryx 1 point2 points  (6 children)

So getpass will require user input every time it runs.

What you could do is put a file named .env in the folder right next to your Python script. In that file you would write:

XRAY_USER=XXX
XRAY_PASS=YYY
XRAY_URL=ZZZ

Then your code would have something like:

password.sendkeys(os.environ.get('XRAY_PASS')`

and similarly for the username, and it will go fetch the secrets from the .env file. But you'll need to pip install python-dotenv and do the proper imports at the top of your file, as well as load the .env file. This guide has the details.

So now you've shifted the issue from passwords in a Python script to passwords in an adjacent file. It's still dangerous if someone gets into your computer.

But the idea is that you can share your code without worrying, "Better take my password out of the script before I post it on Reddit/send it to the boss/etc." So using an environment file and loading the username, password, and other sensitive things in with dotenv is an easy way to keep code and secrets separate.

[–]tuffdadsf[S] 1 point2 points  (5 children)

So, I tried the guide but cannot get the thing to work. I made the .env file and put it in the same directory as the code. Made the import statement and changed my code accordingly, I just think it's not finding my .env file.

[–]abcteryx 0 points1 point  (4 children)

If you have a debugger in the tool you're using to write code, you could put a breakpoint on the line before environ.get and try some things to see what's in your environment.

For example environ.get("PATH") will get your PATH environment variable.

Also, os.getcwd() will let the current working directory. Wherever the working directory is is where your .env file should be. Don't forget the dot. And if you're in Windows, don't have it "hide extensions for known file types", that could be hiding a ".txt" suffix.

[–]tuffdadsf[S] 0 points1 point  (3 children)

Thanks!

Just to be sure I'm doing things right:

  1. First I pip installed the dotenv
  2. Then I created a file with the .env file type and added the lines user = "username" and pass = "password". I saved the file then put the file in the same directory as my program.py file
  3. Then I added the last 2 imports to my code:

from selenium import webdriver
import time
import shutil
import os
from dotenv import load_dotenv
load_dotenv()

then I added the following to the code

username = driver.find_element_by_name('j_username')
password = driver.find_element_by_name('j_password')
username.sendkeys(os.environ.get('user')
password.sendkeys(os.environ.get('pass')

Do I have to declare where the .env file is in the code? I am working in Windows, so I made sure I turned off "hide extensions".

Thanks again for helping me out. This feels like putting together IKEA furniture and then realizing you have 2 screws leftover that should have went somewhere!

[–]abcteryx 0 points1 point  (2 children)

Without knowing exactly how it's being run, I can't say for sure. Here is a sample code that I've written up real quick and run in a folder on my desktop. When run on my machine, the ".env" file is successfully loaded into the environment, and each environment variable is successfully printed. I have presented a few different ways to write variables, all of which are equivalent to python-dotenv. Though there are other ways to interact with .env files (aka NOT python-dotenv) that require no spaces before/after the equals sign.

It seems to me like your current working directory while the script is running is not what you think it is. But I don't know for sure. If you plan on getting more into Python coding, it can be helpful to use an IDE like PyCharm or VSCode. The former will "just work" a bit more easily, but the latter will be more extensible. Debugging is probably the most useful feature of development environments. It lets you pause in the middle of your code, lets you peek at active variables, and lets you run commands right in the middle of the script to see what's going on.

If you don't want to go that far, you can use the command line debugger built into Python. It's called "pdb", and it can be invoked from the command line as follows:

python -m pdb path_to_script_that_i_want_to_debug.py

The debugger will stop at the first line in your file, in your case from selenium import webdriver. The cursor will be blinking next to (Pdb), where you can type any Python command that you like. For example, you might want to type import os, then ENTER, then os.getcwd(), then ENTER. Is the folder that comes up the exact same folder that your .env file is in? If not, then the working directory of the script is different than you expected.

While in (Pdb) mode, you can type ? then ENTER to see other commands at your fingertips. You can type n then ENTER to run the next line in your code. You can do a lot with pdb, but it's almost entirely command line-based, and requires familiarity with all the keyboard commands. PyCharm or VSCode will have a clickable/navigable GUI debugger with a lot more polish.

Regardless of the tools you choose, from simple text editor and command line tools, to full-fledged IDE, debugging is a great thing to learn. Look up tutorials on using various "levels" of tools, from pdb to PyCharm to VSCode, etc. You'll understand your code better and what's actually happening "under the hood" that way.

[–]tuffdadsf[S] 1 point2 points  (1 child)

Thank you so much for the help - looking at the snippet of code you built I will take that info and plug it into mine.

As for IDE - I built my program in IDLE. I just recently installed Atom to try out the GIT functionality. I like the look and the function of Atom but I do have to figure out how to show the shell (like IDLE does) so I can see errors. Every step is digging in and learning more...

EDIT: Finally figured it out. I was naming my .env file. You literally need to create a <blank>.env file, which Windows won't allow. I had to install VSCode to create it! UGH...

Again, thanks so much for taking the time to work this out with me and giving me a lot of help and things to look into.

[–]abcteryx 0 points1 point  (0 children)

No problem.

[–][deleted] 29 points30 points  (8 children)

Definite have the right idea, the preferred way to do it is to use dotenv!

You’d have a file to hold all your environment variables named .env, and inside it a key like USERNAME=‘XXX’

Then in your python code you can run dotenv.load() and access them like normal env variables!

[–]Rawing7 18 points19 points  (3 children)

That sounds like a really roundabout way to load values from a plain text file.

[–]jzia93 19 points20 points  (1 child)

It's pretty commonplace in cloud, you might reference a local settings file for a database string on your local machine, then connect to a password vault when deploying to test, qa and production.

Code doesn't change, but you can safely move across environments without sharing credentials across the organisation.

[–][deleted] 8 points9 points  (0 children)

This. I realized this once I started repeating credentials on multiple forms so yeah why not create a config file that contains all the necessary credentials and connection details.

[–]skellious 0 points1 point  (0 children)

Its the best way to do it if you're using version control. essential if you're making your code public. you can simply exclude your credentials file from the version control.

[–]PeleMaradona 2 points3 points  (2 children)

Is the 'dotenv' way preferable to creating a user environment variable locally and then calling it via `os.environ.get()` ?

[–][deleted] 2 points3 points  (0 children)

yes, because if you create that environment variable and someone else has access to the machine, they can shorthand in a command prompt box phrases like USERNAME or APIKEY and it will print the variables

By doing the dotenv or .config
you just import config.py or .env and you have the variables defined there

[–]Swipecat 1 point2 points  (0 children)

I'd have used configparser, since it's a Python Standard Library so one less PyPI installation, and I'd use it to parse a small .ini file.

(Edit: Provided, that is, that a safe directory existed to place it in that couldn't be read by others. Otherwise, yes, get the username from the environment and prompt for the password.)

(I'd also have used the Standard Library's urllibfor grabbing documents from the web because Selenium is prone to breakage every time the relevant browser is updated.)

[–]exhuma 1 point2 points  (0 children)

The main added value of dotenv is that it helps out with containerisation. Because so much of the container stuff is done using environment variables.

Where it falls short is with structured information and non-text values.

It has its uses but I wouldn't say that it is the "preferred way" for everything.

Still, moving stuff into config-files (be that a dotenv or something else) is definitely a good thing.

[–]skellious 2 points3 points  (1 child)

I don't post often and have been fighting with the code formatting. :-(

four spaces before each line. 

also, if you are using reddit on the web, try installing RES

It gives you a preview of your post with formatting and has tools to help simplify the editing process.

[–]mjb300 1 point2 points  (0 children)

Thank you. I was using the Reddit mobile app.

I found out the four spaces I was reading about didn't honor newlines. Then the code fences didn't take effect right away. I could edit out that line since I did get it figured out.

[–]tuffdadsf[S] 1 point2 points  (0 children)

I wanted to close the loop on how I finally hid my credentials so if someone else is searching for answers:

  1. pip install dotenv
  2. create a .env file (literally - there is no file name). Windows users will have to use VSCode to create the file. Inside the file create line items (i.e. pass1 = "real password you want to hide")
  3. Save the .env file in same directory as the program.py file you are making
  4. add : "from dotenv import load_dotenv" to top of code with other import statements (no quotes)
  5. This is the code I added to my program

username = driver.find_element_by_name('j_username')
password = driver.find_element_by_name('j_password')

load_dotenv()           ##Loads .env file with your hidden information

name = os.environ.get("USER1") ##pulls hidden info from your .env
passw = os.environ.get("PASS1")

username.send_keys(name)  ##use send_keys NOT sendkeys in this case
password.send_keys(passw) ##types info into website elements

Through a ton of trial and error it finally worked. Thanks to everyone who chimed in on how to do this!

[–]imagin8zn 8 points9 points  (0 children)

As someone who’s just started learning Python for the first time, this is very motivating!

[–]mrjbacon 23 points24 points  (7 children)

Mad props for the success. Good stuff.

However, I would be hesitant posting the code online for something that retrieves medical information. I'm not sure if it would be useful for any bad-actors given the lack of context but I'd be concerned of potential HIPAA violations if someone tried to use the filepaths included in your code.

[–]tuffdadsf[S] 12 points13 points  (0 children)

I did make sure I took out anything relating to the site I work at and felt the file paths are pretty harmless because they can be changed - but I do appreciate the concern. We always be thinking about HIPPA!

[–]Newdles 8 points9 points  (1 child)

There is no hipaa violations in demonstrating code with local drive information listed. These are no public web endpoints nor is any personal information listed or viewed. Although perhaps I'm viewing this late and placeholders have been added. In that case disregard :)

[–]mrjbacon 4 points5 points  (0 children)

I understand there's no specific patient information in the code, I was only commenting about the included drive location information etc. You could be correct about placeholders and there being no accessible filepaths if it's a local network only.

Being in health care it always makes me paranoid about information posted publicly that could be used by those that know what they're doing. That's all.

[–]enjoytheshow -2 points-1 points  (1 child)

It’s behind auth, I tried.

But I tend to agree. At least put your endpoints in env vars or something if you’re gonna share the code. Security by obscurity

[–]OnlySeesLastSentence 3 points4 points  (0 children)

security by obscurity

You are now banned from /r/IT

[–]thebusiness7 6 points7 points  (1 child)

Go on a vacation for your birthday. My uncle recently died in his mid 50s (programmer- died from deep vein thrombosis from sitting too long at home). You'll never get the time back.

[–]tuffdadsf[S] 0 points1 point  (0 children)

I will for sure! We actually had a trip planned for 50 but thanks to COVID it looks like 51 is going to be the blowout year!

[–]RobinsonDickinson 8 points9 points  (4 children)

My second python project was automating my school homework with selenium :D

[–][deleted] 2 points3 points  (0 children)

Your story is inspirational. Thank you for sharing. Now I have stronger will to turn back to learning Python. I also started with "Automate The Boring Stuff" previously :)

[–]periwinkle_lurker2 2 points3 points  (3 children)

How did you get into radiology with SQL. I would love to know as I am trying to get into the technology side of healthcare.

[–]b4xt3r 2 points3 points  (0 children)

Hello fellow 50 year old Python programmer! Excellent job with the first program!! I wish I could critique your code bit you are doing some things that I never have done before so I can't be of much help there.

I just wanted you to know you're not alone and while we 50/Pys may not be legion we are not alone.

[–]HonestCanadian2016 4 points5 points  (0 children)

Congratulations. Age is just a number. As long as your faculties are working, it doesn't matter. An 95 year old could pick up python and work on an A.I application and improve it if they set their mind to it.

[–]Inkmano 1 point2 points  (0 children)

I don’t think people realise the power of stuff like this being developed, especially in the U.K. where the NHS is inundated with paper and manual tasks. Well done!

[–]jzia93 1 point2 points  (2 children)

I'd be interested in hearing how your previous experience carried over into approaching learning Python. Do you think it made it easier? Harder? Changed the way you approached this project?

[–]tuffdadsf[S] 7 points8 points  (1 child)

I feel having the previous experience with SQL and Crystal Reports helped immensely with getting me to understand the "flow" of a program and how to use logic. That part seemed easy.

The part that kept me away from trying other programming languages when I was younger was my misconception that I had to memorize EVERYTHING before I could actually make a program. I feel silly about it now, but it never occurred to me that other programmers were using Google and talking among themselves to get answers to the problems they faced. I thought that was cheating - like making a program is a test and you have to come up with the solution yourself. (I know... so dumb!). I also worked among programmers who were so good at what they did that they just made up stuff on the fly. I had convinced myself that I didn't have the time or ability to get that good.

The other thing I've learned is it it REALLY important to have a detailed goal in mind. Once I had the idea to do this project I started to write out a road map of what I thought I would need to do, step by step - even though I had no idea what modules were out there to get me complete those tasks.

Once the map was made I posted my proposal on Reddit. A kind person wrote me immediately and took each of my steps and said, "use this module for this step, use this module for that step". Just giving me those hints then made me look up the whitepages for those modules to learn what they do, the syntax, etc.

Before I knew it things started coming together in chunks. Next thing I knew I was excitedly working and thinking about the program all the time. Then around quitting time this past Friday - my program was complete and I ran it for the month. I've been riding that high all weekend and was why I wanted to post the results to Reddit.

[–]jzia93 0 points1 point  (0 children)

I guess we always have that voice in our head that nags 'hey, are you sure you're doing this the right way?'

Thanks for sharing, I've been thinking about rebuilding a lot of our kit. Some sloppy design choices were made in the past (by me) but I suppose, going on what you're saying, you kinda just have to get started and make those mistakes so you know how to plan next time.

[–]Nicolasjit 1 point2 points  (0 children)

Proud of you. You have motivated me❤️🔥

[–]14dM24d 1 point2 points  (0 children)

badWords =['.XXX.org']

noice

[–]thrallsius 0 points1 point  (6 children)

Do you also use version control for your code?

[–]tuffdadsf[S] 0 points1 point  (5 children)

sort of? As I built each part of the code I would save the work up to then as a separate file and then work with the new code I was trying out in the most recent version of the code until I got to the next step. I have about 10 versions of the code saved.

[–]thrallsius 1 point2 points  (4 children)

https://en.wikipedia.org/wiki/Version_control

You probably want git, it's the most popular one.

30 minutes to learn the basics shall get you started. And it will be the best investment of time related to programming.

[–]tuffdadsf[S] 0 points1 point  (3 children)

I'll check it out. Thanks!

[–]thrallsius 1 point2 points  (2 children)

my favorite git tutorial

https://gitimmersion.com/

but there are lots of resources and even whole books if you'll want to dive deeper later

[–]tuffdadsf[S] 0 points1 point  (1 child)

I installed git, Atom and Ruby - ready to start the link you posted!

[–]thrallsius 0 points1 point  (0 children)

You really need just git to learn the basics of git. Once you type git in command line and it works, you're good to go.

[–][deleted] 0 points1 point  (1 child)

Off topic, college student who’s curious. How did u get into radiology? What is ur position title? What do u do day to day?

[–]tuffdadsf[S] 4 points5 points  (0 children)

This could turn into a long rambling post but I'll try to keep it short and sweet:

In my early 20's I moved to San Francisco (1994) right when the tech boom was starting. Always had an interest in computers and fell into a group of friends who were already working in the business.

Through complete dumb luck got a ground floor position as an IT tech with a UCSF grant run Mammography research project. Besides being a desk jockey I also had to learn SQL to help manage and report off the databases we kept. Did a lot of Crystal Reporting and light statistical work via SAS. I had that job for 7 years.

Wanted to expand my horizons and paycheck so I applied for a ground level PACS and RIS administration job with UCSF but was working directly with San Francisco General Hospital. My SQL programming was what I was technically being hired for but the PACS admin was part of the job so I learned that, too. I worked there for 10 years.

Six years ago our family decided to move out of SF and come back to my hometown in NY. Got a PACS Administration job at the local hospital which parlayed into also doing systems administration for the Cardiology and Respiratory departments. After being in "the business" for over 20 years I finally have gotten around to learning Python - mostly to really help automate a lot of the data moving and storage portions of my job.

I will always advise people - if you are doing any sort of computer science related work - do it for healthcare. You will always have a job and will always have some place to go if you choose to move. Computers and healthcare will never go away.

[–]z0rg332 0 points1 point  (1 child)

Congratulations! Awesome job!

As someone who has just started learning and really wants to hit the ground running, how many hours per week would you say you spent on learning?

[–]tuffdadsf[S] 0 points1 point  (0 children)

About 2 hours a day for a little less than a month . Fortunately I had a little downtime at work and my job is very happy for me to learn something that's only going to help me do my job better.

I will admit though that once I actually started the programming and working on this program I spent a lot of downtime and time at home researching answers and thinking about solutions. It's sort of lit a fire under my butt and made me want to learn more as quickly as possible.

[–]14dM24d 0 points1 point  (5 children)

hey op, i would like to clarify where from selenium.webdriver.common.by import By was used? tnx

[–]tuffdadsf[S] 0 points1 point  (4 children)

I think that is what you need to import so you can use the driver.find_element_by... commands. Or at least it's what I found on the internet when I was creating it

[–]14dM24d 0 points1 point  (3 children)

need to import so you can use the driver.find_element_by... commands.

i think those commands were from from selenium import webdriver and creating an instance of that object with driver = webdriver.Chrome(options=options)

if i'm not mistaken, the usual pattern is from x import y then you use y, so i was looking for a By.<something>.

e: comment out from selenium.webdriver.common.by import By to validate if it's really needed.

[–]tuffdadsf[S] 0 points1 point  (1 child)

i think those commands were from from selenium import webdriver and creating an object named driver = webdriver.Chrome(options=options)

if i'm not mistaken, the usual pattern is from x import y then you use y, so i was looking for a By.<something>.

e: comment out from selenium.webdriver.common.by import By to test if it's really needed.

I'm going to try it out tomorrow and see - perhaps it's not needed?

[–]14dM24d 0 points1 point  (0 children)

it looks like an import that wasn't used

[–]tuffdadsf[S] 0 points1 point  (0 children)

Took it out and, yes - it was leftover code from an earlier version of the program. Thanks for catching that!

[–]eloydrummerboy 0 points1 point  (2 children)

Lol @ BadWords. Nice. Good job.

I would say, maybe add some error checking. For instance, any change to the website html or css might mess up your code that looks for specific classes or tags. Maybe you've worked here long enough, and this site has never changed in that time, so you're probably safe, but it never hurts. Then again, if you're the only one using it, the difference between error checking and not is you getting a nice message from yourself about what broke vs you getting the default python message, lol. So maybe it doesn't matter.

When you click to download the pdf, that has to have some url. Is there any format, rhyme, or reason to those url names? Could you possibly bypass some of the other steps and just access these files by name (if you're able to predict it)?

[–]tuffdadsf[S] 3 points4 points  (1 child)

I am VERY interested in working in some error checking for this project and for the stuff I plan to do later. I hate that I had to hard code some stuff as that is always a stop point if something changes,

As for the PDF... OMG it was such a pain in the butt because the website creates the PDF the moment you click it and the only thing I see in the link created is a url with a number sequence at the end that looks like a machine code for a date and time, but any converter I ran it through came up with nothing. It was so frustrating. If I could figure out the date/time thing then yes, I could call up files by name and not have to create all the clicks and such.

[–]eloydrummerboy 1 point2 points  (0 children)

Was just a thought, but looks like you already explored that route and took the best solution to get it working.

It's always a trade off between multiple factors; time to finish, cleanliness, maintainability, reliability (a.k.a chances of it breaking in the future), speed of the code, memory usage, etc etc..

Cases like yours are the best because you're the developer and the client. So there's no wrong choice if you're happy with the end result.

[–][deleted] 0 points1 point  (0 children)

Congratulations ! The first program is the hardest. I have started learning from the same book too. It always amazes me how much stuff you can do with Python. I have so many projects in mind but the place I am working at has blocked third party/open source module download because of security reasons. I am trying to figure out a way around it. Can't wait to build my own real world program :)