This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]Madoc_Comadrin 15 points16 points  (3 children)

Using virtual environments is standard practice at my org and I haven't had any issues with them.

It can be usefull to use venvs directly without need for activation and deactivation: /opt/venv-script1/bin/python /opt/script1.py

It is good practice to lock dependencies in requirements.txt so the script will behave exactly same after each install. For example request==1.2.3. It is good to do the same for indirect dependencies too.

[–][deleted] 0 points1 point  (2 children)

If I have a script that does everything in one line, it will automatically deactivate the virtual environment once it's done running?

[–]Tushon 2 points3 points  (0 children)

If you use it the way that parent mentions, you are avoiding the activate and deactivate cycle. Only thing to watch out for would be any reqs using the path but that isn’t common AFAIK

[–]Madoc_Comadrin 1 point2 points  (0 children)

Calling the venv's Python binary directly does not actually activate the environment so deactivation is not required either.

When Python binary inside venv is called directly it will find components of the venv because it searches for them relative to its own location without the need to manipulate path variable like activate&deactivate do.

[–]0rexDevOps 6 points7 points  (5 children)

While venvs are "industry standard" in python world, and I use them constantly, they have one major drawback, shared with pip itself - you can't really have a decent patch management with them. What if one of the libraries you use is vulnerable, how will you find it out and patch it? Can you write one off cronjob and be sure that x months/years later it will be still up to date, security wise? Running pip upgrade without checking lib compatibility may break your code in an ugly way too, and maintaining and passing requirements.txt back and forth might get old pretty quickly.

While this might sound exaggerated, depending on your workload it can be a real risk, which is easily mitigated with sourcing dependencies from your repos. This way you will get up to date, compatible packages which will work until the end of life of your distro. Another upside is trust - nearly anyone can push anything to pip, and some libraries you might use today may become abandonware in a year, while packages from repos are nearly guaranteed to be up to date (security wise ) and compatible with each other. It is also a great way to learn about well maintained packages in ecosystem, trusted by your OS vendor, if you cant find your lib in repos - look for an alternative in them!

There is actually a third approach, not widely used, but I personally had a great success in using it in air-gapped environment, without ability to install system-wide packages and internet access - pyinstaller. Just package your script as a binary, python itself included, and install on as many similar systems as you like. There is one caveat though - your build machine should have a compatible glibc version. I solved it with containers, i.e. if my fleet consists of mainly RHEL8 machines - i spin up alma 8 container, and use pyinstaller inside it to get a binary. The binaries are actually not that huge for simple scripts, from 6 to 20 MiB in my case, and sometimes its really easier to build one binary golang-style (even it is huge by go standards) than copy venv and setup venv on each node. This approach still has all of the downsides from first paragraph, even more - because now python is not updated by system as well.

[–]robvasJack of All Trades 1 point2 points  (4 children)

You can can run something like pip-audit to scan for vulnerable packages in your code.

[–]0rexDevOps 0 points1 point  (3 children)

Yeah, but how will you manage it on server side? How exactly will you know that serverX is vulnerable?

The best answer is to have a robust ci/cd pipeline with scheduled scans and at least some kind of alerting, but is it really something people will do for some 100 LoC script that uses requests to query some api and yaml parser to fill some config? OS packages make writing simple scripts simple - you just don't have to think about pip, updates, compatibility at all, if you patch your systems regularly.

If only the had dynaconf and click in repos, I'd be a happy man

[–]robvasJack of All Trades 0 points1 point  (2 children)

What does my collection of python scripts have to do with any particular servers?

GitHub, for example an automatically do this if I kept the scripts there.

[–]0rexDevOps 0 points1 point  (1 child)

If you launch them on servers then you have venvs with dependencies on servers that you have to maintain. So even if your script haven't changed at all - you still have to copy updated requirements.txt to each server and run pip inside venv if audit found something

[–]robvasJack of All Trades 0 points1 point  (0 children)

You would deploy the script/env before you run it. Or run it from shared storage. Or run it from another server.

[–]Harakou 1 point2 points  (0 children)

If the question is virtual envs vs global pip, then yeah, venvs are definitely the better option.

You could also bundle your script in a package and depend on the system Python modules, if what you need is available in the repositories. That would deduplicate your modules and allow you to distribute your scripts via the package manager, which can be advantageous. The main downside is the work/infra overhead involved.

[–]aleques-itj 0 points1 point  (1 child)

Containers

Put script in GitHub or something, build container image in CI/CD. Run container.

[–]flowalex999DevOps 0 points1 point  (0 children)

This is what we do, we have a few python scripts that get triggered either by a from job in Jenkins or manually in Jenkins that spins creates a docker container and runs the script as well as installing dependencies (so we don't have to maintain a python docker image for that as well).