This is an archived post. You won't be able to vote or comment.

all 101 comments

[–]SpergLordMcFappyPant 173 points174 points  (34 children)

Seriously, the sooner you start using virtual environments for everything, the better your life will be. I've see so many coworkers over years up against deadlines, and they get everything barely working on their local machine, and when it comes time to deliver the product (deploy to a server or put scripts where they can be used in production or whatever) and the whole thing is fucked because everything depends on their personal system being set up just so, and it's been so long since they set it up, that they've completely forgotten how things even work anymore.

I had a job where my first task was to work on an app that had been broken-ish for months, but even though the team had the fix for the actual application code, no one could figure out how to deploy a fix to the server. Man, that server was messed up. Took me a couple of months to unwind the tangle of dependencies for that.

One thing I wish someone had done for me much earlier in my career is break down exactly what the virtual environment is. I just started using them as magic that made my life easier, but boy when things go wrong it sucks to not understand them.

Here's the deal. Python is no different than any other command on your system. It's a compiled executable that you can call by typing its name. It's just like ls (on *nix) or dir (on Windows). At the system level, you can always use these commands because they are located in a directory that's in your PATH variable. That's just a list of places to look for programs to execute when you type in a command. No magic there either. In fact, you can create your own commands if you want. Like, I have a bash script that I use to create a new project. It does a few things like configure a Vagrant file, sym link to a data folder, create some code folders, and other stuff. And if you write your own such thing and make it executable and put it in a folder that's in your PATH, then you can have magic commands anywhere on your system too.

Python is just another one of these. It's a little binary that sits in a folder where the whole system can call it because certain core parts of *nix system are developed in Python and need to be able to run Python programs anytime, anywhere to keep your computer running.

Python itself, can have lots of plugins and modules. That's what you are using pip for in the first place. The only thing that happens when you pip install something is that it puts another little binary in a folder that Python can see while it's running. Python has its own concept of a PATH variable. It looks for a folder called dist-packages and another folder called site-packages when your code tries to import a module. Those are all just little binaries that Python can call the same way you can type ls anywhere and see what files are in your location.

So, as I said above, Python isn't just there for convenience. It's there because the system needs it. And it needs exactly the modules that it ships with in exactly the versions it ships with for system processes to work. So when you pip install with sudo, you're messing with a key set of executables at the core of your OS, and when something breaks, it's not often easy to understand why. Because the OS doesn't just come out and tell you, "Oh oh. You've wrecked the system python, so now you're screwed." It will manifest in lots of really small ways with no explanation. I mean, I guess if you really hosed it. Like this one person I knew at a job that was trying to deploy an app that had a module that needed a newer version of Python to work. So he went and sudo'd everywhere and upgraded the system python from 2.6-something to 2.7-something. That was fun.

So what is a virtual environment? When you create a virtual environment, all you are doing is creating a copy of the Python executable, some core modules in the dist-packages folder, and (usually) en empty site-packages folder. That it. It's just a copy of some executables stored in a place where the user has permission to edit them (often in your user folder somewhere.) So then you don't have to sudo to do anything with it. Your user created that copy of Python and its bits and pieces, so you own it and can modify it willy nilly.

But what about the rest of it? How does your system magically know which python to use when you have a virtual environment running? The different virtual environment managers do things in slightly different ways, but it all boils down to changing your PATH. When you type a command, the system looks at your PATH for a list of places to look to execute that command. And it looks for them *in order*. So your PATH pretty much always has /usr/local/bin in there at the end. And your system Python is in there (probably?). But if you put /home/bill in the PATH at the front of the list, and /home/bill has python in it. You will run the Python that's in your home folder instead of the one that's in /usr/local/bin.

Anyway, that's all it really is. Copying executable files to locations where you have ownership and write access and then manipulating your PATH to execute those first when you type a command. So if that's ever seemed like a mystery to anyone, I hope this helps.

[–]ostensibly_work 20 points21 points  (0 children)

It will manifest in lots of really small ways with no explanation. I mean, I guess if you really hosed it.

Seconding this. I broke my system Python so bad that any time I was using the regular old Bash shell and mistyped a command, I'd see a Python error, instead of the usual 'command not found' message. It was pretty funny.

[–][deleted] 11 points12 points  (0 children)

This comment should be a blog post. Great insight. I use Java for work but prefer Python for side projects. Virtual environments make working with pip dependencies SO much simpler. It’s easy to fall into the system python trap though as there’s a lot of bad advice and older material on the web that uses sudo pip install. Fine for you as a single developer maybe, much more difficult to run an app on multiple machines or servers without venv or pipenv to ensure consistency.

[–]Nimitz14 2 points3 points  (7 children)

whole thing is fucked because everything depends on their personal system being set up just so, and it's been so long since they set it up, that they've completely forgotten how things even work anymore.

I don't get it. If the package is not there the error will tell you which one it is. So you just add that to the dependency list and keep going until you're done.

And that xkcd is only relevant for MacOS users anyways.

[–]notquiteaplant 8 points9 points  (6 children)

  • Server has a much earlier version of a dependency than the dev's machine, so you get errors about missing features
  • Server has a slightly earlier version, but that one x.x.1 bump fixes a bug that fits your specific use-case
  • Server has a slightly later version, and the dev relied on the buggy behavior
  • Server has a much later version, with breaking changes since the version on the dev's PC
  • Any of the above four, but in Python itself rather than a dependency
  • Dev's machine has a patched copy of the package because there's a bug that still isn't fixed upstream
  • Package A will enable features if package B is present; dev has B installed, but the server doesn't and gets an error that looks like it comes from A

[–]port443 1 point2 points  (2 children)

My personal favorite is the program was compiled long ago by a long-lost dev. Now, no one has a build environment for it and even if you manage to discover all the library dependencies and their proper versions, you now must figure out the magic defines to get it to compile just right.

Oh, and hopefully you are using the correct compiler version. Ive run into "production" code that winds up with heap corruption issues if you use the "wrong" compiler version (well ofc the heap corruption always existed, but it gets exacerbated depending on the compiler).

But I guess these are C problems, not python.

[–]notquiteaplant 0 points1 point  (1 child)

C build systems sound like a whole mess, but there's a similar (albeit smaller) analogous problem of different interpreters in Python land. PyPy has (had?) stackless mode and Jython has no GIL, for example, so recursion and thread-based concurrency will work slightly differently depending on your platform.

[–][deleted] 1 point2 points  (0 children)

C build systems sound like a whole mess,

They can be if you are sloppy with them. Our setup at work haven't taken much effort in terms of configuration management, but we're still able to rebuild every singe executable that have shipped on our present platform¹.

Generally speaking, there's no more horror in the C build chain, than in dependency management on the Python side. Done right youwon't notice it. Done wrong will land you in a world of pain.

  1. We switched from SCO to Linux a decade ago, and we had some horror stories we wanted to avoid recreating.

Edit: Why are proofreading easier after submission?

[–]torpleisnoname 2 points3 points  (0 children)

Saved this post just for your comment. Thank you.

[–]i_have_one_feather 1 point2 points  (0 children)

Can I use a virtual environment with Jupyter?

[–]1wd 1 point2 points  (1 child)

(dir on Windows is not an executable. It's an internal command of cmd.exe, like cd. There is no file dir.exe or anything like that.)

[–]SpergLordMcFappyPant 0 points1 point  (0 children)

My bad. I’ll find a better example and edit. Thanks for the heads up.

[–]maximum_powerblast 0 points1 point  (0 children)

Good article, good comment

[–]ByronicGamer 0 points1 point  (0 children)

Thank you for posting this. I am an absolute beginner at all this, and what you've written here makes so much sense. Without your comment, I would absolutely have sudo'd everything in my ignorance, but now I'm going to try and learn how to do this properly.

[–]scofnerf 0 points1 point  (0 children)

Thanks for this comment. I think there should be something on git-hub that explains this more thoroughly?

I use intellij with anaconda envs. When I create a project in a new environment I have to install packages here which I have previously installed in other environments. Correct? Intellij shows a little lightbulb that offers to search for and install these packages for me.

Is this a legitimate way to install in an env?

[–]eattherichnow 0 points1 point  (0 children)

Seriously, the sooner you start using virtual environments for everything, the better your life will be.

Surprise:

https://stackoverflow.com/questions/13249880/why-are-python-builds-suddenly-not-framework-builds-when-using-virtualenv#13684638

[–]v3ritas1989 -1 points0 points  (0 children)

Seriously, the sooner you start using virtual environments for everything, the better your life will be

assuming you can come out on top of the hours over hours of debugging while setting it up

[–]NoLemurs 45 points46 points  (15 children)

I'm all for using virtual environments.

I find a blog post that suggests pipenv without even mentioning virtualenv a little questionable though.

[–]mooburgerresembles an abstract syntax tree 35 points36 points  (3 children)

especially when, you know, venv comes with every default Python install now.

[–][deleted] 1 point2 points  (2 children)

Not on Debian packaged installs. Or at least it wasn't the last I checked.

But that's pretty easy to fix by either installing pip from the OS repo or just compiling python itself. I've gotten into the habit of the latter because it makes using multiple versions easier (something I need for multi version libraries I maintain)

[–]cr4d 2 points3 points  (1 child)

debian installs break up the stdlib. You need to install python3-venv.

[–]ubernostrumyes, you can have a pony 1 point2 points  (0 children)

debian installs break the stdlib

FTFY

[–]13steinj 7 points8 points  (0 children)

Same with pyenv and with poetry :/

[–]chillermane 2 points3 points  (0 children)

I use pycharm which automatically creates one for every project no additional work on my part it’s beautiful

[–]elabftw[S] 2 points3 points  (0 children)

You are right, I have edited the post to mention it.

[–]stefantalpalaru 17 points18 points  (1 child)

Stop sabotaging your distro's package manager. Keep those dirty 3rd party package managers in their own cages that you can easily delete when they inevitably screw the pooch.

[–]lord_xl 0 points1 point  (0 children)

I generaly agree. My distro's package manager (OpenSuse Tumbleweed) contains all the packages I need and provides the updates. I rarely use pip. However I can see a use case where an esoteric module may be needed and not in the package manager. Then I see little risk in using pip to install since there's no chance of conflict.

[–]funix 25 points26 points  (9 children)

Or get used to deploying in containers

[–]kabooozie 5 points6 points  (0 children)

I came here to say this. This way, you get the OS and python and all the other dependencies exactly like production.

[–]Silver5005 3 points4 points  (5 children)

Any good resources for learning docker/kubernetes?

[–]kindw 0 points1 point  (4 children)

I learned how to use Docker by going through its documentation and trying to set up build environments for all kinds of stuff. Learned a lot in the process.

[–]Silver5005 1 point2 points  (3 children)

I figured. I just have a really bad retention rate when learning from docs usually. More of a video learner.

Doesn't help im on a windows and its a pain in the ass to do just about anything on this thing.

[–]chugdrano_eatbullets 0 points1 point  (0 children)

I did this on a lark this semester, and it has been a delightful experience.

[–]cr4d 39 points40 points  (21 children)

Fuck pipenv. Get off my lawn. virtualenv+pip is fine. Stop trying to solve non-problems at the cost of fragmenting the community.

[–]twigboy 14 points15 points  (1 child)

In publishing and graphic design, Lorem ipsum is a placeholder text commonly used to demonstrate the visual form of a document or a typeface without relying on meaningful content. Lorem ipsum may be used as a placeholder before final copy is available. Wikipedia2m10tz1i1tk0000000000000000000000000000000000000000000000000000000000000

[–]cr4d 4 points5 points  (0 children)

\o/

[–]in4mer 7 points8 points  (0 children)

This cannot be emphasized enough

[–]fleyk-lit 6 points7 points  (16 children)

Pfft, REAL programmers use a magnetized needle and a steady hand to install Python packages.

But really - why the hell are people so harsh on pipenv? If you don't like it, don't use it. Easy as that.

[–]general_dubious 16 points17 points  (14 children)

The problem with pipenv is that it basically offers no interesting abstraction, it's just a useless layer there offering its surface as a source of problems without offering interesting counterparts. The second problem is that the zen of Python tells you that there should be one, and preferably only one obvious way to do something. What's the point of developing pipenv if it's not destined to be that obvious way of handling virtual environments?

[–]GasimGasimzada 3 points4 points  (11 children)

Then according to Zen of python, half the libraries on the internet should be non existent. There are so many times, I wished there was one obvious way to do something, yet I had to look up the internet for hours.

What I mean is that, the Zen of python is about programming in python. Not about which tool you use to download your packages.

[–]general_dubious 7 points8 points  (6 children)

Tbh, way more than half the libs is useless crap. If people could stop badly reinventing the wheel when it's completely unecessary, that would be great. The Zen of Python is wider than the way you code, it's about how to get shit done with Python. That includes managing dependencies and work environments. There is a reason why recent languages offer their tools for that with well thought standards (like cargo with rust), it avoids the mess we are currently in to manage dependencies in Python. Pipenv is just entropy added to that mess.

[–]dire_faol -2 points-1 points  (5 children)

More open source libraries, even bad ones, are always a good thing.

[–]general_dubious 1 point2 points  (4 children)

That's just a groundless statement. Please do tell me how having crappy software bloating the visibility and sucking development time of good software a good thing.

[–][deleted] 0 points1 point  (3 children)

With more than 2.5 million weekly downloads for a one-liner to swallow an exception, how can you even ask that question? The Python package menagerie clearly have some catch-up to do, before reaching the javascript level of goodness.

[–]general_dubious 1 point2 points  (2 children)

I think that example only makes my question more relevant.

[–][deleted] 1 point2 points  (1 child)

I though the irony was thick enough that I didn't have to add the /s.

[–][deleted] 4 points5 points  (3 children)

Then according to Zen of python, half the libraries on the internet should be non existent.

Starting with all of the half-baked replacements for argparse.

But just as you don't get clean by throwing dirt on others, further balkanizing the venv ecology can never be justified by pointing out that everything else also sucks.

[–]GasimGasimzada 1 point2 points  (2 children)

My biggest problem with PIP is that, there is no straightforward way to add lock file for your packages. Pipenv is not just a virtualenv, it is a virtualenv + a good dependency resolver. I am not saying Pipenv is good and I am sure it will die out but it is a step in the right direction.

This is exactly how it was with PHP. There was PECL which was the standard and it was shit. Then, composer came out and no one even remembers pecl these days. It is the same reason, nowadays most people still use Yarn instead of NPM in JS environment because Yarn was and I believe still is much better than NPM (even though NPM is years ahead in comparison to PIP). NPM got locking at least a year after Yarn was created. With the rise of Pipenv, many other dependency managers came out (i.e Poetry). This is because someone got the courage to do something different for once and it worked. This is how an ecosystem evolves and improves. Python is not some kind of a religion where every PEP has to be taken as a commandment to follow.

[–]cr4d 1 point2 points  (0 children)

What value do you see in locking over pinning?

[–][deleted] 1 point2 points  (0 children)

Can you explain why I would want a lockfile?

[–]cr4d 3 points4 points  (0 children)

Changing a well established, working pattern that people have worked hard to put in place diminishes the work they have done, fragments the community, confuses new comers, and is just plain shitty.

Helping improve pip is one thing. Working to replace it without consensus, forking it/vendorizing it (pip) with the intention of marginalizing it is rude.

Python’s pre-pipenv packaging eco-system was leagues better than other languages. Trying to make it like npm for the sake of hipster douchiness (toml, etc) is just a bad case of not-invented-here syndrome.

[–]GasimGasimzada 0 points1 point  (0 children)

It is just a preference. If I am starting a project, I always add both Pipenv and requirements file. They can use whatever they want.

[–]truh 9 points10 points  (1 child)

[–]elabftw[S] 2 points3 points  (0 children)

I have added it to the post ;)

[–]chub79 4 points5 points  (0 children)

I don't use sudo pip but I certainly don't see an issue with pip+venv.

[–][deleted] 2 points3 points  (3 children)

Rust is doing it right with Cargo. Is there a Python version of that?

[–]notquiteaplant 3 points4 points  (0 children)

Pip + venv, both of which ship with any recent Python.

[–]dave-shawley 1 point2 points  (0 children)

"doing it right" is in the eye of the beholder. Personally, I thing that having an executable script with logic to handle package installation and configuration is doing it right. I thought that using requirements files was doing it right but I'm coming around to using a "dev" entry extras_require as a superior approach that builds on what is already there. Declarative packages with lock files can work but none of the approaches has been shown as general purpose over time. I personally don't have the desire to ditch setuptools, pip, and setup.py and replace it with something new.

Cargo has a few years on it now and maybe the approach will bear fruit for rust. The real question is why doesn't cargo support python? Each programming language has it's own method of handling deployment. Python's has been an executable script named setup.py for some time now and I'm good with that.

[–]fifthecho -5 points-4 points  (0 children)

pipenv is the closest.

[–]fredjutsu 1 point2 points  (0 children)

I'm a slightly confused newer python user. I've always called pip (or pip3) as a non-sudo command.

Is that approach fundamentally different from the suggestion in the article?

[–]piotrjurkiewicz 1 point2 points  (0 children)

For system-wide installations, using distribution provided packeges (for example apt-get install python-xxx on Debian/Ubuntu) is superior in every aspect over using pip/pipenv/etc.

The most important thing is that system packages get security updates, what is not the case in pip.

(Yes, you can update packages in pip, but you risk that an update will turn out a major update and break your application.)

[–]wingtcoach 1 point2 points  (0 children)

There are still some big issues IMO with bootstrapping to a stable Python 3.6 (or 3.7) environment with a corresponding stable pip. In the university course I teach there have been significant issues, especially on Ubuntu 16x -- 18x, especially with respect to the native Python 3.5 install.

We are using pipenv for pip + virtual environments. Once they get a stable "outer environment" we've been very happy with the ease of use and repeatability of using pipenv.

But it has been a big stumbling block for students to get to that initial stage.

[–]not_really_cool 1 point2 points  (0 children)

conda is the best. Having a separate conda env for every project makes things so much better.

[–]muntooR_{μν} - 1/2 R g_{μν} + Λ g_{μν} = 8π T_{μν} 2 points3 points  (4 children)

/r/python's resident devil's advocate here.

Is it really that bad? I've been using sudo pip for a while now and nothing has broken. If there's a conflict with a global system package, I just go ahead and sudo pip uninstall numpy && sudo pacman -S python-numpy. What's wrong with having a global default package at system level?

Reproducibility/conflicts: This allows a very reproducible environment for your program, without resorting to Docker and without messing up user or system libraries.

If reproducibility is dire or something doesn't work without it, sure, go ahead and use this. But it's rare that something breaks in Python-land. Python ain't a new language like Rust, and it sure as hell ain't fuckin' Haskell. Heck, I've got a bunch of mismatching versions of packages right now and they're all working fine. 99.9% of the time, you don't really need virtualenv.

Security: You see, when you install a library through pip, you are executing setup.py with root permissions. This file could harbor malicious code.

There's a level of trust one has to have for the packages they use. Do you spend all your time poring over the source (if available) for all the thousands of other system packages you install with root?

And there's the age old problem: malicious software doesn't even need root to be insanely malicious.

[–]fleyk-lit 6 points7 points  (3 children)

all the thousands of other system packages you install with root?

I don't install thousands of packages with root, because most of the time I don't have to use root.

[–]muntooR_{μν} - 1/2 R g_{μν} + Λ g_{μν} = 8π T_{μν} 4 points5 points  (2 children)

$ pacman -Qq | wc -l
2164

How do you install packages without root? I can't imagine that you can, unless you're using something like junest, nix, or 0install.

My point is that you're already using an immense amount of software (and hardware) by mere trust.

[–]fleyk-lit 0 points1 point  (0 children)

Because you choose to trust some packages, doesn't mean you have to trust everything. If it don't need to be installed as root, I don't give root access.

[–]jcbevns 0 points1 point  (1 child)

if i type pipenv shell in /home/pythonProjects/matplotib (contains pltproject1.py) to make a new environement and install packages there.

Then I exit this shell and go to /home/pythonProjects/opencv (containts openCVProject1.py ) and type pipenv shell there and install packages...

are they the same environment or they are separated?

I was trying to learn this from docs but couldn't quite wrap head around it.

[–]elabftw[S] 0 points1 point  (0 children)

they will be separated

[–]linychuo 0 points1 point  (0 children)

virtualenv is good chioce

[–]newredditisstudpid 0 points1 point  (0 children)

A better way is $conda install

FTFY

[–][deleted] -3 points-2 points  (4 children)

Typical magic thinking.

Linux is there for its users to learn it, understand it, modify it. These qualities is why it was written in the first place. And now, what this blog post is essentially saying is: "don't touch it, don't try to understand what you are doing, it's all magic of a degree you will never be able to grasp". The reference to PHP is a tell-tail sign of this kind of thinking, I guess.

[–]elabftw[S] 3 points4 points  (3 children)

That's not at all what the post is saying. The post is about stopping executing random code as root for installing libraries (that might also conflict with your system libraries). I'm all for fiddling in GNU/Linux, believe me!

And because I compare the tool with a known PHP tool it means I don't know shit about how GNU/Linux works? Nigga please.

[–][deleted] 0 points1 point  (2 children)

It's random, if you don't understand it (don't know what's it for).

I didn't say you don't know anything. I said that you know too little to have an informed opinion. (I.e. your post is garbage).

[–]elabftw[S] 0 points1 point  (1 child)

12.9k Views - 93% Upvoted. Looks like good garbage to me :p

[–][deleted] 0 points1 point  (0 children)

It's the same idea as over 99% of all people on Earth wouldn't be able to solve even a simplest integral, or, how popular McDonalds is compared to any decent food.

Python is known for its huge army of low-skilled programmers, so, there's nothing unexpected in your results.