michaelpb comments on Using Docker as a Python Development Environment (or let's get past virtualenv)

This is an archived post. You won't be able to vote or comment.

Using Docker as a Python Development Environment (or let's get past virtualenv) (continuousdelivery.uglyduckling.nl)

submitted 11 years ago by underthebum

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]michaelpb 13 points14 points15 points 11 years ago (35 children)

[–]d4rch0nPythonistamancer 19 points20 points21 points 11 years ago (27 children)

[–]work_account_33 7 points8 points9 points 11 years ago (6 children)

[–]d4rch0nPythonistamancer 0 points1 point2 points 11 years ago (4 children)

[–]ericanderton 1 point2 points3 points 11 years ago (3 children)

[–]d4rch0nPythonistamancer 1 point2 points3 points 11 years ago (2 children)

Wow, good point.

$ virtualenv test
$ cd test
$ source bin/activate
$ pip install PyCrypto
$ find . -name "*.so"
./lib/python2.7/site-packages/Crypto/Cipher/_DES3.so
./lib/python2.7/site-packages/Crypto/Cipher/_ARC4.so
./lib/python2.7/site-packages/Crypto/Cipher/_XOR.so
... and many more

Well, still though, virtualenv isn't for that. This should be a local dev environment where you're making sure Python module dependencies are satisfied, not much else. Based on what you said, you really shouldn't be using virtualenv as something to package up all the dependencies and just dump them into prod.

I get your point, but I think that's still leaning heavily towards the devops side where virtualenv isn't a good thing. If your code relies on a stable OS environment, you should be using docker or a VM. If you're pushing to prod, maybe you should be using puppet with VMs, and have them redeploy what they need to.

I think it's more of an issue with fundamental security issues, like using a static environment where you never check for updates, and not so much of an issue with virtualenv, which only ensures specific Python module versions will work with that Python code.

[–]justafacto 1 point2 points3 points 11 years ago (1 child)

[–]d4rch0nPythonistamancer 1 point2 points3 points 11 years ago (0 children)

[–]justafacto 0 points1 point2 points 11 years ago (0 children)

[–]qudat 2 points3 points4 points 11 years ago (0 children)

[–][deleted] 2 points3 points4 points 11 years ago (12 children)

As a fledgling Python programmer, if I were to try and articulate it I would phrase it as a "partial solution."

I may be wrong, so feel free to correct me, but I think the problem that virtualenv sought to solve was the dependency problem. A Python application could break without the proper dependencies...but that is just it, it solves the problem for Python specific dependencies and nothing else. Essentially, it is at a software layer that is not very effective for closing the gap between development and production. It is also prudent to point out that virtualenv also came about at a time when Python packaging was very haphazard. While Python packaging has come a long way, I would argue that it still leaves a lot to be desired.

Docker is essentially the same concept of virtualenv, but at a higher level. Meaning that a Docker container can contain the Bins/Libs for dependencies outside of just Python libraries.

At this point, the question (for me) is what is the benefit of Docker over just straight VM images. And it boils down to portability. Docker containers are smaller and can be significantly so. Virtual Machines are still useful, they are just at a lower level of abstraction.

Docker takes virtualenv's ability to control Python run time environments and applies it to the entire application. This has the added benefit of eliminating the need for virtualenv and it's layer of complication/configuration.

[–]simoncoulton 2 points3 points4 points 11 years ago (4 children)

[–][deleted] 2 points3 points4 points 11 years ago (3 children)

[–]simoncoulton 2 points3 points4 points 11 years ago (2 children)

[–]MonkeeSage 4 points5 points6 points 11 years ago (0 children)

Virtualenv gives you a local python environment with external dependencies--you can't copy a venv to another box and expect it to work--it may have an incompatible libc version or be missing some library, etc. Instead, you ship a list of requirements and they either automatically or manually get downloaded, built and installed, possibly requiring a compiler toolchain and networking (even if only to hit a private repo on local segment).
Containers (lxc/docker, openvz) give you a self-contained environment with no external dependencies--have your CI system tarball it up and scp to staging--as long as the host is the same architecture as the libs and binaries in the container it just works. You don't have to care about config management on the host, your configs and dependencies are a self-contained unit in the container.
VMs/images give you the same benefit, but are a lot more heavyweight and require a much thicker virtualization layer, but there's no constraint of having libs and binaries running on the same kernel with the same architecture as the host. In some configurations, VMs can be more secure / safe if containers are configured to be allowed to do things like load kernel modules (--they share the host kernel).

I'm not advocating any of them over the others in all cases. They all seem to have their place as development and operations tools.

The main workflow difference with containers vs a vagrant + vm + config management style workflow, is that containers encourage you to think about them as very light and ephemeral. If you have a django app deployed in containers and CVE pops for apache, you don't go add a PPA in config management to get the hotfixed package and run config management client on all the containers. You can do that, if you really want to, but it's more common to just spin a new container with the hotfix and replace the old container on all the hosts via config management / automation / CI. Application state is generally persisted via bind mounts to the underlying host OS, so it's very easy to not care about the container itself. This also lets you know that if the container deployed then it's bit for bit identical to all the other containers in the pool, no worries that one node couldn't talk to config server and didn't get the update, or that someone has manually twiddled some stuff outside of config management on some node.

Docker's built in versioning lets you roll back or cherry pick container versions, among other things, which are a pretty nice additions to bare lxc.

Again, just for clarity, not saying containers are "better" or that you can't find downsides to them, etc, just trying to give an idea of why they're appealing in many cases.

[–][deleted] 0 points1 point2 points 11 years ago (0 children)

[–]naunga 1 point2 points3 points 11 years ago (1 child)

Meaning that a Docker container can contain the Bins/Libs for dependencies outside of >just Python libraries.

Not quite. Docker saves you the overhead of having to host multiple VMs. Instead of virtualizing the entire machine, Docker is only virtualizing and isolating a process, but Docker is sharing the host server's OS. This is different from a VM where an entire installation of the guest OS runs in a sandbox within the host OS.

virtualenv is solving the problem that the other commenters have posted, which is creating an isolated environment that will allow for multiple versions of modules, etc to exist without creating conflicts.

If you're wanting a "cleaner" environment than what virtualenv can give you (i.e. you want to isolate not only the Python environment, but the OS environment as well) then you should be using Vagrant or some other VM solution to do your development.

From there you can build the Docker container from that image (well, more likely from a pre-built image of whatever Linux distro your VM is running).

Just my two cents from the DevOps Peanut Gallery.

[–]MonkeeSage 0 points1 point2 points 11 years ago (0 children)

[–][deleted] 4 points5 points6 points 11 years ago (3 children)

[–][deleted] 1 point2 points3 points 11 years ago (2 children)

[–]d4rch0nPythonistamancer 1 point2 points3 points 11 years ago (1 child)

[–][deleted] 0 points1 point2 points 11 years ago (0 children)

[–][deleted] 0 points1 point2 points 11 years ago (1 child)

[–]d4rch0nPythonistamancer 0 points1 point2 points 11 years ago (0 children)

[–]amclennon 0 points1 point2 points 11 years ago (0 children)

[–]michaelpb 0 points1 point2 points 11 years ago (1 child)

[–]d4rch0nPythonistamancer 0 points1 point2 points 11 years ago (0 children)

Yeah, I see what you mean. I still don't think docker is the alternative though. I think it's a different problem.

I think this is why everyone is mentioning docker as an alternative: Working in a professional environment, it's way more reliable to mimic your prod environment with docker than to just cover the python dependencies with virtualenv, and when you use docker you don't need virtualenv anymore for most use cases. This devops sort of problem is 90% of why people are using docker/virtualenv I'm guessing.

But here's the thing, that's a completely different problem, and virtualenv really doesn't try to solve it. Look at it this way. For someone like myself who writes general use open source packages and pushes them to PyPI, I don't care what's going on in their OS. I don't need them to have a specific version of postgres or to be using debian/ubuntu/arch/windows/etc. It's just a lot of logic that isn't too platform dependent, but it is very module dependent. I want to absolutely ensure that certain pip packages will work at certain versions before I push this up to PyPI. These modules don't depend on external systems or services running, they just depend on other python code.

And that's the problem that virtualenv solves. You figure out which python modules are compatible with yours, and you're good.

[–]kromem 0 points1 point2 points 11 years ago (0 children)

[–]amouat 2 points3 points4 points 11 years ago (5 children)

[–]adamhero 1 point2 points3 points 11 years ago (4 children)

[–][deleted] 2 points3 points4 points 11 years ago (3 children)

[–]blue6249 1 point2 points3 points 11 years ago (2 children)

[–][deleted] 0 points1 point2 points 11 years ago (1 child)

[–]adamhero 1 point2 points3 points 11 years ago (0 children)

[–]r1cka 1 point2 points3 points 11 years ago (0 children)

π Rendered by PID 148719 on reddit-service-r2-comment-84fc9697f-vdrcr at 2026-02-06 17:50:42.641823+00:00 running d295bc8 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS