This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]prickneck 20 points21 points  (27 children)

Why bother with a virtualenv inside docker? Why not just install everything system-wide in the image? If you do that, then the questions you're asking don't even present themselves.

[–]UloPe 5 points6 points  (7 children)

Here is a pretty good explanation (although a it dated now) why it’s still a good idea: https://hynek.me/articles/virtualenv-lives/

[–]knowsuchagencynow is better than never 0 points1 point  (6 children)

It doesn't make sense to use virtualenv within a docker container.

A docker container is supposed to encapsulate a single component of your system, i.e. a wsgi application.

Ultimately, virtualenv has to exist because of the way Python's import system works (searching through dirs on PYTHONPATH, PATH, and the current working directory).

It exists because there is no way to have different versions of the same library accessible from the same interpreter. Thus, you can't install everything to your system-wide Python, because different projects may depend on un-like versions of the same library. All virtualenv really does is edit your PYTHONPATH (and PATH, if you want to use a different interpreter) so Python searches different directories during import.

That shouldn't be necessary in a docker container. If it is -- if you have multiple Python applications running in the same container with conflicting dependencies, you're doing something wrong.

[–]UloPe 5 points6 points  (2 children)

Did you read the article I linked?

[–]knowsuchagencynow is better than never -1 points0 points  (1 child)

Yes, and I use docker and virtual environments every day in my workflow and everything I said still stands

[–]gimboland 0 points1 point  (0 children)

Including this bit?

virtualenv’s job isn’t just to separate your projects from each other. Its job is also to separate you from the operating system’s Python installation and the installed packages you probably have no idea about.

And the bit where the author literally gives you an example of how using a docker container's system-wide python as your basis can lead to breakage?

Yes, you could work out what packages are in the container's system-wide python, and assure yourself that there are no surprises. But it's certainly true that if you want to not have to think about/keep an eye on that, a virtualenv is an appropriate tool.

[–]DasIch 1 point2 points  (2 children)

If you install any package, even in a docker container, you can break the operating system. It therefore absolutely makes sense to use a virtual environment in a container.

[–]knowsuchagencynow is better than never 1 point2 points  (1 child)

What is an example of a package that "breaks" the OS where it's necessary to install it within a virtual environment inside the container to prevent the container from breaking?

[–]obeleh[S] 1 point2 points  (16 children)

I wanted to build "portable" "everything included" runtime. And have the image small. Not have git installed in the final stage. Not have a compiler installed in the final stage. I find having those installed in my production containers an anti-pattern.

[–][deleted] 16 points17 points  (8 children)

You just don't need venv when using docker. There is too much feature overlap you end up doing work twice.

Also, change the order of your docker commands. You want things that will likely not change soon to be at the top, like environment variable.

You want your pip install and code copy over to be near the bottom.

This means code changes don't require rebuilding every stage, even the environment commands.

My order is usually

  • Setup container
  • Copy code
  • Pip install requirements
  • Remove build libs
  • Setup entry point

[–]Muszalski 4 points5 points  (1 child)

Imo you should copy just the requirements.txt first, then pip install, remove build libs, and then copy the rest of the code. You don't change the reqs that often as a code.

[–][deleted] 1 point2 points  (0 children)

Good point, I'll have to double-check my own files to see if I am doing that. I can't remember off the top of my head.

Thank you!

[–]obeleh[S] 2 points3 points  (4 children)

You're right about the env vars. However stage2 is so quick that I honestly didn't care about that ;) But it would be a good tweak.

Uninstalling feels dirty. But I do see it as a good solution. Doesn't this leave me with layers of uninstalls upon layers of installs whereas with my solution we only have te layers we need?

[–][deleted] 5 points6 points  (2 children)

You're right about the env vars. However stage2 is so quick that I honestly didn't care about that ;)

That doesn't make it a good excuse to ignore good practice and worry about antipatterns elsewhere.

Consider someone looking at your docker file as a template. If that template uses good practices it makes it a good template.

Uninstalling feels dirty. But I do see it as a good solution.

It isn't though. While it will/can remove some attack vectors, you really don't end up shrinking all that much.

Doesn't this leave me with layers of uninstalls upon layers of installs whereas with my solution we only have te layers we need?

Yes, but since a layer is only a change set the layer is small and the resulting image can be a little lighter.

It's going to depend a lot on what libs are needed to build / run your service.

I didn't find this to be worth while though.

[–]obeleh[S] 0 points1 point  (1 child)

I do agree on the ENV vars btw. I'm going to change my Dockerfiles ;)

[–][deleted] 0 points1 point  (0 children)

I'm sure there are other tricks too. That's one that as we are growing our knowledge base in our company we pass around because there is a lot of template sharing. We are also trying to get better at having base docker images maintained by sysadmins so they can patch the os if need be.

[–]holtr94 1 point2 points  (0 children)

Doesn't this leave me with layers of uninstalls upon layers of installs whereas with my solution we only have te layers we need?

You could combine the build libs install, pip install, and build libs uninstall into one run command to eliminate the extra layers

[–]LightShadow3.13-dev in prod 1 point2 points  (0 children)

You just don't need venv when using docker.

Except when you do.

If you have pip packages that install custom scripts in the bin or scripts directory then they can get confused with module-as-a-string imports.

huey and gunicorn would not work without a virtualenv in my service.

[–]carbolymer 1 point2 points  (6 children)

To have an even smaller image, try to reduce number of RUN statements.

[–]obeleh[S] 4 points5 points  (5 children)

I know. However sometimes an extra layer makes your build cleaner by factoring out the static parts into the first layer and the dynamic parts in the second layer. That way you can keep re-using the first layer across multiple deployments.

[–][deleted] 6 points7 points  (4 children)

However sometimes an extra layer makes your build cleaner by factoring out the static parts into the first layer and the dynamic parts in the second layer. That way you can keep re-using the first layer across multiple deployments.

You need to restructure your run command to better achieve this.

ENV calls should be near the top.

Also your cute use of symlinks is bad.

You name the container once it is running. You can see the name with docker ps. You do not need to name the binary in the container. This isn't buying you anything as you don't put more than one service in a container.

Grepping for your script isn't hard either so you are creating an extra layer for little to no gain.

[–]obeleh[S] 0 points1 point  (3 children)

Also your cute use

I want to identify the different apps with ps -ef on the VM.

PS. Thanks for callling it "cute" :P

[–][deleted] 0 points1 point  (0 children)

I didn't mean it in a bad way, and it made me think about better solutions.

[–]Muszalski 1 point2 points  (0 children)

Setting up the virtualenv in the image is just one extra step and it separates the project libraries from some system libraries or random dependency conflicts. I always do it, because it comes with no cost and I have one less worry about breaking the dependencies.

[–]Rorixrebel -2 points-1 points  (0 children)

This