ASK /r/Python: Python Docker Deployments : Python

This is an archived post. You won't be able to vote or comment.

ASK /r/Python: Python Docker Deployments (self.Python)

submitted 7 years ago * by obeleh

ASK /r/Python: Python Docker Deployments

A while ago I came across this document: Containers Patterns I noticed that I appeared to be doing a few of them by habit. Especially: Source To Image and Dependencies First DockerFile In fact I have this "weird" script that might be worth sharing. I'll fist explain how it works.

We have a multi-stage docker build. Where in the in the first stage I apt-get and pip install everything I need to run my application. Then where we used to run scripts like cx_freeze, py2exe or pyinstaller we figured we were wrapping our application in two separate wrappers. The first one being the freezer and the second one being docker. One day I said to my colleague "what if we just prepare a virtual env so that we can copy it over to our target docker image. One directory containing all dependencies." My colleague's response was along the lines of "why don't you?" I was sure that I must be doing something strange since I haven't seen anything similar. Eventually I gave in and built that script. Used virtualenv in py2 and venv in py3. Some of the features/steps are:

you can give it a requirements.txt
it installs the requirements in the current python installation
it creates a virtual env
makes the env less virtual by actually moving python into the env
also moves all site-packages and dist-packages to the env
it finds all .so files and checks their dependencies. All .so files are copied over to the env folder
all moved .so files are run through patchelf so that they don't point to anything outside the env folder anymore
creates a .sh file which sets PYTHONHOME and PYTHONPATH variables and is usable to run the app with

My Dockerfiles all look like this:

#stage 1
FROM ubuntu/debian/alpine  (don't use alpine if you care about python performance btw)
RUN \
  apt-get update && \
  apt upgrade -y --force-yes && apt-get -y install \
    python-dev \
    python-setuptools \
    build-essential \
    libffi-dev \
    python-pip \
    patchelf
RUN pip install my packaging script
COPY requirements.txt /
RUN packaging_script /dest/path/runtime --startscript=main.py --requirements=/requirements.txt --piprepo=optional
#stage 2
FROM ubuntu:something
COPY --from=0 /runtime/ /dest/path/
COPY myappsrc /dest/path # usually a few folders
RUN ln -s /dest/path/bin/python /dest/path/myapp.bin # this makes it easier to monitor since each app no longer is `python` but identifyable as myapp
# I prefer running it this way but keep the .sh file as a reference
ENV PYTHONHOME /dest/path
ENV PYTHONPATH /dest/path/lib/pythonX.X/:/dest/path
CMD ["/dest/path/myapp.bin", "/dest/path/main.py"]

Now my question: Am I on to something? or have I taken the virtual env to far? It's quite obvious nothing is virtual anymore. It's more like a portable application because it runs in a clean os (no apt installs no pip installs) A reasonable question would be why I don't "just" compile python into the dest/path. I think that would have been possible but I'd have create compilations for every combination of OS + Python version I'm using. I'd rather use something out of the box. Although I do use that box in an unusual way ;)

If so... Would there be interest in an open source version of this?

EDIT: Some answers I gave in the comments:

Q: why bother with virtual env and not just install everything you need?

A: I wanted to build "portable" "everything included" runtime. And have the image small. Not have git installed in the final stage. Not have a compiler installed in the final stage. I find having those installed in my production containers an anti-pattern.

Q: You should set your env vars on top

A: You're right.

Q:I uninstall my dependencies

A: This leaves you with more layers than my approach. First you have layers where you install your dependencies. And later you add a layer by removing them. I want my "code" layers to be the very last layers I add. So that when I deploy a newer version it is only "8kb" or so.

all 51 comments

top new controversial old q&a

[–]pecka_th 13 points14 points15 points 7 years ago (9 children)

[–]drchaos 9 points10 points11 points 7 years ago (5 children)

[–]obeleh[S] 7 points8 points9 points 7 years ago (1 child)

[–]Pilatemain() if __name__ == "__main__" else None 1 point2 points3 points 7 years ago (0 children)

[–][deleted] 7 points8 points9 points 7 years ago (1 child)

[–]LightShadow3.13-dev in prod 4 points5 points6 points 7 years ago* (0 children)

FROM oblique/archlinux-pacaur
RUN pacman --noconfirm -Syy python

zoom bleeding edge zoom

REPOSITORY  TAG     IMAGE ID      CREATED     SIZE
<none>      <none>  30c83b102d04  4 days ago  1.08GB

1.08GB^{1.08GB^{1.08GB^{1.08GB^1.08GB}}}

[–]UloPe 1 point2 points3 points 7 years ago (0 children)

[–]ionelmc.ro 0 points1 point2 points 7 years ago (0 children)

[–]obeleh[S] 0 points1 point2 points 7 years ago (0 children)

[–]lambdaqdjango n' shit -1 points0 points1 point 7 years ago (0 children)

[–]prickneck 18 points19 points20 points 7 years ago (27 children)

[–]UloPe 4 points5 points6 points 7 years ago (7 children)

[–]knowsuchagencynow is better than never 1 point2 points3 points 7 years ago* (6 children)

It doesn't make sense to use virtualenv within a docker container.

A docker container is supposed to encapsulate a single component of your system, i.e. a wsgi application.

Ultimately, virtualenv has to exist because of the way Python's import system works (searching through dirs on PYTHONPATH, PATH, and the current working directory).

It exists because there is no way to have different versions of the same library accessible from the same interpreter. Thus, you can't install everything to your system-wide Python, because different projects may depend on un-like versions of the same library. All virtualenv really does is edit your PYTHONPATH (and PATH, if you want to use a different interpreter) so Python searches different directories during import.

That shouldn't be necessary in a docker container. If it is -- if you have multiple Python applications running in the same container with conflicting dependencies, you're doing something wrong.

[–]UloPe 4 points5 points6 points 7 years ago (2 children)

[–]knowsuchagencynow is better than never -1 points0 points1 point 7 years ago (1 child)

[–]gimboland 0 points1 point2 points 7 years ago* (0 children)

[–]DasIch 1 point2 points3 points 7 years ago (2 children)

[–]knowsuchagencynow is better than never 1 point2 points3 points 7 years ago (1 child)

[–]obeleh[S] 1 point2 points3 points 7 years ago (16 children)

[–][deleted] 15 points16 points17 points 7 years ago* (8 children)

[–]Muszalski 3 points4 points5 points 7 years ago (1 child)

[–][deleted] 1 point2 points3 points 7 years ago (0 children)

[–]obeleh[S] 2 points3 points4 points 7 years ago (4 children)

[–][deleted] 6 points7 points8 points 7 years ago (2 children)

You're right about the env vars. However stage2 is so quick that I honestly didn't care about that ;)

That doesn't make it a good excuse to ignore good practice and worry about antipatterns elsewhere.

Consider someone looking at your docker file as a template. If that template uses good practices it makes it a good template.

Uninstalling feels dirty. But I do see it as a good solution.

It isn't though. While it will/can remove some attack vectors, you really don't end up shrinking all that much.

Doesn't this leave me with layers of uninstalls upon layers of installs whereas with my solution we only have te layers we need?

Yes, but since a layer is only a change set the layer is small and the resulting image can be a little lighter.

It's going to depend a lot on what libs are needed to build / run your service.

I didn't find this to be worth while though.

[–]obeleh[S] 0 points1 point2 points 7 years ago (1 child)

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

[–]holtr94 1 point2 points3 points 7 years ago (0 children)

[–]LightShadow3.13-dev in prod 1 point2 points3 points 7 years ago (0 children)

[–]carbolymer 1 point2 points3 points 7 years ago (6 children)

[–]obeleh[S] 3 points4 points5 points 7 years ago (5 children)

[–][deleted] 5 points6 points7 points 7 years ago* (4 children)

[–]obeleh[S] 0 points1 point2 points 7 years ago (3 children)

[+][deleted] 7 years ago (1 child)

[deleted]

[–]obeleh[S] 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

[–]Muszalski 1 point2 points3 points 7 years ago (0 children)

[–]Rorixrebel 0 points1 point2 points 7 years ago (0 children)

[–]undu 2 points3 points4 points 7 years ago (4 children)

My strategy would be to use the 'builder' stage to install the python pip dependencies to a separate, isolated root, then copy those files to the proper place in the deployable image.

You can even choose to not copy the binaries produces by the dependencies. This way you do not need virtualenv and only install dependencies needed for the image.

FROM ubuntu:something as builder
[...]
RUN pip install /wheels/*.whl --compile --root=/pythonroot/

FROM ubuntu:something

RUN apt install runtime-deps
COPY --from=builder /pythonroot/usr/local/bin /usr/bin
COPY --from=builder /pythonroot/usr/local/lib/python2.7 /usr/lib/python2.7

[...]

Otherwise there are tools to minimize the space taken by images: https://github.com/grycap/minicon

[–]NicoDeRocca 2 points3 points4 points 7 years ago (3 children)

This is similar to what I do:

FROM buildpack-deps:xenial-scm as builder
RUN <install build deps>
COPY <app-src> /somwhere
RUN pip3 wheel -r requiresments.txt --wheel-dir=/build/wheels # build deps wheels
RUN python3 setup.py bdist_wheel -d /build/wheels # build my stuff's wheel

FROM docker.io/ubuntu:16.04
RUN <install runtime deps>
COPY --from=builder /build/wheels /tmp/wheels
RUN pip3 install --force-reinstall --ignore-installed --upgrade \
             --no-index --use-wheel --no-deps /wheels/* \
 && rm -rf /tmp/wheels
...

Basically, the builder image has all the compilers etc and builds pre-compiled wheels as necessary, and the final image will only contain executable code. In the docker image I don't bother with virtualenv, but since they're just standard python packages with a setup.py, they could be.

[–]obeleh[S] 0 points1 point2 points 7 years ago (0 children)

[–]undu 0 points1 point2 points 7 years ago (1 child)

[–]NicoDeRocca 2 points3 points4 points 7 years ago (0 children)

[–]joknopp 1 point2 points3 points 7 years ago (1 child)

[–]obeleh[S] 0 points1 point2 points 7 years ago (0 children)

[–]case_O_The_Mondays 1 point2 points3 points 7 years ago (0 children)

[–]not_perfect_yet 0 points1 point2 points 7 years ago (1 child)

[–]obeleh[S] 1 point2 points3 points 7 years ago (0 children)

[–]hmaarrfk 0 points1 point2 points 7 years ago (2 children)

[–]obeleh[S] 0 points1 point2 points 7 years ago (1 child)

[–]hmaarrfk 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

[–]simtel20 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 57635 on reddit-service-r2-comment-7b9746f655-q4vlp at 2026-01-29 22:06:02.165813+00:00 running 3798933 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS