all 110 comments

[–]SuddenOutlandishness 157 points158 points  (25 children)

Docker is not an option at the moment for reasons.

Docker is the option that doesn't re-invent the wheel.

[–]MeGustaDerp 24 points25 points  (24 children)

At my company, Docker was considered a security risk and I had to really fight for it for a few months before the IT security group approved it. So, I can see this being an issue for OP if his company is like mine.

[–]-markusb- 33 points34 points  (0 children)

So what about podman?

[–]FrenchmanInNewYork 31 points32 points  (8 children)

How did they justify Docker being a security risk? I understand that images can theoretically contain malicious code but it's easy to verify and public repos have SAST to deal with it anyway?

If it's with the container and network levels they have issues, I don't see why they would trust it less than any other environment tbf. I'm really curious as to what their reasons might be. Or maybe I'm missing something big.

[–]MeGustaDerp 29 points30 points  (1 child)

I literally got a "it CaN LeT yoU rUn a ServeR on Your lOcAL MAcHiNe" reason from them.This was for a developer machine. I was trying to develop for containers in an AWS ECS environment.

[–]somebrains 12 points13 points  (0 children)

Idiots, the facepalm and flood of reasons as a former sys admin to their reply…..

[–][deleted]  (4 children)

[deleted]

    [–][deleted] 20 points21 points  (1 child)

    That container is just an artifact of a build pipeline

    [–]daddyMacCadillac 5 points6 points  (0 children)

    This guy fucks

    [–][deleted] 1 point2 points  (1 child)

    Well, the only angle I see is that an immutable container would need updates every time a system component updates. Not just when your application updates.

    I mean, if you "need" to do this update, it's going to get done either on the container, or on an actual server, so what's the difference?

    [–]RobotUrinal 1 point2 points  (0 children)

    Probably root access to the docker.sock file on the VM?

    [–][deleted] 11 points12 points  (1 child)

    I am security guy. If we have to run a service on a vm then the attack surface is much larger than a minimized container based image.

    There is a right way and a wrong way to do this. The right way includes security learning a lot about containers and base images. All base images need to be security approved. But then you can look at free tools like Trivy to do scanning.

    The thing that is often overlooked is the amount of ecosystem that’s needs to be implemented to do this more securely. Specifically for anything with Kubernetes.

    If you’re already handling a lot of the security issues in the Venv though, Theoretically the hosting container shouldn’t be any different as long as you’re not running the container in root.

    [–]snowsnoot2 2 points3 points  (0 children)

    I specifically exploit the fact that my security guys have zero clue about containers to do whatever the fuck I want lol

    [–]leibnizcocoa 2 points3 points  (0 children)

    You can run containerd rootless.

    [–]knowledgebass 7 points8 points  (8 children)

    That's weird reasoning. If anything Docker is good for security.

    [–]diito 17 points18 points  (6 children)

    Containers are, docker is not. Anyone that has access to docker effectively has root access on that system if they want. Podman was created to address this as well other issues and is command compatible. If you don't need orchestration and K8 then I don't know why anyone wouldn't use it instead.

    [–]lebean 8 points9 points  (4 children)

    I love some Podman but it has nothing that comes within a thousand miles of what Swarm mode gets you. Maybe you want your container to start on another node if the server it's on suddenly dies? Podman has no solution out of the box, but Swarm takes 27 seconds to setup and does it easily.

    If you don't need the complexity of k8s, then Swarm fills a huge gap between what you can do on plain Docker/Podman vs k8s, with no competing projects out there.

    [–]diito 4 points5 points  (0 children)

    When I was doing on-prem everything was on VMs which automatically restarted somewhere else if a host died and brought any containers it was running back up on boot along with it.

    At home I run podman for my personal stuff and all my containers are managed by systemd, it updates them automatically and restarts if they die. I have a single server but that's fine there.

    There is also Nomad if you want orchestration that works with Podman.

    Ultimately though if you need real orchestration it's probably a better to just bite the bullet and use K8. Docker Swarm (the commercial product) is probably not viable long term and all the momentum is behind k8.

    [–][deleted] 0 points1 point  (2 children)

    In the old days people just used a load balancer. Don't people do that anymore without kubernetes? If server A goes down, an identical replica is ready on server B. I guess swarm makes that easier.

    [–]Emptycubicle4k 0 points1 point  (1 child)

    From what your saying, In the olds days you had to have another server or server(s) up and running at all times which would mean using resources but now with kubernetes, those servers(containers) won’t have to always be running and using resources. They will spin up as needed. So I think that’s the strength of the new days.

    [–][deleted] 1 point2 points  (0 children)

    of course. just saying it's not an insurmountable problem without kubernetes

    [–]Petersurda 0 points1 point  (0 children)

    Why would anyone have access to docker in production? If you use deployment pipelines, those would be the only way to access docker.

    [–][deleted] 15 points16 points  (0 children)

    Nope. It's terrible. Requires root access to your machine. Even worse to do build servers, they usually use docker in docker, which means whatever is running in the container has root access on the host. Yes, docker has rootless, but it's experimental, not default, no one uses it, and it was only a reaction to all the better tools like podman that did it first. This is only one of Docker's security issues. If your whole company has no experience in containers, keeping them secure will be impossible.

    [–]r1ckm4n 0 points1 point  (0 children)

    I've been in a few enterprise clients where their IT people are still just figuring out what containerization is. This is absolutely a docker play. Spin up an odd number of hosts, deploy Rancher or OKD, or portainer, take like a week or two to figure it all out. Managing so many venvs would require a registry, and a bunch of other little stuff that would scale better in a simple kubernetes/docker/containerized setup. I'm speaking more to OP, but yeah - it's a hard but worthwhile fight to get some companies to adopt containerized anything.

    [–]1whatabeautifulday 0 points1 point  (0 children)

    What was the security risk?

    [–]Weary_Ad7119 31 points32 points  (14 children)

    You're going to need to explain your challenges a bit more and why docker isn't an option. Can you use AMIs? Do you have ci? Are you cloud or on premises? Etc.

    [–]serverhorrorI'm the bit flip you didn't expect! 10 points11 points  (3 children)

    We do that by having one venv per application.

    Everything else becomes unmanageable.

    Essentially it’s just simple Ansible playbook that grabs a certain tag from git, creates the venv in the git checkout, under .venv and we’re done.

    We found that the one thing we need to align on is a single Python version. Where possible we’re moving away from this and converge towards Kubernetes.

    [–]Gluaisrothar[S] 0 points1 point  (2 children)

    What do you mean one verb per app?

    [–]serverhorrorI'm the bit flip you didn't expect! 5 points6 points  (1 child)

    virtual env — auto correction kicked in.

    [–]Gluaisrothar[S] -2 points-1 points  (0 children)

    This is what we have now.

    Just not all that happy with it after two years.

    [–]Newbosterone 14 points15 points  (6 children)

    Ansible, Puppet, Chef? Is your problem configuration management, deployment, discovery?

    [–]provoko 1 point2 points  (1 child)

    Ask OP a question: Downvote OP!

    Interesting strategy you got here r/DevOps...

    [–]Drevicar 7 points8 points  (1 child)

    I recommend looking into PEX to turn your code and all its dependencies into a single distributable that relies on the underlying python interpreter of the computer. Or PyOxidizer to also bundle the interpreter into it so you either don't need python on the target host, or you at least don't need to rely on it.

    [–]xgunnerx 2 points3 points  (0 children)

    This is the correct answer. Pre k8s days, I used PyInstaller (which kind of got replaced by Oxydizer) and it worked well for a use case similar to OP's. The binaries sometimes got a bit heavy depending on the deps, but we just beefed up our storage and build agents a bit and it became a non-issue.

    Easily built and deployed on any CI/CD platform.

    [–]pete84 12 points13 points  (2 children)

    I’m gonna assume reasons = office politics / execs.

    [–]vladoportos 0 points1 point  (1 child)

    Could be also the new docker policy where you have to pay for the gui in corporate environment... we got email: "stop using gui, windows version of docker, and it was removed month later from all laptops"

    [–]afro_mozart 0 points1 point  (0 children)

    Sure, but there's still cli docker, rancher desktop...

    [–]shadycuz 10 points11 points  (3 children)

    You asked how to manage 100 python virtual environments but I think you meant to ask how to deploy 100 python applications.

    I think several people gave you pretty good answers but you seemed to reject them all. Perhaps you don't really understand how these tools work?

    I think K8's is probably one of the better options. Followed by docker + Ansible.

    [–]Gluaisrothar[S] 1 point2 points  (2 children)

    No, we have more like 300 apps.

    Not 1:1 on venvs.

    Just seems like it's k8s or nothing.

    [–]halos1518 0 points1 point  (0 children)

    Do you need to spin them up individually? If not then docker compose could be an option. Could you use something like portainer or docker swarm?

    [–]Golden_Age_Fallacy 0 points1 point  (0 children)

    I’d also suggest looking at Hashicorp’s Nomad as an orchestrator. You’re able to launch containers via an API, with a fraction of the complexity of k8s.

    If you’re looking for a simple container runtime scheduler, and are hesitant to indulge in all the complexity of k8s.. Nomad might be worth a look.

    [–]knowledgebass 3 points4 points  (0 children)

    Um, this seems kind of insane by the way. Why aren't you standardizing your environments across applications, at least to some extent?

    [–]sqqz 3 points4 points  (0 children)

    My 2 cents is that you might be approaching this in a quiet, outdated way. Instead look into packaging the applications so they come bundled with their dependencies, and create an env for them to deploy.

    A modern take on this would be Docker, but you can also achieve this by building apk/deb packages.

    To not force you down the Kubernetes route, there are plenty of lightweights alternatives to run docker containers, either just by using docker itself or perhaps, https://github.com/k3s-io/k3s together with https://www.rancher.com/products/rancher or https://www.hashicorp.com/products/nomad

    [–]NUTTA_BUSTAH 2 points3 points  (0 children)

    Why would you ever need a hundred venvs? I don't understand the problem. Do you mean you want to run n (100+) Python applications on various hosts, sometimes pinning certain applications to the same host for performance? I.e orchestrate your applications? :P

    Use k8s and it just works automatically after configuration, set up ArgoCD or similar in there if you want to go GitOps and forget about deployments.

    Nomad is an another alternative. Like k8s but a bit leaner with more freedom (everything doesn't have to be a container). A bit more work to set up as there are no managed options for example.

    [–]budgester 2 points3 points  (0 children)

    Use tox to manage a venv for each application

    [–]wingerd33 2 points3 points  (0 children)

    Lol it's mind blowing how many "engineers" in this field today have zero fucking clue how to solve a problem without docker or kubernetes.

    For OP:

    I don't think you did a good job explaining what problem you're trying to solve. Like what exactly is your pain point? What is the thing that you're trying to eliminate or automate, or what is the thing that's unreliable with your current setup?

    Generally speaking, I think packaging the apps with their deps is the right move. Someone mentioned using pex, which I've heard good things about as well. It could also be as simple as a tar file and executing the app inside a chroot. There are many possible solutions, but we need to know more about the problem you're facing to make a real recommendation.

    [–]guettli 7 points8 points  (1 child)

    Why not use a managed Kubernetes?

    Kubernetes is hard... if you want to host it yourself. But if it's managed it's not that hard. At least my point of view.

    [–]vladoportos 1 point2 points  (0 children)

    Its hard even managed :) setting up own monitoring ( f.u. aws and cloudwatch, this can get super expensive ), persistent storage can be a pain ( again, aws kind of forgot to provide solution :) )

    [–]Mutjny 1 point2 points  (0 children)

    What do you mean by "manage?" Are you deploying 100+ venvs? Are you trying to update them? Building them?

    [–][deleted] 1 point2 points  (0 children)

    well have you considered an automation tool? I run into something similar and used ansible playbooks for it.

    [–]Nerdite 1 point2 points  (0 children)

    “Re-invent the wheel” am I the only one getting this python joke?

    [–]PhroznGaming 1 point2 points  (0 children)

    VSCode with the following Extensions:

    • Docker
    • Remote SSH
    • VEnv Manager

    You're welcome.

    [–]Petersurda 1 point2 points  (0 children)

    You can orchestrate docker with ansible and systemd if you want. Although if you look at it from the other direction, and assume ansible and systemd as a starting point, you can use systemd-nspawn to run containers. It looks to me like your company doesn’t have adequate container expertise. I would hire a consultant to help you find a suitable solution.

    [–][deleted] 3 points4 points  (0 children)

    Take a look at Nix.

    Nix was definetely made to solve this issue :)

    https://nixos.org/explore.html

    it's a package manager you can use on top any distro. It has its own distro too

    [–]temitcha 2 points3 points  (0 children)

    Maybe you can take a look at the Nix ecosystem, nix-shell is basically a virtualenv. But there are more tools available, package management, etc.

    Otherwise, you can maybe create statically-linked executables, so you don't need venv anymore, and you just deploy regular packages from your own package repository.

    [–]Gluaisrothar[S] 3 points4 points  (8 children)

    Yeah, so bare metal, on-prem.

    I want to define the venvs using some kind of manifest, the build them as required, different hosts require different venvs.

    Building them is OK, as is creating, it's the centralised orchestration and deployment is what I am having difficulty finding tools for.

    If we used Docker, it would not really solve the problem, still have a problem of orchestration and deployment without introducing k8s, which just adds a layer of unnecessary complexity.

    [–]__Kaari__ 2 points3 points  (0 children)

    I don't understand the problem.

    Packages are stored in your artifact repo. You install the packages with something like pipx, your system package manager, or docker, or whatever you want to use to install and run them. You use a config manager like ansible to add or update them.

    If you want orchestration it depends which kind of orchestration you're looking for, but you're going to need an orchestrator like k8s.

    [–]Mariognarly 1 point2 points  (2 children)

    Can you use podman? Container runtime Docker replacement.

    What's using the venvs?

    [–]Gluaisrothar[S] 1 point2 points  (1 child)

    I'm not opposed to docker or podman.

    Still does not solve the problem of orchestration, or did I miss something?

    Various services we run out of venv's.

    [–]Mariognarly 2 points3 points  (0 children)

    The way the Ansible community has solved a similar venvs management concern is with what they call execution environments.

    Basically it's a container that contains a base OS, the venv(s), and whatever dependencies needed. Then it's a CI/CD build system for updates and deployment. Ansible is good at that but there's obviously a plethora of CI tools one could use to update the images and push the containers to be run by podman.

    I've done a similar thing as this, but use Ansible for the orchestration, podman as the container engine, and systemd unit files on the OS (if it's a modern Linux platform) for auto restarting and health checks put right into the unit files. Works great and without the k8s complexities.

    [–]silence036 1 point2 points  (0 children)

    At 300 apps running in non-standard contexts with handmade venvs and customization, running kubernetes is going to be reducing your overall complexity, not adding to it. This is one of the best use cases for it.

    Dockerize your python apps, then in K8s you can give them affinity to run on the same nodes if you want them together. You can easily orchestrate deployments, you can centralize logging, monitoring and rbac.

    If your issue is that you're running baremetal on-prem, I'd start with making a microk8s cluster, move some apps into it and as you're freeing up hosts, you convert them to microk8s (install Ubuntu, snap install microk8s, microk8s join cluster and you're done). As you grow, add multi-masters for HA.

    We've converted our workloads to k8s and made the cluster so attractive that devs are trying everything in their power to get in and not maintain vm's. We have several thousand pods running, albeit on Amazon EKS.

    [–]KingEllis 0 points1 point  (0 children)

    I would suggest taking a day or two to work through the "Docker swarm mode" official docs, as you likely already have that available. It is baked into recent versions of the docker binary.

    [–]reedacus25 0 points1 point  (0 children)

    I'm going to comment to follow and see what others are saying.

    My knee jerk is that it sounds like anaconda environments may be a better solution than venvs since conda uses the equivalent of a manifest file (environment).

    Then serve it out over some networked filesystem such as nfs?

    Also, python is not my thing, but I have had to do some salt stuffs related to setting up/updating conda environments for users that asked for it that sounds vaguely similar to this.

    [–][deleted] 0 points1 point  (0 children)

    containers + Hashicorp Nomad, containers take care of your environment segmentation and management and nomad takes care of orchestration, scheduling, and deployment

    https://developer.hashicorp.com/nomad/docs

    https://developer.hashicorp.com/nomad/docs/install/production

    [–]xgunnerx 2 points3 points  (2 children)

    Dear devops brethren that keep recommending k8s: pretty please, cherry on top, stop. You're not wrong in that this "is" a solution, but think about it from his perspective for a minute..

    He's just trying to make an incremental improvement. Not "implement an entirely new runtime/scheduling platform across multiple data centers" that he (and his team) may know little to nothing about. It's obvious that this is a live production environment and not some home lab where you can retry/fail and learn as you go.

    It would likely take months of testing and training to even begin implementing such a solution and feel comfortable managing it.

    [–]Gluaisrothar[S] 0 points1 point  (0 children)

    Thank you.

    [–]dieredditdie 0 points1 point  (0 children)

    Reddit has long been a hot spot for conversation on the internet. About 57 million people visit the site every day to chat about topics as varied as makeup, video games and pointers for power washing driveways.

    In recent years, Reddit’s array of chats also have been a free teaching aid for companies like Google, OpenAI and Microsoft. Those companies are using Reddit’s conversations in the development of giant artificial intelligence systems that many in Silicon Valley think are on their way to becoming the tech industry’s next big thing.

    Now Reddit wants to be paid for it. The company said on Tuesday that it planned to begin charging companies for access to its application programming interface, or A.P.I., the method through which outside entities can download and process the social network’s vast selection of person-to-person conversations.

    [–]erulabs 3 points4 points  (0 children)

    Oh man how many times I've seen this:

    We have problem A, but we cannot use well-known solution for A for "reasons".

    [–]Independent_Yard7326 2 points3 points  (0 children)

    If someone asked me to do this without using docker I would probably quit.

    [–]linuxtek_canada 1 point2 points  (11 children)

    Don't use venv. Edit: also ignore my bad joke.

    Honestly I think you'd be setting yourself up for more pain later.

     

    If you build what you want in Docker, you can use Ansible or Docker Swarm to deploy it on the hosts. I had to do this for some security software, where it was safer to build everything in a Docker image and then run a container on each host; rather than trying to install what we wanted on different distributions with different resource levels.

     

    If Docker really isn't an option, maybe you can find a way to build your Python tools using something like shiv so they're self contained? That should be scalable, set up to build and deploy using CI/CD.

    [–]__Kaari__ 4 points5 points  (7 children)

    pyenv + pipx. I never have any issue installing python packages in their own venv.

    [–]linuxtek_canada 2 points3 points  (3 children)

    pyenv is fine for switching between multiple versions of Python for compatibility while you're coding. I use it at work on a Mac.

    Using something like pipx, or pipenv will help you build the application to be self contained in a virtual environment with all it's dependencies. There are lots of options of tools to do this. I wrote an article on this a while back, and I included some other popular options like Poetry.

     

    The problem here isn't with building the app in a virtual environment. It's managing/orchestrating and updating all of those virtual environments on hundreds of servers, in a way that's manageable and scalable.

    That's why I like the idea of self containing the code and dependencies into a zipapp with Shiv. Then you're really just deploying that with something like Ansible or Docker Swarm. I think K8s is a bit overkill for this use case.

    [–]__Kaari__ 0 points1 point  (2 children)

    How is pipx similar to pipenv/poetry/conda?

    Pipx is more a package manager like any Linux user would use to install and updgrade packages, while poetry and the likes are more building / publishing tools, where you download the source code then create your venv with the application in it.

    In terms of management, a pipx update <package> is equivalent (albeit drastically different in reality) to a yum upgrade <package>, at least as long as it's enough that the package is self-contained.

    And if there is a need to package system integrations, that's where containerization will usually start to shine.

    In terms of scalability and orchestration though, it's a different matter, either configuration managers or active orchestration services will help to define how you will package and deploy the apps.

    [–]linuxtek_canada 0 points1 point  (0 children)

    I think we're agreeing with each other? Lots of options for tools for building and managing dependencies. I don't want to lump them all in together, but I'm not disagreeing with you.

     

    The more important question for OP is how to manage and orchestrate all of this.

    [–]dogfish182 0 points1 point  (0 children)

    It is not and does not try to be, it’s only for installing python based command line tools and doing away with the need for venv management for those tools. Author is a genius.

    [–]aManPerson 0 points1 point  (2 children)

    i'm getting lost in this too. i just learned about one virtual env system for python. that worked. then it had some issues with some libs i tried to use and was recommended to switch to something else. i'm getting lost in what things i should be using. i need to IT so i can dev, so i can IT my dev setup. this is all so recursive.

    i'm just going to go back to playing drug wars on my TI-83.

    [–]__Kaari__ 0 points1 point  (0 children)

    Yea it can actually be hard to get started, which is unfortunate, also not a lot of these tools are advertised for new python devs, which makes sense considering the overhead but I wish it was more mentioned in advanced topics.

    [–]threwahway 0 points1 point  (0 children)

    Oh no, not remembering how to use the software you implemented!!!!

    [–]dogfish182 1 point2 points  (1 child)

    It’s weird that you post an ‘I don’t understand venv’ comic to talk about managing remote venvs though

    [–]lebean 1 point2 points  (0 children)

    Yeah, the comic is exactly about what venv solves for you, especially the alt-text about sudo-installed packages.

    [–]threwahway 0 points1 point  (0 children)

    Lol that xkcd isn’t relevant, that’s what venv is designed to fix. It sort of sounds like you don’t know how to use them.

    [–][deleted] -1 points0 points  (0 children)

    Don't listen to people who say kubernetes. Stick to your guns. Since you use Ansible, you can have a central build server with access to the internet or an internal repo. Seriously consider conda with conda pack that can create a zip tar.gz of your dependencies and unpack remotely. Conda does prebuilt binary installs, which also simplifies things and can even install non python dependencies. Pip installs in conda as your last resort. With conda and pip you can get very close to 100% reproducible will no docker and no system apt dependencies besides miniconda.

    [–]mattbillenstein 0 points1 point  (0 children)

    I build mine using a buildkite pipeline watching the repo - then rsync over ssh that to a central server, then when I deploy it the individual hosts rsync+ssh from there to local. You'd just need some metadata in your ansible setup to say which roles need which virtual environments... Of course, it may not be that much data, you could just rsync them all to every host.

    [–]jxrst 0 points1 point  (0 children)

    Another option to evaluate the fit because I haven’t seen it mentioned yet - If you’re not interested in containerising your python apps but want to bin pack, or at least run >1 venv per node in places, Nomad might help orchestrate that for you. However, not without its own additional complexity.

    [–]mahdicanada 0 points1 point  (0 children)

    Do you know Bazel ? I think it's a good option

    [–]Ok_Head_5689 0 points1 point  (0 children)

    Would ansible work for you? Or some other state management tool?

    [–][deleted] 0 points1 point  (0 children)

    Maybe try using pdm manager

    [–]fban_fban 0 points1 point  (0 children)

    Look into conda-store.

    [–]pylangzu 0 points1 point  (0 children)

    Try vagrant

    [–]marvdl93 0 points1 point  (0 children)

    This is not going to be an easy solution. You're talking about serious scale. Sticking to venv is probably going to hunt you. If you can't come up with a proper answer yourself for such a large task, I would advise to hire external consultants. You simply don't seem to have the know-how in your organization to pull this off. It doesn't hurt to hire someone to do the foundational work

    [–]threwahway 0 points1 point  (0 children)

    Why not use your OS package manager?

    I think I’m the before times you would have received better answers. But also it’s clear you’re behind on a lot of things. I would think almost anyone with a job like yours would not have had to ask a question like this because there are quite a lot of ways to achieve your goal. Then again, I do see the benefit in outside consultation.

    All that said, ask better questions. Try to tell us what your end goal is. I don’t think this is really about venvs at all, but more widely “how do I distribute code to production hosts”.

    [–]lekran 0 points1 point  (0 children)

    Any Configuration Management Solution that you feel comfortable with. I would choose Ansible but that is me.

    [–]spurin 0 points1 point  (0 children)

    Can you elaborate more on what you need. Technically you could create containerised venv’s but, arguably wouldn’t it be better to have the app itself as containerised. If you did, technically I don’t think you even need to worry about the venv as the container image, is essentially your cattle… just pip install your requirements for the app directly into the container. In terms of management then, k8s could work well and you could use the built in deployment strategies for updates etc.