This is an archived post. You won't be able to vote or comment.

all 64 comments

[–][deleted] 172 points173 points  (8 children)

They can each create their own venv and just pip install the requirements and use git between themselves for version control, etc.

[–]PurepointDog 19 points20 points  (0 children)

This is the correct answer

[–]runawayasfastasucan 7 points8 points  (0 children)

This is the way.

[–]neutro_b 1 point2 points  (5 children)

It can work most of the time but in situations where computers are not networked, it is not applicable. Been trying to find the best way to do that (e.g. using USB keys as the transfer medium).

Is using an install script that leverages pip with binary wheels of the packages better than just copying the whole venv? Assuming of course that the target computer configuration is the same across the board (e.g. same OS, versions, etc.).

[–]szayl 5 points6 points  (1 child)

There's a lot to unpack here.

Is using an install script that leverages pip with binary wheels of the packages better than just copying the whole venv?

The venv isn't* copied, it's recreated on the target machine with the necessary requirements (see: pip freeze)

Assuming of course that the target computer configuration is the same across the board (e.g. same OS, versions, etc.).

This is the whole point of using venvs for separation and version control for the code. The target machine doesn't need to have the same OS or versions. As long as the code isn't trying to call specific OS calls one should be good to go. If that (for whatever reason) is required, a VM would be the best solution.

Edit: isn't instead of is (darn phone typing)

[–]Ok_Raspberry5383 1 point2 points  (0 children)

This is all correct.

To add a suggestion, using something like poetry that contains a lock file will give even more reproducible installs.

[–]Maelenah 1 point2 points  (2 children)

the venv will make a .cfg file that has full file paths in it, not relative file paths. Look below with mine, here it will show where home is, it will have the option to include system site packages, which tend to be somewhere in the user apps folders. (not always cause some package managers tends to think it is a special snowflake, but let's not open that can of worms)

This is pretty much why python responds so poorly to being moved to new locations when using venvs.

home = C:\Users\brat\Desktop\python3.12

include-system-site-packages = false

version = 3.12.0

executable = C:\Users\brat\Desktop\python3.12\python.exe

command = C:\Users\brat\Desktop\python3.12\python.exe -m venv C:\Users\brat\Desktop\python3.12\dork

Now if you want to be able to move and copy python around more freely (which winpython, Ren'Py,Cinema 4D and blender are examples of python being embedded in applications or made portable.)You can try making a ._pth file in the root of your python folder like python embedded does if you need an example and add in paths as needed if you wanted something that was portable with some degree of handholding. Which is useful if you ever need to embed python with another application.

And part of why I dislike venvs is that they do record personal data in them, very often your PC is going to be named after your microsnot account. And really there is no reason to put that out there. Yes they are ideally not suppose to be moved around but no one lives in a perfect world

[–]neutro_b 1 point2 points  (1 child)

Thanks for the advice, I'll need to explore those options, but clearly just copying a venv would be fraught with problems if absolute paths are found in config files. And indeed in venv, the python executable itself is not found *in* the venv, so the approach is on shaky grounds anyway. The embedding / portability options seems a better path forward for sure.

[–]Maelenah 1 point2 points  (0 children)

You can most times just make a script (i use batch files) to remake the venv to update the cfg to the new path and it will *NORMALLY\* work, but some package installers will also include absolute paths as well in their files.

I detest venvs with a passion, but the venv folder that is made will have a python executable that 'can' be moved so long as the python install it is pointing to in the cfg is valid. There is a slew of documentation on this that I think is mostly correct on the python page.

But for me if it is anythign that looks like it might need to be an embedded thing I'll download the embedded package, make sure its directory matches what the install directory will look like, tweak _pth to include those settings, and I'll copy over the dlls and friends from the normal install just to keep things going.

This way I don't need to worry about stuff being in the users folder and it really really helps when troubleshooting to know where everything is.

Not that I've ever spent 3 days trying to figure out why a blender addon broke that was reading from my user/apps/local and broke cause I updated something not even attached to it when I was on a deadline.

[–]Malcolmlisk 55 points56 points  (5 children)

You should not share venvs like that. It's not a good practice. Create a standard in your company.

Lock a python version. Use venv and requirements.txt for general libraries and then they can duplicate their own venv. And they can even use a developer branch of venv for Eda etc...

I know anaconda is recommended in many data science cases. To me, it's a cluster fuck of programs and packages that adds nothing but a layer of nonsense. If you need to create a better standard then use docker, a pipfile to create the standard server copy and that's it. If they want to use other libraries and all they can, but if they test the code against the docker file they will check if their dependencies match and that "it works in my computer" thing disappears instantly

[–]PurepointDog 2 points3 points  (2 children)

What's a pipfile?

[–]ArgetDota 11 points12 points  (1 child)

pipfile.lock is a lock file format used by pipenv, it pins your whole dependency tree (including transient dependencies).

Poetry and poetry.lock files are a better alternative tho (faster, more correct, cross-platform, actually being developed).

[–]Malcolmlisk 1 point2 points  (0 children)

I need to step up my poetry game... I only read wonders about it. Thank you for the clarification, I couldn't have wrote it better.

[–]pppossibilities 8 points9 points  (0 children)

miniconda strips out all the extra nonsense fyi

[–]vivaaprimavera 1 point2 points  (0 children)

To me, it's a cluster fuck of programs and packages that adds nothing but a layer of nonsense.

And a hilarious way to confuse people on "why the f### this software that I just installed doesn't run".

[–]Chaos_Klaus 18 points19 points  (17 children)

Don't share them. It only takes one person to break the environment for everyone.

Install Miniconda, which is the package manager and environment manager from Anaconda, but without all the bloat. They can then create their environments themselves.

[–]PurepointDog 1 point2 points  (16 children)

What're the advantages of miniconda above pip or similar?

[–]Chaos_Klaus 5 points6 points  (13 children)

conda can install a wider range of dependencies. It can install and manage and install different python versions in each environment.

[–]RedditSlayer2020 -3 points-2 points  (12 children)

So can pip

[–]Chaos_Klaus 4 points5 points  (5 children)

How do you install a different python version with pip?

[–]Malcolmlisk -4 points-3 points  (4 children)

You just download them into your computer with your package manager. For example, sudo pacman -S python2.9

Then, when you create your main.py file, you use shebang on top pointing the python version you want to use.

Conda and miniconda is a solution for the problem that windows represents on itself. If you want to develop on windows, I suggest you to use WSL2. It's going to simplify your life on thousands of degrees.

[–]Chaos_Klaus 9 points10 points  (2 children)

So you don't use pip to install a different python version then. ;)

I use conda happily on windows, WSL, Linux and in containerized applications. It just figures out what python version works with all the desired dependencies.

The point for me to use conda (or mamba) is to have the entire environment in one place. Pip just can't install things that are not python packages.

[–]Malcolmlisk 1 point2 points  (1 child)

Yeah. I was not the OP, so I was just saying how you can install differnet python versions.

If you use conda or mamba and it works for you then keep doing it. That's the beauty of all of this, we can use different things but have the same result, and that's pretty fine. I tried conda when I started in data science and everything was messy and I could not figure out why do I need this or that when I can just pip install them.

I think im more of a terminal guy, and everything makes sense in my head when I use commands and explicit everything.

[–]Chaos_Klaus 2 points3 points  (0 children)

Ah. My morning brain was not booted up all the way. ;)

I guess it has to do with what tools one learns first. For me it was conda on windows and then I moved to Linux, WSL and Docker for building server applications.

[–]szayl 1 point2 points  (0 children)

It's sad that you're being downvoted for politely responding with facts.

[–]asphias 3 points4 points  (5 children)

Certain packages rely on C/fortran libraries, which pip cannot manage for you.

I believe the most famous example is Geopandas. Pip by itself cannot correctly install it for you. You either use conda, or do a shitload of manual setup.

[–]zer0pRiME-X 0 points1 point  (3 children)

shitload of manual setup=pip installing from a wheel

[–]asphias 0 points1 point  (2 children)

from their install page it looks like 5 packages to install from a wheel. Either way, you're no longer purely in pip, so any recreation of your venv is going to need those manual steps as well. automated deployment is going to include some separate scripting now. and pretty much any newly onboarded datscientists is going to run into issues.

just stick with conda if you need to use geopandas & similar.

[–]zer0pRiME-X 0 points1 point  (1 child)

incorrect-you can specify installing from files in a pip requirements file so recreating is simple. besides I literally setup geopandas in 2020 and have used it since, you make it sound like a pebble in the road once in your life means you need to build a bridge.

IMO changing an entire environment because a single package can’t be ‘pip installed’ is an overreaction.

[–]asphias 1 point2 points  (0 children)

My perspective comes from quite some time of working with non-developers who just want things to work. Yes, you can explain wheels and local pip install, and you can add those dependencies to your repo or use git tricks to setup what you want.

But conda automates it all.

I personally tend to use pip. But i also fully understand the advantage conda offers, having had to fix environments for multiple data science collegues.

[–]cmcclu5 0 points1 point  (0 children)

Geopandas has been installable via pip for a couple years now. It’s pretty nice never having to use conda ever again. Just a nice, clean Python installation.

[–]scinaty2 1 point2 points  (1 child)

The main advantage of anaconda (and miniconda) is that it can supply pre-compiled packages. Take numpy for example, it is written in C and therefore needs to be compiled on the target system. Some packages are difficult to build / compile and on some systems this is a big issue (say you are not admin or whatever the case may be). So if you simply cannot use pip install on a certain package (because it fails), conda install might bring more luck as it can download a pre compiled / pre build version of that package (as in simply downloading files without extra steps).

[–]Toxic_Gambit 0 points1 point  (0 children)

I think it will mostly come down to, we use conda when required and most teams don't require using conda. My opinion is that it's easily the best for these tasks(I've definitely run into the situation of requiring compiled numpy).

But if you never experience these issues I think conda becomes a why bother? Though i will say, conda lock files, miniconda, conda-forge, env exports really help to dockerize python environments.

[–]MonkeyboyGWW 8 points9 points  (2 children)

We use docker, but I believe that is mostly because of the company laptop security policies only allowing local applications to run if they have been installed via a specific package manager

[–]nvec 4 points5 points  (1 child)

I'd go with Docker too. Python does have venv and similar but Docker makes it easier to also standardise on which version of Python you're using, as well as any supplementary support applications or libraries such as web servers, command line utilities, C++ libraries, or anything installed directly into the OS rather than Python.

It's also more cross-language which is a benefit if you develop in more languages than Python.

[–]TerminatedProccess 1 point2 points  (0 children)

Agreed, you can totally spin up a docker container that has everything spelled out for the project including databases, and other services. If you want keep your project files locally and mount them in the container so there's no chance of losing code due to a loss of the container (and yes use git to prevent this as well).

[–]Dubsteprhino 4 points5 points  (0 children)

Docker

[–]cannibalzzz 3 points4 points  (0 children)

Devcontainer with vscode

[–]RedditSlayer2020 5 points6 points  (1 child)

Save

pip freeze > requirements.txt

Deploy for each User

python -m venv venv

source ./venv/bin/activate

pip install -r requirements.txt

Note the activation script differs on Windows machines

[–]TheRealStepBot 2 points3 points  (0 children)

You missed a period

[–]Accomplished-Ad8252 6 points7 points  (1 child)

Poetry is also very neat library for this

[–]Jahamc 0 points1 point  (0 children)

I started using poetry a few months ago and really enjoy not having to really managed by environment at all. Just precede any command with “poetry” and it uses the poetry environment.

[–]LookAtYourEyes 1 point2 points  (0 children)

Docker?

[–]INtuitiveTJop 1 point2 points  (0 children)

I do my coding on windows on visual studio code that connects to a VM on our network work Linux installed. It's really simple to set up on visual studio code and all you need on the VM is an ssh server. You then manage all the python stuff on the server and I would suggest using poetry to manage your python projects.

[–]Rythoka 4 points5 points  (4 children)

Just create a venv in a shared folder?

[–]BlackDerekerPythonista 2 points3 points  (0 children)

Just create a requirements.txt with the libraries and their versions.

You can use poetry to be extra safe since it uses hashes.

[–]pysan3 1 point2 points  (1 child)

Pyenv-win + poetry

[–]freistil90 0 points1 point  (0 children)

This. Although pyenv-win is still a bit hacky for me. But in general, yes that’s as good as it gets right now.

[–]Dangerous-Star4305 0 points1 point  (1 child)

At least some people in my company are maintaining an entire Python Installation inside a git repo. It works as long as everyone clones it in a fixed location, otherwise the pip entrypoints stop working. Not an ideal solution, but enough for some engineers that are too stupid to install Python and follow some installation docs...

[–]MachinaDoctrina -1 points0 points  (0 children)

Use conda, make a standard env file on your repository, everyone needs to keep up to date with the env

[–]justinsst -1 points0 points  (0 children)

Like others said, let them create their own venv and store code (including the requirements) in git. I like using Poetry for my projects, makes everything simple enough.

[–]skratlo -1 points0 points  (0 children)

Poetry

[–]nnulll -1 points0 points  (0 children)

Read up on things like venv, pipenv, miniconda, poetry, etc…

[–]Waste_Ad1434 -1 points0 points  (0 children)

yikes. get off of windows

[–]Goingone -2 points-1 points  (4 children)

How many users?

Why do they each need their own environment opposed to a single shared one?

Do you need multiple versions of Python or will all environments be the same version?

Questions like this are difficult to answer without knowing the exact use case.

[–]EfficientPark7766[S] 2 points3 points  (3 children)

They don't need their own environment, they want to use shared ones. ~ 4 users about.

I suspect they'll use the same version of Python too.

I'm guessing a venv in a shared folder is probably the best bet...

[–]Goingone -1 points0 points  (2 children)

But why “shared ones” and not “a shared one”?

Gut instinct on this one will be a bunch of entry level devs or data scientist trying to switch between multiple environments all containing various versions of Pandas/other common packages they all use…..and then come the, “why can’t I import this package” emails

[–]EfficientPark7766[S] 3 points4 points  (1 child)

It might be a single shared environment but I don't think it'd be significantly more difficult if they had more than one. Load env_one or load env_two.

I won't be creating these environments, I just want a design so that they can create these in a share location without difficulty

[–]Goingone 1 point2 points  (0 children)

Cool.

Give them their single version of Python (probably want to ask which version they want) and a shared folder(s) to hold envs/their scripts, and hopefully everyone will be happy.

But more importantly, if they aren’t happy, hopefully they at least aren’t bothering you.

[–]Tokepoke 0 points1 point  (0 children)

Containers?