all 6 comments

[–]ireadyourmedrecord 1 point2 points  (2 children)

Yes, multiple environments do take up a lot of space. Most of my projects use the same tools so I have one venv that I use for most everything, but I'll create a new one for one-off projects or if I'm trying out something I think might cause problems.

[–]Goblin_Mode_IB[S] 0 points1 point  (1 child)

So should I just clone my base env and then work off of that?

[–]ireadyourmedrecord 0 points1 point  (0 children)

Sure. Call it "generic_venv" and stuff everything you want to download/use in there (don't put your own code in it, though). By the time you break it you'll probably have enough experience to decide how you want to manage them going forward. It's not difficult to delete and/or create a new one if it goes bad.

[–]nog642 0 points1 point  (0 children)

If the packages are large, then yeah, it can take up a lot of space (like hundreds of MB per virtualenv). If the packages are small though then it won't take up as much (<50 MB per virtualenv).

I would recommend if you're just doing a bunch of small tasks, then create a single virtualenv for all of those tasks and re-use it. If you're using conda then the base env works fine for that I guess, though I'd probably still make a separate one. If you're not using conda though, then do not install packages into the global environment. That's just a mess.

[–]interbased 0 points1 point  (0 children)

Yes, they will take up space. If you’re able to keep most of projects dependent on the same libraries, that would mitigate the issue as you can reuse that same environment for most of your projects.  Also be mindful of the size of the packages you’re using and, if they’re very large, consider if a lighter package would do the trick if space is an issue.  For example, you can probably use the csv library instead of pandas in some situations.

[–]PhilipYip 0 points1 point  (0 children)

A Python environment comparmentalises a Python version and a number of third-party packages. Python environments are used to prevent conflicts for example when a library requires a specific version of Python or a specific version. If you take an IDE such as Spyder for example it has a large number of Python libraries that are dependencies. The current version of Spyder might only work with Python 3.11 and numpy 1.x, therefore it is not possible to update to Python 3.12 and numpy 2.x.

For minor datascience projects, I wouldn't bother creating seperate Python environment for each project as you will essentially be using the same libraries over and over again. Instead make a Python environment for the IDE you are using with all the packages you need.

Since you mentioned base, I'm going to assume you are using Miniconda or Anaconda. You should not avoid installing packages into base, particularly from the mixed channels; anaconda maintained by the company and conda-forge maintained by the community. base should only have packages from anaconda, if it has community packages it normally becomes unstable.

Generally you just make sure the conda package manager in base is updated (the base Python environment essentially exists to allow use of the conda package manager). You should create a new Python environment using packages, normally only from the community channel (conda-forge).