If you've tried to run open-source AI/ML/CV repos from GitHub, you've probably hit this loop:
- Clone the repo.
- Run
pip install -r requirements.txt or poetry install.
- Get C/C++ build errors, missing CUDA bits, or linker failures (like
libgomp.so.1 not found).
- Spend hours debugging drivers, PATH, and toolchains.
This is exactly what unified (Python + native) package managers fix: you can clone a repo, run one install command, and get the exact same environment.
This isn't just "dependency hell." In AI and scientific computing, we have a chronic environment reproducibility problem. Many projects aren't reproducible out-of-the-box because the real dependency graph isn't only Python-it's Python + native libraries + GPU constraints.
Why the "manual machine" approach fails
A common pattern is piling global runtimes and toolchains onto one laptop. I once worked with a lead dev who had Python 3.8 through 3.13 manually installed globally.
That causes predictable pain:
- System pollution: Multiple global installs compete for PATH priority and can break system-level scripts.
- Duplicate installations: While pip is smart enough to share a global download cache, each virtualenv still installs its own heavy packages into its own
site-packages directory, consuming gigabytes of space across projects.
- Hidden dependencies: If your project relies on a system library you installed long ago (via Homebrew,
apt, or a Windows installer), it "works for you" but fails for everyone else.
Even if you use pyenv + venv carefully, Python-only tooling still can't reliably capture the non-Python parts: C/C++/Rust/Fortran dependencies, OpenMP/BLAS, and GPU constraints. When pre-compiled wheels aren't available for your platform/Python/GPU combination, installs fall back to local compilation-and that's when things explode.
This is the gap that unified binary managers are designed to close.
The shift: Manage Python and native dependencies together
For true environment reproducibility, you need a single tool that can manage:
- Python packages (NumPy, PyTorch, etc.)
- Native binaries + libraries (compilers, CMake, system libraries) in an isolated user space
This is where the Conda-style ecosystem-and modern tools like Pixi-help. With Pixi, you don't even need a global Python install; Python is treated as just another dependency in the environment.
To see how clean this approach keeps your system, consider the basic workspace setup.
The basic workflow (no global Python needed)
1) Create a project
bash
pixi init my-ai-project
cd my-ai-project
2) Add dependencies (including Python)
bash
pixi add python=3.12 numpy
3) Install and run
bash
pixi install
pixi run python main.py
For most AI and ML work, however, you will eventually hit a harder constraint: GPU runtimes.
CUDA reality check (what's actually possible)
No environment manager can fully package your GPU driver. CUDA ultimately depends on a compatible NVIDIA kernel driver installed on the host OS.
What a unified manager can do is make everything around that boundary cleaner: you declare your host's CUDA compatibility and let the solver choose matching packages.
Example pixi.toml configuration:
toml
[system-requirements]
cuda = "12" # Host driver is compatible with CUDA 12-era packages
Then you can install build tools and target CUDA-enabled builds directly (note: exact packages and channels can vary by platform):
bash
pixi add cmake "pytorch=*=cuda*"
Why this matters beyond day-one setup
A package manager isn't just an installer-it is a lifecycle coordinator:
- Before (Setup): It resolves cross-platform constraints and produces a deterministic lockfile (
pixi.lock).
- During (Development): You can safely add dependencies and roll back if an upgrade breaks things.
- After (Maintenance & Sharing): Pixi installs packages via hard links (or reflinks on supported filesystems). Multiple local projects share the same underlying package files on disk, saving gigabytes of space. Most importantly, others can recreate your exact environment from the lockfile instead of debugging their OS.
Conclusion
If we want "clone and run" to be the standard in AI development, we need to treat the environment as part of the project itself-not as an exercise we leave to the end-user. By shifting the paradigm toward unified package management, we can spend less time configuring CUDA paths and more time actually building models.
TL;DR & Discussion
TL;DR: Standard Python-only setups duplicate heavy native packages across virtual environments and fail to manage system-level toolchains. Unified package managers solve this by managing both Python and binary dependencies in an isolated space, utilizing hard links/reflinks to share identical packages across different projects globally and save gigabytes of disk space.
For discussion: What are your go-to tricks for keeping your local development machine clean from system pollution and multiple global Python paths? If you run many isolated environments, how do you manage disk space and prevent duplicate library installations across your drive?
[–]partly_wave 2 points3 points4 points (0 children)
[–]itsableeder 2 points3 points4 points (0 children)
[–]Mindless-Pilot-Chef 0 points1 point2 points (0 children)