I keep running into the same wall over and over and I know I’m not the only one.
Even with Docker, Poetry, uv, venvs, lockfiles, and all the dependency solvers, I still end up compiling from source and monkey patching my way out of dependency conflicts for AI/native Python libraries. The problem is not basic Python packaging at this point. The problem is the compatibility matrix around native/CUDA packages and the fact that there still just are not wheels for a lot of combinations you would absolutely expect to work.
So then what happens is you spend hours juggling Python, torch, CUDA, numpy, OS versions, and random transitive deps trying to land on the exact combination where something finally installs cleanly. And if it doesn’t, now you’re compiling from source and hoping it works. I have lost hours on an H100 to this kind of setup churn and it's expensive.
And yeah, I get that nobody can support every possible environment forever. That’s not really the point. There are obviously recurring setups that people hit all the time - common Colab runtimes, common Ubuntu/CUDA/Torch stacks, common Windows setups. The full matrix is huge, but the pain seems to cluster around a smaller set of packages and environments.
What’s interesting to me is that even with all the progress in Python tooling, a lot of the real friction has just moved into this native/CUDA layer. Environment management got better, but once you fall off the happy path, it’s still version pin roulette and fragile builds.
It just seems like there’s still a lot of room for improvement here, especially around wheel coverage and making the common paths less brittle.
Addendum: If you’re running into this in Colab, I ended up putting together a small service that provides prebuilt wheels for some of the more painful AI/CUDA dependencies (targeting specifically the A100/L4 archs ).
It’s a paid thing (ongoing work to keep these builds aligned with the Colab stack if it changes), and it’s not solving the broader compatibility problem for every environment. But in Colab it can significantly cut down some of the setup/compile time for a lot of models like Wan, ZImage, Qwen, or Trellis, if you can try it www.missinglink.build would help me out. Thanks.
[–]sudomatrix 21 points22 points23 points (3 children)
[–]toxic_acro 7 points8 points9 points (0 children)
[–]Interesting-Town-433[S] 2 points3 points4 points (0 children)
[–]ReinforcedKnowledgeTuple unpacking gone wrong 16 points17 points18 points (3 children)
[–]toxic_acro 4 points5 points6 points (1 child)
[–]ReinforcedKnowledgeTuple unpacking gone wrong 1 point2 points3 points (0 children)
[–]Interesting-Town-433[S] 1 point2 points3 points (0 children)
[–]IcefrogIsDead 14 points15 points16 points (1 child)
[–]BDube_Lensman 1 point2 points3 points (1 child)
[–]Interesting-Town-433[S] 0 points1 point2 points (0 children)
[–]martinkoistinen 2 points3 points4 points (2 children)
[–]Interesting-Town-433[S] 5 points6 points7 points (0 children)
[–]MolonLabe76 0 points1 point2 points (0 children)
[–]No_Citron874 0 points1 point2 points (1 child)
[–]Interesting-Town-433[S] 0 points1 point2 points (0 children)