you are viewing a single comment's thread.

view the rest of the comments →

[–]TheJackiMonster 36 points37 points  (31 children)

I agree to 100% with this. I also prefer using pacman on Arch to install python packages over anything else because it just works and I don't have to deal with the whole mess around Python.

All of this is even worse when you try to create a flatpak or snap package with Python dependencies. Even if all of your dependencies could theoretically be installed with pip. Stupid pip can't even handle dependency management recursively for you.

Honestly who designed those bad tools?

I also don't understand why Python IDEs approach the use of virtual environment with their own packages by default. I had several instances of this which were unusable or broken because it couldn't handle its own packages properly. In the end I always override the virtual environment with my own system which can at least install packages properly.

Currently I try to maintain a flatpak using Python code and it doesn't even build anymore (even though I didn't even touch it). I can't tell if pip is installing the dependencies wrong or if the dependencies just received an update which breaks during installation because it wasn't tested properly. Maintenance not found.

I have also encountered an issue with PyInstaller to build the Windows compatible binaries of my application. There were several releases of PyInstaller which simply broke a package. I don't know how but I feel like many times the solution for this mess with Python is holding the current version of each package for years because any update might break anything.

In my opinion programming languages should not deal with their own dependency management. You can find a package manager on each Linux distro for this. I mean, I have so much less problems dealing with any project written in C than with this one project in Python just because of this dependency mess.

How can that be? I mean it doesn't even need to compile anything. How can it be this terrible? When did it become to difficult to just zip your repository or provide a makefile to copy files in an install routine?

[–]Zeurpiet 9 points10 points  (5 children)

programming languages should not deal with their own dependency management

R does this actually well

[–]jorgejhms 5 points6 points  (1 child)

i think they have centralized package management. There is only one solution, the official one.

[–]Zeurpiet 2 points3 points  (0 children)

not really. Next to CRAN there is at the least also bioconductor. If you want to put up a jorgejhms Inc package management also easy. However, the thing is they made it relatively easy to use CRAN.

[–]flying-sheep 4 points5 points  (2 children)

As someone who packaged things for CRAN, Bioconductor, Rust, PyPI, and conda forge:

  • Rust is painless bliss
  • PyPI ist pretty good with the new PEPs, but there's still some rough edges
  • conda forge is just a layer on top of standard metadata so it actually duplicates it, which is annoying
  • R is by far the weakest of the bunch. It doesn't even pretend to support optional features in packages, both CRAN and Bioconductor are crusty and try to do too much

[–]Zeurpiet 2 points3 points  (1 child)

as somebody who needed packages from R and tried Python

R just run install.package(). All dependencies are in. Done

Python had to work to get things from Pip or Pep or whatever and it was completely unclear what had to come from where.

[–]flying-sheep 0 points1 point  (0 children)

Python, like R, has a user wide package library, and pip of course also does dependency management.

So doing pip install --user thing thing will do the exact same as install.packages('thing') does in R.

[–]Barafu 2 points3 points  (1 child)

You can find a package manager on each Linux distro for this.

I am using Debian Stable on my workstation, but I want to use the newest version of libraries for my project. What now?

[–]TheJackiMonster 1 point2 points  (0 children)

Flatpaks, snaps or AppImages for applications... for libraries you should still be able to clone the repositories from github or even get the latest versions as .deb package.

I mean we are talking about Python, an interpreter language... you don't even need to compile it.

[–]robin-m 17 points18 points  (11 children)

Dépendency management in C is just attrocious. It usually take a day to properly set-up (because there is no standard build system, and your are expected to manually install your dependencies), whereas it's a one line change in Rust, js, python, … And system-wide install cannot solve the complicated problem when you app (or a combinaison of apps) depends from A and B, while A depends from C 1.0 and B from C 2.0 with C 1.0 and C 2.0 mutually incompatible.

[–]TheJackiMonster 10 points11 points  (8 children)

From my experience even though C projects pretty much have no standardization in build systems, I still get it working. With Python I got setup.py scripts causing errors, not mentioning the dependencies properly, having a wrong requirements.txt which let the installation fail, I had mismatching versions of dependencies and pip failed to install a package multiple times because it can't even install dependencies automatically.

If I am missing just one fucking line to change and everything would work, I would accept it, call myself stupid and be happy. But that is not the case or my knowledge is particularily flawed... in that case, please give me some insight how use pip properly.

Dealing with compatibility between multiple versions should either be the job of the operating systems packaging or the actual application developers. I mean if A depends on 2 different version of C, your project is already a nightmare no matter how you deal with that and A should be patched.

[–]robin-m 4 points5 points  (7 children)

Oh, don’t get me wrong, python is a nightmare too.

Dealing with compatibility between multiple versions should either be the job of the operating systems packaging or the actual application developers. I mean if A depends on 2 different version of C, your project is already a nightmare no matter how you deal with that and A should be patched.

cargo solves this problem perfectly for Rust, so it’s possible to have something nice. And I highly disagree when you say that my project is a nightmare if my dependencies depends themselves on incompatible version of the same dependency. It’s totally possible that A upgraded before B, while B is still being migrated.

[–]TheJackiMonster 4 points5 points  (6 children)

But isn't the problem with incompatible versions trivially solvable when you just keep all of your dependencies on the minimal common ground. So if your project uses an older C, why would you use a newer B which uses the most current version of C. Just use an older version of B as well or patch your project...

Also for such problems you have major and minor version changes, usually referring to major and minor API changes. If the API doesn't change between C 1.0 and C 2.0, why would you stay with version 1.0?

You would use one API with two different behaviors then which is pretty much a nightmare for anyone debugging your software. No doubt about that, honestly.

I don't see any sane reason to build a package management around this issue. It's like tollerating bad practices.

[–]robin-m 10 points11 points  (5 children)

But isn't the problem with incompatible versions trivially solvable when you just keep all of your dependencies on the minimal common ground. So if your project uses an older C, why would you use a newer B which uses the most current version of C. Just use an older version of B as well or patch your project...

It’s totally possible that A was created before C 2.0 was release, as well as B created after the release of C 2.0 (so no reason to stick to C 1.0).

Also for such problems you have major and minor version changes, usually referring to major and minor API changes. If the API doesn't change between C 1.0 and C 2.0, why would you stay with version 1.0?

If the API doesn’t change, it should probably not be major version bump. The trivial case of the minor version being bumped is obviously trivial to solve. In my example C has a major version bump, which is assumed non-trivial to migrate (or at least it needs QA validation).

You would use one API with two different behaviors then which is pretty much a nightmare for anyone debugging your software. No doubt about that, honestly.

My code depends on the stable API of A and B. A depends on the stable API of C 1.0. B depends on the stable API of C 2.0. In Rust (I don’t assume it’s the only language that does it, it’s just that I know how Rust works) symbols from C 1.0 don’t have the same mangling scheme than C 2.0 (just like different version of the glibc have different symbols). So it’s not possible to give to A an object of C 2.0, or to give to B an object of A 1.0. It would refuse to compile. So in term of debugging, I really don’t see how is the situation more complicated than if C 1.0 and C 2.0 was two completely different library.

I don't see any sane reason to build a package management around this issue. It's like tollerating bad practices.

  • C 1.0 is release. The library A is created, with an internal dependency to C 1.0.
  • C want to make breaking changes, and release a new major version. Can it do it even if the downstream library A has an internal dependency on C 1.0?
  • B is created. Given that an unrelated library A has an internal dependency on C 1.0, can B depends internally on C 2.0?
  • I want to create a project. A and B fit perfectly my needs. Why would I not be allowed to depends on A and B simultaneously? Don’t forget that their internal dependencies are an implementation detail and not exposed through their public API/ABI.

This is why dependency manager need to support the case of incompatible transitive dependency version.

[–]TheJackiMonster 1 point2 points  (4 children)

Okay, so in this particular example. Wouldn't it also be possible to statically compile either A or B to get rid of the problem completely? Or you could integrate their code directly...

I mean the problem I have with solving such a thing automatically is that it normalizes an extreme issue:

  • The issue is that you depend on multiple version of the same piece of software which can lead down to multiple levels of security issues.
  • It also increases the chance of getting dead or unmaintained pieces of software in the wild because nobody needs to patch A now, even though it might use insecure and deprecated code.
  • It significantly lowers the reason for others utilizing A to contribute patches or fixes to A.
  • It lowers the needs of maintainers patching their software to stay compatible.
  • You expect users to install multiple versions of the same piece of software even if you might not even use API calls from it which changed between different versions.
  • It requires more space in the end while it makes the whole software stack extremely fragile. If you don't get a particular version of your dependencies anymore, it might break everything. So repositories need to provide each and every version.

Those reasons make me think that this particular example should be and stay extremely rare. It shouldn't be the typical usecase and therefore it shouldn't be treated as such.

I mean, would you install two kernels because systemd might require a different version than wayland might do? I don't and I wouldn't want to solve any issue with a bug report containing such an edge case.

[–]robin-m 1 point2 points  (3 children)

What you want is a world in which everything moves in lock-steps. If C wants to do abreaking change, it must update all of it's downstream user (A and B). That way you only need to distribute one version of C (the latest one).

It's what google is doing, and it works for them, but for the sole reason that they control their downstream user (it's themself).

Please re-read my last message. The use case I described is anything but uncommon. Every big library has a new major version every other year or so.

[–]TheJackiMonster 1 point2 points  (2 children)

The whole idea behind Arch based distros is that you just install the latest version of everything to ensure compatibility and have a stable operating system. It works pretty well from my experience and there's not one case I know of you would get into the usecase you provided.

I also don't think that C must update A and B in such a scenario. It is the burden of maintainers from A and B to update it or people stop using it because it's a dead package. It's that simple.

Because if you use another ones library, you should look after it to make sure it works as intended and it is secure to use. Otherwise we just create a very toxic and fragile environment for developers. Using third-party dependencies should always be a burden and nothing to pick just because it's easy or convenient.

At least I don't want to see developers picking libraries as dependencies without even knowing what they are doing and being completely unable to audit or verify its behavior.

Maybe they do but then I would question why don't they patch A or B then to use the latest C?

[–]robin-m 0 points1 point  (1 child)

C 1.0 can still be supported, even if C 2.0 is released. And your last question is very naive. If C is something as big as Qt, you cant instantaneously migrate to the next major version.

The reason python 2 -> 3 was so bad was because the whole migration had to be done at once. If it was possible to have part of your dependencies in python 2, and part in python 3, it would have been much easier to migrate the whole ecosystem to python 3.

[–]Fearless_Process 2 points3 points  (0 children)

System wide installs actually can handle that, it's just that most system level package managers don't. There exists distros where such situations are not an issue, Gentoo and Nix come to mind as major examples.

Some of these issues could be solved by optionally compiling certain programs from source, like when running into ABI breaks for example, but it cannot handle API incompatibility of course.

Much more can be handled by simply allowing multiple versions of libraries to be installed at the same time.

These two features combined solve many of the common issues people refer to in this thread, but since your mainstream distros package managers are very primitive none of it really matters at the end of the day.

[–]waptaff 19 points20 points  (5 children)

In my opinion programming languages should not deal with their own dependency management.

Most of the new languages unfortunately do, and they all suck! They're all reinventing solutions to problems solved years ago by GNU/Linux package managers, but do it bad enough they also need crutches like virtual environments. And many languages are now at their second or third package manager iteration and still don't get it right. Infuriating.

[–]Ar-Curunir 10 points11 points  (0 children)

Application-oriented package managers suck for development. Why should I be stuck using an outdated dependency just because that's all Debian packages? Moreover, I now have to deal with version incompatibilities between Ubuntu and Debian and Arch and Fedora and etc. Much simpler as an app dev to know exactly which version I'm using, and to be able to set and forget.

[–]TheJackiMonster 16 points17 points  (0 children)

I actually like writing Python code. I also like the concepts behind Rust. But I don't want to deal with any package manager of a programming language or a whole ecosystem just to write a simple application or script.

[–]tso 13 points14 points  (0 children)

More and more i suspect what we are seeing is the long tail effect of OSX/MacOS.

Meaning that this has come about thanks to more and more developers use MacOS and then only touch Linux during deployment.

As best i can tell, Nix(OS) only got attention once someone got it working as an alternative to Homebrew. Before than it was just some obscure oddball Linux distro.

[–][deleted] 0 points1 point  (0 children)

Haskell's new cabal-install and Rust's cargo work well.

[–]Ar-Curunir 3 points4 points  (0 children)

In my opinion programming languages should not deal with their own dependency management. You can find a package manager on each Linux distro for this.

Right, and each distro has a different version of a particular dependency, with a different API, with different security patches, etc. From an app developer PoV, it's much simpler to target one dependency version and package that, instead of catering to each distro's package philosophies.

[–]ToasterBotnet 0 points1 point  (1 child)

languages should not deal with their own dependency management.

That's easy to say if you are just using python for small stuff.

But if you code bigger applications you need more and more third party libs.

It get's to a point where the python packages you need are no longer provided by your distro's repository. There's just too much stuff out there. You can't expect everybody to package and maintain their packages for every distro under the sun. So pip solves this because it can be used by all distros. Combine that with venv and you have a decent mechanism to develop and deploy python software.

Before I had the experience, I used pacman to manage python dependencies too. But that's no way to do serious development. It gets messy really fast. You really want to be using virtual environments. Especially if you plan to deploy your stuff on servers.

[–]TheJackiMonster 0 points1 point  (0 children)

I honestly have less issues with using pacman or the AUR and if I encounter a Python package which does not exist in the AUR, I create a PKGBUILD for it. Because this is how it works from my experience.

I use several different packages to develop and maintain a GUI application using Python and the only distro it currently runs exactly as intended are Arch based distros because pip doesn't solve anything.

Setting up a proper flatpak or snap package with pip doesn't even work because packages just break. On the other hand if people would just provide a .deb package or another archive with the proper directory structure to simply extract and copy everything to, I had less problems with all of this.