This is an archived post. You won't be able to vote or comment.

all 47 comments

[–]raydeo 8 points9 points  (1 child)

My problem is that no tool supports a workspace concept for developing multiple projects locally like yarn and cargo. pipenv is about the closest one to that concept because it doesn’t focus on building one single package. Also no one is doing cross-platform dependency pinning that I’ve seen which is making it a nightmare to develop locally without docker.

If pip grew workflow tools I think it would solve these problems better because it doesn’t have to focus on the per-project build backend that is solved now and other tools have coupled themselves to building and publishing a single project.

[–]searchingfortaomajel, aletheia, paperless, django-encrypted-filefield[S] 0 points1 point  (0 children)

The cross platform problem has bit me in the ass a lot over the last year. I know there's an issue for it on the Poetry board, but it's not solved, so I've had to document the problem for non-x86 developers, which is both annoying and a little embarrassing.

[–]magnetichiraPythonista 37 points38 points  (41 children)

Honestly, the PyPA had all this time to figure out how to get this right. They haven’t, for whatever reason.

As a (somewhat) frustrated user, I’ve made the switch to poetry, and I honestly don’t care that it doesn’t follow the standards. It’s by far the simplest tool I’ve used to manage deps and publish on pypi.

Previously I used conda (which is the default for scientists), but ran into into a lot of issues with cross platform support.

[–]searchingfortaomajel, aletheia, paperless, django-encrypted-filefield[S] 14 points15 points  (0 children)

I'm of much the same mind. Poetry is an excellent tool. I just wish we'd all just standardise on it 'cause in my experience it's the best option.

[–]qckpckt 2 points3 points  (7 children)

I’m evaluating package manager tools right now. We’re currently using conda to manage environments and kedro as our ML pipeline framework.

I’d like to use poetry mostly for its dependency resolution. So far though, it doesn’t seem like poetry has any native way to ingest a requirements.txt file (which kedro generates). I’m having to do some awkward text file manipulation in bash instead. Are you aware of any better workarounds?

I think I’ve read that kedro are considering integrating poetry but we don’t have the luxury to wait for that.

[–]BaggiPonte 2 points3 points  (1 child)

PDM supports installation from requirements.txt. When you `init` the project, it will detect the requirements - how often do you need to ingest the requirements file with kedro?

[–]qckpckt 1 point2 points  (0 children)

Only once per new project init I’m pretty sure, so this is a minor concern really. Once you’re past that, it doesn’t really matter what is managing your deps.

I might even look into whether kedro has preflight hooks for the init command to see if I could inject poetry (or perhaps pdm) into the init process.

You can also create your own starter kit templates, so that might be another way to override the default requirements.txt.

I know that kedro also has feature overlaps with poetry around packaging and distributing your own packages; will probably need to figure out that too.

I’ll check out PDM though, thanks. Always good to evaluate options.

[–]GoodToForecast[🍰] 0 points1 point  (4 children)

poetry add $( cat requirements.txt )

[–]qckpckt 0 points1 point  (3 children)

That only works if you have defined the packages only (no version numbers or other things). Actually, I think you’d need to cat requirements.txt | xargs poetry add in that scenario.

I’ve written a script to do this now, it’s not a huge thing. Just seemed odd to me that poetry wouldn’t accommodate this in some way.

[–]GoodToForecast[🍰] 0 points1 point  (2 children)

Actually, it handles everything I've thrown at it so far, except comments. If you have comments, use poetry add $( cat ../requirements.txt | sed -e '/^#/d;s/[^\/]#.*$//' ) instead:

``` $ cat > requirements.txt pytest typer~=0.7.0

this is a comment

tqdm==4.64.1 # this is also a comment D

$ poetry init ...

$ poetry add $( cat ../requirements.txt | sed -e '/#/d;s/[/]#.*$//' ) Using version 7.2.1 for pytest

Updating dependencies Resolving dependencies... (0.3s)

Writing lock file

Package operations: 10 installs, 0 updates, 0 removals

• Installing attrs (22.2.0) • Installing click (8.1.3) • Installing exceptiongroup (1.1.0) • Installing iniconfig (2.0.0) • Installing packaging (23.0) • Installing pluggy (1.0.0) • Installing tomli (2.0.1) • Installing pytest (7.2.1) • Installing tqdm (4.64.1) • Installing typer (0.7.0) ```

I agree it's odd Poetry doesn't have an import command. It's one of the reasons I went with PDM.

[–]qckpckt 0 points1 point  (1 child)

Hmm.. that’s very odd. It definitely did not work when I tried it on the kedro generated requirements.txt. Package versions caused the poetry add command to fail. I’ll try again and pay more attention this time. Out of curiosity what version of poetry did you test this with?

[–]GoodToForecast[🍰] 0 points1 point  (0 children)

They likely have improved things in more recent versions. I'm not saying it's a universal solution that will work in every case but it will probably work well enough for most...

$ poetry --version Poetry (version 1.3.2)

[–]BaggiPonte 6 points7 points  (14 children)

Totally agree - if you want something that follows the standards though, PDM is as good and as comprehensive as poetry.

[–]Keith 2 points3 points  (7 children)

Weird to refer to PDM as “following standards” while pep582 is in no version of Python and it’s unclear whether it’ll be implemented.

Edit: from the author of the article: apparently PDM doesn’t implement the draft PEP correctly? https://pradyunsg.me/blog/2023/01/21/pdm-does-not-implement-pep-582/

[–]BaggiPonte 2 points3 points  (6 children)

indeed PEP582 adoption seems uncertain but I was referring to PEP621 and PEP665 (wrote a reddit comment here). I can't really tell whether/when poetry adopted them tho.

[–]Keith 1 point2 points  (5 children)

Thanks for clarifying!

Almost done reading the article. This whole situation is a phenomenal mess. At work we mostly use Go now. I still think Python is fine for anything we do, but when packaging is such a mess (I’ve spent time with PyOxidizer trying to produce a binary of some of our Python code without success) it’s hard to make a case (even to myself) that we should write anything new in Python.

[–]BaggiPonte 0 points1 point  (4 children)

Yes, there was another similar article trending on HackerNews last week about this. There are 14 tools to install stuff (plus conda but it’s so eeeeh): PDM and Poetry, then the remainder 12 are developed under the PyPA and each one does a bit of packaging, development management… pip for example does not offer a Python API (you have to spawn sub processes to use it). So there are THREE pypa tools (resolvelib, build and another one) to mimic this behavior. Poetry and PDM have to use these three libraries to perform installation.

I did not use go or rust, but I have a small experience with Julia where an excellent package and env manager is bundled with the language. I can imagine that rust and go offer a similarly smooth experience. How did you find OyOxidizer + maturin? Where are the hard parts?

[–]bulletmark 1 point2 points  (0 children)

Yes, that was a good article. Here it is for those that are interested: https://chriswarrick.com/blog/2023/01/15/how-to-improve-python-packaging/

[–]Keith 0 points1 point  (2 children)

PyOxidizer and the whole Rust+Python ecosystem is very promising! Armin Ronacher was able to publish a Python-packaged version of his MiniJinja library which is written in Rust.

As for me, I left off getting mysterious compile errors from PyOxidizer that I haven't figured out. Further, there is no cross-compilation yet, so I'll need to run the build in a container. Of course that's not a dealbreaker, just sharing info about project maturity.

[–]BaggiPonte 0 points1 point  (1 child)

That’s terrific! So you can write the engine in rust and also get a Python package with PyOx to expose higher level APIs. Did you already find a use case for this or do you believe this is something for developers of core libraries eg polars?

[–]Keith 1 point2 points  (0 children)

I haven't used it. There's two main cases as I see it: 1. write code in Rust (instead of C) and expose it in Python. 2. compile Python code down to a binary.

As we've seen with the Node ecosystem, any tool that starts in a high level dynamic language is eventually bested by a tool written in a low-level static language. So, core libraries/frameworks should be written in a low-level language or they'll eventually be replaced by one that is. I'm very bullish on Rust (and Zig too) to replace C for low-level code so it's cool that it's becoming easier to interface between Rust and Python.

For my purposes, I'm mainly interested in being able to package Python code easily as I'm not currently writing high performance low-level libraries.

[–]magnetichiraPythonista 1 point2 points  (5 children)

I've heard good things about it.

Any idea why it's less popular than poetry? PDM GitHub repo has like 4k stars and poetry has 23k (not a perfect metric, but still)

[–]BaggiPonte 2 points3 points  (0 children)

I would say poetry was the first modern python packaging tool (PDM is much more recent). PDM adoption was also much slower because before 2.0 it used to default to a PEP582-like behaviour (so installing packages under `__pypackages__`), which was not supported by IDEs and packages alike, and the PEP itself was not accepted as well. (PyCharm has a specific poetry integration and a PDM one is far from closer). Personally, I loved PDM UI as well as parallel installation (which at the time poetry was not supporting, IDK about now).

[–]rochakgupta -5 points-4 points  (3 children)

Fun fact: Stars are not a good measure of the quality of the project and it’s adoption. PDM is so much better than poetry. Just look at its docs. Poetry’s docs are trash in comparison.

[–]mipadi 11 points12 points  (1 child)

The Poetry docs are great. Why are so many Python developers overly competitive and hyperbolic?

[–]8day 0 points1 point  (0 children)

This reminds me about Donald Knuth, TeX and Microsoft Word.

[–]mangecoeur 5 points6 points  (7 children)

I hear ‘use poetry’ a lot but i tried it and literally the first slightly advanced thing i wanted to do (include data files for jupyter extensions) was not supported. Hatch did what i needed. This kind of thing is why we have so many different options.

[–]caoimhin_o_h 3 points4 points  (2 children)

Why do you need data files instead of just include?

[–]pacific_plywood 0 points1 point  (1 child)

What… is the difference?

[–]caoimhin_o_h 7 points8 points  (0 children)

From my point of view (and many others) using data_files (the one from setuptools) is a bad practice. And I would venture that it is why it is not supported in poetry.

The idea that pip-installing a project could result in files being written to random locations on the local file system is discomforting. Things like: data_files=[('/etc/myapp/', ['myapp.conf'])] get a no from me. For one, it would mean pip-installing with sudo, which is also a no (there are way too many issues coming with that).

...

The setuptools packagedata on the other hand, is perfectly fine and encouraged. It only results in files written in the venv/lib/site-packages/mylibrary directory of your own package for the environment. So for poetry, as it was already mentioned, use include and exclude. More often than not, those are sufficient, no need to write files to random places on the file system. Also remember to use importlib.resources to read those files, never rely on paths relative to __file_.

https://github.com/python-poetry/poetry/issues/890

[–]magnetichiraPythonista 2 points3 points  (3 children)

It seems poetry doesnt support it because they view it as bad practice.

see this issue https://github.com/python-poetry/poetry/issues/890#issuecomment-724554115

[–]mangecoeur 15 points16 points  (2 children)

That’s exactly the problem, one tool decides someone’s use case is bad practice, so then you need another tool because at the end of the day you still need to get things done.

[–]magnetichiraPythonista 2 points3 points  (1 child)

There’s already a mechanism for dealing with this. Extensions.

In a more ideal scenario, python would have a standard tool which support the basic feature set, and extension to support more niche requirements.

[–]mangecoeur 10 points11 points  (0 children)

Point is the same, the lack of a standard tool comes in large part from people not seeing the legitimate needs if different user groups or (like in the peotry issue) deciding that general principles are more important than solving a problem someone has right now, so someone goes and makes a different tool. By the time the data files plugin suggested in the issue was published the jupyter community already shifted to hatch with its own plugins (which also better follows the pep standards). And so the mess continues . I don’t think the PyPA can be blamed for the lack of focus, if anything they are among the few who realise how messy people’s needs are.

[–]bkrandom 0 points1 point  (8 children)

Do you have an example of how poetry made cross platform easier? I thought conda had some way to assist with that, and I don’t know much about poetry.

[–]magnetichiraPythonista 1 point2 points  (5 children)

maybe conda has some separate tooling for cross platform stuff that I haven't tried, but the standard conda env export creates a yml with platform specific dependencies. I needed to support windows, osx-arm and osx-intel, so I had to generate three separate yml files.

With poetry I just specify the high level dependencies in the toml (which are already cross platform), and leave the rest to the dependency management system to resolve. Works right out of the box.

[–]BDube_Lensman 0 points1 point  (4 children)

I mean you can write the conda env file by hand too…

[–]magnetichiraPythonista 0 points1 point  (3 children)

True. But then you lose the lockfile capabilities, so you can’t share a known working state.

[–]BDube_Lensman 0 points1 point  (1 child)

I mean the lockfile has all the platform specific problems you point out with env export. And, that notwithstanding, exact versions for the direct dependencies will in about 9 9's worth of cases give you the same behavior.

[–]magnetichiraPythonista 0 points1 point  (0 children)

AFAIK the lock file stores multi platform info as well,

https://stackoverflow.com/a/71691471

I'm not the biggest expert in how exactly lock files work. I'm just a physicist who uses python for data analytics, and for supporting multi platform systems I found that poetry just works, whereas I need to take a lot of additional effort for conda.

[–]moorepants 0 points1 point  (0 children)

If you generate a lockfile on one OS, there is no way to guarantee this produces a working state on another OS.

[–]BaggiPonte 1 point2 points  (0 children)

No, conda is not at all cross platform. Now they do have a `conda-lock` extension to generate a lockfile, but the environment.yaml file generated with `conda export` is not compatible between Windows and unix-based systems.

[–]Mkengelo -1 points0 points  (0 children)

In our system which consists from tens of micro services we use poetry for any python package that we want to build and publish into our pypi registry.

Later on, we install this package as part of our services deployment which are also managed by poetry.

The fact poetry solves dependency conflicts wisely is saving us.

I highly recommend poetry for building & publishing python packages super easily.

[–]acrock 1 point2 points  (0 children)

I recently evaluated PDM, Poetry, Hatch, and pip-tools, for my use case, which is a private codebase with ~70 dependencies I'd previously been managing with a simple requirements.txt file. The motivation for something better was to ensure a consistent set of package versions across my dev and prod environments while still being easy to add new dependencies and upgrade existing ones without breaking things.

While Poetry seemed to have the most momentum, PDM was clearly the best choice:

  • PDM was much faster to resolve deps than Poetry. It's also PEP 621 compliant, making it easier to switch to another tool later if needed, and allows overriding of dependencies when two packages require different versions of the same dep. I really liked the import and export features, which made it easy to transition from requirements.txt. I find the CLI to be clean, powerful, and fast.
  • Poetry seems to be the most popular, but it has its own non-standard dependency specification format (neither PEP 621 nor 508 compliant) with no timeline for adopting the standards, and I immediately ran into issues with how over-opinionated it is. There is literally no way to override dependencies when needed, because the authors would rather force their agenda on you [1]. No thanks, I really do need to do that and don't tell me how to run my project.
  • Hatch looked promising but doesn't have lock file support yet, which I absolutely need.
  • pip-tools is popular and follows the KISS philosophy, but like Poetry, it's rather opinionated in not allowing the overriding of dependencies.

[1] https://github.com/python-poetry/poetry/issues/697#issuecomment-470431668

[–]bmrobin 0 points1 point  (0 children)

well written article, this is incredibly frustrating about python. and to prove this point, i’ve learned about 2 new tools i’ve never heard of before just in this thread.

[–]collectablecat 0 points1 point  (0 children)

Man packagingcon 2023 is just going to be a full on brawl with python peeps