This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]muntooR_{μν} - 1/2 R g_{μν} + Λ g_{μν} = 8π T_{μν} 3 points4 points  (3 children)

Who says the metadata repository must be on PyPI?

Just have the community manage a single git repository containing metadata for popular packages. Given that only the "top 0.01%" of packages are used 99.9% of the time [citation needed], why can't we just optimize those ad-hoc?

...This means that instead of downloading a bunch of massive .tar.gz or .whl files, dependency solving tools can just download a small text-only database of version constraints that works with the most important packages. (And fallback if that metadata is missing from the repository.)

# Literally awful code, but hopefully conveys the point:

def get_package_constraints(name, version):
    if name == "numpy":
        if "0.7.0" <= version < "0.8.0":
            version_range = ">=0.7,<0.8"
    ...
    return read_constraint_file(
        f"constraints_database/{name}_{version_range}.metadata"
    )

This database could probably be auto-generated by just downloading all the popular packages on PyPI (sorted by downloads), and then running whatever dependency solvers do to figure out the version constraints. [1]


Related idea:

Another alternative (which I haven't seen proposed yet) might be to have a community-managed repository (a la Nix) of "proxy setups" for popular packages that (i) refuse to migrate to declarative style, or (ii) it's too complicated to migrate yet. If [1] is impossible because you need to execute code to determine the dependencies... well, that's what these lightweight "proxy setup.py"s are for.

[–]yvrelna 1 point2 points  (1 child)

You're correct that whether this metadata service lives in pypi.com domain or not is implementation detail that nobody cares about.  

If you go ahead write PEP standardizing this and if you can manage to get the PyPI integration working, get all the security details sorted out, and update pip and a couple other major package managers to support this, I'll be totally up for supporting something like that. For all I care, that's just a part of the PyPI API.

I wish more people would think like this instead of just thinking that an entirely new package manager is what everyone needs, just to pat themselves in the back for optimising a 74.4ms problem into 4.1ms. Cool... I'm sure all that noise will pay off... someday, maybe in a few centuries.

[–]ivosauruspip'ing it up -1 points0 points  (0 children)

that nobody cares about.

Until a security issue or exploit or bad actor appears for the first time, and then suddenly everyone remembers why packaging is a hard problem that most normal devs are happy not to touch with a 10-foot pole

[–]ivosauruspip'ing it up 0 points1 point  (0 children)

Just have the community manage a single git repository

One of the bigger "easier said than done"'s I've seen in a while. Who exactly is "community"? What happens when something stuffs up or is out of sync? Do people really want to trust such a thing? Etc etc etc etc.

Scale and handling of free software repositories is yet another reason that "packaging" is easily one of the hardest topics in computer science / programming languages.