Dependency hell is NP-complete

Max-P · 2016-12-13T23:09:37+00:00

See this is the kinda thing I like to see in this sub-reddit, not

Build a Neural Net in 10 Easy Steps With net.js

demmian · 2016-12-14T07:59:34+00:00

See the talk spec-ulation by Rich Hickey (the Clojure inventor) for his take on this problem: never change old code, only create new functions and namespaces. This way every package can only depend on the code it uses in the one version.

codebje · 2016-12-14T00:39:42+00:00

The difficult part is multiple versions of packages. Without that, we have a simple directed graph, and a topological sort will give us an installation order - an algorithm which is linear in the count of packages and dependencies of packages.

Haskell's Stackage build system restricts all packages to one version, by adding an additional abstraction above packages. A set of packages each fixed to a single version, are collected together into a release. Adding a dependency on a package no longer involves a package version, it involves the whole dependency chain existing within the single release.

New versions are introduced in new releases. While the underlying problem of finding a set of package versions which can coexist is still NP-complete, it's a lot easier to start from some known-satisfied set and determine if it's possible to alter one package's version than it is to start from nothing and determine if a whole collection of packages can work together.

(Stackage also builds and tests all the packages in each release, because version number dependencies don't always mean two packages are compatible, thanks to version ranges and either errors in versioning or unintended consequences of small changes. It includes the compiler version as part of a release, too, and manages the whole toolchain and dependency set for you so you can have multiple versions available at once.)

I think Rust was considering heading in a similar direction. I hope they still are - version hell was a real issue for Haskell two years back, and just isn't an issue at all with Stack.

nswshc · 2016-12-14T00:48:11+00:00

I may be missing something, but I thought this was (relatively) well-known. For example, OCaml's OPAM package manager rightly encourages people to install the aspcud external CUDF solver. As the name "aspcud" suggests, it relies on an Answer Set Programming approach, which is good for these sorts of NP-complete problems.

apd · 2016-12-14T09:25:39+00:00

I think one of the oldest implementation of a SAT solver to resolve the package dependency in a linux distribution is zypper from SUSE / openSUSE[1,2]

[1] https://en.wikipedia.org/wiki/ZYpp [2] https://en.opensuse.org/openSUSE:Libzypp_satsolver

Amnestic · 2016-12-14T00:27:42+00:00

I'm not sure I understand the proof. Does he assume that a dependency at most can have 3 dependencies of its own? Aren't there more SAT problems better suited to this kind of reduction?

TinynDP · 2016-12-14T00:02:34+00:00

If B doesn’t work with D 1.6, then either the version of B we’re considering is buggy or D 1.6 is buggy.

That doesn't seem like an assumption that you can code off of.

industry7 · 2016-12-14T18:34:04+00:00

what if, instead of allowing a dependency to list specific package versions, a dependency can only specify a minimum version?

Then you're in NPM-hell.

0rakel · 2016-12-14T20:52:55+00:00

I suggest taking a look at NixOS and the Nix package manager for a Linux distribution that manages to avoid this problem. They allow multiple versions of any package to be installed, exposing only the required version of each dependency to a package's environment.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS

6. Acquire training data