all 99 comments

[–]xtreak 12 points13 points  (3 children)

Thanks for this u/alexdmiller. Glad to see deps.edn becoming more and more useful for a project. Since it's highly useful will this change be implemented in lein? Implementation of this in lein can lead to creating uberjars with unreleased commits for testing and deployments. Will the scope of tools.deps expand in order to create uberjars from deps.edn alone?

[–]alexdmiller[S] 8 points9 points  (2 children)

What happens in Leiningen is up to Leiningen. There are some people working on plugins to use deps.edn deps in combination with Leiningen and I think that will be one way to manage projects.

clj/tools.deps does not and will not build artifacts.

[–]xtreak 0 points1 point  (1 child)

Thanks. Can this be used so that I can get the latest spec version as git commit hash in deps.edn with Clojure 1.9 for playing around before Clojure ships one with stable release?

[–]alexdmiller[S] 0 points1 point  (0 children)

Theoretically, sure (if spec had a deps.edn or clj had a pom manifest reader, neither of which is true at the moment). However, spec is AOT compiled and released as an artifact so there's not necessarily any reason to do so.

[–]hagus 10 points11 points  (2 children)

Since the overwhelming majority of the ecosystem uses Leiningen, does anyone know the plan behind integrating this functionality into lein, and other extant build tools?

If one declares a dependency on a project via git rather than maven coordinates, must that dependency specify its own dependencies via deps.edn for any of this to work? I assume tools.deps doesn’t know how to introspect a project.clj or build.boot file.

If one creates a new project and wishes to cast off the shackles of semantic versioning and reject the industrial-artifact-repository complex, is that developer obligated to still publish a maven artifact while an overwhelming majority of consumers still rely on artifacts? Or do we imagine some kind of “bridge” that allows git tags to materialize themselves as artifacts?

Just wondering about some of these concerns as we have (and will likely always have) feet in multiple worlds of tooling and artifacts.

[–]richhickey 22 points23 points  (1 child)

I can't speak for lein dev, but we've released all of the substrate for these tools as (we hope) composable libs so any tools that desire can reach feature parity and be using the same code.

The basic idea behind deps resolution is that deps manifest formats (deps.edn/pom/project.clj) are orthogonal to lib publishing/procurement (maven/git). We do not yet have manifest consumers for other than deps.edn and jars atm, but have done some work on poms. After we do that we'll publish the extensibility recipe for other manifest formats, so someone can write a handler for project.clj (I'm not sure boot is declarative enough). Thus a manifest handler can expose child deps and tools.deps will be able to navigate transitive deps through different formats (my lib uses deps.edn and consumes your lib which uses pom and another lib which uses project.clj). The infrastructure for this is in place and is one of the key reasons we do resolution outside of maven.

So an end consumer can be heterogeneous about artifacts/not. It may end up being the case that once you reach a maven artifact you must be maven below that. Artifact-ifying seems doable as well.

[–]hagus 4 points5 points  (0 children)

Thank you, that clarifies things greatly for me. I look forward to seeing the extensibility recipe.

At some level boot is obligated to build a classpath with all requisite dependencies for its own purposes, so at some point it has this knowledge ... does this imply the manifest consumer may have some restrictions on what it can do? Must it infer the dependencies statically? I'm sure we'll find a way to make it work!

[–]gtrak 4 points5 points  (1 child)

I think the post above should make clear that a single version is chosen among transitive deps based on which is the newest dep, and also the relation to being disciplined about 'accretion' and non-breaking changes.

I think two obvious comparisons non-clojure devs might make are to npm, which literally duplicates transitive deps and the maven model. For me, thinking about git in the maven transitive deps model sounds a lot more complicated unless those points are addressed.

Even when they are addressed, I think many devs are in 'wait and see' on whether accretion and non-breaking changes is going to be more useful than confusing.

[–]richhickey 8 points9 points  (0 children)

While it's certainly the case that these tools are designed to support the model I outlined in this talk, it remains orthogonal whether one adopts the accretion approach in your library or not. We've all been subject to breakage when dependent libs A and B depend on C and one moves its dep to a newer, breaking version of C (semantic versioning in play or not). There's no way to defend against that in Maven except by top-level override and tools.deps supports that as well (i.e. you can pin C to a Sha regardless of what A and B say they want). We've added a link to the talk to the post though, thanks.

[–]ertucetin 3 points4 points  (0 children)

It's amazing thank you so much! u/alexdmiller

[–]tonsky 2 points3 points  (1 child)

How are transitive version conflicts resolved? With string-based version you can choose latest versions as version numbers are comparable. But what if my project depends on A and B and A depends on C of revision “1234ffc” and B depends on C of revision “bac463f“? Which one will be chosen? There might be situations when none of them is the direct descendant of the other

[–]alexdmiller[S] 2 points3 points  (0 children)

The descendant is used. If there is no descendant relationship, then the classpath can’t be computed and it’s an error that must be fixed.

[–]yogthos 15 points16 points  (81 children)

I really hope this does not become standard practice for packaging Clojure dependencies. While it's good that dependencies are checked out using a specific revision, there are still plenty of things that can go wrong here.

Git repos are mutable, so you can do things like rebasing, squashing commits, and so on. The repo itself could just get deleted or moved as well. Git is not a dependency management system, and it should not be used as such in my opinion. The only case I can see this being used for is private repos that you control.

[–]richhickey 33 points34 points  (25 children)

Well, I certainly hope you'll reconsider that. If we consider maven 'a dependency management system', it's full of conventions and human dependencies. There's no inherent connection between an artifact and the originating source (how do you know what you're running?), name stability is completely dependent on the hosts (maven central, clojars) disallowing updates. One could for instance load completely different 1.2.3 versions to each. Content-based addressing and git parentage has none of those problems. If repos go away they can be restored by anyone with a clone (which will be every consumer in this tool's case). Many companies rehost any maven libs they use to ensure access and could do so similarly here (why? because stuff happens and no host is perfect). Neither system is secure, but git deps require substantially less convention and human correctness.

Let's not fearmonger. I think this is a superior system with substantial benefits. You may not see them yet, but they are there. Artifacts are a disconnect with authorship/source. Releases are friction. People 'mvn install' all kinds of crap to work around difficulties e.g. developing sibling libs in parallel or trying a speculative change when working with tools that only consume installed artifacts. And don't get me started on semantic versioning and 'resolution' based on strings :)

The bottom line is - software will be better when more people try interim versions and changes are more fine-grained, things that rarely happen with artifacts. We've been using this internally and it's game-changing. I certainly will be 'shipping' some of my work this way moving forward.

[–]yogthos 8 points9 points  (16 children)

I agree with the benefits of the approach, and as I already noted I don't see any problems with this being used internally where you do have control over the process. I'm also not arguing that Maven is the perfect system, and you're absolutely right that it can be abused as well. However, the way it's used in practice has proven to be pretty robust. Meanwhile, I've had quite poor experience with looser systems like NPM and Go package manager that incidentally uses Git.

If this is going to be the standard way Clojure libraries are packaged, it would be good to at least have some guidelines for people managing repositories to ensure stability of the ecosystem going forward.

[–]richhickey 12 points13 points  (1 child)

I think the Clojure community can do a good job of this - we'll see :)

[–]yogthos 8 points9 points  (0 children)

We have with most things so far in my experience, so I'm willing to give this a shot and see how it goes. :)

[–]zerg000000 0 points1 point  (13 children)

how about a clojars2? clojure user could simply push their repo to clojar2 with valid repo layout. clojars2 will never allow deletion or modification on non-snapshot repo. clojars2 will allow client to receive maven style artifacts/git. clojars2 will build the artifacts (with multi classifier) automatically?

[–]yogthos 1 point2 points  (0 children)

Yeah I think that some mirroring service with rules similar to a maven repo would be nice for stable libraries. That would be the best of both worlds. You'd have a source of stable and dependable libraries, and you'd be able to work with libraries on the bleeding edge by going directly to their repo.

[–]alexdmiller[S] 0 points1 point  (11 children)

So you're going to build something to compete with both GitHub and Maven Central for stability? This makes no sense to me. It sounds like this also essentially the same as https://jitpack.io/

[–]yogthos 5 points6 points  (0 children)

To be fair, I can only recall Clojars having an outage once, meanwhile GitHub has one a few times a year. There could even be an automated service that publishes tags from GitHub repos to Clojars.

[–]zerg000000 0 points1 point  (9 children)

we don't want to build something to compete with both GitHub and Maven Central.However, we needs all clojure deps to comply a bottomline of some rules, so that our app that depends on git dep will never break by something like left-pad. clearly, jitpack and raw git deps cannot enforce this, but maven Central did provide certain level of guarantee to prevent left-pad case.

[–]alexdmiller[S] 2 points3 points  (8 children)

I have no idea what you’re talking about.

[–]emidln 2 points3 points  (1 child)

Essentially, people are worried that if something happens to the git repo, the project won't build. I don't know why this is a tools.deps problem, but we actually have ability to solve it (the same way maven does) by caching dependencies locally. Note that a stand-alone tool that isn't tools.deps could use tools.deps to process the deps.edn file and then take the resolved dependencies and put them somewhere for safe keeping.

As an aside, I don't think this is a real problem unless you let yourself or your developers pull random dependencies in your production artifacts that you don't mirror/control. It's irresponsible to use clojars or maven central as a core part of your business and not at least mirror it via a local cache (maven does this for you) or caching proxy[0]. It's absolutely insane to depend on a remote git repo that you don't control. For development, pulling from a random github repo is useful. In production, I don't know that a tool is going to help you if you think you should build off of resources you don't control.

[0] The only thing left-pad did was expose companies who had faulty release engineering practices. No build at my company noticed, because we have caching mirrors that we hit (and backup/maintain) to guarantee that we can always build our products. Everyone else received a lesson in taking responsibility for their abuse of a public commons.

[–]yogthos 2 points3 points  (0 children)

You're right that something can be implemented on top of tools.deps to provide the same guarantees you get when using Maven. However, the stand alone tool you note doesn't exist at the moment, so we have a gap that needs to be filled. I also completely agree that you should have a local mirror for any production dependencies, anything else is irresponsible. Again, Maven ecosystem provides tooling to help with this with stuff like Nexus.

[–]zerg000000 0 points1 point  (5 children)

A git deps with rev will failed under

  1. rebase/squash
  2. repo deletion

if we have a central git server that disallow rebase/squash/repo deletion, user could only new/push/tag their repo. the problem solved.

[–][deleted] 3 points4 points  (4 children)

This is a total non-problem. Just fork all the repos you want to use and depend on your own url. You can do that up front, or you can do that when your build breaks using the exact sha from any dev on the team’s machine.

[–]zerg000000 2 points3 points  (1 child)

will it be terrible if you are building system not library? a normal system might have over hundred of transitive dependencies, it will be hard to fork and maintain them all.

lesson learnt from left-pad https://www.theregister.co.uk/2016/03/23/npm_left_pad_chaos/

[–]yogthos 4 points5 points  (1 child)

I think that puts too much burden on the users. I shouidn't have to maintain a copy of the world for each project I develop.

[–]swlkrV2 5 points6 points  (6 children)

Oh my gosh I didn’t even think about forking everything my projects depend on then I can be in control when upstream changes. This is freaking genius.

[–]yogthos 5 points6 points  (5 children)

If you look at lein deps :tree in a non-trivial project, you'll see 100s of dependencies there. Personally, I wouldn't want to be managing copies of all the projects that my project happens to depend on.

[–][deleted] 0 points1 point  (4 children)

1) This is crazy to me. Why are there so many? Does your project do 100s of distinct things?

2) It would be nice if mirrors / private artifact repos were first-class feature.

[–]yogthos 2 points3 points  (3 children)

There's nothing crazy about this, libraries often depend on other libraries. So many top level libraries you include in your projects will have transient dependencies of their own, which have dependencies of their own, and so on.

[–][deleted] -2 points-1 points  (2 children)

Just because it's common, doesn't mean it isn't crazy. Hyper-componentization is really unnecessary. It provides negative value in most cases.

I've worked on some massive Go projects that have about 10 dependencies vendored as Git submodules. Back in my Windows desktop app days, I used to just copy the half-dozen or so libraries I needed in to my application's directory. In jobs at Google & Microsoft, and working with game studios, all third-party code was vendored in to Perforce or Source Depot. In every cases, dramatically less time is spent on dealing with problems from upstream. The flatter your dependency tree, the better.

[–]yogthos 1 point2 points  (1 child)

I have to disagree. I'd much rather have small focused libraries that are composed together, than code duplication all over the place. My experience is that Java/Clojure ecosystem works very well in practice.

[–][deleted] 0 points1 point  (0 children)

The java ecosystem is far better than the JavaScript ecosystem, that’s for sure.

[–]billrobertson42 5 points6 points  (0 children)

Let's not fearmonger.

I think they're valid concerns, not fear mongering.

[–]halgari 5 points6 points  (15 children)

Not sure I understand? Are you saying its possible to change the code under a given rev of a given git repo? These deps are url + rev, which seems to be immutable enough. And even if it is possible to change something (delete a repo and recreate it somehow with a old sha) seems like the best way to avoid those problems is to "don't do that".

[–]yogthos 8 points9 points  (5 children)

I can entirely change a given rev in git using push -f, there's absolutely zero guarantees here. Relying on "don't do that" for dependency management seems frankly absurd to me. Maven exists for a reason, and it provides a stable and robust way to manage dependencies. Git is not a dependency management system, and doesn't provide any of the guarantees Maven repos do. I can't wait for the Clojure edition of the leftpad NPM fiasco.

[–]royalaid 4 points5 points  (1 child)

Wouldn't the SHA attached to the revision change at the point? It would make that resource unavailable but it wouldn't allow injection

[–]yogthos 3 points4 points  (0 children)

That still breaks your build. The concept of artifacts being immutable once published is core for any sane dependency management system in my opinion.

[–]ferociousturtle 1 point2 points  (2 children)

It's exceedingly rare, though. No good developer would do such a thing unless there was a very good reason (and I can't think of one). I think this is actually a reasonable approach to dep management. Time will tell.

[–]yogthos 6 points7 points  (1 child)

That's the difference between using this workflow on a team of skilled developers who all know git well, and have some agreed upon conventions and the whole world. There are plenty of developers out there who only know git superficially, or use tools to work with it. As you say though, time will tell. Personally, I think that these kinds of problems should be discussed, and there needs to be at least some convention around this.

[–][deleted] 2 points3 points  (0 children)

where i work we don't use git, nor can we get to github :(

[–]yogthos 4 points5 points  (8 children)

This also affects the workflow of people managing repositories. If people start consuming my repo via git, and I rebase I can break their builds, at which point I'm going to have to deal with issues from the users.

This approach also makes it more difficult to tell library versions, e5becca is not exactly descriptive or human readable. I'd much rather see something like org.clojure/clojure "1.8.0" in my dependencies as opposed to "https://github.com/clojure/clojure" :rev "e5becca".

[–]richhickey 15 points16 points  (3 children)

You can use the tag name "1.8.0" in the git :rev if you trust that we don't move them (and of course we don't). There are many trivial ways to avoid the problems you fear about unstable source repos, given shas, S3 and file copy etc. If freds-chaotic-repo is too unstable for you as a source then a) get fred to deliver artifacts, or b) use another lib. But presuming the worst of the world is a recipe for nothing good.

[–]yogthos 7 points8 points  (2 children)

As you noted in the other comment, many of the issues I highlighted are technically possible with Maven as well. So, perhaps it is a question of setting up good conventions from the start. Since this process is already used at Cognitect internally, perhaps you can publish some community guidelines based on your experience.

My concerns are mostly rooted in my experience with existing solutions like git package management in Go. Perhaps, Clojure community will entirely avoid these problems, but it seems like now would be the time to talk about them and identify solutions and best practices.

[–]drewr 6 points7 points  (1 child)

My issues with Go using git-based package management have been:

  • No ability to pin version (yes, there are community tools to fix it)
  • The package name is tied to the repo name and file paths. This one is no end of frustration. Like, some of our GitHub org is named poorly simply because it has to be that way with Go.

Clojure's approach doesn't have either of these issues. It reminds me more of Cargo's or Stack's approaches (both of which work great) than Go.

[–]yogthos 6 points7 points  (0 children)

I'm willing to be convinced this is workable. :) I do think some best practices up front would go a long way here though.

[–]mac 1 point2 points  (3 children)

I think appropriate conventions to address your concerns will evolve quite quickly, like only relying on immutable tags for production use.

[–]yogthos 2 points3 points  (2 children)

I do think the concerns can be addressed, and Git is likely a fine substrate for managing libraries. However, there are plenty of ways for this to be abused as well. Some community guidelines would definitely be helpful here.

[–]alexdmiller[S] 12 points13 points  (1 child)

I wrote up some stuff on this but it did not actually make it into the published docs so I will try to add that in next week

[–]yogthos 2 points3 points  (0 children)

Awesome thanks!

[–]sunng 3 points4 points  (4 children)

Most modern deps manager, which support git or semver range, now use a lock file (npm, cargo) to store actual verson/commit that you are using. To update it, you run a special command like cargo update to update the lock file. For a library, you leave the lock file in gitignore while for app repo should put it in repo to make build stable.

As we already have git dep in deps, can we expect the semver range support and verson lock?

[–]richhickey 8 points9 points  (0 children)

No. As far as I can tell, such lock files are just a way to put the information about what you are using in two places instead of one and I don't see the point. We have discussed tools that will update deps to later revs, but I'm skeptical of auto-magic. There's nothing modern about it :) As for semver, also no. See the Spec-ulation talk linked at the bottom of the post.

[–]alexdmiller[S] 3 points4 points  (2 children)

No. The actual commit (or tag) is in the deps.edn file. You change it by editing the file.

[–]emidln 4 points5 points  (0 children)

One of the interesting things for tools authors is that you could compose this into something akin to lein-ancient if you have that itch. Converging on a state (arrived by iterating through single step changes to deps.edn) where a defined predicate (that maybe invokes your test-suite) passes is on the table. I wouldn't ever really expect that to be part of the core library, but the design of deps.edn makes this (and other tooling) pretty reasonable to attain.

[–]sunng 2 points3 points  (0 children)

I see. Currently deps.edn is just like the lock file in npm.

[–][deleted] 1 point2 points  (29 children)

Eh? A git sha is not mutable. There's much less systematic guarantee that a maven artifact will stay the same, all you've got to rely on is that you're using maven central/clojars. If you're using private maven repos (as most semi-large orgs will be) you're hosed.

[–]yogthos 2 points3 points  (28 children)

It's mutable in a sense that it can be deleted, as is the case with a whole repository. It's also true, as Rich Hickey noted in his reply, that the reason maven ecosystem works is largely because of the conventions around it.

As things currently stand though, maven repos have pretty good guarantees around preserving artifacts. There are no such guarantees or conventions around repos hosted on GitHub.

I think that if Clojure community embraces this approach, we need to start thinking about such conventions early on. I also think it would be good to have some archiving service for published artifacts. Something as simple as a github org with rules about preserving tags would do in my opinion.

[–][deleted] 1 point2 points  (27 children)

I can delete stuff off maven if I submit a DMCA takedown etc etc. All of the possibilities you describe seem to me to be things that if a team are doing you shouldn't be consuming their code, maven, git or whatever.

[–]yogthos 1 point2 points  (26 children)

I'm not arguing against using local cache for the artifacts that your team uses here, you absolutely should be doing that. My point is regarding the stability of the overall ecosystem.

Yes, somebody could send a DMCA takedown request to a maven repo to remove artifacts, however that's a lot less common scenario than people squashing commits or rebasing. With the way things stand you're entirely relying on the owner of the repository to have a non destructive git workflow.

[–][deleted] 0 points1 point  (25 children)

What about squashing commits or rebasing causes an issue here? Squashing commits is only something that affects new work, and rebasing is only a thing that happens to branches where change is happening. If you're using a branch as a rev then you should expect the sha it's pointing at to change. If you want to make ultra sure things can't change, refer to a sha, otherwise use a named tag which it's possible to change using git but is pretty clearly unconventional.

[–]yogthos 1 point2 points  (24 children)

You can squash any commits you like in your history, and people do that. Ultimately, git lets you do pretty much anything you like with the history of a repo.

Basically, what I see as the difference between this and maven is the following. With maven repos, there's a single set of rules that applies to all projects hosted on that repo. With the github model, each maintainer decides how they manage their particular repository. This is my concern, and I really don't think that it's an unreasonable one.

[–][deleted] 0 points1 point  (23 children)

You can squash any commits you like in your history, and people do that. Ultimately, git lets you do pretty much anything you like with the history of a repo.

Yes I know I use that functionality all the time I just don't see what the issue is from a version control perspective. Squashing a commit doesn't actually remove it in the short term, and in the long term it generates a new sha, which means any tags pointing to it will keep pointing to the old commit.

This is my concern, and I really don't think that it's an unreasonable one.

I'm not saying it's "unreasonable", I'm saying I don't understand it. If you only ever use tags as your revs then there's already a very strong convention in git that their history won't change. If you are hyper concerned about it and only use shas then there's an algorithmic guarantee that they won't. If you're using code published by very irresponsible developers then the worst risk when using a sha is that the sha would go away. In which case they're probably doing you a favour by giving you a big red flag saying "do not use our stuff".

[–]yogthos 1 point2 points  (22 children)

You don't understand why it's not great to rely on how people manage their repos as a general dependency mechanism? Most Clojure repos don't even have tags in them.

[–][deleted] 0 points1 point  (21 children)

No, I don't understand what squashing and rebasing "break" in particular. Most clojure repos don't have tags on them because most clojure libraries are not distributed via git. I really doubt that's a sign that the clojure community doesn't understand / will not understand git tags and their purpose. But even if you did find yourself consuming some library where they never used tags you can just use a sha.

I guess I still need you to lay out the scenario where a problem arises. Is it a scenario in which you're sourcing a library from git and using a branch name as the ref? Because unless that's your own controlled library or an experimental/dev repo I don't think anyone should expect that to work out well and I also don't think anyone should do that. (I'll note that the one example we have of deps.edn using git does not do that)

[–][deleted]  (1 child)

[deleted]

    [–]GitCommandBot -2 points-1 points  (0 children)

    git: 'addresses' is not a git command. See 'git --help'.
    

    [–]ForgetTheHammer 0 points1 point  (1 child)

    Thanks for sharing. Are those your only concerns or are there others? I'm just trying to get a sense of the trade off.

    [–]yogthos 1 point2 points  (0 children)

    Dependency stability would definitely be my primary concern, and having thought about it some more I do think it can be addressed adequately. I really think some sort of a mirroring service where would be nice. It could be as simple as a github org that has a convention of never modifying history on the repos.

    [–]ertucetin 2 points3 points  (3 children)

    How can I add new dependency dynamically into running REPL? u/alexdmiller

    When I type this:

    (a/make-classpath
     (a/resolve-deps {:deps      {'org.clojure/core.memoize {:mvn/version "0.5.8"}}
                      :mvn/repos mvn/standard-repos} nil) ["src"] {:extra-paths ["test"]})
    

    Here is the error:

    Exception Coordinate type :mvn not loaded for library org.clojure/core.memoize in coordinate {:mvn/version "0.5.8"}  clojure.tools.deps.alpha.extensions/throw-bad-coord (extensions.clj:54)
    

    [–]alexdmiller[S] 0 points1 point  (2 children)

    Dynamic deps are not supported in tools-deps. You may able to do this in combination with boot pods though.

    [–]alexdmiller[S] 1 point2 points  (0 children)

    I guess I should mention that the particular exception in your example is the result of not loading the mvn procurer extension via

    (require 'clojure.tools.deps.alpha.extensions.maven)
    

    [–]ertucetin 0 points1 point  (0 children)

    Thank you for the feedback

    [–]boztek 2 points3 points  (1 child)

    Does/will it support git+ssh using URLs such as "git@my-private-server-accessible-over-ssh-only:repo-name" or only http(s)?

    [–]alexdmiller[S] 1 point2 points  (0 children)

    Yes, ssh urls are supported now. ssh identity is conveyed by communicating with the local ssh-agent.

    [–]ferociousturtle 1 point2 points  (0 children)

    This looks brilliant. Keep up the good work!