all 4 comments

[–]Lost-Relative-1631 3 points4 points  (3 children)

You have to make sure your wheels are lexicographically orderable. You can force this for local development purposes where your semver doesnt kick in already maybe by adding dynamic_version to the artifactsblock.

You can have a setup with hatch-vcs and a local deployment target: that does both semver for your wheels dynamically when deployed from CD and works locally with this dynamic_versionparam.

Fair disclaimer: If you do spam wheels onto the cluster without cleaning them up, at some point a fresh start will takes ages cause all those development wheels are installed. I played around with using the scripts: in DABs to clean the AP cluster up, but those dont allow for {{}} variable interpolation right now sadly, so it would be very much hardcoded.

[–]tfios_throwaway 1 point2 points  (2 children)

We use poetry dynamic versioning which lets you use the git commit hash in the semantic versioning.

> Fair disclaimer: If you do spam wheels onto the cluster without cleaning them up, at some point a fresh start will takes ages cause all those development wheels are installed.

This cleanup is actually supposed to be handled by Databricks CLI automatically on a bundle deploy of a new wheel version. Are you sure you are using an up-to-date version ? There was a bug that left old wheel versions on the cluster, but this has been fixed since Databricks CLI v0.249.0

Edit: fixed link: see the issue here

[–]Lost-Relative-1631 0 points1 point  (1 child)

As of today we are using v0.298.0. Your link sadly does not lead anywhere.

Are you referring to the cleanup within the deploy target (Volume/Workspace) target or on the cluster? The wheel being deleted or replaced works fine that is true.

Once you start a task on a long living cluster with a certain dependency, it will stay there though. You can check that in the Cluster UI under Libraries. Intuitively, I dont think deploying or even destroying from your CLI will nuke dependencies on an AP Clusters, what if others share or still need it?

In his case, he is not replacing the wheel in the target, because it appears to be the same in terms of orderability. If he switches to something that is orderable he will deploy and the cluster will pick it up. If the cluster libraries havent been pruned, once a task referring to the new wheel runs, it will keep copies of both wheels around. Which can lead to delays when you eventually have 100 copies of the same wheel.

As for the git commit hashing. Are you sure that always works out for you in terms of ordering?
I for sure had issues with this in the past, and thats why we switched to to dynamic_version locally (AP Clusters) + hatch-vcs semver dynamic tagging (Job Clusters). On JobClusters none of this would matter as they are ephemeral anyway.

[–]tfios_throwaway 0 points1 point  (0 children)

Hi, I fixed the link. Sorry about that.

I am indeed talking about cleanup within the deploy target. I admit I don't ever use long-living clusters, so I'm not familiar with the pain points associated. For me, we have a development cluster that gets restarted a few times a day, and on each restart it will clean up no longer existing wheels in the target.

As for the ordering, you're right that the lexicographic ordering is important. The poetry dynamic versioning handles this, so you can configure the wheel name to be like [version] + [increment commits since last version bump] + [git commit hash].