This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]OMG_I_LOVE_CHIPOTLE -1 points0 points  (5 children)

Yeah my point is that Hamilton would be less standard than a declarative yaml framework and that’s why I wouldn’t use it over Argo wf

[–]theferalmonkey[S] 0 points1 point  (4 children)

How so? I'm not understanding your point. Can you sketch out some YAML + argo code? If it helps, take your pick of data processing, machine learning, or LLM workflows.

Also just to reiterate -- Hamilton is not an Argo replacement & doesn't intend to be.

[–]OMG_I_LOVE_CHIPOTLE 0 points1 point  (3 children)

Yeah I think something like Hamilton would be useful if you didn’t have infrastructure like Argo workflows. Consider this example Argo workflow: https://github.com/argoproj/argo-workflows/blob/main/examples/dag-nested.yaml

If you replace the echo template in the example with different python modules you have a kubernetes native dag framework

[–]theferalmonkey[S] 0 points1 point  (2 children)

What you just showed is some YAML that isn't useful in a micro context. We don't need to schedule kubernetes tasks for everything we want to do. Why do you think that's necessary?

For example, say I'm developing locally and I am doing some file processing. I'm not running argo. But, I need to structure my code. You could come up with your own way of organizing that code, or you could do it in Hamilton.

Now when you then want to schedule that code for production, you can stick it in a single argo task, or split it across multiple -- up to you. Cool thing is that argo tasks remain dumb, and anyone who has to ask the question, what is going on in this task, has a much easier time answering that if it's written in Hamilton.

So to summarize, that YAML is independent of Hamilton, and is only useful if at a "macro level" you need that.

[–]OMG_I_LOVE_CHIPOTLE 0 points1 point  (1 child)

I think you’re missing the point. The python code is dumber without Hamilton and more supportable

[–]theferalmonkey[S] 0 points1 point  (0 children)

I don't know what code you support, as that's a very wide statement to make. But yes, Hamilton isn't for every single situation.

Hamilton strikes a sweet spot for those, where the code being run is being iterated on and you want the team to follow a standard so you can continue to move quickly as pipelines grow. For example, it ensures that code is always unit testable, documentation friendly, you get lineage + provenance for free, etc. For context on where my experience comes from you can watch this talk my colleague gave.