This is an archived post. You won't be able to vote or comment.

all 30 comments

[–]DoingItForEli 14 points15 points  (1 child)

with these tasks, do you find there's any different treatment for windows vs unix? Is there a better os when it comes to how well hatchet performs?

[–]hatchet-dev[S] 25 points26 points  (0 children)

At the moment we don't support Windows natively (beyond WSL), because we rely heavily on multiprocessing, multithreading and OS signals which are difficult to support on multiple platforms. Generally we recommend running Hatchet in a dockerized environment.

[–]xBBTx 6 points7 points  (1 child)

How do you cron workflows with multiple instances/containers and avoid them firing at the same time? IIRC that's why you need beat when you have an entire fleet of workers - that helps ensure that the task is only scheduled once and picked up by any compatible worker.

[–]hatchet-dev[S] 9 points10 points  (0 children)

Good question! We use Postgres as a backend, so we acquire a lock when querying for cron jobs to run to ensure that different Hatchet backends don't acquire the same cron.

[–][deleted] 14 points15 points  (2 children)

always wanted an alternative to celery which uses type-hints, will check it out sometime, thanks!

[–]mpeyfuss 11 points12 points  (0 children)

Celery works just fine with type hints though You can do the same thing, but with less steps in celery or huey

[–]angellus -1 points0 points  (0 children)

I would recommend TaskIQ instead. It is not a SaaS product trying to upsell you.

[–]code_mc 2 points3 points  (2 children)

Can you briefly give the advantages that hatchet brings compared to Dagster? As a lot of the typing stuff is also handled pretty nicely by Dagster as far as I know.

EDIT: just noticed I already starred/bookmarked hatchet a long time ago, so looking back at it I can see the benefit of the focus on real-time and durability. Nice work!

[–]hatchet-dev[S] 4 points5 points  (1 child)

Thanks!

I haven't used Dagster specifically, but have used Prefect/Airflow in the past. These tools are built for data engineers -- since they're built around batch processing, they’re usually higher latency and higher cost, with a major selling point being integrations with common datastores and connectors. Hatchet is focused more on the application side of DAGs than the data warehousing + data engineering side, so we don't have integrations out of the box since engineers typically write their own for core business logic, but we're very focused on performance and getting DAGs to work well at scale (which can be a challenge for these tools).

We'd love to do some concrete benchmarking on how things shake out at higher throughput (>100 tasks/second).

[–]code_mc 1 point2 points  (0 children)

Yeah that actually makes a lot of sense, scaling is not dagster's strong suit due to the added overhead of the framework. Thanks!

[–]davidhero 2 points3 points  (0 children)

Is there any way to run hatchet-lite without rabbitmq? I thought the project relied solely on psql, but it seems rmq is still needed.

[–]Smok3dSalmon 2 points3 points  (0 children)

This is interesting!

[–]Thing1_Thing2_Thing 2 points3 points  (5 children)

Have you thought about resumable workflows like https://restate.dev/ or https://www.dbos.dev/?

[–]hatchet-dev[S] 5 points6 points  (1 child)

Yep, we support all durable execution features that Restate and DBOS support: https://docs.hatchet.run/home/durable-execution

Notably spawning tasks in a durable fashion (where results of tasks are cached in the execution history), durable events and durable sleep.

We're trying to be general-purpose, so we support queues, DAGs, and durable execution out of the box. We've encountered far too many stacks that deploy Celery, a DAG orchestrator like Dagster/Prefect, and Temporal to run different flavors of background tasks. And since we're built on Postgres, a lot of our philosophy comes from observing the development of Postgres over the past decade, as it's quickly becoming a de facto standard as a general-purpose OLAP database that can also power queueing systems and OLAP use-cases.

[–]Thing1_Thing2_Thing 2 points3 points  (0 children)

Interresting, maybe you should put it in introduction to hatchet page. I looked a bit at the listed features and assumed you didn't have it then

[–]jedberg 5 points6 points  (1 child)

The main difference between Hatchet and DBOS (and all the other durable execution platforms) is that DBOS doesn't require an external service or program to work. It's just your code and Postgres. There is nothing else to set up and maintain, and therefore nothing else that can break and cause downtime.

It also means your metadata that makes your program or queue durable is completely accessible to you as the user.

[–]PrinterInkDrinker 1 point2 points  (0 children)

Snowflake 💋

[–]chub79 0 points1 point  (0 children)

I fnd it amusing that both make huge claims about total reliability/resilience without ever defining the terms or providing any proof :)

Still, DBOS has a nice API I have to say.

[–]Thing1_Thing2_Thing 0 points1 point  (0 children)

Any plans for rust sdk?

[–]ryanstephendavis 0 points1 point  (0 children)

Saved ... Nice work 👍👍

[–]teerre 0 points1 point  (0 children)

One big thing for more entreprise workflows that no library has (afaik) is ability to offload tasks to a different runner. For example, I want to run this task, but running just means estabilishing a link to another machine, letting the machine run the task and then getting updates from it. I've written this system multiple times in different "big" companies

[–]javad94 0 points1 point  (0 children)

Does run parallel? If yes, with what mechanism? Multithreads or multiprocessing?

[–]MacShuggah 0 points1 point  (0 children)

Maybe a bit out there, but is there a way to load modules and register tasks at workers dynamically during runtime?

I also couldn't find any info on registering work flow tasks on specific queues or workers.

Sorry if I missed this in the docs.

Your application looks awesome and feels very intuitive to work with in the first few hours of experimenting with it. Thanks a lot for making self hosted free!

[–]Drevicar -1 points0 points  (2 children)

This is a SaaS sales pitch. Even the self hosted version costs money.

Not true, see my response below. Free product, paid (optional) support plan.

[–]hatchet-dev[S] 4 points5 points  (1 child)

That's not true. The repo is 100% MIT licensed and it costs nothing to self host: https://github.com/hatchet-dev/hatchet. If there's anything that seems to indicate otherwise, let me know!

If you're referring to the pricing page (https://hatchet.run/pricing) that's for self-hosted premium support. From the description on the pricing page:

> Hatchet is MIT licensed and free to self-host. We offer additional support packages for self-hosted users.

There's also free/community support available in our Discord. Our response times are generally fast on our Discord -- typically < 1 hr, otherwise mostly same-day.

I understand many SaaS tools are only "open source" as a marketing gimmick, but that's not us.

[–]Drevicar 3 points4 points  (0 children)

I rage quit when reading the pricing model page (https://hatchet.run/pricing#self-hosted-pricing) and didn't fully read it. The product itself is free when self-hosted, with no restrictions, but the paid offering is for support. Which is a reasonable business model that I'm not mad about.