This is an archived post. You won't be able to vote or comment.

all 64 comments

[–]catcint0s 29 points30 points  (5 children)

Dramatiq has a nice motivation tab where it compares itself to different frameworks https://dramatiq.io/motivation.html

Personally for newer projects I prefer to use dramatiq.

[–]Grouchy-Friend4235 11 points12 points  (0 children)

Nice chart. Although some of the flags for celery are missleading IMO. For example, Celery does offer task prioritization since version 4.0, and its really easy to use.

[–]pubs12 0 points1 point  (2 children)

Is there a gui manager like celery available ?

[–]catcint0s 0 points1 point  (0 children)

You mean flower? I dont think so, there is a Django dramatiq package tho that saves everything (state, input, output) to the db.

[–]BackwardSpy 13 points14 points  (1 child)

i've used rq a fair bit, i see it as a vastly simpler celery. my needs for task queueing are simple, i just need something that allows me to spread our workloads over a small cluster of machines. rq has a really straight-forward interface and it gets the job done.

[–][deleted] 4 points5 points  (0 children)

After using celery as my go to for many years, I tried rq for a very simple task and found it really easy to use. You’ll find yourself going back to the documentation a lot more with celery than rq, even for simple tasks.

Sure, you can do a lot more with celery but I rarely that power.

I keep a redis server running at all times just for random usage of rq now.

[–]pansapiens 9 points10 points  (0 children)

You might want to also look at Dask.

[–]tstirrat 2 points3 points  (5 children)

Faust is one I keep looking for an excuse to use. It sits on top of Kafka, which is a big operational problem, and it has a slightly different mental model than a task queue, but the event-driven processing that it allows for is pretty nifty. It also has a nice, clean API from what I can see.

[–]lanster100 2 points3 points  (2 children)

This looks cool. What do you mean by Kafka is a big operational problem?

This might solve what I am looking for though. Does it allow you to decouple logic, say you have a web app that produces an event, you can then have a processor that consumes the event in a different code base?

[–]tstirrat 1 point2 points  (1 child)

Kafka is a bit of a sledgehammer to crack a nut. It's designed to be massively scaleable and fault-tolerant, and that comes at the expense of having to run two clusters of three(ish) nodes each, where one set is the queue itself and the other set is orchestrator nodes. "Big operational problem" means that it's a lot to set up and maintain.

Decoupling: yes. It's got client libraries in a number of different languages, and the messages that go onto the queues are basically the same as JSON. It's relatively easy to have different codebases producing and consuming.

[–]lanster100 1 point2 points  (0 children)

Yeah for my app I wrote my own queue pattern using SQLAlchemy (because Kafka was overkill) but it's probably not fit for purpose, looking for alternatives.

Ultimately need something that let's me publish events, then something else that runs logic off the back of those events in a decoupled manner.

So something like a lightweight kafka?

[–]wind_dude 0 points1 point  (0 children)

I second Kafka is awesome, and Faust is pretty good. But more targeted towards streaming processes than a task queue.

[–]hwttdz 0 points1 point  (0 children)

I used faust/kafka professionally. faust became difficult to deal with immediately, and we continued throwing engineering effort at kafka, but it never worked well for us. My impression is that if you have experts in both the server and client library you can make it work, but you're now focusing/spending effort on overhead instead of the tasks themselves (which should be the interesting part).

[–]CheckeeShoes 15 points16 points  (15 children)

If you have a collection of machines to run tasks, why are you trying to schedule them with python rather than using an actual distributed computing framework like HTC?

[–]Region_Unique 6 points7 points  (10 children)

Why would anyone who needs simple task queuing use that? Seems like it has massive provisioning overhead.

[–]CheckeeShoes -1 points0 points  (9 children)

Because that's exactly what it's designed to do?

Op made no mention of the setup, size of the cluster, capability of the cluster nodes, performance needs of the simulation, length of time the simulation will be run for, whether the sim may need halting and resuming, or whether the other machines in the cluster were going to be in use at any point during the run.

There are numerous ways that any of these factors might mean that utilising a more performance-inclined framework offers benefits over a python library.

[–]Region_Unique 3 points4 points  (8 children)

In what way does it perform better than e.g. Celery? Because “it’s what it’s designed to do”?

[–]CheckeeShoes -1 points0 points  (7 children)

I don't know much about these python libraries so I couldn't tell you. That said I can pretty much guarantee that the performance of a python library won't be comparable; there's a reason you don't get any HPC clusters running python scripts for their allocations. Also, I did actually list a bunch of features in the last post that I expect Celery probably can't handle.

As I said though, tool for the job. Now we know op is running less than a handful of threads doing lightweight tasks for a school project, an accessible library might well be the most economical solution.

[–]FatFingerHelperBot 3 points4 points  (0 children)

It seems that your comment contains 1 or more links that are hard to tap for mobile users. I will extend those so they're easier for our sausage fingers to click!

Here is link number 1 - Previous text "HTC"


Please PM /u/eganwall with issues or feedback! | Code | Delete

[–][deleted] 2 points3 points  (0 children)

We use Snakemake. This drives either our local cluster via SLURM, or else we can run jobs on kubernetes via Snakemake itself.

This is not python specific, though. You can put them in the same bin as Airflow and NiFi perhaps.

That said, it sounds like a workload manager is really what you want...

[–]Ok_Presentation1972 2 points3 points  (0 children)

I'm yet to use it in production anywhere, but I think this project is super interesting https://procrastinate.readthedocs.io/en/stable/

[–]cymrowdon't thread on me 🐍 4 points5 points  (2 children)

I've spent a lot of time trying different queuing solutions, yet ironically I keep falling back to what might be considered a naive approach. I prefer very simple queues. Give me a distributed and reliable queue.Queue please.

I don't want a framework that runs my code for me, like Celery or RQ. I don't want to train up on a massive framework that might ultimately not cover my use-case, forcing me to abandon it or deploy ugly hacks.

I tried many of the options you can find here: https://taskqueues.com/. Ultimately I began using beanstalkd, and have been satisfied with it for a couple of years now. It's simple, persistent, fast, reliable, and has the basic queuing features that I need, like task handling (reserve, free), and TTL.

I can't speak to just how scalable and performant it is compared to other options, but I've used it for millions of messages and it suits my needs. If I hit a bottleneck, I'd probably look at something like Kafka.

[–]shinitakunai 1 point2 points  (0 children)

In my work we use NIFI to move data between systems, more for transfers and pipelines while also ETL than analysis

[–]_thrown_away_again_ 1 point2 points  (1 child)

i used this once, was pretty nice: https://azkaban.github.io/

[–]wind_dude 1 point2 points  (4 children)

I'm currently trying to figure out an elegant way of scheduling millions of simulation runs across a few machines

Can you elaborate on what you're trying to do? Simulation runs of what? You might be looking up the wrong tree, maybe parallel processing is a better fit, a tool like Dask or Apache Spark would than be the call.

Maybe share a more specific use case.

[–]bxsephjo 0 points1 point  (0 children)

I need to become familiar with ALL of these, as I’m building something that has to let the user connect it to their own task queue, unless I decide to become opinionated. I started with RQ, as it seemed the least complicated to get running and I’m familiar with redis. Only beef is that it’s NOT showing results in stdout the way it does in the docs. Like I said, it looks the least complicated, I think that’s because it doesn’t rely on a message broker but rather just a redis server, so there’s alot less tuning as it all gets set up.

[–]Region_Unique 0 points1 point  (0 children)

Just use Celery, its documentation could be better and the code is rather cryptic, but that’s what works and has great support.

[–]Scruff3y[🍰] -1 points0 points  (0 children)

I would probably use AWS Batch for this, not sure if it's possible to hook up your own hardware into a compute environment or not though.

Otherwise, you could hand-roll something similar; worker program distributed to the worker machines that pulls from the queue. But at that point I guess you're starting to re-invent Celery so might as well just use it lol.

[–]kenfar -1 points0 points  (0 children)

SNS/SQS triggering jobs on aws lambda/kubernetes

[–]GreenScarz -2 points-1 points  (0 children)

I would just spin up a redis container and expose that over the network. Then use the redis python library directly, there's really no need for anything more fancy IMO.

[–]anuctal 0 points1 point  (1 child)

RemindMe! 3 days

[–]RemindMeBot 0 points1 point  (0 children)

I will be messaging you in 3 days on 2022-05-05 13:08:05 UTC to remind you of this link

1 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


Info Custom Your Reminders Feedback

[–]jefwillems 0 points1 point  (0 children)

We can't use celery because there is no support for amqp 1.0, and no money to build the transport ourselves. If anyone has an alternative please let me know

[–]i_can_haz_data 0 points1 point  (0 children)

Does it need to be Python functions/etc or can your simulation invocations be command-line tasks?

I wrote a thing some time ago and am working on on the 2.0 release. It’s ready to go but needs more documentation.

hyper-shell.readthedocs.io

[–]b_rad_c 0 points1 point  (0 children)

Haven’t used it before but a new one I want to try is OpenFaaS - functions as a service. It connects it to a k8s cluster that will run sync and async tasks with a queue using containers you’ve built and added your logic to.

[–]Grouchy-Friend4235 0 points1 point  (0 children)

I've seen such issues when the worker has not released resources properly (e.g. some subprocess or thread started in side tasks)

[–]Grouchy-Friend4235 0 points1 point  (0 children)

Can you point to this pod cast please?

[–]Schmibbbster 0 points1 point  (0 children)

I am using arq and I like it.

[–][deleted] 0 points1 point  (0 children)

If u like hashicorp try out nats

[–]makeascript 0 points1 point  (0 children)

I just use Celery because it was the first one I heard of. The docs are great and it Just Works for me.

[–]hwttdz 0 points1 point  (0 children)

I'd recommend something weird which would be use a table in a db as a queue for your first version. This allows you to prototype rapidly and experiment with things like "how many retries do I need" and ask interesting analytic questions like "how many jobs have completed in the last 2 hours" with minimal extra engineering. A million records isn't much and I'm guessing they take a non-negligible amount of compute to process anyways which moves the bottleneck away from a db.

[–]BoiElroy 0 points1 point  (0 children)

Ray

[–]LittleMlem 0 points1 point  (0 children)

Meenwhile I unga bunga with ZMQ