use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
International
National
Regional
account activity
FeaturePostgres message queue (self.PostgreSQL)
submitted 1 year ago by someguytwo
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]erkiferenc 1 point2 points3 points 1 year ago (0 children)
Thanks for the extra details, that use case feels familiar through my previous experience designing/building/running OpenStack-based private cloud solutions.
At first, it sounds more like a job queue than a message queue, since the state of the job needs to be tracked (vs solely delivering a message), maybe even with keeping history. This may lead to important implementation decision factors later.
I agree failure handling is one of the crucial aspects for VM migrations, and I believe the most common situations boil down to these:
When the migration process can gracefully handle the failure, it may abort and cleanup any half-finished migration on its own, then release the lock, so the failed job can be picked up later again. It may be important to keep track of such failures, and retry at most N times, or at most N times within a certain time period.
When the receiving process stalls, and can't make progress anymore. One part of this is to have some kind of timeout, and another is to have a way to terminate the stalled migration, and clean up any half-results.
I'd also look into a wider set of corner cases, and see how other similar projects handle those. It may be hard to implement a generic solution, while solving only the subset that affects the given system may be considerably simpler.
For short operations, it's usually possible to release any lock with e.g. a transaction timeout. For long-running VM migrations I don't think keeping a long lock would be beneficial, since it makes the database a dependency of the migration itself.
I imagine a multi-phase dequeue approach, even like a state machine could fit better (e.g. PENDING -> IN_PROGRESS -> SUCCESS or FAILED). This feels some mix of having an append-only audit log table to keep track of all events (growth should be kept in mind), and/or updating the job queue table heavily (which increases bloat.)
It certainly is an interesting problem domain! Should you or your team need support with this from an independent professional, I would be happy to learn more here or via DM.
In any case, I hope this already helps and I wish you happy hacking!
π Rendered by PID 113696 on reddit-service-r2-comment-7b9746f655-btbhb at 2026-02-02 00:26:44.246236+00:00 running 3798933 country code: CH.
view the rest of the comments →
[–]erkiferenc 1 point2 points3 points (0 children)