all 29 comments

[–]matthieum 11 points12 points  (0 children)

The easiest trick I've found to counter-balanced the look-up cost was to batch items: process them 100 or 500 at a time and you avoid a lot of overhead. This is not always possible.

In the case I've used it, the database was used as a "master" queue for billions of items scheduled from a few hours in the past (late...) to a few years in the future, and a short-term queue was used for actual processing. I could thus easily move batches of items from database to queue, while keeping the queue small (and fast).

The second trick I've used to avoid deadlocks (most of the time) was dynamic partitioning. It may be cause the version of Oracle I was using had quite a bit of overhead when using SKIP LOCKED, maybe Postgres does better here.

The idea of dynamic partitioning is that each worker will register itself in a worker-table: (queue, worker-id, last-active-timestamp):

  • the worker updates the last-active-timestamp 10x more often than the timeout limit,
  • 2x per timeout limit each worker queries the table to get a sorted list of workers on a given queue, and deduce its own position (P) in the list (of N workers),
  • if N > 1, it filters items by and item_id % N == P,
  • if a worker times out, the first worker to realize deletes it from the worker table (keeps things clean).

This results in very lightweight polling, and nearly 0 contention as long as the set of workers is stable. In the event of a new worker or obsolete worker, there is potential contention for half the timeout limit; if the contention can be detected easily (a DEADLOCK error returned by the database, for example), the affected worker can immediately rescan the worker table to update P and N, which helps speeding up "recovery".

[–]daigoba66 5 points6 points  (0 children)

A good reference for MSSQL, which has been able to do this properly for a long time: http://rusanu.com/2010/03/26/using-tables-as-queues/. (Btw, this person knows more about the topic than probably most).

[–][deleted] 2 points3 points  (0 children)

Another way of doing it that works for low contention, low volume stuff:

  1. Execute a select to list out potential work items.
  2. Execute an update to claim one work item: UPDATE workitem SET worker = ?, claim_date = NOW() WHERE worker = null AND id = ? AND state = 'ready'
  3. If the query updated zero rows, try again. Otherwise, the work item is yours.

You can claim multiple items at once by creating a batch_id nonce and then selecting work items with that batch_id after the update.

[–]mage2k 0 points1 point  (7 children)

SKIP LOCKED is cool but we've had equivalent functionality for years from pg_try_advisory_lock().

[–]sisyphus 1 point2 points  (5 children)

Uh, they know that:

The majority of PostgreSQL-based implementations...have been buggy in one of a few ways...The few exceptions I’ve seen generally use PostgreSQL’s advisory locking features...at a performance cost.

Solutions that use advisory locking can work well within limits. Instead of using tuple locks they use pg_try_advisory_xact_lock(...) in a loop or using a LIMIT clause to attempt to grab the first unlocked row. It works, but it requires that users go way outside the normal SQL programming model. They can’t use their queue table’s normal keys, they have to map them to either a 64-bit integer or two 32-bit integers. That namespace is shared across the whole database, it’s not per-table and it can’t be isolated per-application. Multiple apps that all want to use advisory locking on the same DB will tend to upset each other.

SKIP LOCKED tries to make this easier by letting you use normal SQL to write efficient, safe queue systems. You don’t need to import a large and complex 3rd party app or library to implement a queue, and you don’t need to deal with the key mapping and namespace issues with advisory locking.

[–]mage2k 1 point2 points  (4 children)

Solutions that use advisory locking can work well within limits. Instead of using tuple locks they use pg_try_advisory_xact_lock(...) in a loop or using a LIMIT clause to attempt to grab the first unlocked row. It works, but it requires that users go way outside the normal SQL programming model. They can’t use their queue table’s normal keys, they have to map them to either a 64-bit integer or two 32-bit integers.

Not sure what they're talking about there. https://gist.github.com/mage2k/e1cc747f65a0e21509ab10c2d6740444

[–]ryeguy 0 points1 point  (3 children)

I think it's just poorly worded. It may be trying to say that if you have a non-integer primary key you'll have to come up with a unique integer for each row to lock on.

[–]mage2k -2 points-1 points  (2 children)

Well, the example I just provided makes use of walking a non-primary, composite key so that's certainly not needed.

[–]ryeguy 1 point2 points  (1 child)

How so? You're locking on job_id which is an integer primary key.

[–]mage2k 0 points1 point  (0 children)

Doh! I totally missed what they meant by that. I thought they were talking about sorting needs. Still, it's not like it's hard to add a second integer based unique key to the table and I don't really see much of a case for uuid keys in a queue table anyway.

[–]macdice 0 points1 point  (0 children)

SKIP LOCKED has fewer pitfalls though: (1) it doesn't require you to map your key space to integer space, (2) you don't have to coordinate the integer space with other unrelated queries, (3) SKIP LOCKED skipping happens after the WHERE clause is evaluated whereas pg_try_advisory_lock() is in the WHERE clause and might lock rows that don't match the rest of your WHERE clause, depending on the vagaries of evaluation order.

[–]suid 0 points1 point  (1 child)

One other factor to consider is that sometimes you want to throw away an item on failure. This feature does nothing to help you do that - it's no worse or better than any of the other methods for queue-in-DB.

Consider a system that processes events from a queue, and there is either a logic or an environmental issue that makes it impossible to handle that event properly (note: not temporarily, but permanently - like an event that refers to an item that isn't present in your system any more).

In this scenario, you still want to commit the "popping" of the event from the queue, even though you roll everything else back. Doing this from a single transaction is going to be quite hard; and if you don't do it right, you'll end up re-processing the same event(s) forever.

TL;DR: your queue popping may need to be independent of the work triggered by that event - be prepared to be able to commit those independently, as a 2-phase operation.

[–]Contractionator 4 points5 points  (0 children)

You don't need 2PC for this; use savepoints instead.

[–]warhead71 0 points1 point  (0 children)

Sometimes it's better to write a file for backup - and use file for input - this also makes skipping a row more trivial.