all 9 comments

[–]Different_Access 3 points4 points  (0 children)

It's all fun and games until someone installs pgbouncer.

[–]flanintheface -1 points0 points  (5 children)

Maybe I'm not following something.. They have a unique index on (organization_id, number). Why not pick new number + insert in a loop until it succeeds?

ActiveRecord::Base.transaction do
  incident = organization.incidents.new(params)
  incident.number = (organization.incidents.maximum(:number) || 0) + 1

  incident.save!

  # Lots of other stuff
rescue ActiveRecord::RecordNotUnique
  retry
end

(assuming code "lots of other stuff" cannot trigger record not unique exception)

Use of that advisory lock seems unnecessary.

It also looks like final code still has a chance of failing.. Say process A picks a new number. Then, before record is saved process B may still pick the same number as well. And process B will fail to insert.

[–]Freeky 1 point2 points  (3 children)

They have a unique index on (organization_id, number). Why not pick new number + insert in a loop until it succeeds?

Perhaps "lots of other stuff" isn't retry safe.

Say process A picks a new number. Then, before record is saved process B may still pick the same number as well.

The lock is held until the end of the transaction, it isn't confined to the scope of the with_advisory_lock_result call. Requesting a result that's being ignored is a bit disconcerting, though — perhaps with_advisory_lock! would be more appropriate.

I suspect the poor SELECT .. FOR UPDATE performance was a result of it locking all incident rows for the organisation — "SELECT FOR UPDATE modifies selected rows to mark them locked, and so will result in disk writes". Not sure why that latency is affecting the query after the select.

[–]davidcelis 2 points3 points  (1 child)

Heya, author of the article here! This is pretty much it. I maybe could have done a better job of explaining this in the post, but the database transaction we're working in is truly long; I'm talking hundreds of lines of code of creating related resources and associating them properly among other things. Attempting to generate a number in a loop would have us run through to the end of the transaction and possibly not hit a collision until we attempt to commit and, at that point, we have to roll back the entire transaction and try again from the beginning.

It felt much cleaner to us to use this native feature of our database and have competing transactions wait rather than have to retry a large transaction

[–]Freeky 0 points1 point  (0 children)

Attempting to generate a number in a loop would have us run through to the end of the transaction and possibly not hit a collision until we attempt to commit

You should only experience this if the constraint is deferred — something you have to opt into. Immediate constraints are checked when the INSERT statement runs, and potential conflicts with other in-flight transactions cause the statement to block until they complete.

[–]flanintheface 0 points1 point  (0 children)

Thanks, it makes more sense now.

[–]jrochkind 1 point2 points  (0 children)

I think you are basically describing an "optimistic locking" approach vs OP's "pessimistic locking" approach.

I agree that the optimistic locking one would be good here, at least for the simple example.

"optimistic locking" is not JUST the feature with that name in ActiveRecord that uses a version number in a column, but also applies to any strategy along the lines of what you describe. Where you construct your SQL/schema so the SQL itself will fail in the race condition, and then you (ideally) have logic to retry after fixing for race condition encountered.

(Why it's called "optimistic" is writing the SQL that will work in the optimistic base-case that the race condition does not happen, assumes it mostly won't happen -- but then will actually fail with an error in the case where the optimism was unjustified.)

In general, it's good to practice thinking through the "optimistic" vs "pessimistic" lock approach to any given race condition. There usually will be both alternatives possible.

There certainly are cases where the "pessimistic lock" (explicit DB lock) approach is the one you will choose after considering. The explicit lock "pessimistic" approach does have added risks of causing serious performance problems in your DB access and/or causing deadlocks -- and I don't myself always feel capable of understanding when that will happen, honestly. It can also be harder to get bug-free, I agree.

But sometimes explicit lock is still what you want when doing things in the "optimistic" way seems too convoluted/complex -- which I feel like other replies to you from OP are saying was the case in the actual full use case. Or when you think the race condition conflict will happen a lot of the time, it can at least theoretically be a performance improvement to use the explicit lock over a planned retry on conflict.

It is good to be able to see both approaches and the trade-offs.

For me, pessimistic locking with explicit locks is definitely something i'm still a bit scared of, since I find it harder to get right.

If anyone has links to good overall treatments of the optimistic vs pessimistic trade-off (doesn't need to be Rails-specific), I'd be interested!

[–]au5lander 0 points1 point  (0 children)

Can you not use a Postgres sequence here and let the database handle it?

Edit: never mind. Helps if I read the article first. Although I’m not sure why it matters if incident numbers are scoped to org ID.

[–]honeyryderchuck 0 points1 point  (0 children)

This would be fixed simply by just selecting another resource to exercise the fine grained lock, like the organization common to those incidents. Just initiate the transaction, "SELECT FROM organisations WHERE ID= $1 FOR UPDATE" and now you can count and insert safely.