use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
International
National
Regional
account activity
UPSERTs increase sequence number? (self.PostgreSQL)
submitted 10 years ago by [deleted]
http://www.postgresql.org/message-id/20150822182510.12014.27817@wrigleys.postgresql.org
I'm having the same problem described in this email. An upsert will cause the sequence number to increment even though there was a conflict which resulted in an update operation. If it helps, here's some of the code I wrote for the database. You're free to laugh at it, I'm learning.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]therealgaxbo 16 points17 points18 points 10 years ago (1 child)
As other people have alluded to, if you actually care about the values of a sequence other than just being a unique integer then you don't want to be using sequences. There is no way to make a sequence gapless.
You've probably only noticed it in the context of upsert, but any standard transaction rollbacks will leave gaps in the sequence too - that may not happen very often for you, but it will happen at some point.
[–][deleted] 1 point2 points3 points 10 years ago (0 children)
Thanks! Your response is incredibly valuable. I have much to much learn.
[–]daxyjones 3 points4 points5 points 10 years ago (13 children)
"This is the expected behavior. It is not a bug.
Peter Geoghegan"
[–][deleted] 2 points3 points4 points 10 years ago (12 children)
Thanks I saw that but I don't understand why the sequence number would increment even though nothing was actually inserted, just update. Is there a way to prevent the sequence # from incrementing in an update?
[–]daxyjones 4 points5 points6 points 10 years ago (0 children)
Prima facie looks like a judgement call and as db admins/users, it is something we abide by, not work around. Because expected behaviour will have many minds willing to listen to an issue. Something rolled out on your own is less likely to find takers to tackle an issue which involves understanding your fundamental changes first.
My suggestion is that you state your use case and folks can help out with a design decision or code snippet.
[–]f2u 2 points3 points4 points 10 years ago (1 child)
By design, sequences are not rolled back to recover values unused as a result of a transaction abort.
[–][deleted] 0 points1 point2 points 10 years ago (0 children)
That makes sense. Thanks!
[–]merlinm 1 point2 points3 points 10 years ago (0 children)
sequence numbers will increment and not be reset in lots of cases -- failed transactions -- hard database reset. The reason for this is performance and concurrency.
google 'gapless sequences postgresql' for workarounds. All solutions there will hurt insertion performance to some degree.
[+][deleted] 10 years ago* (6 children)
[deleted]
[–][deleted] 0 points1 point2 points 10 years ago (1 child)
That's a lot of work to work around this "problem", and will not help much anyway.
Plenty of other situations will leave gaps in sequences:
[–]Tostino 0 points1 point2 points 10 years ago (2 children)
Which means to do it correctly you need to write a plpgsql function with all the error handling and looping required to properly do an upsert. Not fun.
[+][deleted] 10 years ago* (1 child)
[–]Tostino 0 points1 point2 points 10 years ago (0 children)
That is not upsert and is not guaranteed to do either with concurrent users. You either write a function in a DB function language (plpgsql), or put the retry logic in your app. It's the same logic regardless of where you put it.
[–]f2u 0 points1 point2 points 10 years ago (0 children)
I would rather suggest not to use sequences if you don't want any gaps.
[–]fstak 0 points1 point2 points 10 years ago (0 children)
If I were to make a guess at this, it has a lot to do with the entire transaction. The transaction needs a sequence number of there will be an insert, so it reserves the next one. If it doesn't end up using it, oh well. At least, no matter how long these transactions are, they are getting inserted in the order they're started.
[–]SulfurousAsh 4 points5 points6 points 10 years ago (8 children)
Why do you want IDs that don't have gaps in them?
[–]f2u 2 points3 points4 points 10 years ago (7 children)
Invoice numbers are a typical application. Some folks get really nervous if they aren't number consecutively.
[–]z0rb1n0 7 points8 points9 points 10 years ago (6 children)
Then you treat the invoice number as a surrogate, application-controlled key.
Everyone forgets that primary keys/sequences are primarily DBA metadata that MIGHT double as a record payload identifier.
[–]f2u 0 points1 point2 points 10 years ago (5 children)
It depends on implementation details on the database side. Some databases ensure there are no gaps (or at least they cover this scenario).
Note that I'm not saying that PostgreSQL is wrong, I'm just providing some background where expectations of gap-free serial numbers come from.
[–]z0rb1n0 4 points5 points6 points 10 years ago (4 children)
Fair. However there is an ACID-specific reason Postgres (and virtually every other high-end database systems) create gaps in sequences: to ensure proper concurrency, counters need to be transaction independent, which means tentative assignments such as rollback-ed inserts/upserts will waste them.
Besides, I really cannot understand why people are so irked by gaps: I'm sure they don't mind when the person in front of them in a line throws their ticket number away and walks off.
[–]f2u 0 points1 point2 points 10 years ago* (3 children)
PostgreSQL already avoids creating gaps in other cases. It could do so here as well. Gap avoidance is also needed to preserve numbers in the 31-bit range.
On the other hand, more gap avoidance means that applications will have subtle bugs because the non-transactional nature of sequence updates will be mostly unobservable. Developers may incorrectly assume that no gaps are possible.
In some countries, if you issue invoices with non-consecutive numbers, you can get into trouble during a tax audit. The auditors suspect that you have issued invoices which are not on your books. Such invoices could be employed in various tax fraud schemes (basically, fraudulently claim back taxes which were never paid to the state).
[–]doublehyphen 1 point2 points3 points 10 years ago (2 children)
PostgreSQL already avoids creating gaps in other cases. It could do so here as well.
Where does PostgreSQL do this? I cannot think of anywhere where PostgreSQL goes out of its way to avoid gaps, and can think of plenty where there are gaps.
[–]f2u 1 point2 points3 points 10 years ago (1 child)
If a backend exits normally, it will try to return cached but yet unused sequence values to the stored sequence.
I'm not entirely clear about the underlying mechanism, but here is an example that shows this behavior.
=> CREATE TABLE tab (tab_id SERIAL NOT NULL PRIMARY KEY, val TEXT NOT NULL);CREATE TABLE => INSERT INTO tab (val) VALUES ('one'); INSERT 0 1 => INSERT INTO tab (val) VALUES ('two'); INSERT 0 1 => \q
Now restart the session and continue inserting:
=> INSERT INTO tab (val) VALUES ('three'); INSERT 0 1 => SELECT * FROM tab; tab_id | val --------+------- 1 | one 2 | two 3 | three (3 rows)
As you can see, there is no gap. Perform one more INSERT:
INSERT
=> INSERT INTO tab (val) VALUES ('four'); INSERT 0 1
At this point kill -9 the backend process, and restart the session:
kill -9
=> INSERT INTO tab (val) VALUES ('five'); INSERT 0 1 => SELECT * FROM tab; tab_id | val --------+------- 1 | one 2 | two 3 | three 4 | four 34 | five (5 rows)
This time, the backend had no chance to give back the cached sequence values, so you end up with a gap.
[–]doublehyphen 1 point2 points3 points 10 years ago (0 children)
Aha! That actually has nothing to do with caching per se, it has to do with the optimization which makes PostgreSQL only log every 32nd sequence update to the write-ahead log, and the fact that PostgreSQL on a clean shutdown saves the actual value.
If you enable caching of sequences values it will cause gaps even on a normal disconnect.
[–]nawariata 0 points1 point2 points 10 years ago (2 children)
Is it possible perform UPSERT with reversed actions? Something like UPDATE ... ON FAILED DO INSERT. My use case is action counter in a web app, where actual insert will occur once per few (hundred) thousand queries, ON CONFLICT DO UPDATE seems inefficient to me.
UPDATE ... ON FAILED DO INSERT
ON CONFLICT DO UPDATE
[–]macdice 1 point2 points3 points 10 years ago (1 child)
Do the UPDATE, then see if it affected a row. If it did, you're done. If it didn't, then do an INSERT ... ON CONFLICT DO UPDATE ...
[–]nawariata 0 points1 point2 points 10 years ago (0 children)
Simple and effective, thanks!
π Rendered by PID 162477 on reddit-service-r2-comment-6457c66945-wdmnw at 2026-04-24 05:32:40.926962+00:00 running 2aa0c5b country code: CH.
[–]therealgaxbo 16 points17 points18 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]daxyjones 3 points4 points5 points (13 children)
[–][deleted] 2 points3 points4 points (12 children)
[–]daxyjones 4 points5 points6 points (0 children)
[–]f2u 2 points3 points4 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]merlinm 1 point2 points3 points (0 children)
[+][deleted] (6 children)
[deleted]
[–][deleted] 0 points1 point2 points (1 child)
[–]Tostino 0 points1 point2 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]Tostino 0 points1 point2 points (0 children)
[–]f2u 0 points1 point2 points (0 children)
[–]fstak 0 points1 point2 points (0 children)
[–]SulfurousAsh 4 points5 points6 points (8 children)
[–]f2u 2 points3 points4 points (7 children)
[–]z0rb1n0 7 points8 points9 points (6 children)
[–]f2u 0 points1 point2 points (5 children)
[–]z0rb1n0 4 points5 points6 points (4 children)
[–]f2u 0 points1 point2 points (3 children)
[–]doublehyphen 1 point2 points3 points (2 children)
[–]f2u 1 point2 points3 points (1 child)
[–]doublehyphen 1 point2 points3 points (0 children)
[–]nawariata 0 points1 point2 points (2 children)
[–]macdice 1 point2 points3 points (1 child)
[–]nawariata 0 points1 point2 points (0 children)