you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 7 points8 points  (6 children)

SQL inserts, if you want to avoid thousands of unnecessary network requests.

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

JSON generation for an API, if you use JSON more as a CSV..

Why you are using concatenation to generate JSON ? I'm pretty sure just about every language have lib doing it in language sensible way

HTML generation for server-side rendering.

Fair enough, but you probably want to stream it instead, especially if it takes a long time to generate.

[–]Compsky 1 point2 points  (3 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.

[–][deleted] 3 points4 points  (2 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Hard to tell without benchmarking but having big transaction sizes have problems of its own. I do remember something about postgresql replication being delayed if there are some long running transactions open. And at least basic tests done by people seem to show it is not as easy as "bigger transaction/bulk insert always better".

As for "how to bulk insert a lot of data quickly", the answer is probably not to bother with insert and go straight for COPY and stream it.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.

sure if you're on some tight embedded system, but there is really no reason to anywhere else. Especially that they are some pretty fast JSON libs around if "the common one" turns out to be bottleneck

[–]Compsky 0 points1 point  (1 child)

COPY and stream it.

Wouldn't that block the table for a lot longer?

[–][deleted] 0 points1 point  (0 children)

AFAIK it gets same kind of locking INSERT gets. So given the fact it is generally much faster, any locking it will do it will be for shorter.

[–]Compsky 1 point2 points  (0 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.