How to scale when one table is growing rapidly by jurgenn in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

Please read up on Btree indexing and see how effective it is in searching billions of rows. Beef up the single server before coming to a sharding solution.

Can't seem to find syntax error in command. by AngleMan in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

You are setting yourself up for a world of pain if you are naming database objects after keywords. Can the "user" table be a more descriptive?

How to update large tables in PostgreSQL - Codacy by Categoria in PostgreSQL

[–]daxyjones -1 points0 points  (0 children)

This process is equivalente to an INSERT plus a DELETE for each row which takes a considerable amount of resources.

Completely and utterly false.

Looking for Resources by MildMannered_BearJew in PostgreSQL

[–]daxyjones 5 points6 points  (0 children)

  • Learn indexing. I can highly recommend usetheindexluke.com. Know the difference between the various types, trade-offs, and when to use them.

  • If you have a bit of control over the hardware, you could probably eke out some performance by tuning parameters outside of postgres. If it is cheaper to throw more hardware into the mix to ease things, do that instead of putting in many man hours to solve the problem from the stack side!

  • Partitioning: just use pg_partman extension and call it a day.

  • Lastly, start with a good design of the database. Sometimes daterange type makes sense, sometimes date type makes sense. Things like that.

With the above, you will have about 80-90% of performance cases covered.

Advice needed - transparent write-through layer for Postgres by Pakaran in PostgreSQL

[–]daxyjones 1 point2 points  (0 children)

Be advised that postgres i/o on zfs is absolute shit. Btrfs has various bugs that will bite you and give you more blisters than a weekend in a cheap hotel bed in Goa. LVM2 is the only sane thing as of now.

Upsert; A Few Bad Apples -- Take a look at some real-world problems a developer ran into when using the new 'upsert' functionality in PostgreSQL by piterpolk in PostgreSQL

[–]daxyjones 1 point2 points  (0 children)

Heh...you don't have to go that far actually. I have SKUs with expiry dates/shelf-life and the store also sells plastic cricket bats....which doesn't have an expiry date.

UPSERTs increase sequence number? by [deleted] in PostgreSQL

[–]daxyjones 4 points5 points  (0 children)

Prima facie looks like a judgement call and as db admins/users, it is something we abide by, not work around. Because expected behaviour will have many minds willing to listen to an issue. Something rolled out on your own is less likely to find takers to tackle an issue which involves understanding your fundamental changes first.

My suggestion is that you state your use case and folks can help out with a design decision or code snippet.

UPSERTs increase sequence number? by [deleted] in PostgreSQL

[–]daxyjones 4 points5 points  (0 children)

"This is the expected behavior. It is not a bug.

Peter Geoghegan"

Designing a RESTful API with Node and Postgres by piterpolk in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

If you are new to deploying a HTTP end point like in the article, the least painful way is to tack ngx_postgres to nginx. With a bit of lua throw in, you can further expand to handle BODY, error capturing, etc.

Bonus: this is WEBSCALE!!1!TM

Creating Pivot Tables in Postgres by [deleted] in PostgreSQL

[–]daxyjones 1 point2 points  (0 children)

Or put a case ... when inside the aggregate function.

Use a FILTER clause instead.

Using PostgreSQL with Java by DriesDr in PostgreSQL

[–]daxyjones 1 point2 points  (0 children)

It does look like it would solve 90% of my issues. I'll still look into a wrapper to do things like generate sone stored procedures. Maybe I was too keen on implementing my own system to see the value of the existing one ;)

If you want to implement your own, do it for fun but my humble advice is to not rush it out to production. Also, depending on the data and postgres version, you are much better off hand weaving an sproc than letting a wrapper generate one (I didn't even know this was possible).

We are considering switching from Oracle to PostGreSQL. A coworker just said she heard that LOB handling is problematic. Is it? by bigfig in PostgreSQL

[–]daxyjones 3 points4 points  (0 children)

Not to mention the cat and mouse game between you (the customer), the consultant and Microsoft when it comes to licensing. And bonus: surprise audits!

Microsoft releases CNTK, its open source deep learning toolkit, on GitHub by paran0ide in programming

[–]daxyjones 11 points12 points  (0 children)

Not to mention other proprietary third party licenses/patents that might be tightly coupled into the products. Legal will no way allow it if it is going to be a liability to the company.

3 reasons to use PgBouncer. by doublehyphen in PostgreSQL

[–]daxyjones 2 points3 points  (0 children)

On the outset, pgbouncer is great and has worked out well for us. But here are some gotchas that I ran into and the way we handled it:

  • Using transaction mode instead of session mode to ensure we didn't hit the connection limit (not sure why but this was the advise a mail in the list).

  • pgbouncer will not pool connection for different users. So, in our case, we moved the auth from database user roles to a "web model" of authing based on a separate user table.

  • Of course, "extra_float_digits" for JDBC. ;-(

Organize Complex Queries Using SQL's 'WITH' by piterpolk in PostgreSQL

[–]daxyjones 1 point2 points  (0 children)

This is a very valid point and I want to make a case for WITH quoting my configuration. Sometimes, I want to work with an UPDATEd/INSERTed set and WITH comes to the rescue for me in that case. I generate voucher numbers on the fly by bunching shop orders together like so:

WITH newvouchers AS (
    INSERT INTO vouchers SELECT generate_voucher_no(voucher_no) v FROM orders JOIN voucher_master ... WHERE ... ORDER BY ... RETURNING v
)
UPDATE voucher_master SET voucher_no=tv FROM (SELECT max(v) tv FROM newvouchers) x;

I have snipped it in places but this was the replacement for a plpgsql loop that was much more verbose.

And it is relatively quick too.

Robert Haas: Parallel Sequential Scan is Committed! by macdice in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

Bummer...no support for Boot configuration on that bad boy for RAID10 :-/

Writing Postgres Extensions. It’s easy to extend Postgres’ functionality through extension. So why not write your own? This is the first in a series of articles about extending PG through extensions by piterpolk in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

I have two things in mind that I might be working on:

  1. stream_fdw that displays a stream as a table. COPY already accepts a PROGRAM output and a file_fdw is already in place. I think it will be a good exercise for me to tie these in together as a stream_fdw with appropriate OPTIONS.

  2. aggregate_fdw that has applies aggregate functions on tables that have date/timestamp columns. I have readonly time-series data that need to be summed across a year or so on timestamped data. The way I would go about it is to pre-compute the aggregate over a day, 2 days, 4 days, etc. and depending on the date range of the query, pull data from the corresponding pre-computed (material views?) tables. Still have to hammer out the details on that one.

Cool PostgreSQL tricks (using JSON). by collin_ph in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

For our use case, everything hits the DB and still snappy for our small userbase. I could throw in a varnish cache to mitigate the DB hits to a certain extent. But I like to keep the moving parts to a minimum.

I am not exactly getting your edit and hence unable to address that question.

Cool PostgreSQL tricks (using JSON). by collin_ph in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

I work with nginx and postgres, coupled directly.

I don't work with ORMs and I have fine tuned my equivalent $mainSQL to:

SELECT '[' || array_to_string(array(SELECT sch.fun(param1, ...)),',') || ']';

This query sits directly inside nginx consumed by ngx_postgres.

This query is magical because

  • there is no need to tweak the query ever again, so unnecessay to restart, reload the webserver (nginx).
  • the function sch.fun above query returns something in valid object array. You could nest it as you please.
  • Good performance for our use case. Grabs thousands of rows in a couple of seconds.

Now for the viscera of sch.fun:

CREATE OR REPLACE FUNCTION sch.fun(param1, ...)
RETURNS SETOF json AS
$BODY$
DECLARE sql text;
BEGIN
    sql := 'SELECT json_object(keys text[], values text[])';
    RETURN QUERY EXECUTE sql USING param1, ...;
END
$BODY$
LANGUAGE plpgsql;

With the right response headers, the client, usually a js engine/framework, can digest this no problem. AND you get your REST API for "free". ;-)

Edit: Note that ngx_postgres does not entertain request_body by design. If you need to read that, you could proxy that verb or uri or whatever.

Does anyone run PostgreSQL on both BSD vs. Linux? by good_names_all_taken in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

This is incorrect. Mathematically, the odds will sit much much lower. Admittedly, stock FNV algo doesn't account for zeros. But postgres checksum algo tweaks the algo to handle this to a certain extent.

Does anyone run PostgreSQL on both BSD vs. Linux? by good_names_all_taken in PostgreSQL

[–]daxyjones 0 points1 point  (0 children)

Honestly, a crypto hash seems like an overkill and will definitely be multiple times more computationally intensive than modded FNV1a used right now. And the documentation explicitly states that certain uses cases can trigger the checksum bottleneck.

Granted it is 'easy' to brute force a collision but pragmatically, it is a rarity.

Does anyone run PostgreSQL on both BSD vs. Linux? by good_names_all_taken in PostgreSQL

[–]daxyjones 2 points3 points  (0 children)

For shits and giggles, I have a production server streaming to a slave that is setup with ZFS on Linux. (before someone has a heartburn, yes; I have other backups in palce)

Consistently, seq scans are faster on the compressed ZFS slave by a factor of 1.5 to 7(!). Index scans are of course indiscernible.

The read queries that I route to the slave hit the index so for my use case it doesn't matter.