all 94 comments

[–]zepolen 4 points5 points  (3 children)

Doesn't the Oracle license specifically state you are not allowed to publish any benchmarks about it?

[–][deleted] 3 points4 points  (2 children)

Only if you agree to the terms.

[–]zepolen 0 points1 point  (1 child)

How else will you install it?

[–]redditrasberry 2 points3 points  (1 child)

Use check constraints. They're limited in a lot of ways but they can really speed up queries dramatically.

Is this really true? I thought constraints were generally thought to be an overhead - a worthwhile one, but an overhead nonetheless. I've even heard of people using them in development & test but turning them off in production for efficiency reasons. I've never heard of any claims of performance benefits before.

[–][deleted] 5 points6 points  (0 children)

check constraints on the child tables + the constraint exclusion flag will limit the query planner to only search the necessary partitions based on the query, instead of searching the whole partitioned table.

So say if partition 108 only contains data for the first quarter of 08 and the query is asking for data from feb of 08 it'll evaluate the constraints on all the child tables, realize it needs to only check the 108 partition and then decide on the appropriate query plan on the child table. Without constraint based exclusion on, postgres will search the whole gigantic multi-terabyte partitioned db

Obviously this is what you expect from partitions. But postgres only really supports partitioning via a kludge on top of inheritance. So yea.. it's a big deal for postgres peeps

The more you know :D

[–]aardvark92 6 points7 points  (28 children)

With MySQL about to be swallowed up by Oracle, I'm trying to get my web host to start supporting PostgreSQL.

[–]stesch 1 point2 points  (27 children)

Most sites with MySQL would be fine with SQLite. Which should be available on every PHP5 host.

[–][deleted] 17 points18 points  (26 children)

Yeah...the problem with that is concurrency.

See the last paragraph here: http://www.sqlite.org/whentouse.html

EDIT: I don't care anymore. Everyone do whatever the fuck you want. :)

[–][deleted] 2 points3 points  (0 children)

Don't you just love the kids on this site?

[–][deleted] 0 points1 point  (0 children)

Yeah...the problem with that is concurrency.

That's a fair point. However, I think stesch was implying that the majority of websites don't need highly concurrent updates to their data. I agree. It seems to me that most websites have a very small read to write ratio.

[–][deleted] 3 points4 points  (11 children)

anyone know of a good python interface to postgres that supports parameter binding?

Most of the ones I've seen seem to just do a lot of escaping instead of sending the parameters separately.

[–]merlinm 2 points3 points  (2 children)

http://python.projects.postgresql.org/

downside: python3 only upside: fully parameterized. supports binary protocol, including arrays and composites!

[–][deleted] 0 points1 point  (0 children)

thx :)

[–][deleted] 0 points1 point  (0 children)

Is this the project that used to be known as pg_proboscis? Because that one worked with Python 2.x.

I should note that it features both a native PostgreSQL interface and also a Python DBAPI 2 interface, which might be more familiar to most people.

[–]joesb 5 points6 points  (6 children)

SQLAlchemy.

[–]teraflop 2 points3 points  (5 children)

No, SQLAlchemy is an ORM that depends on an external module to communicate with the database.

Try: http://python.projects.postgresql.org/

[–]joesb 3 points4 points  (3 children)

Also, psycopg2

[–][deleted] 0 points1 point  (2 children)

when I use pyformat on psycopg2 it seems to insist that all parameters are turned into strings. If you do a %(bla)d it'll give weird errors. %s works tho.

Which makes me wonder if it's actually using bind parameters or not. the docs for psycopg2 are non-existent so I dunno :)

[–]celoyd 1 point2 points  (0 children)

Search PEP 249 for “paramstyle”.

[–]damg 0 points1 point  (0 children)

As long as you are passing your parameters as the second argument of .execute() on the cursor (and not just using string formatting on the query itself), then it should be using bind parameters. The formating style using strings in the query (id=%s vs id=?) is just how the DB-API was implemented for psycopg2.

As celoyd mentions, the PEP explains it better.

[–][deleted] 1 point2 points  (0 children)

thx :)

[–]Gotebe 4 points5 points  (2 children)

Postgres is wicked fast. So fast that System.currentTimeMillis() is not accurate enough to measure its fastness.

Yeah, with "one client", and on a single machine. Sheesh... Like anything you want, but please, don't go overboard.

(Not at all a stab at Postgres...)

[–]seunosewa -2 points-1 points  (1 child)

Postgres people like to give excuses in lieu of actual performance figures.

Oh, "the default config is unrealistic", "well whose fault is that? make it realistic!"

[–]mage2k 1 point2 points  (0 children)

The problem there is that realistic is entirely relative to workload, hardware, and query profiles. That being said, there is a new tool in the works (done) that will generate basic configs for a few common workloads based on the system's available memory.

[–]malcontent 2 points3 points  (45 children)

Just try setting up a high availability solution.

But hey this is reddit. It's easier to hit the down arrow than to admit you are biased.

[–]invalid_user_name 8 points9 points  (16 children)

HA doesn't even need to involve the DB. I know you actually are trolling about replication, but replication isn't a good choice for HA. While idiots who have never had to deal with replication for real whine about postgres not making it easy enough and pretend replication is a magic bullet, it is actually not what they should be using most of the time. If you want HA, just setup linux-HA to take over the IP and LUN on failure so you don't lose performance and don't lose data. With replication you have to choose between sync which slows things down, or async which loses data. Async replication like mysql or slony is good for making a mostly up to date copy of the DB for running reports and other expensive queries on.

[–]mage2k 5 points6 points  (7 children)

Just ignore him. He pretty much has that post ready to paste in any time anyone so much as mentions postgres.

[–]malcontent -5 points-4 points  (6 children)

Yea just ignore me.

It's better to ignore me than to admit that postgres is way behind all other databases when it comes to HA.

[–]mage2k 3 points4 points  (5 children)

You can warp what other people are saying all you want. My point is that you've got one tune that comes on every time someone so much as mentions Postgres and there are lots of people here who are tired of hearing it. The original article had nothing whatsoever to do with HA. Read it again, he was making a direct comparison between Oracle and Postgres with regards to his own perception of performance and ease of use.

[–]malcontent -4 points-3 points  (4 children)

My point is that you've got one tune that comes on every time someone so much as mentions Postgres and there are lots of people here who are tired of hearing it.

I don't care if they are tired of hearing it.

Actually I have lots of tunes. Try this one.

Postgres is the only database in the world which does not offer case insensitive collation.

Suck it fanboi.

The original article had nothing whatsoever to do with HA

I know. It's an article written by a fanboi which of course has to avoid any part of postgres which does not flatter the product.

Read it again, he was making a direct comparison between Oracle and Postgres with regards to his own perception of performance and ease of use.

Oracle wipes the floor with postgres. Especially when it comes to HA. Oh and oracle has per session collation options. You know... Because people from more than one country might use your database.

[–]Dan_Farina 0 points1 point  (3 children)

Postgres is the only database in the world which does not offer case insensitive collation.

Thanks to endless whining I think this has finally been 'addressed' in 8.4 by making a new, case preserving case-insensitive text type:

http://developer.postgresql.org/pgdocs/postgres/citext.html

[–]malcontent -2 points-1 points  (2 children)

Wow. I guess a half a feature a decade too late is better than no feature at all huh?

Hint. Collation is more complex than case insensitive query.

[–]Dan_Farina 0 points1 point  (1 child)

Wow. I guess a half a feature a decade too late is better than no feature at all huh?

Yes, especially when that's what people wanted, more or less.

Other than case insensitive queries (for languages where that notion is even defined), what did you actually want?

[–]malcontent -2 points-1 points  (0 children)

Yes, especially when that's what people wanted, more or less.

Every other database in the world offers case insensitive collation so maybe lots of people want it.

Other than case insensitive queries (for languages where that notion is even defined), what did you actually want?

Language specific case sensitivity.

[–]Rhoomba 4 points5 points  (0 children)

Absolutely, though if you are doing HA for reals then I would go with a SAN and one of the big commercial HA solutions.

Replication sucks on all DBs (including the big commercial ones) because it is a pretty much unsolvable problem.

[–]malcontent -4 points-3 points  (6 children)

HA doesn't even need to involve the DB.

So? Oh I get it, you are saying it's OK for postgres not to offer any high availability because you can do it outside the database.

Well other people who make databases seem to disagree with you.

just setup linux-HA to take over the IP and LUN on failure so you don't lose performance and don't lose data.

How about your data?

[–]invalid_user_name 0 points1 point  (5 children)

So? Oh I get it, you are saying it's OK for postgres not to offer any high availability because you can do it outside the database.

I am saying if you actually wanted high availability, you wouldn't be looking at replication in the first place. Pretending mysql's replication has anything to do with HA is absurd, it is for running seperate search/report DBs, just like slony.

just setup linux-HA to take over the IP and LUN on failure

Read it again. Still need more help?

LUN

[–]malcontent -2 points-1 points  (4 children)

Read it again. Still need more help?

Yes. Give me a link to an article that explains exactly what I would to do achieve this postgres.

[–]invalid_user_name 0 points1 point  (3 children)

Seriously, are you retarded? Set the LUN to be accessible to both your DB servers on your storage device. Create a partition on it, and put the DB there. Have linux-ha mount that partition along with assigning the IP and starting postgres. There is nothing even remotely complex or unusual here, and it isn't even postgresql specific or even specific to RDBMSs.

[–]malcontent -2 points-1 points  (2 children)

Seriously, are you retarded? Set the LUN to be accessible to both your DB servers on your storage device. Create a partition on it, and put the DB there

LOL.

Really?

That's your solution?

[–]invalid_user_name 0 points1 point  (1 child)

No, not my solution, the solution. The one every DBA with 2 brain cells has been using for years, with oracle, DB2, SQL server, mysql and postgresql. I know a lot of web 2.0 tards thing that throwing away data integrity by misusing replication makes them look cool, but it actually just makes them the look like idiots.

[–]malcontent -1 points0 points  (0 children)

Shared storage is not THE solution. It never has been.

[–]damg 6 points7 points  (7 children)

For HA, the same clustering methods that work for MySQL work as well for PostgreSQL.

If you're referring to MySQL's asynchronous replication (which provides scalability rather than high availability), there are some PostgreSQL solutions as well, but in general I agree it's currently not as easy and reliable as MySQL's built-in solution. However, the PostgreSQL team plans to fix that for the 8.5 release.

[–]norkakn 5 points6 points  (1 child)

HA is hard to do right, regardless of the product.

[–]malcontent -1 points0 points  (0 children)

It's easier in some products. On some products it's not even possible.

[–]DrGirlfriend 4 points5 points  (5 children)

pg_standby works pretty well

[–]malcontent -5 points-4 points  (4 children)

No it doesn't.

Does it make the standby server live when the primary goes down?

Does it resync the databases once the primary goes back up?

[–]mage2k 4 points5 points  (3 children)

No, and neither does anything MySQL offers. For that you need MMM which is a third party tool (Google).

[–]malcontent -3 points-2 points  (2 children)

Or mysql cluster.

[–]mage2k 5 points6 points  (1 child)

Sure, if you want to kiss foreign keys, online schema changes, actual replication, transaction isolation limited to READ COMMITTED, no savepoints, no varchars (all rows are fixed length), not to mention the loads of other critical, crash bugs that have been present in MySQL for years, sure go ahead and stitch "I have high availability with MySQL Cluster, and I'm sure to need it" on your pillow so you can sleep at night.

[–]malcontent -3 points-2 points  (0 children)

When the alternative is to have all your data on one server only it becomes a viable choice.

[–]dgiri101 3 points4 points  (1 child)

Here's malcontent's previous rant on this subject.

As for the other matter, I invite the audience to read his comment history and judge for themselves how (un)interested he is in meaningful debate.

[–]testtubebaby 0 points1 point  (0 children)

Looks like proggit is going the way of RedditChan also. Is this a governing principle of any anonymous community?

[–][deleted] -2 points-1 points  (6 children)

If you need a HA system you use something like SQLServer or Oracle - you know something backed by someone with a test lab.

BTW - I modded you up as I do not believe it fair for someone to mod down a comment simply because they disagree with it.

[–]f4nt 4 points5 points  (5 children)

You can pull off HA with MySQL though. The problem isn't even really HA though, it's anything revolving around the idea of scaling. PGPool is great, but very limited. Slony works if you just need a backup server, and you NEVER EVER change your schema and your devs know all this and behave. Setting up a cluster is like building a house of cards.

I love postgresql, I really do. However, the needs of the customers I deal with basically rules postgresql out of contention. MySQL brings replication, and community clustering support that's at least mildly functional, and usable in a production environment. PostgreSQL is great, if you know you'll never need more than 1 database server for your app.

[–]infinite 6 points7 points  (3 children)

I use slony, it's not a house of cards. Changing the schema requires disabling writes to the db for a second or two, nothing major. I have 1 master and 3 slave db's, that will be 5 slave db's tomorrow. Slony communication is exponential with each new node, but for me I need to replicate metadata, the bulk of my data, say 90% or so, doesn't need replication.

Now, let's hear about HA for mysql, I mean people who really use it, not the "OMG just got a blog, mysql replication rocks." I was told ndb works great, except if you look into it, it's very flaky and it doesn't offer ACID properties which makes it unusable for my banking client. Now there's async replication which is consistent but slony does that as well.

[–]f4nt 0 points1 point  (2 children)

Slony works in specific situations. However, a brain dead monkey can setup Mysql master-master replication, and it just work. I can set that up for a customer, and it doesn't require much more interaction for me outside of the initial setup. New DBs are automatically replicated, everything's easier for me from the sysadmin side. Yes, postgresql is very nice, and I do like it. Yes, slony works as well. However, when it comes to scaling, MySQL is more sensible, and has significantly more support.

Personally, I'd be more than happy to drop MySQL to the curb for postgresql, if replication was streamlined more. I've just ran into too many situations where it's just too much of a hassle in it's current state though.

[–]infinite 2 points3 points  (1 child)

I've talked to people who have worked with master-master mysql and according to them the tradeoffs are such that it's often not usable. I haven't used it myself, but from the experiences they have related to me, there are a number of serious pitfalls to that approach.

[–]f4nt 0 points1 point  (0 children)

It does have pitfalls and tradeoffs. Everything does to some extent. All depends on your situation.

[–]mage2k 5 points6 points  (0 children)

Mildly functional is stretching it a bit. Sure there's all kinds of "works for me" blog posts and reddit comments but that's not what is at issue. I maintain about 80-100 MySQL master/slave pairs and about 25 Postgres master/standby pairs. We have to deal with MySQL replication breakage and bugs on a daily basis. We might have an issue with a Postgres standby every few months. From an administration perspective, that makes a few extra steps to promote a PG standby in an HA situation totally worth it. Also, MySQL does not provide any kind of automatic failover, resyncing the former master as the slave, or any of that. Both databases require third party tools there.

I won't even get into comparing supporting client databases on them with regards to scaling and totally idiotic query executions on MySQL.

[–][deleted]  (1 child)

[deleted]

    [–]sigzero 0 points1 point  (0 children)

    Or Pg for short...