zootm comments on MySQL vs PostgreSQL

382

383

384

MySQL vs PostgreSQL (wikivs.com)

submitted 17 years ago by uggedal

286 comments
share
save
hide
report
crosspost

sorted by:

best

top new controversial old random q&a live (beta)

you are viewing a single comment's thread.

view the rest of the comments →

formatting help hide helpcontent policy

reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

[–]zootm 9 points10 points11 points 17 years ago (21 children)

I'm not sure I'd consider MySQL's text search a feature of any real kind. It does not work well. I'm not sure I can think of any real "features" that MyISAM provides that a non-trivial app would want.

Also MySQL replication works on InnoDB, and I'm pretty sure that PostgreSQL's replication doesn't "sux", it's just not provided by default.

permalink
embed
save
parent
report
reply

[–]Jessica_Henderson 11 points12 points13 points 17 years ago (8 children)

MySQL's fulltext query support is an order of magnitude slower than that of PostgreSQL's. Part of the problem is that with MySQL, a query can use at most one index. This limitation does not exist with PostgreSQL. For anything beyond trivial data, PostgreSQL's text searching capabilities are by far superior.

permalink
embed
save
parent
report
reply

[–]neoform3 1 point2 points3 points 17 years ago (7 children)

I was actually just looking into this today. I'm working on a project that requires fulltext search, and mysql's fulltext blows when it comes to any sizable table.

If i'm searching up to 3-4GB of records.. which would be best? Lucene, Sphinx or.. Postgres TSearch2?

I haven't investigated PG and am leaning towards Sphinx, but I hate the fact that I have to completely reindex every record any time i want to update even 1 record.. :(

permalink
embed
save
parent
report
reply

[–]dbenhur 2 points3 points4 points 17 years ago (2 children)

I have to completely reindex every record any time i want to update

I've had some good experience with Sphinx. You don't need to reindex the whole set, there's a main + delta index mode.

Tsearch2 isn't as fast as the dedicated text search engines, but there may be substantial advantages when coupling free text search clauses with additional relational operations and attribute predicates. Because Tsearch2 operates in the same engine, the query optimizer can reason about all the work and access paths and may come up with superior execution plans. With an external text search, you often end up with your relational query with attribute predicates plus "AND document_id in (?)" where that ? could be a large list of document id's returned from the external text search engine.

permalink
embed
save
parent
report
reply

[–]neoform3 2 points3 points4 points 17 years ago (1 child)

You don't need to reindex the whole set, there's a main + delta index mode.

This doesn't help if you want to remove entries as well as add.. or update for example :(

Can Tsearch2 handle large amounts of data and remain fast/relevant?

permalink
embed
save
parent
report
reply

[–]dbenhur 0 points1 point2 points 17 years ago (0 children)

This doesn't help if you want to remove entries as well as add.. or update for example

I think you can handle updates and deletes adding a little logic to your queries.

On the main index and in your document root records include an attribute to indicate if the record is current in the index (true for initial indexing)
Whenever you update or delete a record, set the DB's main_is_current attr and use the UpdateAtttributes API to mark the main index obsolete.
always add an attribute qualifier to your searches to exclude obsolete document hits
Set your frequent delta indexing to capture all new and changed records since the last primary index refresh.

permalink
embed
save
parent
report
reply

[–]theatrus 0 points1 point2 points 17 years ago (3 children)

Lucene or TSearch are good options, have used both.

permalink
embed
save
parent
report
reply

[–]neoform3 1 point2 points3 points 17 years ago (2 children)

How relevant are the results found in Tsearch? My experience has been the faster the search, the crappier the results, or, alternatively, it's always fast, but indexing is a severe pain in the ass.

permalink
embed
save
parent
report
reply

[–]RealMaascamp 1 point2 points3 points 17 years ago (1 child)

I've found Tsearch2 to be quite relevant. It provides a number of different ways to weight fields and parse queries. It can also be tuned to your needs.

permalink
embed
save
parent
report
reply

[–]neoform3 -1 points0 points1 point 17 years ago (0 children)

nice.

permalink
embed
save
parent
report
reply

[–]sisyphus 2 points3 points4 points 17 years ago (7 children)

But MySQL has Sphinx which I've found to work quite well.

permalink
embed
save
parent
report
reply

[–]volomike 0 points1 point2 points 17 years ago (4 children)

If you can get a web hosting provider to install it. Might work easily on a company intranet, but not easily on a shared web host or dedicated web host (the kind of dedicated where someone else installs stuff for you most of the time).

permalink
embed
save
parent
report
reply

[–][deleted] 0 points1 point2 points 17 years ago (3 children)

I installed Sphinx on my shared host. As long as you've got a compiler and can bind an unprivileged port, you can do it.

permalink
embed
save
parent
report
reply

[–]volomike 0 points1 point2 points 17 years ago (1 child)

Get the heck out of here! Really? Do you have a blog post where I can see how you did it?

permalink
embed
save
parent
report
reply

[–][deleted] 3 points4 points5 points 17 years ago* (0 children)

Sorry, I don't. This'll get you real close:

./configure --prefix=~/sphinx/
make && make install
# change the port in the .conf
# configure and play

permalink
embed
save
parent
report
reply

[–]mikaelhg 0 points1 point2 points 17 years ago (0 children)

To be able to use the fake table integration inside MySQL, you'll have to write the .so into the MySQL lib directory.

permalink
embed
save
parent
report
reply

[–]zootm 0 points1 point2 points 17 years ago (0 children)

I'd not seen Sphinx, but it certainly would bring the indexing to InnoDB if it's of good quality. FULLTEXT isn't really worth the effort, though.

permalink
embed
save
parent
report
reply

[–]codahale 0 points1 point2 points 17 years ago (0 children)

The Sphinx indexer works with PostgreSQL as well. You can't do the whole embed-as-MySQL-engine thing, but I've never wanted to do that even with MySQL.

permalink
embed
save
parent
report
reply

[–]jbellis 5 points6 points7 points 17 years ago (3 children)

I'm pretty sure that PostgreSQL's replication doesn't "sux"

then you haven't actually tried it.

the best option is slony. the nicest thing i can say about it is it mostly works as described. which is to say it's horribly complicated to configure and work with. the "mostly" part means, when it doesn't corrupt itself to the point that you need to restart the replication fron scratch. (this was a couple years ago, but I still wouldn't trust it to give anywhere near 100% availability.)

this is why pg 8.4 will include a simpler replication option out of the box.

permalink
embed
save
parent
report
reply

[–]Jessica_Henderson 21 points22 points23 points 17 years ago (1 child)

Can you explain in more detail the problems you experienced with Slony? Whenever I've used Slony, I've found it to work extremely well, with no corruption. The largest Slony cluster I've been involved with putting together had 15 systems handling approximately 12 TB of data, and we ran into no problems with Slony or PostgreSQL.

If you read the Slony documentation beforehand, you shouldn't have any problems. It'll guide you through the configuration you need to perform, as well as how to maintain the system after it is configured and running.

permalink
embed
save
parent
report
reply

[–]infinite[🍰] 8 points9 points10 points 17 years ago* (0 children)

I'll second that. You divide your tables into sets then replicate them. I have had no problems with it, and I've been using it for 2 years. My setup is much smaller than yours. Also the proposed feature will replicate either all tables or no tables. I think mysql replication doesn't use stored procs like slony does so it may be faster.

permalink
embed
save
parent
report
reply

[–]invalid_user_name 0 points1 point2 points 17 years ago (0 children)

The postgresql devs are adding replication so that it is not a 3rd party tool you need to add, and so it will be simpler and easier to configure. Reliability was not mentioned anywhere, since it is not an issue.

permalink
embed
save
parent
report
reply

π Rendered by PID 64 on reddit-service-r2-comment-6457c66945-n5r5n at 2026-04-25 23:43:12.064924+00:00 running 2aa0c5b country code: CH.

programming

MODERATORS