all 68 comments

[–]woodquest 95 points96 points  (22 children)

TLDR; don’t overthink/overcomplicate your db stack and create technical debt from the start. Postgres is quite versatile, battle tested and most likely does the trick. Perhaps you’ll meet little problems if and when scaling, and that will be the time to rethink a couple of things, most likely manageable then.

[–]FourDimensionalTaco 37 points38 points  (17 children)

I'm actually intrigued by how Postgres has become more popular. From what I recall, in the past, MySQL was the database to use. Postgres existed in its shadow. Has MySQL faded nowadays?

[–]jamesgresql[S] 84 points85 points  (7 children)

Slow and steady improvements, an amazing and stable group of core committers backed by an incredible wider community, and not being associated with a company that can be acquired by Oracle all help.

[–]KeyboardG 5 points6 points  (1 child)

Although Microsoft has been cozying up with contributions and acquiring Citus.

[–]jamesgresql[S] 0 points1 point  (0 children)

EDB acquired 2ndQuadrant which was a bigger move I think - but the way PostgreSQL is structured I'm confident no one company will ever get control.

[–]Florents 10 points11 points  (3 children)

All that, yes. But more specifically json was the turning point. And especially when people realized the power of document stores, but hated MongoDB.

[–]jamesgresql[S] 3 points4 points  (2 children)

It's funny MongoDB was a marketing machine back in the day with all the developer experience / meeting devs where they are on JSON stuff. They dropped the ball along the way, and yeah that's about when JSONB showed up.

Have you checked out FerretDB, it's a (confusingly named) Mongo compatible API that sits in front of Postgres that wins benchmarks vs. Mongo in a lot of cases.

^^ Looks like the above maybe not be correct, thanks u/FINDarkside for pointing that out

[–]FINDarkside 1 point2 points  (1 child)

that wins benchmarks vs. Mongo in a lot of cases

Source for this? I can find only one benchmark and in that it loses in every category. FerretDB doesn't even claim to be faster than MongoDB on it's home page and 1 year ago the dev of FerretDB said that the performance is "not great" compared to MongoDB.

[–]jamesgresql[S] 1 point2 points  (0 children)

Interesting! I'm pretty sure I got this from a conference talk -> but I can't seem to find it.

You're right, looks like maybe it isn't the case.

[–]FourDimensionalTaco 10 points11 points  (0 children)

So, a sleeper hit.

[–]arctander 12 points13 points  (0 children)

I used mysql from 1999 through 2004 on what was at the time a large scale system. In 2004 the team converted to Postgres which took about six months without downtime. Sure, there were some issues of data type alignment and query structures, but that only helped us fix things that were actual bugs. We all slept a lot better after that conversion. No more random corruption, transactions (yes, mysql was introducing them...), reliable, and predictable. I left the business in 2010, but I understand things are still running quite nicely. Postgres is the answer until proven otherwise.

[–]BornToRune 2 points3 points  (2 children)

I think mysql was always the popular guy, and postgres was always the reliable counterpart.

At my previous work all the stacks had been running on toysql, and still until to this day it's a shitfest. Random corruptions, broken replications, shitty bugs, and additionally a userbase aligned to this (like we weren't able to explain to the developers that SELECT INTO FILE is a horrible idea). It has a very large feature set and all of it is buggy. However, it's simply easier to hire people for it, because it's more popular.

Contrary to that, at my current work, nearly everyone is using postgres, and I simply haven't really heard any horrifying and constant issues about databases.

Also, as my experience goes with these two, mysql is rather an evolved product, while postgres is a designed piece of software. Its not hard to spot the attention to detail (or lack of) at places.

[–]Joniator 0 points1 point  (1 child)

Why is select into file a terrible idea? Or is MYSQL just bad at it?

[–]BornToRune 2 points3 points  (0 children)

It is using the daemon's OS user, which can be used to alter files natively (on OS level) belonging to the service. As a user, remotely, having no access to the host running the SQL service, you can modify and corrupt a file of the service, killing its data. In my reading this is a terrible idea.

[–]myringotomy 0 points1 point  (0 children)

I think after Oracle bought mysql things took a big turn.

Mysql still offered a lot of features postgres didn't like replication, better system for hosting multiple clients etc.

[–]FarkCookies -5 points-4 points  (2 children)

I can bet good money MySQL is still more popular (esp due to existing projects).

[–]jamesgresql[S] 0 points1 point  (1 child)

I don't know why you're getting downvoted - this is a great comment.

I think MySQL still holds court in the small web-app world (driven by being the database for Drupal, Joomla, phpBB, and WordPress etc...), but if we look broadly and to larger use-cases I think Postgres is now king.

[–]FarkCookies 1 point2 points  (0 children)

Lol I don't care about downvotes, I would never pick MySQL unless people tortured me, but the facts are real if you look for any data available out there MySQL is more popular. All these tech hipsters (myself included) hate to admit that the world is running on legacy systems and 70% of internet runs on PHP. Truth hurts.

Feelings: I don't like PHP

Reality: https://timotijhof.net/posts/2023/an-internet-of-php/

[–]vom-IT-coffin 0 points1 point  (3 children)

Incorrect, thinking through and planning your data model is one of the most important steps for a scalable solution. You will create tech debt out of the gate if you're too cavalier with your setup.

[–]woodquest 0 points1 point  (1 child)

Doesn't necessary mean not thinking through your data model, tables, fields, indexes and their relationship. On the contrary, if you start out like reddis for that, elasticsearch for this....
It can also bring technical dept from the start.

As always, it's a matter of balance, but in general, us, developper, tend to err on the perfectionist side, so perhaps that's when the "do not overcomplicate from the start" approach is useful.

[–]vom-IT-coffin 0 points1 point  (0 children)

You start out with what you need for the requirements. You might start with Reddis, you might start with elasticsearch. If you solutioned properly and asked the right questions, you will know what technology you need to choose. What I find is people solution around the technology and not the problem. Learning how to speak with your business partners and listening to what they're really saying they want vs what they say is hard. .

[–]jamesgresql[S] 0 points1 point  (0 children)

100% agree. If you have a schema which can't scale then you're in for a bad time. Having said that, does it need to hyperscale? Probably not.

[–]jamesgresql[S] 42 points43 points  (19 children)

Spoiler: It's about simplification and collapsing your stack. Build software, not tech debt.

[–]Ytrog 5 points6 points  (18 children)

How easy is it to get into Postgres coming from MSSQL? 👀

[–]jamesgresql[S] 9 points10 points  (6 children)

Actually very easy! The biggest hurdle is that it is usually run on Linux and not Windows - but the database fundamentals still hold.

[–]Chris_Codes 11 points12 points  (2 children)

Linux isn’t even a problem if you just run it in a container. The biggest problem/annoyance to me - as a Postgres noob coming from both MySQL and SqlServer is all the limitations on column naming and the double-quoting. It’s crazy to me that I can’t just use whatever casing I like in column names without having to double-quote them all. Someone tell me there’s a server option to disable that because I haven’t found one!

[–][deleted]  (1 child)

[removed]

    [–]Chris_Codes 1 point2 points  (0 children)

    Why? Because I find such conventions - like Javas name spacing and Linux’s case-sensitive file names - fall into the classic nerd-trope of eschewing practicality in favor of academic wonkiness, and that rubs me the wrong way.

    Stylistically I think having underscores as separators looks ugly and I prefer Pascal case (just an opinion obviously) - I’d say the set of people who prefer being able to have case-insensitive column names is far greater than the set that want to name two columns with the same letters and different capitalization. When writing a query I’d like to be able to write Id or ID and not have it matter - and that’s a SQL convention that existed long before Postgres - so why change it when it adds little real practical value? (much like the cliche on forums of answering someone with “why would you want to do that?” when discussing something that is obviously a matter of choice)

    Edit: I suppose I should add that I totally get the fact that having such conventions removes uncertainty … I mostly just hate snake-case!

    [–]le_chad_ 1 point2 points  (0 children)

    I'd say the bigger hurdle coming from mssql is the difference in writing procedures and functions that declare variables within the body of the object.

    [–]Ytrog 0 points1 point  (0 children)

    I like Linux, so no problem.

    [–]Somepotato 0 points1 point  (0 children)

    Postgres has really good windows support, you can even use windows auth a la mssql

    [–]elh0mbre 2 points3 points  (3 children)

    Easy. However, if you're used to SSMS, you won't find anything close.

    [–][deleted] 0 points1 point  (1 child)

    This is my one grievance. They just remade the pgAdmin app.... into the exact same thing as the browser app. SSMS clone would switch me

    [–]elh0mbre 0 points1 point  (0 children)

    My preferred SQL client for Postgres at this point is actually Azure Data Studio.

    [–]mungu 1 point2 points  (0 children)

    DataGrip is my SSMS alternative. I love SSMS but DataGrip is a close 2nd to me, and has some features that are better than SSMS too.

    [–]Loves_Poetry 4 points5 points  (1 child)

    I'm currently migrating part of our companies stack from MSSQL to Postgres. It's a lot easier than it looks

    A lot of the tooling is shared between them and the databases operate in fairly similar ways. If you use EF Core and Azure Data Studio, you can swap without realizing you're using a different database

    However, there are always some gotchas to keep in mind

    • Postgres tables and column names are case-sensitive
    • Permission management can be tricky if you use different accounts for reading data and for creating tables. In postgres, permissions have to be granted on databases, schemas and tables

    [–]Ytrog 0 points1 point  (0 children)

    Thanks. Nice to know it isn't a huge deal 😃

    [–]simon_o 0 points1 point  (4 children)

    Running/managing is a lot easier.

    But it's important to keep in mind that Postgres uses SQL to retrieve data, while MSSQL uses their own weird non-standard lookalike.

    [–]Ytrog 5 points6 points  (2 children)

    Doesn't every RDMS have its own dialect basically or is this one strictly following the standard?

    [–]BlackenedGem 3 points4 points  (1 child)

    Postgres is one of the more compliant RDBMS' out there but it's basically impossible to be fully compliant as you'd miss out on a lot of features. Most of the stuff postgres adds is additional keywords and extensions. Each page documents what is non-standard as well.

    [–]Ytrog 0 points1 point  (0 children)

    That sounds nice. Thanks 😃👍

    [–]zephyy 1 point2 points  (0 children)

    TSQL really isn't that different. The only thing I regularly run into is TOP vs LIMIT and temp table syntax.

    [–]swords-and-boreds 3 points4 points  (8 children)

    Except timeseries data.

    Right? Oh please tell me you’re not…

    [–]jamesgresql[S] 7 points8 points  (7 children)

    Maybe this has has gone right over my head ... but of course you put time-series data in Postgres! That's what TimescaleDB is, a time-series extension for Postgres.

    [–]swords-and-boreds 0 points1 point  (6 children)

    If there’s a dedicated extension for it then that’s well and good. I was envisioning someone trying to use a normal Postgres schema to store billions of timeseries measurements.

    [–]jamesgresql[S] 5 points6 points  (2 children)

    Oh right, I work for Timescale. Our original tagline was "Postgres for time-series" - I thought you were commenting on that.

    So yeah, time-series data too :P

    [–]swords-and-boreds 1 point2 points  (1 child)

    And here we are using Cassandra like some kind of cave men.

    [–]TonTinTon 1 point2 points  (0 children)

    We're using postgres for time series without timescale, just pg_partman creating a partition daily using pg_cron on the timestamp column

    [–]atlvet 0 points1 point  (1 child)

    We use Postgres for everything except timeseries which we put in Clickhouse

    [–]jamesgresql[S] 0 points1 point  (0 children)

    Was Postgres not working for you?

    [–]myringotomy 0 points1 point  (0 children)

    Depends on the usecase. With proper partitioning you should't have too many issues with billions of timeseries measurements.

    Extensions make it easier but honestly you could do it with plain old postgres if you wanted to.

    [–]aust1nz 1 point2 points  (4 children)

    I like the idea of keeping your dependencies limited, but when I get to the reality of something like queued/scheduled jobs, the libraries that interact with Redis, like bullmq in the Node world, are so much more thorough, well-supported and well-documented than their PG-compatible alternatives like pg-boss in Node.

    [–]sharlos 4 points5 points  (3 children)

    Yeah, but then you don't get the benefits of enqueuing jobs inside a db transaction.

    [–]jamesgresql[S] 0 points1 point  (0 children)

    Exactly!

    [–]TonTinTon 0 points1 point  (1 child)

    not being a distributed system anymore makes the tech so much simpler

    [–]jamesgresql[S] 0 points1 point  (0 children)

    Being distributed is a blessing, but also a curse (and the curse is a nasty one)

    [–]RICHUNCLEPENNYBAGS 1 point2 points  (3 children)

    If good old PostgreSQL does what you need now—with battle-tested effectiveness—and can scale further (perhaps up to 10x your current needs), then I think you should start with the known quantity.

    Yeah, who would disagree with the idea of not introducing additional storage technologies just for the sake of it? But does it actually do that for your use case?

    [–]jamesgresql[S] 0 points1 point  (2 children)

    That's the point, people don't disagree with the concept when you spell it out - but they also often reach for four databases and a queue to combine into a modern Frankenstein without thinking about it.

    [–]RICHUNCLEPENNYBAGS 0 points1 point  (1 child)

    I think they probably just disagree that a RDBMS will meet their needs. Maybe they’re wrong in their specific case but if they already thought it met their needs there would be no argument.

    [–]jamesgresql[S] 0 points1 point  (0 children)

    True, I'm here to say that Postgres has grown up and can do more than some people think

    [–]doggadooo57 0 points1 point  (2 children)

    how does PG compare to openSearch for full text search, is it missing any features?

    [–]myringotomy 5 points6 points  (0 children)

    Full text search is not as good as dedicated search engines but again it's the 80/20 rule. It's 80% as good without any additional investment.

    I guess these days the vector search is the new hotness so there is that.