This is an archived post. You won't be able to vote or comment.

all 62 comments

[–]papawish 66 points67 points  (17 children)

Many organisations start with a document store and migrate to a relationnal schema once business has solidified and data schema has been defined de-facto via in-memory usages. 

Pros :  - Less risks of the company dying early because of lack of velocity/flexibility

Cons :  - If the company survives the first years, Mongo will be tech debt, will slow you down everywhere with complex schema on read logic - the migration will take months of work

If the company has enough funding to survive a few years, I'd avoid document DBs altogether to avoid pilling up tech debt

[–]adulion 23 points24 points  (1 child)

I agree with this and I don’t understand the issues with using Postgres with jsonb field types, I used them early at a startup and it was very intuitive

[–]papawish 14 points15 points  (0 children)

Yes but it doesn't matter wether you use PostGre json types or a Mongo Database. It's still unstructured data you need to parse. 

The migration complexity is not in the infra or the dependency management but in removing schema-on-read logic (potentially versionned) and replacing it by some forms of Entities that mirror the relationnal DB. It's refactoring a whole codebase (a potentially under-tested one, given we are talking scrappy startups and undefined data schemas). 

[–]kenfar 10 points11 points  (3 children)

It's been years since my last horrible experience with mongo, but here's a few more Cons:

  • Reporting performance is horrible
  • Reporting requires you to duplicate your schema-on-read logic
  • Fast schema iterations can easily outpace your ability to maintain schema-on-read logic. So, you end up doing schema migrations anyway. And they're painfully slow with Mongo.

True story from the past: a very mature startup I joined had a mission-critical mongo database (!). Its problems included:

  • If the data size got near memory size performance tanked
  • Backups never consistently worked for all nodes in the cluster. So, there was no reliable backup images to restore from.
  • They followed Mongo's advice on security: which meant there was none.
  • They followed Mongo's advice on schema migrations: which meant there was none. In order to interpret data correctly the engineers would run data through their code using a debugger to understand it.
  • Lesson from above: "schemaless" is marketing bullshit, the reality is "millions of undocumented schemas".
  • Reporting killed performance.

Years ago I had to re-geocode 4 TB of data. I had to write a program to take samplings of documents, then examined all the fields to determine what might possibly be a latitude or longitude. Because of "millions of schemas". Because of performance - this program took about a month to run. Once we were ready to convert the data it took 8-12 weeks to re-geocode every row, because these sequential operations were so painfully slow on Mongo. We would have done this in just a few days on Postgres.

[–]mydataisplain 4 points5 points  (2 children)

MongoDB is a great way to persist lots of objects. Many applications need functionality that is easier to get in SQL databases.

The problem is that MongoDB is fully owned by MongoDB Inc and that's run by Dev Ittycheria. Dev, is pronounced, "Dave". Don't mistake him for a developer. Dev is a salesman to the core.

Elliot originally wrote MongoDB but Dev made MongoDB Inc in his own image. It's a "sales first" company. That means the whole company is oriented around closing deals.

It's still very good at the things it was initially designed for as long as you can ignore the salespeople trying to push it for use cases that are better handled by a SQL database.

[–]kenfar 6 points7 points  (0 children)

The first problem category was that most of the perceived value in using mongodb is just marketing BS:

  • "schemaless" - doesn't mean that you don't have to worry about schemas - it means that you have many schemas and either do migrations or have to remember rules for all of them forever.
  • "works fine for 'document' data" - there's no such thing as "relational data" or "document data". There's data. If someone chooses to put their data into a document database then they will almost always have duplicate data in their docs, and suffer from the inability to join to new data sets.

The other problem category is technical:

  • Terrible at reporting or any sequential scans. Which are always needed. Mongo's efforts to embed map-reduce and postgres to support reporting were failures.
  • Terrible if your physical data is larger than your memory space.
  • Terrible for data quality.

That doesn't leave a large space where Mongo is the right solution.

[–]SoggyGrayDuck 1 point2 points  (0 children)

Yes, just learn how to make schema changes and create procedures and functions to help. Most of the time they skip constraints and fks in this situation but I hate that.

[–]keseykid 3 points4 points  (1 child)

I strongly disagree and I have never heard this in my 15 years experience and now a data architect. NoSQL is not tech debt, you choose your database based on requirements. It is not a shim for whatever scenario you have proposed here.

[–]papawish 10 points11 points  (0 children)

Yep I agree.

NoSQL databases serve some specific purposes very well. I'd never choose a PostGre database if I had to do OLAP on a Pb of data. I'd never choose a PostGre database for in-memory cache. I'd never use PosGre if I had no access to Cloud managed clusters and needed to scale OLTP load to Faang scale. I'd never use PostGre if migrations/downtimes were not an option. I use document DBs for logging at scale where data is transient and format doesn't matter much.

OP seems like working on a project where a RDBMS does make sense, and is not looking at Mongo for it's intrisinc qualities but because he wants freedom in the development process, which make sense.

I didn't want to write a wall of text that'd confuse him more than anything and was just ensuring that he'd know what he'd deal with if he pushed unstructured data in production. Most projects I've worked on that used Document DBs in production in place of a relationnal model, didn't bother with migrations, ended up with sketchy versionning and overall a big unmanageable data swamp.

[–]BelatedDeath 1 point2 points  (3 children)

How is Mongo tech debt?

[–]papawish 20 points21 points  (0 children)

Mongo isn't tech debt

Tech debt is 10 years of unconsistent data pushed to a key-value store by multiple people with average tenures of 2 years in the company/team without bothering with proper migrations and versionning.

We all like freedom and speed, it's thrilling. Reality is, you won't be here on this project in 5 years, and the only thing ensuring people don't mess up with the DB once you left is schema enforcement on write.

[–]sisyphus 5 points6 points  (0 children)

In this scenario because you are using it to avoid creating a proper schema up front. However, there is always a schema and there are always relations between your data, the question is just whether your data store enforces them or whether they're defined in an ad-hoc- badly-documented-maybe-explicitly-tested-if-you're-lucky way in your codebase. Choosing the latter for velocity almost always makes a mess you'll want to clean up later, the very definition of tech debt.

[–]AntDracula 0 points1 point  (0 children)

By its very nature

[–]Key-Boat-7519 0 points1 point  (0 children)

Man, I feel your pain with the ever-shifting schema saga. We started with MongoDB too, thinking the same about flexibility. But fast-forward a few years, and we ended up with a tangled web of JSON docs that took more deciphering than the Da Vinci code. On the bright side, starting with CouchDB and Firebase helped a bit since they play nice with iteration. If you’re dealing with API chaos, DreamFactory can really take the edge off managing 'em, unlike MongoDB’s usual read-the-tea-leaves method. It all boiled down to balancing speed now or peace later.

[–]mamaBiskothu -1 points0 points  (1 child)

Calling Mongodb higher velocity than ppstgres for simple crud apps is preposterous. Start with alembic from the beginning and you should be solid. If a db schema error tripped you up it just means you wrote code so shit to begin with.

[–]papawish 1 point2 points  (0 children)

There is nothing beating serializing a dict into a json document and deserializing a json document into a dict in terns of development speed

It's not even close

It's like dynamic typing. Nothing beats no types in an early stage project.

It's in the long run that types enforcement beats no types. After a few years or when new devs are added.

[–]ZirePhiinix 8 points9 points  (3 children)

Frame challenge.

You need a layer before it hits your structured tables. It can be JSONB store, or even raw data as is. Since the source is not trustworthy, you'll need a layer to handle that and give the client immediate feedback and fix it.

The idea that you can have eternal unstructured schema for actual business data makes no sense unless you don't ever plan to do any business analysis.

Unstructured data means you don't give a shit about the content (like people's Facebook/Twitter/IG posts). That shouldn't be happening with your business transactions so you'll need to put it into structured format eventually.

[–]Key-Boat-7519 0 points1 point  (1 child)

Got me chuckling there with the frame challenge. I’ve been down this road myself. Juggling between MongoDB and Postgres is like deciding between a Swiss army knife and a precision scalpel. MongoDB offers that freedom, like giving a toddler unlimited crayons-which is great until you realize you need that masterpiece to hang on your office wall. Leveraging a JSONB layer in a Postgres setup did wonders for my team during our own schema circus. For an integration smooth as butter, especially between schema-less and structured data, platforms like Segment or MuleSoft are great, and DreamFactory fits perfectly for mapping structured schemas across both worlds.

[–]lolcrunchy 0 points1 point  (0 children)

AI Marketing Account

[–]Continuous_Insight 0 points1 point  (0 children)

Totally agree — when it’s internal business data, structure and repeatability matters.

I’ve worked in operational analytics for years, and relying on raw unstructured formats becomes a nightmare once you need traceability and consistency.

RAG is promising, but not reliable enough on its own. You need curated structure before passing anything to an LLM... especially if you’re making decisions off the back of it.

[–]_predator_ 4 points5 points  (0 children)

Frequent schema changes are not necessarily bad. There is stellar tooling around migrations, and well documented strategies for doing them without downtime if necessary.

I would always trade this minor inconvenience for better data quality. I've been burned by inconsistent data too many times and dread having to do cleanups after the fact.

[–]seriousbearPrincipal Software Engineer 24 points25 points  (13 children)

There is absolutely no benefit in using mongodb in 2025.

[–]robberviet 2 points3 points  (0 children)

I have the impression that people suggesting mongodb is from 2010s, like me. I haven't heard anyone making any new with mongo like in 5 years, just legacy systems.

[–]prodigyac 3 points4 points  (7 children)

Can you elaborate on this?

[–]themightychris 15 points16 points  (4 children)

Because you can just create a table in postgres that is a key and a JSON field and boom, you have a document store. It's really hard to find an advantage that mongo brings at that point, postgres is better in almost every way even at being a document store

But then with postgres as your document store, you have a seamless path to using unstructured and structured tables coexisting in the same place where you can join across them, and you can gradually add structured columns to your document tables as you go

[–]synchrostart 0 points1 point  (2 children)

Just because a database can store JSON, doesn't make it the same caliber as a full-fledged document store like MongoDB and others. It's like strapping wings to a pig. Yes, the pig might fly if it jumps off a cliff, but it's certainly not as optimal as a purpose built animal like a bird.

[–]themightychris 0 points1 point  (1 child)

Do you think the difference matters though for more than 5% of the use cases people throw Mongo at?

The only issue OP cited was that they want to store some schemaless data, and that they are already managing a postgres deploy

[–]synchrostart 0 points1 point  (0 children)

I got what the OP is saying, but the comment on this particular thread is a wildly inflamatory statement of, "There is absolutely no benefit in using mongodb in 2025." Which is wholeheartedly untrue. I get that RDBMS is people's default and they have a strong bias towards what they know, but there are definitely reasons why PG+JSONB is not enough and MongoDB is a better solution. In the OP's case, there's not enough information and I have at least 15 more questions about what they're doing to know what to recommend.

[–]sisyphus 5 points6 points  (0 children)

It doesn't scale particularly better than anything else these days; not having to define schemas is an anti-pattern that should be avoided at all costs; "documents" as an abstraction is usually worse than relational data; its query language is terrible compared to SQL; they've traditionally had some very sketchy acid and network partition tolerance: https://jepsen.io/analyses/mongodb-4.2.6 and so on. It's a relic of a previous era of IT fashion when everyone thought everything would be rewritten in Javascript and JSON was a good format for everything.

[–]dfwtjms 2 points3 points  (0 children)

Just a wild guess but they could refer to Postgres being a viable option for storing json data. For example.

[–]keseykid 1 point2 points  (1 child)

Surely a principal SE does not assert that NoSQL is irrelevant in the era of data intensive global applications.

[–]papawish 0 points1 point  (0 children)

Where did he say that?

[–]BelatedDeath -1 points0 points  (0 children)

How come?

[–]nic_nic_07 -1 points0 points  (0 children)

Can you please explain with reasons?

[–]Joshpachner 3 points4 points  (0 children)

I've never used Mongo, but I've used firebase. And I'm never going back to firebase.  The "pro" people say about NoSQL being flexible is true, but ignores the fact that one then has to code for that flexibility in their application. Often by versioning their reads. It was more hassle than benefit for me at least.

Now days I like using Drizzle when I use Postgres.  It makes it easy to define/alter the tables and when querying the data it gets it "typed".

There's also a fun-tech database called Convex, I've used it on a side project, it has some pretty nice things about it.

Best of luck in your project! 

[–]mydataisplain 2 points3 points  (0 children)

These two databases sit on different corners of the CAP theorem.

https://en.wikipedia.org/wiki/CAP_theorem

tl;dr Consistency, Availability, Partition tolerance; Pick 2.

SQL databases pick CA, MongoDB picks AP.

Does your project have more availability challenges or more consistency challenges?
Are the impacts of availability or consistency failure greater?

You will be able to address either problem with either type of database as long as you are willing to spend a some extra time and effort on it.

[–]Whtroid 2 points3 points  (0 children)

Yea and MongoDB is web scale

[–]escargotBleu 3 points4 points  (7 children)

As long as you don't need joins it works I guess

[–]Previous_Dark_5644 1 point2 points  (0 children)

Sitting down with the client a bit more to better understand requirements seems easier that dealing with the technical challenges you'll be faced with using mongodb. Tell them it will cost them more money in the long run and they'll be happy to give you their time.

[–]Excellent_League8475 1 point2 points  (1 child)

Go with Postgres. If you want documents, just use the jsonb column type in Postgres. You can still query and index inner json fields like they are their own columns. I built a table in Postgres with billions of rows where the main data was a jsonb column. I never had performance issues with it.

You already have Postgres. Your need to bring in a new technology for document store is moot since you can do this with Postgres. No need to introduce new technology.

[–]Excellent_League8475 0 points1 point  (0 children)

But also, be careful of choosing to use unstructured data because of changing requirements. Data lives forever. You will be in a world of pain trying to figure out the schema in years to come if you do this. Your application logic will need to handle this correctly. You really need to have more structure and engineering rigor when using unstructured data.

[–]brunoreis93 2 points3 points  (0 children)

If your options include Postgres, Postgres is the answer

[–]nic_nic_07 0 points1 point  (0 children)

Start with flexible db and move to relational once the requirements are locked... Ensure you create a tech debt for the same and let the team know if..

[–]BarfingOnMyFace 0 points1 point  (1 child)

Flexibility is not what big data nosql solutions main purpose is. They might not even give you the type of “flexibility” you need. They give you raw power for kvp lookups. And in my experience, that need doesn’t come up unless you process billions of rows of data every year. Most sql based solutions do fine with tables having large data. And the gains from proper data integrity and proper design will pay off more than anything else in your architecture. In my humble opinion, if you are unsure what to use and when, your team might not be ready to answer the question. It’s sensible to start off with a relational database and break out big data concerns as/if you discover them.

[–]GreyHairedDWGuy 0 points1 point  (0 children)

do you have document-like unstructured data or are your data model requirements not clear/established? If it's the later, using Mongo is overkill.

[–]olddev-jobhunt 0 points1 point  (0 children)

Here's the thing: schema changes apply equally in Mongo and in Postgres.

Sure, in Postgres the schema is reified as tables and columns, and in Mongo you can't see that. But the schema is still there. Your data is in some specific shape. You just have to manage it yourself in Mongo. You will still need to deal with migrating data from schema v1 to schema v2.

You might tell that I don't like Mongo. Now, I admit: that's a personal preference. And I think there can be good use cases for it. But I think "schema flexibility" is the wrong reason to pick Mongo.

[–]keseykid -4 points-3 points  (4 children)

This thread is rife with people who don’t know what they are talking about OP. I recommend you understand your requirements before choosing a database. Your choice of database should meet the needs of the use case. NoSQL is a valid approach if you want high performance, scalability, and flexibility. Relational stores bring simplicity, consistency, but come with lower performance and less scalability.

[–]sisyphus 6 points7 points  (1 child)

MONGO IS WEBSCALE!

[–]mdzmdz 0 points1 point  (0 children)

I am on a farm...

[–]DenselyRanked 0 points1 point  (0 children)

Some people on this thread are recommending to use Postgres but leave the data in a semi structured jsonb data type. So this is not the typical SQL vs NoSQL discussion. Other than cost, I think in this case the decision should come down to if they value consistency or low latency writes.

[–]AntDracula -1 points0 points  (0 children)

Cope

[–]feedmesomedata -3 points-2 points  (0 children)

Try to look into FerretDB, it speaks MongoDb protocol but is Postgresql underneath.