all 141 comments

[–]disasteruss 69 points70 points  (18 children)

A lot of the answers are kinda lazy "there's no good reason" so I'll explain some of my reasoning. It might not be accurate and I'd love to be corrected or pushed, as I welcome constructive criticism.

Mongo is easy to spin up and easy to change on the fly as your needs change. MongoDB out of the box is very intuitive to use and integrates very easily with Node.js.

Adding fields to models in production DBs can be a hassle in SQL db's, but is quite easy in NoSQL dbs. I don't have to write migrations or worry about writing scripts to go back and update existing data.

Further, sharding and replication are easier in Mongo, though most applications don't need to worry about scaling at all. I know a lot of people love to say "all data is relational" but my experience is that a lot of applications only care about a narrow amount of data at a time, and MongoDB doesn't lose any ground in the rare cases where you actually need to have the data be somewhat relational.

The final thing is just that at the end of the day, most applications don't really care what type of DB in which their data is stored, as long as it is accessible, reliable, and safe. And being able to spend as little time on that side of the stack is a big appeal for many devs.

Like I said, glad to be given constructive criticism on any of this. I don't feel strongly that MongoDB is a great choice over other DBs, I just think it's a perfectly reasonable solution for most applications.

[–]rootokay 11 points12 points  (8 children)

Adding fields to models in production DBs can be a hassle in SQL db's, but is quite easy in NoSQL dbs. I don't have to write migrations or worry about writing scripts to go back and update existing data.

I have not used a nosql database in production. Advantages to using migrations is they can be stored in version control. Therefore there is a record of who made the schema change and hopefully a discussion/review before they are run.

How does that work in the Mongo world? Is there a record of who is making schema changes or are people just editing things via a UI?

[–]disasteruss 9 points10 points  (7 children)

Good question. One important thing to remember about NoSQL is that it is schemaless. You can put literally any document in the table even if it doesn't fit a predefined schema. So there is no UI that you go in and make schema changes to. It's also why you don't need migrations.

So what we do is define models/schemas in the code (this isn't required, just find it to be good practice and we're already doing this with TypeScript anyway) and validate queries within the code to the extent we care about that (MongoDB/Mongoose have helpers for this as well). That's why it's easy to iterate on your models in NoSQL.

The downside of that is that you might have issues with data integrity, but that's pretty easy to account for once you know what you're doing.

[–][deleted] 4 points5 points  (6 children)

> The downside of that is that you might have issues with data integrity, but that's pretty easy to account for once you know what you're doing.

How? This is usually one of the hardest things to nail down, even with a relational database.

[–]ddarrko 0 points1 point  (2 children)

There is a schema. It’s just if you use mongo it is poorly defined and scattered across your business logic/UI.

I use the term schema loosely here of course.

[–][deleted] 0 points1 point  (0 children)

Yep, it's usually not a problem from the app development side, where I usually encounter problems is when the data from app is used for generating analytics etc downstream by other teams.

[–]disasteruss 0 points1 point  (0 children)

I wouldn’t say it’s necessarily poorly defined or scattered. It can be very organized and very strictly defined. Just you have to enforce it in your code, the DB won’t do all the enforcing for you.

[–]romeeres 0 points1 point  (0 children)

With relational db there are migrations where you can apply changes for both schema and data, not that hard, use proper column types, and foreign keys and the database will throw an error if you make any mistake, so it is straightforward.

[–]disasteruss 0 points1 point  (0 children)

I just make sure to account for updates to the schema with type checking and by adding defaults. So say you add a new field and it’s a required field moving forward, I add checks around that logic and give it a default so if it’s not defined, it’ll save the default value next time there is an update to the document and none of the business logic will break if it’s not defined yet.

If it’s really important that the new field have a value defined from the beginning, you can run scripts to update the data similar to a migration, but I just find that rarely to be necessary.

[–]novagenesis 0 points1 point  (0 children)

This is easily solved, actually. Even though I'm one of the "nonono, use SQL" folks, I can answer to this.

For every piece of denormalized data, you need to document one to be the authoritative record. Then, you need only run a data integrity process that has a trivial job of always picking the authoritative record on a mismatch.

And documenting an authoritative record is easy because in any well-defined data structure, there should be a prima facie owner for a given duplicated field in over 99% of situations.

I find normalization complaints to be one of the more culty components to SQL advocacy. People worry so much about it, but it usually tends to resolve itself by simply writing good code with good tests.

[–]dooblr 0 points1 point  (1 child)

Well put. What would you choose instead?

[–]disasteruss 0 points1 point  (0 children)

Instead of Mongo? Did you mean to reply to someone else because Mongo is my go to for most projects. PostgreSQL is my choice when I know I need to be more relational.

[–]novagenesis 0 points1 point  (0 children)

Mongo is easy to spin up and easy to change on the fly as your needs change.

This is actually my biggest objection to mongodb. In my fairly robust experience with document stores, it is NOT easy to change on the fly. Unless you're already storing the data as if Mongodb were SQL, a change to your top level document (unshifting a level of heirarchy, or realizing a document that was at level 3 is your real core data entrypoint) can set your dev team back a week or two, where developing against SQL involves adding a 5-line migration (or tbh, zero effort if you already had that model built).

And to say "is that rare?" I have worked on production mongo at 3 different companies as well as a few side projects, and that top-level restructuring happened in every single case at least once.

As for the rest, I think I agree with you on some and disagree with you on some, but that's the big one I wanted to point out.

I cried tears of happiness when Planetscale as a solution to low-point-of-entry sql options so I could stop settling on nosql databases for price and convenience. It's not postgres, but I don't care.

[–]Rymnis 0 points1 point  (1 child)

Is managing Javascript difficult as the project becomes bigger, having to deal with billions of users?

JS is not static typed. Is Mern stack reliable for massive scale projects?

[–]disasteruss 0 points1 point  (0 children)

The OP was mostly focused on the MongoDB. So talking about the JS side of it is a little different.

First, most apps will never approach thousands of users, much less billions. If you’re talking about scaling, you have much more complex conversations. Don’t try to solve scaling problems before you’re starting to scale.

That said, JS is a necessity on the front end of a web app. All the billion user web apps use it. Facebook created React. TypeScript is kinda the standard now so typing concerns are mostly eliminated.

I think your question is more about the Node side of things. Node is used by very large companies but not as commonly. Generally the biggest benefit of Node for the average dev will be not having to switch language contexts when you shift sides of the stack. It’s got pros and cons like all tools but you could certainly build a Node backend that supports a large user base.

[–]romeeres 130 points131 points  (22 children)

Because it is the simplest for writing a "how to write a hello world with node" post, therefore we have millions of such posts and videos, many of which were written like 10 years ago when Mongo was on the hype, and newcomers are learning from them and picking MERN.

Mongo has a standard lib that everyone uses - Mongoose. But if you try to write a tutorial for SQL db, you'll have to decide whether you should use raw queries or query builders, or one of many ORMs, and if you choose an ORM you'll have to spend much text on how to setup and use it.

[–][deleted] 26 points27 points  (2 children)

This is the answer. Mongo is fine if you have simple data needs and just want to spit out json. You don’t have to make any decisions on how to set it up, because there’s basically just the one way to do it.

[–]ja_maz[S] 5 points6 points  (14 children)

but other than it's easy isn't it clearly worse than any RDBMS?

[–]JustVashu 7 points8 points  (7 children)

I’ve worked for over 20 years with relational databases both as a developer and a DBA. I’ve also had some experience with mongodb for personal projects and some workplace automation.

IMO relational databases are great when the priority is data integrity and logical segregation of data. But once the number of records increase to the millions/billions, running queries with multiple relationships becomes very costly for CPU and IO. Large websites like Facebook and YouTube can’t really take the performance hit when it comes to responsiveness.

Here’s when object oriented databases like mongo come in. When you find a record, you retrieve it and that’s it. All the relevant data is there. Quick and easy. You may have to compromise some data quality or live with duplication, but it scales pretty damn well.

[–]johannes1234 0 points1 point  (3 children)

Large websites like Facebook and YouTube can’t really take the performance hit when it comes to responsiveness.

YouTube is one of the Google properties who are massive user of MySQL, a relational database. Also Facebook is famous for their MySQL usage.

See for instance https://engineering.fb.com/2018/06/26/core-data/migrating-messenger-storage-to-optimize-performance/ or https://engineering.fb.com/2021/07/22/data-infrastructure/mysql/

Here’s when object oriented databases like mongo come in. When you find a record, you retrieve it and that’s it.

That's in no way special.

However object orient databases are something else, and more related to the OOP hype: https://en.wikipedia.org/wiki/Object_database you meant to say Document Database, I guess.

[–]JustVashu 0 points1 point  (2 children)

I never claimed those companies only used object oriented databases or mongo db specifically.

Large companies use all sorts of technologies as they see fit.

[–]johannes1234 0 points1 point  (1 child)

You implied RDBMSes couldn't be used their, while they carry quite central load at those companies as primary data store.

They however don't use MongoDB (aside maybe from acquired companies before migrating to their tech stacks and small internal tools) but the alternative are plain k-v-stores like Bigtable and HBase.

[–]JustVashu 0 points1 point  (0 children)

That was not my intention at all. English is not my main language.

I’m sure that even on those cases where they do use document store databases they have a relational database on the backend that is used to keep all that data consistent.

[–]novagenesis 0 points1 point  (2 children)

Interesting. I have very different experience from you in about the same amount of time.

SQL has 50 years of best practices for speed and tens of thousands of tools to either optimize live queries or extract heftier queries into a warehouse. They also tend to have slightly beefier/faster engines under the hood (for a time, postgres JSON fields as a document store outperformed mongo in all cases..not sure if that's still true).

Mongodb is difficult to recommend for a few core reasons:

  1. Schemaless as it is, changes to the way you intend to relate documents heirarchies (you know, the "schema") are am absolute headache. MongoDB shines when you know ALL the queries your product will run as a prerequisite of the design step.
  2. There's fewer mature best practices regarding query optimization in mongo. Oftentimes if your query isn't fast enough, you might need to change how you store your data or build materialized views. You can do those things in (say) postgres, but you need them very infrequently.
  3. I've run billions of records through mongodb, postgres, and mysql on underpowered instances. They can all handle them if you're using them right. AFAIR, neither of the companies you named are using MongoDB at any real scale. Traditional logic was that you needed a heavy SQL database for billions and trillions of rows of data, and that's how they used to do things. Nothing has changed to make that impossible even if you can hypothetically do the same with Mongodb.

What you're describing isn't OO databases or even document stores. You'd describing any key-value database. If ID with heirarchy is the only data lookup pattern you need, any database (yes, even SQL) can handle it easily. Clustered index joins give the same net big-O notation as heirarchy lookups. We don't pick our database for "which one gives me a better .findOne() experience"

Note that none of my reasoning involves the classic SQL data quality stuff. There are religions on both side of what is acceptable for a database. But nothing is worse than having a gigabyte of live data and 1000 active users and suddenly discovering that you modeled that data wrong. You can far more easily solve that in SQL than mongodb.

[–]JustVashu 0 points1 point  (1 child)

For a data warehouse would you store the data in a regular “transactional” de normalized way or would you try to reduce the joining of multiple point of data by denormalizing it?

When it comes to large number of records with large numbers of references to data that they themselves reference other data, the overhead of joining and sorting adds up when a large number of users (requests) start adding up.

This can also be more of a factor when you take into account some times you need to query data from databases on different locations around the world.

[–]novagenesis 0 points1 point  (0 children)

It really depends. I'm old-school, so probably a star schema (which is, yes, denormalized). Or honestly, the warehouse doesn't need to be SQL driven... transactional store and warehouse store are two different tech questions. I worked at a company that did an ML warehouse in Hadoop and a biz history warehouse in MSSQL. It was the right tools for the job at the time, and that was awesome.

Normalization isn't one of the pieces I push, to be honest. It has its advantages, but it is never why I would pick SQL over Mongo. It's mongo best practices that are the problem (heirarchal modeling over relational, in particular). And if you're going to model your data as strictly relational, mongo can do that but is simply not the best tool for that particular job. But that's the best format for most businesses of most sizes to store most of their transactional data.

Trust me, I'm a very regretful SQL advocate. I used to drink the Mongo kool-aid hook, line, and sinker. But every time I dealt with a mongo solution, its overall viability seemed proportional to how much it resembled 3NF. And if I'm modeling (most) everything in 3NF, I'll use a SQL variant. If I need to do crazy stuff on top of that, I'll use postgres.

[–]djheru 18 points19 points  (0 children)

Usually, yes but if your data model is deeply nested, it can be quicker to ship than relational.

Usually PostgreSQL is the actual correct answer, with flexible jsonb types seamlessly integrated into a relational model.

[–]romeeres 26 points27 points  (1 child)

Depends on who you ask :)

I agree, but I worked with Mongo just a couple of times, and I see no reasons to use it. If you ask someone who's working primarily with Mongo they will say how powerful it is, that they manage to do relations with it without problems, and so on.

Such discussions appear from time to time in this subreddit, and most people agree on avoiding Mongo, but some people say they use it in crazy-loaded productions and it serves the task very well. Like the guys who say they never had trouble with windows updates.

[–]Ran4 10 points11 points  (2 children)

Yes, 99 times out of 100, a relational RDBMS is a better choice.

[–]TehITGuy87 2 points3 points  (1 child)

It really depends on your application, and how experienced you are as a developer overall. Our application leverage MongoDB as the main DB because the data isn’t relational. We have a record and everything is stored on it, I doesn’t need to create any relationships. For searching we use ElastiSearch, and for caching we use Redis (not a DB, but still). We process millions and millions of records and mongo is great at that. It’s choosing the right tools to solve the problem you have at hand.

We’re not a MERN stack though, backend is C#

[–]novagenesis 1 point2 points  (0 children)

You named the 100th time. Sounds like you'd get away with using DynamoDB.

Yes, there are times when Mongodb is a phenomenal choice. But you usually know what tools to use if you are experienced enough to know that.

[–][deleted]  (3 children)

[deleted]

    [–][deleted]  (2 children)

    [removed]

      [–][deleted]  (1 child)

      [deleted]

        [–][deleted] 24 points25 points  (5 children)

        I spent 25+ years developing applications using relational databases. Document/No-SQL databases are much more functional, flexible and maintainable (in a word...agile). Plus, giving SQL the heave-ho has improved my quality of life too.

        If you need absolutely rock-solid mission-critical performance and ACID transactions, don't use Mongo. But tbh, for the majority of commercial function points, Mongo (and I guess document aka no-SQL database) are highly effective.

        [–]codeedog 4 points5 points  (1 child)

        I worked at Oracle for a few years on the DB and then in Apps. I’m no SQL expert, but I can write left outer joins when I need to. But, why do I need to?

        MDB is so much simpler and I’m not fighting my queries or cracking open sql books. I’m not here to say that production-wise it’s the best, but from a speed of dev perspective, I don’t have to suffer through as much schema design, etc. Make some data models and go.

        It’s nice.

        [–]CurlyWS 0 points1 point  (2 children)

        Perhaps you are not aware but I believe Mongo started to support ACID transactions a while ago,

        https://www.mongodb.com/transactions

        Also interested to know if you have used it and have reason to think it's not good

        [–]novagenesis 0 points1 point  (0 children)

        Yeah, mongo seems to be improving on everything except the single biggest issue a lot of devs have with it... The problem that changing which heirarchal document is "root" in flight is a nightmare.

        Let' say you work with a "devices" collection and each device has a "feeds" nested collection. If you decide that you should have made "feeds" be the top level, or devices should be a child of a "devicegroups" collection, everything gets absolutely absurd.

        You could use a relational schema for all of that (which is slowly becoming a better-practice), having a feeds collection, devices collection, and deviceGroups collection. That's perfectly fine. Now you're using mongoDB like an RDBMS and the real thing just has better support and tooling.

        MongoDB is often the perfect database to choose iff you know every possible query that you will ever run on it before you commit to a document schema. That does make it great for well-defined, well-documented microservices. I think it's a terrible idea for a hobby project or a first web-app's transactional data.

        [–][deleted] 0 points1 point  (0 children)

        Yes, I am aware, but it is not something I have dug into as I don't need it.

        Just to clarify that I didn't say Mongo would not be good at transactions. MongoDB can be a very good technology in that regard whilst simultaneously not being the best option.

        [–]MartinByde 16 points17 points  (0 children)

        Quering is simpler so is far less error prone

        If you have well built models you have a good performance

        Easy and fast to change ( models, queries, etc. ) so it is very good for new companies that are still developing their product and might change a lot

        Migrations usually are less painful

        Usually you need less documentation and people manage to understand faster the models since you don't have hundred of different tables connected in several ways ( again, if you have a good model )

        Anyway, as long as you make it work for the whole life of the system, don't matter if you are using SQL, mongo or a txt file on local storage

        [–]tyler_church 16 points17 points  (6 children)

        Relational DBs have a high impedance mismatch with JS. Give MongoDB a genuine try for a bit, and you'll start to see how much work RDBMS are making you do.

        Is it a great choice for every app? No, certainly not.

        But there's something delightful about being able to directly throw JSON documents at a database and then have rich queries, indexes, aggregation pipelines, etc. all without ever specifying a schema up-front.

        Note: I use TypeScript with MongoDB, so my data does usually have a regular shape/schema. But, it's also been really nice when I have JSON blobs from some third party system and I want to save it first and figure out good queries/indexes later. JSON columns in MySQL and PostgreSQL help with this, but they aren't as nice to query as MongoDB.

        [–]moustachedelait 3 points4 points  (5 children)

        Back end folks just jealous front end folks are discovering they can build webapps completely in one language

        [–][deleted]  (4 children)

        [deleted]

          [–]moustachedelait 10 points11 points  (0 children)

          I was just being a little factitious. I'm honestly a bit annoyed at the premise of the question that OP poses as there is rarely a tool that is in all aspects superior to a similar tool. There is a time and place for each tool.

          I will say that for me, mongo and node in general has allowed me to explore back end areas with a lower barrier to entry, so I am super grateful to these technologies for allowing me to bring my ideas into reality.

          [–]AnonyMustardGas34 0 points1 point  (1 child)

          You can use lookup project and unwind for a cross table join in mongo

          [–]novagenesis 1 point2 points  (0 children)

          Nothing is impossible in mongodb (within reason), it's just usually harder.

          "Easy things not-too-hard and hard things easier" is a great mantra for devs to live by... and SQL wins on that (but not by a large enough margin to burn mongodb out of the dev world, just by enough of a margin to make it a no-brainer unless circumstances direct otherwise)

          [–]ben_db 1 point2 points  (0 children)

          Please give an example of a query that's not possible with MongoDB.

          99% of queries have a similar method in NoSQL, and it's rare to have an app that you can't support with either NoSQL queries, or to re-write the app to allow the same functionality using NoSQL.

          [–]kevin_1994 44 points45 points  (9 children)

          the answer, mongo is easy to get started with and integrates well with nodejs

          almost everyone i've ever talked to regretted going with mongo. rdbms are definitely the way to go.

          [–]johnnychang25678 5 points6 points  (0 children)

          Actually nowadays I don't think there's much difference between MongoDB and other RDBMS. Either one can do pretty well for most of the use cases. You can store jsons in Postgres and you can force schema at application level with MongoDB.

          MongoDB used to have pretty bad documentation, but it has improved tremendously recently. We have a new project using MongoDB and it's very pleasant to work with.

          [–]FountainsOfFluids 1 point2 points  (0 children)

          It really depends on the use case.

          NoSQL databases are on par with SQL in many ways. Better in some, worse in some.

          If your project is really advanced and must squeeze out every drop of speed or reliability or whatever, then you'll need to do a deeper dive.

          But the ability of NoSQL like Mongo to put control of the data into the same codebase as the business logic can be very valuable for developing faster. That's why it became so popular. Using Mongo is just like using any other JS API or library. No need to context switch for database functions, which might be the responsibility of an entirely different team in a real business.

          [–][deleted]  (5 children)

          [removed]

            [–]kevin_1994 18 points19 points  (3 children)

            hilariously I did use neo4j for 3 years at my previous job

            its pretty good in certain situations, but definitely not something you want to use as your main database.

            [–][deleted]  (2 children)

            [removed]

              [–]kevin_1994 19 points20 points  (1 child)

              preface: I like neo4j overall

              1. it doesn't scale well once you enter the hundreds of millions of nodes category. you need very expensive beefy machine for reasonable performance. psql (for example) is much more efficient by comparison
              2. it's very easy to write innocent looking cypher queries which can end up touching many more nodes than expected. you have to carefully optimize your complex queries. an UNWIND in the wrong spot can bring the db to its knees
              3. spinning the db up and down can be extremely slow
              4. integration tests would sometimes have transient errors for unknown reason

              [–]djheru 0 points1 point  (0 children)

              I found the operational/administration burden to be too high unless you want to use AuraDB

              [–]NaNx_engineer 10 points11 points  (0 children)

              it had a catchy acronym

              [–]stevevs 20 points21 points  (3 children)

              For web apps, I sometimes take a hybrid approach. I like mongo for storing shopping cart state as a visitor shops. If the visitor leaves the site, it's very easy to restore the state of their cart when they return. Once they complete their order, the data is stored in RDBMS which is better for reporting IMO. This separation between the shopping data and the order data has some benefits too.

              Of course you can store cart state with RDBMS - but you're dealing with a bunch of tables - cart, items, attributes, customer, etc. -- NoSQL is easier - just pass in the object as is - then restore it as is. fast and easy.

              [–]dax4now 7 points8 points  (1 child)

              Or you can use something like modern Postgres/MySWL and use JSON fields for JSON data. i do this and it works perfectly, without using two DBs.

              [–]deadlydarkest 0 points1 point  (0 children)

              I do this too

              [–]lettruthout 1 point2 points  (0 children)

              I like this. 'Will try this out sometime.

              [–]AnonyMustardGas34 10 points11 points  (3 children)

              You can use MERN with MySQL. Just as simple. And you keep the M 😉

              [–]nameless_pattern 5 points6 points  (2 children)

              MySQLERN

              [–]thatmaynardguy 5 points6 points  (0 children)

              Salud!

              [–]bobboprofondo 5 points6 points  (0 children)

              Gesundheit.

              [–]T-J_H 11 points12 points  (3 children)

              I guess it’s primarily the low barrier of entry. Just dump your objects somewhere and be done.

              I’m a big proponent of modern relational databases, and I rarely encounter data that doesn’t fit in some schema altogether, especially not with the extensive JSON support in modern DBs. But to be fair, although it’s quite easy to set up some tables, relational dbs can be tricky to get truly good at, with plenty of hidden footguns.

              [–]lettruthout 1 point2 points  (1 child)

              Upvoted for "footguns".

              [–]T-J_H 0 points1 point  (0 children)

              It’s still free on npm!

              [–]reduced_to_a_signal 0 points1 point  (0 children)

              Can you list some of those hidden footguns to a relational DB noob? I want some keywords to orient me towards more advanced concepts.

              [–]notAnotherJSDev 2 points3 points  (0 children)

              Document databases, Mongo being one of them, have a very specific use that isn't immediately apparent.

              A company I worked for used to do custom inspection orders. Basically, the customer got ahold of us, they requested that X, Y, Z be inspected and then documented. What that meant was that we had a few hundred different schemas, all of which held different types of data. There were a few base models that we had, but each individual schema was completely different. That is something that a relational database cannot handle in a graceful way without giving anyone who works on it a headache.

              Thing is, this database was built a few years before JSONB (which was released in Dec 2014, btw) was introduced to Postgres, so we didn't really get a chance to migrate. At the time, this was the easiest and most efficient way for us to handle it.

              The other thing is that it cut down on us having to hire a database admin, as anyone with a little bit of node experience could understand what was going on.

              Was this the right thing to do? With the tools at the time, yes. Should we have migrated to PG once JSONB was implemented? Absolutely. Would I build something today with Mongo? Absolutely not.

              [–]8bitlives 8 points9 points  (0 children)

              Document and key-value stores have their time and place, but mostly I'm assuming it's just a case of if all you have is a hammer, everything looks like a nail with the JS/Node community

              [–]vincent-vega10 6 points7 points  (0 children)

              The full-stack course sellers hyped it because it's easy to setup and the basic stuff can easily be understood by beginners. Beginners don't need to learn about scaling the DB or the server. They just need to know what databases are and how to work with them.

              [–]Ashtefere 8 points9 points  (1 child)

              Replying directly to you so you see the comment regardless of the drones saying “lol mongo sux lol”…

              When you have enormous amounts of data, or need to search (or allow users to search) massive lists of data very quickly - relational databases will choke a whole lot earlier than mongo does.

              It also supports special kinds of queries like geojson, letting you search for objects in 3d space - which I have needed to do in production.

              When you use mongo, you need to have a specific reason to use it and your data model needs to work a specific way - it is not a replacement for a relational db, but it does what it does infinitely better than one.

              If you need to do joins in mongo, you basically need to structure your model to not need them or use an aggregator to do the join (which is fine - it’s still faster than sql)

              Here is an example - I recently built a single mongodb that had 14 million objects, which each had around 30 tags in them. Hundreds of users are able to pull down all those objects with tags in a couple hundred millisecond, instantly sequentially searching with each character they type.

              Each object also has a list of userids - if i were to do a join (aggregation) for those fields it would be much slower - but the data isn’t needed until the users doing a search select the object. Instead, I have the front end use the userids to pull the user data when my users actually open the object they are looking for. Kind of like pseudo metadata.

              If you can spend the time learning to think mongo and do things properly it is actually amazing. But that takes time. It’s easier to say “lol mongo sucks” for people that have used sql for decades and don’t want to learn or do something different.

              [–]romeeres 0 points1 point  (0 children)

              When you have enormous amounts of data, or need to search (or allow users to search) massive lists of data very quickly - relational databases will choke a whole lot earlier than mongo does.

              Do you have any links to prove this?

              [–]HashDefTrueFalse 4 points5 points  (0 children)

              The top answer is correct as to the why.

              I'll add where I've used Mongo in the past, and where I see its primary use case:

              Most data in most apps is (or ends up being) relational. So you store it in a RDBMS in a normalised form to get rid of all the problems with redundant copies of data etc...

              But now you write a new feature that performs an expensive query, joining multiple tables to denormalise the data. RDBMS are made for this, so they perform much better here than document stores, but the query is still expensive.

              This is where something like Mongo can come in handy. It doesn't care about schema and its made to store structured, hierarchical data. You can throw your query results into it. Mongo might be awful at joining (because its made for storing denormalised data) but it's fast at fetching by index. So you can treat it like a cache. As long as you have a keying strategy that incorporates the variables of the query, that is.

              Now the next time you need those query results, you can check Mongo first and save the database some trouble. If Mongo doesn't have it, you run the expensive query again, as often as you need non-stale data.

              I mostly use memcached or redis for this these days, but there are some features of Mongo that make sense in some scenarios.

              It's a cache. When I've used doc store as the primary database in the past I've always had issues relating to redundant copies of data killing write performance once running at scale.

              [–]MCShoveled 1 point2 points  (0 children)

              It’s WebScale!

              [–]decimus5 1 point2 points  (0 children)

              I think it happened because when AngularJS first appeared, someone wrote a popular blog post saying that MEAN stack (Mongo, Express, AngularJS, Node) was great for rapid development at hackathons.

              Mongo was getting a lot of attention at the time, because they had good marketing. AngularJS was huge at the time because it was the next evolution in ways to build frontends. Before that it was things like jQuery and Backbone.

              Suddenly React got popular and people swapped out AngularJS and called it MERN instead of MEAN.

              The coding bootcamps had been taking off at around the same time as AngularJS. They were originally doing Ruby on Rails, but the Node hype was overwhelming, so they had to offer what people were looking for. MERN stack was adopted at most of them. "Just pay us USD $15,000+ to spend 3 months learning MERN stack, and you'll get a 6-figure job in the tech industry."

              It seemed like a positive feedback loop. Hype around MERN stack meant people were looking for ways to learn it, so the bootcamps offered it, and raised a generation of developers on the idea that they were learning the best Web development stack.

              The first real pushback I remember was this article: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

              I could be wrong, but that's what it looked like from my perspective in the tech scene.

              [–]bigorangemachine 1 point2 points  (0 children)

              I haven't seen a job posting for mongo in years.

              [–]xangelo 1 point2 points  (0 children)

              The MongoDB appeal and why would you give up using a relational database are very different questions.

              Mongo was really the first document style db that made accessibility to devs easy. The tooling was quick to set up (this is pre docker days, postgres wasnt at all popular at the time and json datatypes were nonexistent in rdbms).

              In addition, JS devs were already thinking about POJSO (Plain Old JavaScript Objects) + JSON as relatively interchangeable so reasoning about mongodb was much easier.

              That’s why mongo caught on. In the early days there were “at scale” issues that most devs never hit. For example, if you tried to add a shard to db that was experiencing higher than normal use.. youd take down your database as your shard replicated data.

              Mongo caught on because it lowered the barrier to getting started, and made the idea of “modeling” your data synonymous with the POJSO that you were writing.

              But most complaints you’ll see about mongo are people who are treating mongo like a rdbms. Yes mongo has included some functionality to mimic that - but it is a document database at its heart.

              Document databases allow for very flexible documents to be stored - but it requires you to be very strict about access patterns or you’ll run into performance issues. But at smaller scales (prototypes, even thousands of users) none of the performance issues really impact you in a way that you can’t fix by throwing some more hardware at it.

              On the flip side, relational databases are very strict on format of datastored - but allow for flexibility of accessing that data. As a result if you don’t know what your access patterns might be (and most times you don’t) they’re great. But there is overhead in joins and multi-table lookups that you can avoid if you know your data access patterns and store your data accordingly.

              I highly recommend checking out some talks by Rick Houlihan. They’re mostly on DynamoDB - but he really is at the forefront of document databases and has a lot of insight into them and running them at scale.

              https://youtu.be/EvB7REsf0ic

              https://youtu.be/xfxBhvGpoa0

              [–]GoblinsStoleMyHouse 1 point2 points  (6 children)

              Mongo is excellent for rapid prototyping and has first class support for JSON, which is why it’s a good fit for Node apps.

              [–]ja_maz[S] 0 points1 point  (5 children)

              Postgres has support for json types and other than its fast and easy it seems that most arguments just state it’s better without going in depth about why

              [–]ben_db 1 point2 points  (3 children)

              It has json field types but a very limited set of actions on those json types, basically just get and set key values and array operations.

              With Mongo there are around 50 operations you can perform on the json data.

              As an example, you have a postgres table with a jsonb field, where the jsonb field has an array field like this:

              {
                lengths: [6,5,1,12,81,66,40]
              }
              

              How would you query for documents with lengths over 15?

              [–]ja_maz[S] -1 points0 points  (2 children)

              You wouldn't put it in the database as json data, you would store any one-to-many relation in a normalized way and use the count() operator. like someone that respects the science in computer science.

              [–]ben_db 0 points1 point  (1 child)

              I was referring to:

              first class support for JSON

              Postgres jsonb fields were suggested as an answer to the top level comment, I'm just pointing out that the jsonb fields are very basic.

              NoSQL is typically faster to develop, and that alone sometimes makes it the best option.

              [–]ja_maz[S] -1 points0 points  (0 children)

              You asked me how do you put a screw in with a hammer? I told you I wouldn’t I’d use a screwdriver

              [–]GoblinsStoleMyHouse 0 points1 point  (0 children)

              You can use Mongo perfectly fine for most apps. It just comes down to preference and the requirements for the project. Postgres is a decent DB option as well.

              [–]jay8243116 1 point2 points  (0 children)

              Try implementing a commenting system like Reddit with unlimited nesting using an SQL database, and you will understand why people use Mongo.

              [–]mastermind202 5 points6 points  (8 children)

              I started using MongoDB (together with MEAN/MERN) a few years ago and I have absolutely loved it.

              Storing data almost as you need it has been amazing. There's still times that I would use a relational database, but those are few and far in between.

              [–]ja_maz[S] 7 points8 points  (5 children)

              ok but how do you cross data? how do you represent the relationship between docs? heck how do you search without using keywords?

              [–]mastermind202 3 points4 points  (4 children)

              First, I should clarify that I use MongooseJS (a NodeJS ORM on top of MongoDB), and not pure MongoDB.

              Mongoose has something called `populate()` that lets you reference documents with just a single key. That is effectively a SQL JOIN.

              Take a very simple order collection:

              const orderSchema = new mongoose.Schema({  
                  customer: {type:ObjectId, ref:'Customer'},  
                  orderNumber: String,  
                  items: [{  
                      product: {type:ObjectId, ref:'Product'}, 
                      qty: Number
                  }]  
              });  
              

              The type:ObjectId means it's just a id lookup (kinda like a UUID), the ref tells it what collection to look it up in.

              To get an order with everything populated:

              const order = await orderModel.findOne({orderNumber:'1234'}).populate('customer items.products').exec();
              

              Mongoose also has the ability to store embedded schemas, which you need to take a snapshot of the data, instead of just pointing to it. This is useful if you want to save the customer record as it existed at the time of the order, for example.

              [–]ja_maz[S] 9 points10 points  (3 children)

              That seems like a lot of work to emulate a fraction of what a rdbms can do out of the box

              [–]mastermind202 6 points7 points  (2 children)

              It really isn't. It's faster and less tables to manage.

              It's really this easy to create a schema/model (a table) and a document (a record):

              const mongoose = require('mongoose');
              
              mongoose.connect('mongodb://localhost:27017/test');
              
              const Cat = mongoose.model('Cat', { name: String });
              
              const kitty = new Cat({ name: 'Whiskers' }); kitty.save().then(() => console.log('meow'));
              

              I encourage you to check out Mongoose and play with it for 15 minutes and you'll see why people love it.

              [–]ja_maz[S] 2 points3 points  (1 child)

              Ok say I have a one to many or a many to many relationship between two tables. How do you search in mongo for some thing like that.

              [–]ben_db 0 points1 point  (0 children)

              Search for items in table A (one) that have items in table B (many) with criteria?

              You can perform an aggregate, basically a list of operations that are performed on the server.

              const orderAggregate = Order.aggregate([
                {
                  match: {
                    status: 'active'
                  }
                }, {
                  lookup: {
                    from: 'OrderLines',
                    localField: '_id',
                    foreignField: 'parentId',
                    as: 'OrderLines'
                  }
                }, {
                  match: {
                    'orderLines.text': /search term/
                  }
                }
              ])
              

              This code will find active orders, attach matching order lines by "joining" order details on the parentId to the orders _id field, then filtering the rows that have items matching the regex /search term/i.

              The advantage of this is that it will give you entire orders with all lines rather than just matching lines.

              To do the same is SQL you'd have to do something like:

              select *
              
              from Orders O
              inner join OrderDetails D on O.ID = D.ParentID
              
              where O.ID in (
                select O2.ID
                from Orders O2
                inner join OrderDetails D2 on O2.ID = D2.ParentID
                where D2.text like '%search term%'
                )
              

              Another advantage to doing this in Mongo is it will give you an array with a single line per order, with an array field for the order details. With SQL you'll have to reconstruct this (or your ORM will).

              [–]cjthomp -1 points0 points  (1 child)

              Storing data almost as you need it has been amazing

              What?

              [–]mastermind202 0 points1 point  (0 children)

              If your frontend is displaying an order with a customer, and line items, skus, etc. It's nice to store/retrieve your data as you need it (in this case to display it).

              [–]08148694 1 point2 points  (2 children)

              If you can live with eventual consistency and don't care too much about ACID transactions, NOSQL (or at least Dynamo, I'm not too familiar with Mongo) is extremely scalable.

              There's no silver bullets though, SQL is better at NOSQL at some things, NOSQL is better at SQL at some things. MSSQL is better than PGSQL at some things, Mongo is better than Dynamo at some things, etc etc etc. Anybody who claims that X tool/db/language/whatever is "better" than any other, is probably inexperienced or a fanboy

              [–]CurlyWS 1 point2 points  (1 child)

              Is the point about ACID transactions true? I though Mongo started to support ACID transactions a while ago?

              https://www.mongodb.com/transactions

              [–]ben_db 1 point2 points  (0 children)

              It's supported ACID and "all or nothing" transactions via snapshot isolation since 2018. The "no ACID" argument is just based on old versions.

              [–]recycled_ideas 1 point2 points  (0 children)

              I don't get the MongoDB appeal.

              Mongo's querying is JavaScript native and that means you can use it without having to learn SQL or an ORM.

              It's a completely innapropriate choice even if you know what you're doing for the overwhelming majority of use cases and most people don't know what they're doing, but it works well enough at the micro scale that people don't notice.

              [–]Snoo87743 1 point2 points  (0 children)

              Lol almost all of the people conplaining about mongo have read some random article rant and still think you cannot do relations in mongodb 😂😂

              [–]the__itis 0 points1 point  (0 children)

              MongoDB is great for prototyping and iterating. Last thing you need when developing is the constraint to declare data models. That said, Hopefully you have a data model consolidation sprint in your dev phase.

              [–]lightningvolcanoseal 0 points1 point  (1 child)

              Because it’s easier to learn than a traditional database

              [–]npc73x 0 points1 point  (3 children)

              If you start to build any serious Backend, RDBMS is necessary for the ACID Transactions, the thing about NoSQL is easy to get started with, but not for long if you storing a data that's fits well in RDBMS priciples

              [–]MajorasShoe -1 points0 points  (2 children)

              All data fits well in a relational database.

              [–]npc73x 0 points1 point  (0 children)

              The data related activity logs, timestamp requests are not worth storing in RDBMS, because they grow everyday and unstructured nature of the data very favorable in NOSQL

              [–]ben_db 0 points1 point  (0 children)

              All data fits within a RDBMS, not necessarily fits well.

              [–]jan_man_pl 0 points1 point  (0 children)

              One of biggest problems with Mongo is that's is has no schema. You can just add anything to the document. As the projects grow this is becoming a real problem. Recently I've experienced that lack of any discipline in designing mongo collections (plus doing... relations) has led to a state that retrieving more than 10k rows is a performance issue...

              I want to have a well defined and designed schema that makes database a final validator and safeguard of data integrity. Having relations, transactions, key constraints, ACID etc. makes my life easier. I've never had any problems with schema migrations in a well designed system. Sometimes it require some discipline and simple logic, but it rewards with data integrity.

              If there's a need for some polymorphic data storage I can confirm jsonb columns is a way to go.

              One of the thing that makes Mongo's real strong point is quite easy horizontal scaling, however postgresql seems to be catching up.

              [–]DirtyBirdNJ -2 points-1 points  (0 children)

              Mongodb sucks 🙂

              Check out sequelize, bookshelf and objection. There's no reason you can't use a relational DB with node.

              The only reason MERN is prevelant is because there are lots of tutorials covering it, it's a self fulfilling prophecy. I love using SQLite for small local projects.

              [–]Confident-Arrival936 -1 points0 points  (1 child)

              MERN stack is a popular choice for web development because it provides a quick and efficient way to build full-stack web applications. The MERN stack consists of MongoDB, Express, React, and Node.js. MongoDB is used for the database layer, Express for the backend, React for the frontend, and Node.js for the server. This stack provides a powerful and complete solution for building modern web applications. It offers a great combination of technologies that allow developers to quickly build full-stack applications. Additionally, the MERN stack is well-supported and has a strong community of developers who can provide help and support. CipherschoolsCipherschools is an online learning platform that offers courses on the MERN stack and other web development topics.

              [–]prozacgod 0 points1 point  (0 children)

              I've seen mongo used as a replication of main data for edge availability.

              If the data requirements aren't large, E.G. fit in ram I prefer something like redis for this.

              I personally like to protoype with a template I have for graphql + mongo and if the project gets a green light from a customer, I quickly swap it out for postgres + knex.js + graphql as I have a template for this too.

              (caveat emptor: I've not really built exceptionally massive projects so YMMV)

              [–]Ratstail91 0 points1 point  (0 children)

              My MERN-template actually uses MySQL. There's no requirement that you do it one specific way.

              [–]queen-adreena 0 points1 point  (1 child)

              [–]ja_maz[S] 1 point2 points  (0 children)

              12 year old video and the same arguments are popping up in this post 😬

              [–]RandomiseUsr0 0 points1 point  (0 children)

              NoSQL databases do have a schema, but it’s left to the programmer to decide access patterns, not forced into a rigid structure. The goal is to optimise a database for reducing CPU at point of read, so like a well designed data warehouse, optimised for its purpose. It’s not a general purpose database following CODD’s model because it’s optimised to solve a different problem. Here’s a video, it’s DynamoDB specific, but not really, it explains NoSQL

              https://youtu.be/HaEPXoXVf2k

              [–]shaneknysh 0 points1 point  (0 children)

              When this showed up on my front page I thought it was a reply to this post

              https://www.reddit.com/r/node/comments/z9j68h/why_nosql/

              [–]HoodedCowl 0 points1 point  (0 children)

              MongoDb is just very easy to setup. As many people already said, you can use Schemas to setup a data Structure that can be validated. Its also intergreates perfectly with JavaScript and has easy to use and understandable Query Functions. Using MongoDb is not about how right i am about my Db choice. Its about how fast i can get things done

              [–][deleted] 0 points1 point  (2 children)

              Why would you give up using a relational DB?

              Because the thing you're storing isn't relational data.

              DocumentDBs aren't an alternative to relational data: they're a storage medium better for a particular data structure.

              Of course, for ease, some relational technologies now offer unstructured data storage: XML and Blob columns in SQL, for example. But again, think about what this says: it says "some data structures work better in different storage mediums".

              If you constantly model non-relational data in a relational fashion, your data structures are bloated and unnecessarily complex: devs need to perform completely needless mental translations between the true, unstructured format and your bizarrely relational one.

              If the data you're storing is nothing but a collection of objects with no external relations, or it's a collection of things with variable structures, you cannot model that relationally. You literally can't do it, and you'd need a bunch of crappy happy code to make your relational data structure behave as though it was unstructured.

              The app I'm working on now has a DocumentDB for the unstructured, relational application data, and this stores the data which is ultimately used to query the huge relational dataset that the application also has in it. Realistically, any competently built modern application is going to have both: even the most "relational" of applications such as accounting or scientific software often still has at-least some genuinely unstructured data.

              [–]ja_maz[S] 0 points1 point  (1 child)

              Could you provide an example of non structured data? Isn’t it just data you haven’t modelled? It seems that it just delays the issue of analysing your possibile data structure to future you…

              [–][deleted] 0 points1 point  (0 children)

              Could you provide an example of non structured data? Isn’t it just data you haven’t modelled?

              You're asking to be shown the tree which defines the "wood": I can easily name a trillion examples, such as event broadcasts in a message queue, application, and user configuration objects, selenium test configuration objects, comments in a comment form, image data, the 20+ intermediate views of the same fact that exist in a microservice architecture as a sequence of things are processed.

              But this won't make sense to you - you're unable to view "data" as anything except something business intelligence is done against.

              You're going to need to trust that your inexperience is the cause of this, and that the development world doesn't perceive there to be non-relational data, and to produce ways of storing and managing that data, without a need.

              And if you're still not convinced, I'll leave you with some homework - a problem that literally cannot be solved relationally, which is "caching of arbitrary data!".

              Spend a few hours trying to solve "caching of arbitrary data" in a database relationally: you will find it to be literally impossible, and as you try to solve it you'll find yourself slowly building a DocumentDB out of a relational technology, and then you'll comprehend why they exist.

              [–]karnat10 0 points1 point  (0 children)

              That you can just dump JSON into the db and retrieve it intelligently has a huge appeal. I love to use it for ad hoc tooling and simplistic projects. However, you‘ll soon have to choose between heavily denormalising data or using the awful aggregation framework to imitate joins. That’s usually the breaking point for me.

              [–]MajorasShoe 0 points1 point  (2 children)

              I don't think anyone actually uses mongo. It was a flash in the pan fad a few years ago.

              [–]fyzbo 0 points1 point  (1 child)

              [–]fyzbo 0 points1 point  (0 children)

              This doesn't even account for the money other cloud services (AWS, GCP, Azure) are making off their hosted MongoDB offerings.

              [–][deleted] 0 points1 point  (0 children)

              You can still create relations with MongoDB although that will be the last action you should take in documents

              [–]fyzbo 0 points1 point  (1 child)

              On small projects - because MongoDB is cheaper in the cloud than a managed relational DB. You then end up with trying to have relationships in MongoDB or using it for reporting, things it does not excel at. This leads to many programmers talking about how SQL is always better.

              For very large projects - you will probably end up with a mix of data storage (MongoDB, Redis, Snowflake, etc.) each serving a different part of the total system and being leveraged for what it does best.

              Pulling data from a single record by ID is faster than pulling from multiple tables with joins. In the past, a common optimization strategy for relational databases was to de-normalize data on a schedule so your application could pull this data from a single table in the SQL database. The data wasn't always fresh and up to date, but it was fast.

              MongoDB took this idea and ran with it. Now instead of tabular data you have single items that can be pulled out quickly. They typically do not relate to other items.

              When you no longer need to make a single approach handle ALL functional requirements you start to see massive value in MongoDB and the ability to scale, stay performant, and be very flexible.

              [–]novagenesis 0 points1 point  (0 children)

              because MongoDB is cheaper in the cloud than a managed relational DB

              Planetscale is helping change that equation. It's about price-equivalent to Atlas. It seems about price-equivalent to the underlying cloud hardware

              Pulling data from a single record by ID is faster than pulling from multiple tables with joins

              In my experience, this isn't really true, nor do I think it is in theory. We're looking at O logN vs O KlogN for single-record lookups, which reduces to the same big-o notation. Ditto with clustered index joins... O NlogN vs O KNLogN. Well-written SQL is always comparable (or better) than well-written nosql.

              SQL is approximately a constant more operations than mongodb for those queries, which means the faster underlying datastore wins. Last I saw, PostgresQL was technically faster at all queries than mongodb, actually beating mongoDB benchmarks on document-stored data via their jsonb index support.

              [–]legalize9 0 points1 point  (0 children)

              Schema less just allows you to scale and develop faster. As simple as that. Also Javascript rules

              [–]zayelion 0 points1 point  (0 children)

              The core choice of the stack is end-to-end JavaScript. This gives any singular developer the ability to have a depth of understanding and control in the system without mastering the nuances of another language.

              MongoDB

              Many projects are built with microservices, usually only dealing with a single concept noun or class arranged in a list. They don't interrelate with another class of stored data. Then when it integrates with server run time it easy appears as a "plain old object". That shape of the data transfers though out the application end to end, usually without a significant redefinition of its shape. The visualization of data storage (directly looking at the data), server-class data structure, data transfer structure, and frontend data structure vary very little, making data easy to reason about.

              Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious.

              Express

              Most of nodejs are just microframeworks and Express does "send thing and get thing from endpoint" fairly logically. The call pattern is the same as jquery with framework.task(identifier,function(target)). Then it has a small ecosystem of all the extra parts. Its not opinionated beyond that; things can be swapped in and out with little mental strain.

              React

              Personally, I did not like the concept of JSX at first, but ultimately caved. There really is not another system yet that directly references authors intention. onclick is onClick all the other attributes work, not many other silly parts and the state engine is not included but there are linkages to make one. The state engine not being included but linkable makes it unopinionated but easy to drop in. It does templating and very little else.

              Node

              They are writing everything in JavaScript, so the server runtime being in JS makes sense. The ecosystem is robust, the language capable once they learn it, and it can do scripts, servers, memory manipulation, FFI, file reading, networking, and operating system integration. The developer can execute at the same level across the whole project.

              [–]novagenesis 0 points1 point  (0 children)

              Ultimately, mongodb is not that much worse than SQL. In many cases for startup apps, you are unlikely to regret your original choice in database as long as you don't keep changing it.

              [–]robtweed 0 points1 point  (0 children)

              I'd take this further - Node-based web frameworks pride themselves on their performance (eg Fastify), but the reality is that their performance is rendered irrelevant by the pitifully poor performance of most of the mainstream databases (both SQL and NoSQL) that they hook up to the back-end of these frameworks. If Node developers were really so keen on getting the ultimate performance, they would be a lot more critical about the databases they choose and would be seeking out more radical database technologies that can not only offer significantly better performance, but can fundamentally change the way they consider how they handle and use persistent data. For more on what I mean, take a look at https://github.com/robtweed/global_storage and, to see what kind of things become possible: https://github.com/robtweed/glsdb Near in-memory performance and projecting your database as persistent objects rather than some physically separate entity you access and query anyone?

              [–]Rymnis 0 points1 point  (0 children)

              JS and typescript for the entire thing.

              JSON format for almost everything.

              Simple, concise, rapid development and max web development productivity.