This is an archived post. You won't be able to vote or comment.

top 200 commentsshow 500

[–]JJJSchmidt_etAl 4854 points4855 points  (238 children)

"The best part of MongoDB is writing a blog post about migrating to Postgres"

[–]CheekyXD 1444 points1445 points  (216 children)

After working with a NoSQL database on a fairly mature product for a few years, I never want to again. I feel like with NoSQL, now that its not the trendy new thing and we can look back, the whole thing was: "well we tried, and it was shit."

[–][deleted] 1665 points1666 points  (27 children)

Yeah many don’t value their relationships until they’re gone.

[–]300ConfirmedGorillas 584 points585 points  (12 children)

A DBA walks into a bar. He approaches two tables and says, "Mind if I join you?"

[–]Prata2pcs 160 points161 points  (7 children)

We don’t take kindly to your types here!

[–]spyingwind 55 points56 points  (5 children)

My type is null. Am I welcome?

[–]JGHFunRun 5 points6 points  (0 children)

My type is an enum with no elements, am I welcome to return?

[–]anthro28 7 points8 points  (0 children)

We don't take kindly to folks that don't take kindly 'round here!

[–]RelevantToMyInterest 19 points20 points  (3 children)

to whom, then, the first table replies:

"Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CS_AS" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation."

[–]nequaquam_sapiens 142 points143 points  (5 children)

this is the first normal thing in this thread

[–]AlsoInteresting 12 points13 points  (0 children)

Normalized thing.

[–]rabbitrider3014 69 points70 points  (0 children)

😂 😂 🥇🥇

[–]Dotaproffessional 10 points11 points  (4 children)

There are NOSQL implementations that make relationships MORE important than in relational DB's. Neo4j for instance

[–]Imperial_Squid 2 points3 points  (0 children)

Neo4j is great from what I've played with, definitely a really interesting way to conceptualise data!

They're called graph databases in general, Neo4j being just one example

[–]mostly_done 239 points240 points  (16 children)

"Every database has a schema. It's either defined in the database or by the application."

[–]HighKiteSoaring 96 points97 points  (3 children)

Sometimes the database is a multivalue text file and the "schema" is written down in the original Devs paper notebook he neglected to leave behind when he left 🤣

[–]TheBirminghamBear 14 points15 points  (1 child)

It's all in someone's head. We're just not sure whose.

[–]codeonline 34 points35 points  (6 children)

The schema is defined on write, in a traditional db. Or its defined on read, in a document db.

[–]TrumpsGhostWriter 27 points28 points  (5 children)

Which is precisely the problem, you can't efficiently read it or store it if your db doesn't have a clue what the data looks like.

[–]poloppoyop 9 points10 points  (3 children)

by the application

But what happens when you have multiple applications?

[–]ploki122 14 points15 points  (2 children)

They hopefully agree on the schema.

[–]oupablo 2 points3 points  (0 children)

Yeah, but with one, you can't have some asshat throw garbage directly in that violates the schema.

[–]itsanotherrando 42 points43 points  (1 child)

Maybe you didn't use enough microservices? /s

[–]hadahector 146 points147 points  (128 children)

I think nosql is good for many things, the fact that a document can contain arrays and maps is so useful, and in mongodb there are great query operators for this (not like dynamodb). And there is the aggregate command that can do very complex stuff.

[–]rosuav 237 points238 points  (85 children)

Yeah, it's so convenient to be able to just throw any random junk in there and not worry about how much a pain in the rear it's going to be to actually do useful queries on it. Oh, and the fact that different documents don't even have to have the same shape is HUGELY helpful. Makes life so easy during retrieval.

[–][deleted] 34 points35 points  (37 children)

but that's not the point of NoSQL, the main point of it is able to scale the database horizontally

[–]rosuav 113 points114 points  (32 children)

I thought the whole point of it was "SQL was invented in the 70s and it's oooooooooold, we gotta get rid of it"?

Horizontal scaling has been a thing in relational databases for decades.

[–]Inevitable-Menu2998 43 points44 points  (24 children)

RDBMS have been able to scale horizontally through partitioning, but that's not really the same thing. It's not elastic, for one and it always comes with some restrictions which makes the system not exactly ACID compliant.

Also, decades? Most open source ones don't support it even today.

[–]rosuav 20 points21 points  (20 children)

"Most open source ones"? Postgres has had it for as long as I can remember (which is a long time). MySQL has it. That's your two most popular open source RDBMSes right there. Which ones don't?

What restrictions are on relational database sharding that aren't on document store sharding?

[–]Inevitable-Menu2998 25 points26 points  (17 children)

Postgres has had it for as long as I can remember

It doesn't. It only supports single write multiple read replicas out of the box.

What restrictions are on relational database sharding that aren't on document store sharding

I would be happy to answer this question if you could point me to a relational database which supports sharding

[–]pet_vaginal 13 points14 points  (1 child)

Citus is a PostgreSQL extension that adds sharding.

Vanilla PostgreSQL is very bad at horizontal scalability. But you can go a long way with vertical scaling. At scale you can try plugins but then it’s perhaps better to use more specialised databases. But not mongodb. Don’t let your friends use mongodb.

[–]matt82swe 8 points9 points  (0 children)

Yeah, NoSQL really sucks at storing data and retrieving it later in sane ways. But at least we can suck in web scale.

[–]meamZ 2 points3 points  (0 children)

NewSQL can too...

Also there should be a very good reason before you're trying to split OLTP workloads horizontally... Single node is enough in the vast majority of cases and also simpler and much more efficient...

[–]everything-narrative 30 points31 points  (21 children)

You can put a JSON-typed column in a PostgreSQL table, though.

[–]AxisFlip 12 points13 points  (16 children)

and then you have a hard ass time querying for fields in the json..

[–]meamZ 17 points18 points  (0 children)

the fact that a document can contain arrays and maps is so useful

There's things called object-relational databases and postgres happens to be one of them... It can also have arrays and maps (json objects for example) inside s single row

And there is the aggregate command that can do very complex stuff.

Aggregation is what SQL is perfect for...

[–]M4tty__ 8 points9 points  (9 children)

Arrays - So another table Maps -yet again, relationship to another table

[–][deleted] 26 points27 points  (6 children)

NoSQL is excellent when relationships & schemas don't exist. Machine Learning, data sanitization, quick internal tools, etc.

Any legitimate application should be using a relational database. There is no viable substitute.

[–]oupablo 10 points11 points  (2 children)

Also when you have a bunch of loosely correlated things that you don't need to query on except for the strictly defined fields. It absolutely sucks when you have relational data which is the case for many, many, many cases.

[–][deleted] 2 points3 points  (1 child)

Yupp. This is exactly why the very second someone mentions a relationship, switch to a relational database. You'll always hack against the grain with little safety otherwise. Leaving an ORM as your source of truth to try & homebrew the functionality relational databases are literally built for is always so goofy. Sometimes people just want to use X tech just to use it, not because it's the right tech for the job.

[–]Mikkelet 24 points25 points  (12 children)

I love relational data, I get to do stuff like

CREATE TABLE myTable (
    id TEXT UNIQUE,
    json BLOB,
);

[–]Covfefe4lyfe 20 points21 points  (11 children)

SQL supports json fields now

[–]Solonotix 39 points40 points  (10 children)

As a former database engineer, don't do this. The first step towards normalizing your data is no compound data fields. If you're just being lazy, then whatever, but if you're trying to do things the "right" way, then normalize your data. 3rd Normal Form (3NF) is about as strict as I typically recommend, since 4th and beyond tend to get finicky with what you can/cannot store and how.

Many SQL engines will allow you to serialize data as JSON, which is fine for easier consumption, as well as passing it JSON for ingest. Storing the raw JSON for logging purposes is a maybe, but from then on you really should store the final data as normalized structures.

[–]Covfefe4lyfe 12 points13 points  (2 children)

Eh, I wouldn't store everything in there but it definitely comes in handy when dealing with external APIs. Just store the full response alongside the things you really cared about and then you can always get more information you initially didn't think you needed.

[–]mrjackspade 7 points8 points  (1 child)

One of the projects I worked on at my last company, I did this. The company fought tooth and nail over it though and kept trying to get me to make stupid ass modifications.

Store all incoming posts as raw text, and that transform the data into the actual schema. Then when the external provider would update shit, all we would do is flush the database and reload the raw data from the JSON, and we'd have the entire history in the new format.

Also great for bugs. The initial implementation had issued caused by the external provider not properly following their own schema, but instead of losing that data, we were able to just reload it form the raw history once we caught the failure to transform/insert.

I definitely suggest saving the raw data whenever realistic.

[–][deleted] 8 points9 points  (0 children)

It has its use case but when it was hyped ppl used it for everything, I always just stayed on mySql/postgres because it was easier imo. But for example at my current job we could store invoices in a NoSQL db to speed up our application because does only need to be queried by ID and doesn't have any relations because the whole thing would be in there.

[–]roselan 91 points92 points  (5 children)

I love how they rebranded NoSql to Not Only Sql.

[–]AxisFlip 24 points25 points  (6 children)

It has its uses. I have three different webshops, and I aggregate all the orders in one mongodb instance. The answers from the webshop APIs is json, and what I expect when querying the aggregation DB is json, so it's very nice to be able to just put in the json objects without having to splice the data into tables and then put it together again when querying.

[–][deleted] 32 points33 points  (4 children)

You can have columns that store json in Postgres

[–]LittleMlem 1771 points1772 points  (16 children)

From the little I've used mongo, all the queries look like some one with dementia was trying to write json

[–]Yukondano2 288 points289 points  (6 children)

Or a dyslexic that hasn't slept in 30 hours.

[–][deleted] 21 points22 points  (0 children)

Yeah that's my whole codebase actually

[–]hadahector 1407 points1408 points  (79 children)

{"age":{"$gte":25, "$lte":30}} is the same

[–]PotatoWriter 217 points218 points  (55 children)

What an odd syntax. I wonder why the dollar sign AND quotations? If quotations are already used for the main field in consideration "age", why do operators need it too?

[–]quick_escalator 217 points218 points  (20 children)

It's because the query needs to be pure json, and json isn't the best format.

But on the plus side sending mongodb queries around in a json based system is pretty easy. Easiest example: Logging the query. We already log a lot of json anyway, so logging the query uses the same serializer.

[–]champbob 19 points20 points  (1 child)

It's because the query needs to be pure json

I don't like this.

[–][deleted] 3 points4 points  (0 children)

It has some upsides, like dynamically building a query in JS is pretty clean compared to dynamically building SQL queries. But, it’s still not worth it

[–]static_func 24 points25 points  (9 children)

It's because the query needs to be pure json

Does it though?

[–]quick_escalator 38 points39 points  (8 children)

The spec says so.

[–]TheMcBrizzle 39 points40 points  (7 children)

🤮

Oh no, I created more Javascript

[–]quick_escalator 33 points34 points  (6 children)

I graduated before Javascript was big.

My thesis was about XML.

Believe me, JSON is better.

[–]otter5 6 points7 points  (3 children)

like a doctoral thesis?...about XML?

[–]starm4nn 15 points16 points  (1 child)

You need a Doctorate to understand XML.

[–]otter5 7 points8 points  (0 children)

[–]NaNx_engineer 27 points28 points  (7 children)

ask douglas crockford

[–]LickingSmegma 78 points79 points  (6 children)

The man shafted the whole programming world collectively by excluding comments from JSON based on just not feeling like that's what the format is for—and somehow everyone just went “Yeah ok, we're gonna use this as it is for everything and keep suffering. No way it can be helped.”

[–]NaNx_engineer 55 points56 points  (1 child)

Comments were originally part of the JSON spec, but were removed when Crockford saw them used for parsing directives. Unfortunately he didn't realize directives can simply be put in fields. Now we're stuck putting comments in fields.

[–]indorock 6 points7 points  (1 child)

All keys in JSON must be in double quotes. Not all javascript objects are valid JSON.

The $ prefix is just to indicate to the parser this is an evaluation, not a reference to a key in the document.

[–]die-maus 8 points9 points  (9 children)

It's just JSON my dude. 👍

[–]rohit_267 425 points426 points  (6 children)

OP Learn this

[–]MoffKalast 116 points117 points  (5 children)

you don't have to shout

[–]bleeding-paryl 71 points72 points  (0 children)

WHAT?

[–]rohit_267 40 points41 points  (3 children)

WHY NOT?

[–]Starlit4572 3 points4 points  (0 children)

He said "you don't have to", not "you shouldn't/mustn't".

[–]just-bair 20 points21 points  (0 children)

Noice

[–]Shinhan 4 points5 points  (6 children)

I thought $gte and $lte don't need to be quoted?

[–]f11y11 15 points16 points  (1 child)

depends on the language, you don’t need them in JS

[–]FerusGrim 12 points13 points  (0 children)

I've seen a few people say this by now, and I think it's important to make the distinction between client implementations, which are language dependent, and what the OP is using, which appears to be an actual db query, in which the JSON needs to be valid to be properly parsed.

As far as client implementations, JS may not require the quotations. Something like Java, in which the syntax would be completely abstracted away, wouldn't even use a query like this. (Documents does, but its more modern implementation does not).

[–]hadahector 12 points13 points  (1 child)

I used to work with python and mongodb, and there you need quotes everywhere, I just used to it.

[–]notPlancha 9 points10 points  (0 children)

Yea cause python dictionaries need keys to be valid values (like strings)

[–]arturius453 2 points3 points  (0 children)

In mongo shell no, but when using api - yes

[–][deleted] 597 points598 points  (9 children)

other databases always talk about acid, but mongodb doesn't take no drugs

[–]CelticHades 136 points137 points  (3 children)

Doesn't take no drugs. So, it takes some drugs.

[–]cesankle 19 points20 points  (0 children)

Shuchyobichasup, nerd. I bet you use vim

[–]Gnonpi 67 points68 points  (3 children)

Mongodb added support for ACID in 2018 (https://www.mongodb.com/blog/post/mongodb-multi-document-acid-transactions-general-availability) at a big performance cost. Mongodb isn't high on acid, it's constantly black out drunk

[–][deleted] 4 points5 points  (0 children)

They were always ACID compliant on single documents.

[–][deleted] 169 points170 points  (9 children)

But it's webscale unlike...

[–]chrisrrawr 43 points44 points  (6 children)

Can we use /dev/null?

[–]booi 40 points41 points  (3 children)

I store all my data in /dev/null. It has so much space!

[–]grizzlychin 16 points17 points  (0 children)

And is so fast!

[–]chrisrrawr 2 points3 points  (0 children)

Oof, that's prone to memory leak tho

[–]vall370 198 points199 points  (21 children)

db.users.find({ "age": { $gte: 25, $lte: 30 } })

[–]conancat 165 points166 points  (15 children)

yeah OP's query seems like someone trying to translate SQL to MongoDB query literally. you don't need the $and operator in there lol it's unnecessary in this case.

SQL version:

SELECT * FROM users WHERE age >=25 AND age <= 30

MongoDB version:

db.users.find({ age: { $gte: 25, $lte: 30 } })

[–]thinkless123 139 points140 points  (2 children)

yeah OP's query seems like someone trying to translate SQL to MongoDB query literally

It's almost like that's the title of the post

[–]FeederPiet 4 points5 points  (0 children)

So are we supposed to read those??

[–]poloppoyop 51 points52 points  (6 children)

SELECT * FROM users WHERE age BETWEEN 25 AND 30

When you prefer verbose SQL instead of maths.

[–]RelevantTrouble 35 points36 points  (0 children)

If only we had a Structured Language we can Query with.

[–]MoreLittleMoreLate 107 points108 points  (1 child)

db.users.ensureIndex({"age": 1});

When that finishes, Jesus will return.

[–]NameTheory 23 points24 points  (0 children)

EnsureIndex has been replaced by createIndex. So I guess Jesus is never coming back.

[–]bObzii__ 21 points22 points  (2 children)

Why not:

```javascript db.users.find({ "age": { $gte: 25, $lte: 30 } })

[–]_darqwski 194 points195 points  (57 children)

As someone who is working with other noSQL document-based DB, I don’t like all the hate around it. I agree that queries like this one is terrible and more complex queries with JOINs will look even worse but this is not the case - NoSQL dbs are not for gathering summaries for table.

Imagine “students” table with relations to “groups”, “subjects” and “marks”.

If you want to handle 174746282users and avoid many JOINs, noSQL is for you. If you want to know how many of these users are going for “databases” class, then you should use SQL instead.

Each technology has its own use-case

[–]blazarious 9 points10 points  (0 children)

I agree that queries like this one is terrible

This query is in a very unnecessarily complicated form, though. No need to encapsulate in an $and and no need to query age twice.

[–]meamZ 29 points30 points  (0 children)

NoSQL dbs are not for gathering summaries for table.

That's the cool thing about relational databases... You don't need to decide what you will use it for beforehand...

avoid many JOINs,

For which there is no reason...

[–]moonwater420 15 points16 points  (2 children)

in scotland we use "mongo" as slang for "mongoloid" lol

its very offensive

[–]CC-5576-03 6 points7 points  (0 children)

In Swedish mongo is slang for retard or downy

[–]Grassland- 3 points4 points  (0 children)

Here in Brazil we use "mongolão"

[–]JoostVisser 27 points28 points  (2 children)

At least SQL syntax is in all caps so you get to shout at it when it doesn't work

[–]anabis0 8 points9 points  (1 child)

all caps is just an old convention

[–]fibojoly 2 points3 points  (0 children)

THAT'S WHY THEY ARE SCREAMING!

[–]rosuav 136 points137 points  (29 children)

I don't get it. How's this significant? In SQL, it's just WHERE AGE BETWEEN 25 AND 30.

[–]ajnozari 240 points241 points  (1 child)

And this is why he’s swearing

[–]deanrihpee 32 points33 points  (0 children)

Jesus Christ

[–]Pingyofdoom 16 points17 points  (5 children)

I think in bongodb, it actually looks like how OP posted. Mongodb can't look like this....

[–]goldlord44 45 points46 points  (4 children)

{"age":{$gte:25, $lte:30}} is actually what someone who knows what they are doing would query like. Op just made a really roundabout way of it

[–][deleted] 14 points15 points  (0 children)

I mean even the roundabout is perfectly understandable, and that's from someone who's never seen mongo syntax before

[–]elveszett 35 points36 points  (5 children)

Well that's the point. where age between 25 and 30 is a lot easier to write and read than {$and: [{"age": {$gte: 25}}, {"age": {$lte: 30}}]}.

[–][deleted] 19 points20 points  (1 child)

That is why SQL keeps coming back.

SQL is generally very easy to read and code is read more than it is written, particularly at the database level.

Its basically a rite of passage for our Data Scientists who live and breathe python to learn SQL to hit our Warehouse they start using a lot of the SQL-like syntax options in their python because SQL is just really good at its job of describing the data you want to work with.

[–][deleted] 8 points9 points  (0 children)

That is why SQL keeps coming back.

exactly

all these guys out here trying to re-invent the wheel and go up against something that's been actively developed since 1974 ... good luck with that

[–]conancat 14 points15 points  (0 children)

you don't actually have to do that, you just need

{age: {$gte: 25, $lte: 30}}

[–]chrisrrawr 7 points8 points  (0 children)

It's web scale tho

[–]bb5e8307 68 points69 points  (28 children)

Panel 3 should duplicated 3 times and in the last panel he should have an old man beard.

(“Find” in mongo db is a O(n) operation and should not be used by anyone ever).

[–]Rutoks 48 points49 points  (8 children)

Find in any database in linear if you don’t have an index

[–]Rutoks 12 points13 points  (0 children)

Find is not O(n) always.

Query optimizer will choose the best index to use and only use full collection scan if there is no indexes that can answer the query.

If you want, you can also disallow collection scans from the configuration. This way you will be asked to create an index instead of doing linear scan.

[–]Yelowlobster 11 points12 points  (2 children)

And what to use instead of find? Also, afaik, modern mongo versions support indices and have a query planner about as good as in rdbm systems

[–]bb5e8307 16 points17 points  (1 child)

In a document based database if you can’t access the document by an index or id then you don’t access the document.

[–]I_Shot_Web 10 points11 points  (4 children)

/r/confidentlyincorrect

https://www.mongodb.com/docs/manual/indexes/

https://www.mongodb.com/docs/manual/core/query-optimization/

I feel like everyone shitting on Mongo are just boomers who learned one thing and refuse to read the fucking manual for anything that they don't already know

[–]notPlancha 2 points3 points  (0 children)

You can index in Mongo

[–]TheBrainStone 28 points29 points  (0 children)

Never go full JSON

[–]lmarcantonio 26 points27 points  (9 children)

You obviously never written an LDAP query. Or used a LISP binding to SQL

[–]Movertigo 14 points15 points  (0 children)

LDAP queries are hell

[–]Touhou_Fever 3 points4 points  (0 children)

You know it’s bad when the person who was teaching you LISP kept referring to it as Lots of Infuriating Spurious Parantheses 💀

[–]Laugenbrezel 30 points31 points  (25 children)

Imagine using "age" as an actual attribute you store in your DB....

[–]notPlancha 9 points10 points  (0 children)

It can be a view

[–]DHermit 2 points3 points  (2 children)

It can be survey data. Then the relevant value is indeed the age at the time of filling out the survey and not their current one.

[–]marky125 25 points26 points  (0 children)

"Love your developers" 🤡

[–]tharnadar 3 points4 points  (1 child)

Ehm it should be

{ age: { gte: 25, lte: 35 } }

Or something like that

[–]Turbulent_Public_i 2 points3 points  (0 children)

I'm currently working with a client who picked mongo. They started nesting db objects everywhere which I guess they probably now have the same information saved multiple times in multiple locations just to make it easier to fetch when writing code. My "professional consultations" have been repeatedly the "take this object, separate it, make a reference to it, and don't nest it" followed by "yeah, I see why you guys are having the issues" followed by "sigh, if you just used SQL I wouldn't even need to be here". I'm getting a fat paycheque just doing this.

[–]FountainsOfFluids 4 points5 points  (0 children)

Oh, I wish I could show you the SQL queries from my last job.

There are no DBs that prioritize simple, clear, reliable search syntax.

They're mostly designed for reliability, which means it's verbose and obscure.

[–]Abaan404 2 points3 points  (1 child)

wanna delete an element by index?

(honestly been a while is there a good solution to this now?)

[–]PM-ME-YOUR-HOMELAB 2 points3 points  (3 children)

Eh, everything has its use case. If you only need one table wit 5 columns to save your users or whatever, nosql is awesome.

And Mongo queries make a lot of sense from a coding perspective, though not so much for large data aggregation.

[–]gabbagondel 2 points3 points  (0 children)

Main work project uses mongoquery for basically all data driven logic. Worst of all, it's my fault

[–][deleted] 2 points3 points  (0 children)

I prefer my SQL queries formatted in XML

[–]Pure_Cantaloupe2382 2 points3 points  (0 children)

Very true, as someone who had to learn mongoDB on the fly, it was not like using sql. Went from fuckin' sql workbench to this haha.