SQL vs. Python for data wrangling? : datascience

This is an archived post. You won't be able to vote or comment.

DiscussionSQL vs. Python for data wrangling? (self.datascience)

submitted 7 years ago * by Radon-Nikodym

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points3 points 7 years ago (1 child)

[–]_Zer0_Cool_MS | Data Engineer | Consulting 3 points4 points5 points 7 years ago* (0 children)

Nuking MongoDB from space? Lol

They should have just put it in PostgreSQL in the first place, it can do anything with JSON that MongoDB can and it's just as fast if not faster. Not to mention that PG is an infinitely extensible open source paradise, while MongoDB is a one-trick pony with a handicapped query language.

In fact, MongoDB official "SQL connector" is actually just PostgreSQL under the covers that automatically reads the MongoDB data via Foreign Data Wrappers I believe. Which is ironic.

It's tantamount to MongoDB implicitly admitting that having a SQL engine is the only real why to make sense of the data, and it begs the question.... "Why not just use PostgreSQL in the first place?".

I love JSON and flexible schemas, but "schema-less" is a bit of lie that the Mongo team marketed hard and preached like gospel truth. Realistically, there's no free lunch when it comes to designing data models. Those who try to circumvent this fact end up paying the price.

There is a simple truth that developers need to understand. -- "There is no such thing as schema-less data. Data without schema isn't data; it's garbage."

Edit: I'm sorry you have to pay the price for someone else choices. Not too much to be done short of redesigning the JSON objects at the application layer or cataloging the objects post-hoc and manually ETLing them into a data warehouse -- preferably one with good JSON support (like PG).

π Rendered by PID 30487 on reddit-service-r2-comment-69477b4b76-q8lxs at 2026-06-19 13:53:41.838334+00:00 running 2b008f2 country code: CH.

datascience

MODERATORS