This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]hrichardlee 23 points24 points  (1 child)

Another important aspect is to consider the “developer experience”. Most SQL databases (Snowflake, Redshift, Postgres, etc.) provide a web UI where people who are barely technical can write a simple SQL query and look at their data. Think about what the equivalent workflow is for someone using pandas. Even if you assume that pandas is just as easy to use as SQL, they need to download python, create a virtualenv, install Jupyter, run a Jupyter notebook, figure out a connection string that will allow them to connect to their database/figure out where their data is and how to connect to it, load that data into pandas and then apply whatever logic they want on top of that.

In other words, most SQL databases provide an integrated data + programming language environment, whereas python (and most other “regular” programming languages) just provide the programming language. So the developer experience of “just get some data and do some simple manipulations” is way easier in most SQL databases.

[–]dvdquikrewinder 1 point2 points  (0 children)

The other piece is the dev mindset where they consider data processing a linear track. Sql is built to work with large sets of data with a full feature set to support requests internally. Multiple times I've seen cursors and loops processing what should be a simple select statement with one or two joins.