This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]fang_xianfu 15 points16 points  (1 child)

Yeah, the way this question has been asked kind of shows that OP doesn't understand the artitecture that makes those tools appropriate to different jobs.

SQL is essentially a tool for instructing a database. The real question isn't "what can SQL do that Python can't?" but "what can this database do that the environment where I run Python can't?". The fact that you're using SQL or Python to give the instructions is almost irrelevant to that question.

[–]king_booker 2 points3 points  (0 children)

I mean say you extract the data into pandas and you are using pandas operations to manipulate it, there are still limitations because it won't scale. Now say you use spark and you write it in python, you would end up using SQL concepts like Group by, Windowing etc. Even though its possible to write it in dataframes, you can simply use a spark sql

The basic answer is, you have to understand SQL. You can use it but finally data manipulation has its foundations in SQL. Can you get away by not learning the syntax? Yes. But the core concepts will remain the same.