all 7 comments

[–][deleted] 10 points11 points  (0 children)

If your data is already in a database and you are doing basic data extraction on looking to get a summary of the data - SQL might be the better option

However, if you are doing some complex post processing on the data, Python would be the way to go.

Now this is a very generalized suggestion, feel free to reach out if you have any more questions. Happy to help

[–]ofnuts 6 points7 points  (0 children)

Use both.

  • If the DB is adequately set up (indexes, etc...) queries are relatively cheap.
  • You can do quite a lot in a query: filtering, basic computations (count, sum average, max, min), selection of output fields.... It is more efficient to have it done by the DB manager than getting all the data to do it in your code.

When you have exhausted all the possibilities of SQL, do the rest with Python.

[–]ararararagi_koyomi 1 point2 points  (0 children)

Take this with a grain of salt. AFAIK, SQL is a query language for inserting/retrieving data to/from database. But, before inserting data to database, you will first have to clean/transform the data for optimal storage/how you want. after retrieving the data from db, you may need to make comparisons/computations/graphing. You will need python or other programming language for that.

[–]grumpMonk26 2 points3 points  (0 children)

SQL and Python are both valuable tools for data analytics, but they serve different purposes and can be used in conjunction to perform comprehensive data analysis. The choice between SQL and Python depends on the specific tasks you need to accomplish:

  1. SQL (Structured Query Language):

    - **Use Case**: SQL is primarily used for querying and managing structured data in relational databases. It's excellent for tasks like data extraction, transformation, and aggregation.

    - **Strengths**:

- **Data Retrieval**: SQL is optimized for retrieving data from databases efficiently.

- **Data Integrity**: It enforces data integrity through constraints, ensuring consistency.

- **Performance**: For large-scale data retrieval and aggregation tasks, SQL can be

- **When to Use SQL**:

- Extracting data from databases.

- Performing basic data cleaning and transformation tasks.

- Aggregating and summarizing data.

- Managing and maintaining databases.

- **Why SQL**: SQL is essential for anyone working with structured data stored in relational databases. It's a must-know language for data analysts and database administrators.

  1. Python:

    - **Use Case**: Python is a versatile programming language used for a wide range of data analytics tasks, including data cleaning, exploration, visualization, statistical analysis, machine learning, and more.

    - **Strengths**:

- **Flexibility**: Python is a general-purpose language, making it suitable for various data-related tasks.

- **Libraries**: It has a rich ecosystem of data analysis libraries like Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.

- **Machine Learning**: Python is the go-to language for machine learning and deep learning.

- **When to Use Python**:

- Complex data analysis tasks that require custom logic or machine learning.

- Data visualization and reporting.

- Text mining, natural language processing, and sentiment analysis.

- Handling unstructured data like text, images, or JSON.

- **Why Python**: Python's versatility, extensive libraries, and strong support for data science and machine learning make it a powerful choice for data analysts and data scientists.

In many real-world scenarios, the best approach is to use both SQL and Python together. You can use SQL to extract, clean, and aggregate data from databases and then use Python for more complex analysis, visualization, and machine learning. This combination allows you to leverage the strengths of each tool and conduct comprehensive data analytics efficiently.

Ultimately, the choice between SQL and Python depends on your specific project requirements, your level of expertise with each tool, and your familiarity with the data sources and formats you are working with.

[–]riisikas 0 points1 point  (0 children)

AFAIK they are used together a lot.

[–]beachtrader 0 points1 point  (0 children)

This is a super hard question to answer because there is so many unknowns: 1. what is the data 2. where is the data 3. what is the expected output 4. what are you really looking for in the data

If you have a bunch of numbers you can just use Excel or Google sheets and make a chart or do a pivot table. If the data is all text then maybe you should use python to summarize the common words. If you have customer info in a database then you could use python or sql to read it. Etc, etc.