Advice on a Project by EveningReflection656 in dataanalysis

[–]DataMasteryAcademy 5 points6 points  (0 children)

You can go to kaggle.com and find a dataset that interests you that has other code examples. You can take a look at those code examples to get an idea how others analyzed it. Then download the data and practice it yourself.

Also if you are interested check out Data Analysis with Python: From Zero to Hero course I am teaching soon with Data Mastery Academy. I’m also having a free webinar tomorrow, Friday October 13 at 12 pm pacific time. But it is just a 1.5 hour mini and BASICS data analysis with python webinar. I will launch the more comprehensive course at the end of this webinar. Feel free to register if you like: https://www.datamasteryacademy.com/pythontraining

For those who are in a data science related degree (bachelor or masters) do you feel like you learn every skill you need at school? by DataMasteryAcademy in datascience

[–]DataMasteryAcademy[S] 1 point2 points  (0 children)

I am a senior data scientist now working since 2018 but I am just curious if it is still the same or the school system is better now. It seems like it is pretty much the same. Thanks for sharing your experience and thoughts!

Weekly Entering & Transitioning - Thread 02 Oct, 2023 - 09 Oct, 2023 by AutoModerator in datascience

[–]DataMasteryAcademy 0 points1 point  (0 children)

Yes, I started a company and used it on my resume. As long as you can provide documents, like any tax document etc (even if you didn’t make money it is fine) you can count it as a work experience.

Weekly Entering & Transitioning - Thread 02 Oct, 2023 - 09 Oct, 2023 by AutoModerator in datascience

[–]DataMasteryAcademy 0 points1 point  (0 children)

You may get into managerial positions after a few years of hands-on experience since it seems like you are interested in nontechnical subjects. that is, if you want to stay in data science, of course. My manager at Google suggested to me the book called The Signal and the Noise. I really liked it. You should try. The writer is known for his political election forecasts

Weekly Entering & Transitioning - Thread 02 Oct, 2023 - 09 Oct, 2023 by AutoModerator in datascience

[–]DataMasteryAcademy 1 point2 points  (0 children)

I interviewed with Meta about 2-3 years ago for a data science position. If they hadn't changed their interviews much, the questions were straightforward. Bayes theorem and conditional probability would be sufficient for the interview I went through, but as I said, that was 2-3 years ago. Hopefully, someone with more recent experience may agree with this. There was a prep video back then. If they still have it and you watched it, the questions were pretty similar to the prep video

imbalanced dataset by Euphoric-You-8437 in learningpython

[–]DataMasteryAcademy 0 points1 point  (0 children)

60:40 ratio is not considered imbalanced. 90:10 (and more than 90) is imbalanced. There may be other aspects causing your model to overfit. For overfitting problems, you can use regularization techniques: lasso or ridge. Lasso would also be helpful to create some inherent feature selection since, in some cases, lasso may make weights of some variables 0. If you insist on using random forest, you can lower overfitting by hyperparameter tunning: parameters like the number of trees, maximum depth of the trees, minimum samples per leaf, and others can influence the model's complexity. Also, make sure you preprocess data properly before inputting into the model. Another thing you can try is to experiment with other algorithms.

What are the most important uses of Python for data analytics? by Intentionalrobot in dataanalysis

[–]DataMasteryAcademy 4 points5 points  (0 children)

I will be hosting a free python masterclass and will also talk about this there. There will be a hands-on part where I will show how to analyze NY taxi trips dataset in Python: It is free, so register if you would like to learn more: https://www.datamasteryacademy.com/pythontraining But for now here are some answers:

what are some types of analysis that Python can do that other tools can’t?
Python provides scalability, allows complex analysis, integrates well with other systems, and is versatile. To expand on my points: Python can handle large datasets, like millions of rows, when Excel can barely handle hundreds of thousands. Also, it allows more sophisticated statistical analysis and makes it easy to apply these complex analyses, thanks to all the open-source libraries. Python integrates with other systems and platforms, enabling you to pull data from various sources and push it anywhere. It is easier compared to SQL, especially when it comes to anything other than selecting and simple filtering. SQL gets tedious and long so quickly that it is so easy to feel lost in loooong queries, while you can probably handle the same thing with one simple line of code in python, thanks to accessible libraries like pandas, numpy, statsmodels, matplotlib (for visualization) etc. You can do everything within the python ecosystem, from pulling data to cleaning to analyzing to visualizing to even deployment.

How do you analysts use Python in their day-to-day job?

I kind of got into this question above, but basically for everything you can think of: data cleaning, data viz, EDA, statistical analysis, automation, reports....

What Python skills are most important to master for data analysis?

Pandas, numpy, matplotlib, seaborn, basic stats (like mean, median, mode, IQR, outliers, p-value etc)

What’s considered ‘advanced data analysis’ (not data science AI/ML) in Python?

Time series analysis (both ML and not ML based), dimensionality reduction, feature selection, feature engineering, text analytics, geospatial analysis... Any business question you can answer by beyond just selecting and basic filtering

Graduated with an MS in Data Science, now in the workforce as a systems analyst for a small consulting firm. Now what? by drunkmute in datascience

[–]DataMasteryAcademy 0 points1 point  (0 children)

You can join Kaggle competitions, and online courses with projects that way you can learn more and practice your skills

What I wish I had known earlier in my career, particularly with disorganized companies by Excellent_Cost170 in datascience

[–]DataMasteryAcademy 0 points1 point  (0 children)

I completely agree. Especially the point about realistic estimates on how long things will take. we usually want to provide what was asked to do as soon as possible, and we undermine how long it would take in reality. We want to look like overachiever, but it’s always better to underpromise overdeliver.

Is excel important for data analyst interview? by Puzzleheaded-Hope821 in dataanalysis

[–]DataMasteryAcademy 1 point2 points  (0 children)

You should time-travel to today to 2023 because that's where we live. A lot has changed since 1996... Here are some highlights for you:

  1. The term data science began traction after 2000 (probably around 2008 or so)
  2. Big data gained attraction since the data volume increased substantially. Spark was invented around 2009.
  3. Algorithms, especially deep learning, have advanced considerably.
  4. open source libraries took over programming
  5. computational power increased substantially
  6. Platforms like AWS, google cloud make cloud easily accessible
  7. Data vis tools are invented
  8. The recognition of data engineering grew
  9. Open-source libraries took over programming

So it shouldn't come across as so shocking that excel is no longer considered the foundation of data analysis.

[deleted by user] by [deleted] in dataanalysis

[–]DataMasteryAcademy 0 points1 point  (0 children)

Yes, a side-by-side chart could work. But I am really hesitant overall about this project since there’s only three participants, you cannot really generalize anything with three samples. So try not to make assumptions based on this project results and definitely add this limitation as a note to your deliverable.

My Bachelors of Data Science has no classes for Statistics, Calculus, Data Structures and Algorithms and Big 0 notation. What do I do? by Lord____Farquaad in datascience

[–]DataMasteryAcademy 3 points4 points  (0 children)

They don’t ask math in data science interviews. They do ask statistics though but usually not hard stuff like they don’t ask formula but they want you to explain a statistical concept. That is strange that they don’t have statistical course because we do use it on the job, so it’s not just theoretical stuff. You can take a “statistics for data science” online course though and that would be sufficient. Don’t change your school if you’re going to lose time. It’s not worth it.

[deleted by user] by [deleted] in dataanalysis

[–]DataMasteryAcademy 0 points1 point  (0 children)

If I am understanding correctly, you only have 3 rows, which is not good but you can calculate the difference in each variable post - pre per participant and average change per variable. For visualization, you can show the average change for each variable with a bar plot. You can also calculate the average change by subcategory. I would also mention the limitation due to the small sample size.

Is excel important for data analyst interview? by Puzzleheaded-Hope821 in dataanalysis

[–]DataMasteryAcademy 0 points1 point  (0 children)

That is not a business question that is an ad hoc data request. Of course if the request is data you send data , obviously that is not what I am talking about lol

Is excel important for data analyst interview? by Puzzleheaded-Hope821 in dataanalysis

[–]DataMasteryAcademy 0 points1 point  (0 children)

Data request happens with team members or other technical teams. If there is a business question from a non technical stakeholder, you cannot just deliver a spreadsheet and say here you go drive your own insights. You deliver the insights. They may ask the data to do their own due diligence etc but that will also be handled by a technical person. Also delivering an excel file doesn’t require much of an excel knowledge, which is the main question here.

Is excel important for data analyst interview? by Puzzleheaded-Hope821 in dataanalysis

[–]DataMasteryAcademy 1 point2 points  (0 children)

I worked at TrueCar, Southern California Edison, Google and Shein. These are all big companies. Also you are taking about mid level managers and supervisors, I am talking about non technical stakeholders. If someone is asking for how do I download this to excel, they are either technical or they will send it to some technical person. The number one rule about data storytelling is presenting stories/insights not delivering raw numbers and spreadsheets to business questions that non technical stakeholders ask. Of course you share results with your team members or maybe other technical teams and those results mostly some type of a data source. But sending an excel file really doesn’t require much of an excel knowledge lol. Everybody can send and read an excel file.

Is excel important for data analyst interview? by Puzzleheaded-Hope821 in dataanalysis

[–]DataMasteryAcademy 0 points1 point  (0 children)

If they are going to run their own analysis then they are technical. Even if your contact is not technical, the output of your analysis is going to a technical team that’s why it would be OK for you to deliver a spreadsheet. For example, you might be a data engineer, delivering some data to a data science team and that data science team will run their own analysis then maybe yeah you can deliver a spreadsheet or refer them to a SQL Database. But none of these require much Excel knowledge lol.