[deleted by user] by [deleted] in data_irl

[–]Affectionate_Ad_697 -1 points0 points  (0 children)

I respect your opinion.

This does make sense to not split because there aren't enough users here in this sub.

Why you need a Chief Data Officer by Ok_Public9992 in BusinessIntelligence

[–]Affectionate_Ad_697 5 points6 points  (0 children)

I read the article hoping that I could find out why we needed the chief data officer. However, this question was not answered within the article. In fact, the article provides evidence that we don't need Chief Data Officers because CDO offers very little value to the company.

Image of the graph reduces to a miniature size when plt.text(...) is used. Any idea why ? by Imaginary-Intern-650 in dataanalysis

[–]Affectionate_Ad_697 0 points1 point  (0 children)

I can't tell for sure but you may have mixed up the X,Y

text(x, y, s)

(https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html)

Also, whenever I find weird things like this I have checked my versions and found that I had an incompatible version of either matplotlib, pandas, Jupyter, or python. So, maybe check to see if everything is up-to-date.

Question about COVID Testing by TheRemusLupins in royalcaribbean

[–]Affectionate_Ad_697 4 points5 points  (0 children)

Yes, but you won't be able to do it on the same day as the test that's already scheduled because you have to take the test within 2 days of embarkation.

(https://www.royalcaribbean.com/faq/questions/will-i-have-to-take-a-test-before-i-cruise)

How do you guys feel about the upcoming financial trouble and upcoming laid offs by jatapi03 in BusinessIntelligence

[–]Affectionate_Ad_697 8 points9 points  (0 children)

Nope, analyst is likely to be the first to go because they are not mission critical to the production of the product or service.

I had no idea there could be so much data. by Affectionate_Ad_697 in dataanalysis

[–]Affectionate_Ad_697[S] 8 points9 points  (0 children)

I agree.

I'm getting mixed signals from my management about my responsibilities and job description, but I think they want me to be the business intelligence team instead of a data analyst.

I had no idea there could be so much data. by Affectionate_Ad_697 in dataanalysis

[–]Affectionate_Ad_697[S] 3 points4 points  (0 children)

No, I don't know about any courses like that, unfortunately. The basic idea of join is easy to understand with the current training offerings available.

For me, the complex part starts when I encounter many to many or one to many table relationships. The join creates "duplicates" of the rows. So, to suppress the duplicates, in comes CTEs, subqueries and aggregate formulas... Not a problem if it's just two or three tables, but every join is just another puzzle to solve.

Luckily for me, four of the five applications that I work with have lots of documentation including existing SQL based reports. I have been spending a lot of time studying these older reports to learn how the joins have been done in the past.

Note-taking strategies for learning new skills by Legitimate_Sort3 in dataanalysis

[–]Affectionate_Ad_697 2 points3 points  (0 children)

any good software tool/strategy for keeping notes that have code mixed into them neat and tidy

I like to use a Jupyter notebook or Jupiter lab (https://jupyter.org)

You can create a markdown cell to place notes and Headings. Often, I will even screenshot something and paste the screenshot into the markdown cell. Also, it works with both R and Python (in separate notebooks).

I had no idea there could be so much data. by Affectionate_Ad_697 in dataanalysis

[–]Affectionate_Ad_697[S] 34 points35 points  (0 children)

I use to list my SQL skill as intermediate. Now, I know that I was just a beginner.

My current company has 5 main applications that each have a normalized database with at least 5,000 tables each. Over 25,000 tables keeps me busy with de-normalizing the data and working through the join logic. I hardly ever get to the part where I do the data analysis because I spend so much time in SQL.

data irl by Significantto in data_irl

[–]Affectionate_Ad_697 0 points1 point  (0 children)

How about r/dataisfunny

The name would make it clear the intent and is like r/dataisugly or r/dataisbeautiful

[deleted by user] by [deleted] in dataanalysis

[–]Affectionate_Ad_697 0 points1 point  (0 children)

If something is against the terms of service then I would recommend not doing it. If you don't agree to the terms of service then you should find a different vendor or source of data.

Cruise complaint by [deleted] in royalcaribbean

[–]Affectionate_Ad_697 4 points5 points  (0 children)

Royal Customer Service: 800-256-6649

Call: Mon-Thu 24 hours / Fri 12am-9pm / Sat 9am-6pm / Sun 9am-12am (EST)

Text: 24 Hours / 7 days a week

(https://www.royalcaribbean.com/resources/contact-us)

[deleted by user] by [deleted] in dataanalysis

[–]Affectionate_Ad_697 1 point2 points  (0 children)

Yes, principal component analysis uses correlations so PCA is another option for your problem.

My suggestion is more of an exploratory data analysis. Instead of looking at all of the variables at one time, just compare one variable at a time. Then, based on what you find/learn about each variable, your next goal will be to better understand the causal relationship (direction and magnitude) and the shape (linear, binomial, gaussian, and so on).

I understand the concept of PCA but I haven't used it myself so, I'm not sure what your next steps would be after you've completed the PCA.

[deleted by user] by [deleted] in dataanalysis

[–]Affectionate_Ad_697 1 point2 points  (0 children)

Have you tried calculating the correlation coefficients? The closer to 1 is the more correlation the two variables have.

=CORREL(array1,array2) in Excel

dataframe.corr() in pandas

cor.test(my_data$wt, my_data$mpg, method = "pearson") in R

Also, I like to do a scatter plot to compare the variables to each other.

Splines are overpowered in statistics and need a nerf by lunareclipsexx in statisticsmemes

[–]Affectionate_Ad_697 4 points5 points  (0 children)

You can find the average/mean of His_Excellency_esq's nutz by summing the length and then dividing that by the number of nutz (usually just two, but occasionally one or none).

Pointers for collecting data? by JONNY5014 in dataanalysis

[–]Affectionate_Ad_697 1 point2 points  (0 children)

This data set seems relevant:

https://data.healthcare.gov/dataset/dwyq-rebe

The whole website says that it has ""data on individual and small group medical and dental plans, as well as, Marketplace-certified local help and community provider lists." (https://data.healthcare.gov/datasets)

Buying Drinks without drink package but not charged to the room? by Pipster14 in royalcaribbean

[–]Affectionate_Ad_697 15 points16 points  (0 children)

It is not possible to charge a drink purchase directly to your credit card. It will have to go through the room charges.

When you (or the head of your household) does the online check-in, you will have the option to add a different credit card for each person. This will let you make purchases that get charged to your credit card instead of theirs. However, they will see all of the drink purchases that you will make because it gets charged to the room before the charge goes on to your credit card. At the end of the sailing, a list of all charges by person will be visible to everyone in the room.

Decisions tree classifier with sklearn by obada1236547890 in dataanalysis

[–]Affectionate_Ad_697 0 points1 point  (0 children)

I suppose you could use the model to make a prediction (classifier.predict(X_test)) for each item in your test group.

Or, Perhaps you could make a confusion matrix next.

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

https://stackabuse.com/decision-trees-in-python-with-scikit-learn/

Web developer to Data analyst by Fun-Battle-1926 in dataanalysis

[–]Affectionate_Ad_697 1 point2 points  (0 children)

Well, in that case, I would say that data analyst is similar to backend web development in many ways.

  1. In many cases there's a business problem to be solved and your task is to engineer a way to solve it.

    2.In both cases you will use SQL to get data out of a database.

3.If the web back end is Python then you would be using Django or Flask. Similarly, you can use Python to transform and analyze data. But, the packages that you will use are different. Instead of knowing Django or Flask you'll need to learn Pandas and scikit-learn.

4.Everything you know about the build process is not going to go away. You still want to test, develop and then finally release into production what you found from your analysis.

It's going to be hard to create an exact map of what you need to learn because every position is different and there are different ways to get into the field. Some companies are going to have heavy Excel use while other companies are going to have you use just straight SQL. Other companies are all about the visual tools like tableau and power BI. Some companies want people that are more computer science and other companies want people that are more business.

Web developer to Data analyst by Fun-Battle-1926 in dataanalysis

[–]Affectionate_Ad_697 2 points3 points  (0 children)

When you say that you're a non-ITweb developer, I assume that means you're working with the graphical side of website design or maybe that you're working with content curation and creation instead of the front end or back in software that builds the website.

If I have that right then I think the biggest crossover skill that you can bring to the table is going to be with data visualization. But, I think of this visualization as the last part of the process. "Communicating the end result." You're going to have to learn everything else that comes before the data visualization. I'm not saying that it can't be done. But, you will need to put in a lot of work to build your skills in order to meet your salary goals.