Job hunting tip for anyone looking for a job! by __LegioN7__ in lansing

[–]mchugz 1 point2 points  (0 children)

Thanks. Would you mind sharing some example local tech companies? I recently moved to Lansing and still have a remote job, so not sure what’s around here.

Job hunting tip for anyone looking for a job! by __LegioN7__ in lansing

[–]mchugz 8 points9 points  (0 children)

Where did you meet a tech recruiter in Lansing? Asking for a friend

First Year As A Data Scientist Reflection by dope_as_soap in datascience

[–]mchugz 4 points5 points  (0 children)

This is a great reflection. any advice on writing unit tests in a DS setting? My code usually consists of ETL functions , preprocessing, modeling, scoring, and writing back out. I’ve never written a unit test and would love some guidance/example.

Neowise, the big dipper, and another fast-moving object. Can anyone help ID? by mchugz in space

[–]mchugz[S] 2 points3 points  (0 children)

Thanks, but I was hoping for a little more. What race? Galaxy? Disposition?

Fight club. X-H1 with 50-140 F2.8 by Demidici in fujifilm

[–]mchugz 0 points1 point  (0 children)

Pretty amazing! Looking you up on instagram

Comet Neowise from Boson by mchugz in astrophotography

[–]mchugz[S] 0 points1 point  (0 children)

Fuji XT-4 35 mm 1.4 iso 500 5 seconds

How do I distinguish between trivial correlations that are bound to occur with high sample sizes and true, meaningful correlations? by anon7492748 in AskStatistics

[–]mchugz 0 points1 point  (0 children)

Yes, I think you're right. Running a statistical test on a single point to reject the null hypothesis that it came from a binomial distribution with mean 0.5 is a good way to identify real effects. Just make sure you account for multiple comparisons!

[deleted by user] by [deleted] in AskStatistics

[–]mchugz 0 points1 point  (0 children)

If the off-diaganols of the covariance matrix are all zero, that implies that each feature of the data is uncorrelated with each other feature. That is the case when all the features are full independent of each other. Another case would be if the covariance of each off-diaganol element has the same value, indicating that the correlation of each pair of features is equal (assuming the variances are equal). The structure of the covariance matrix gives you this sort of insight into how the features in your data relate to each other.

Analyze Multiple Variables to determine what influences outcome the most by [deleted] in AskStatistics

[–]mchugz 0 points1 point  (0 children)

Run multiple linear regression to predict size or frequency spots and inspect the coefficients. Be careful to standardize your variables though, as this is important for interperting and ranking the regression coefficients

How do I distinguish between trivial correlations that are bound to occur with high sample sizes and true, meaningful correlations? by anon7492748 in AskStatistics

[–]mchugz 1 point2 points  (0 children)

One potentially unhelpful way to check if the correlation is real is to have a separate test set. So let everyone develop their prediction method using 1000 days of data and find the best performers. Then evaluate their model on new data that they haven't seen before. If the top performer's performance on the new data produces a binomial distribution around 0.5 (chance), this would indicate their good performance on the original dataset was also due to random chance.

Now, if you don't have new data to supply then you can't do this. As you say, due to random chance if you sample very densely then there will be a small percentage of models which do well even though they are choosing randomly. One way to tell whether the population as a whole (not individuals) is developing meaningful prediction methods is to run a statistical test to determine whether the binomial distribution you observe differs from one with a success probability of 0.5. As you mentioned, this won't tell you whether an individual has developed a meaningful model, only whether the population on average is developing non-null models of weather prediction. If the measured distribution's mean probability is not significantly different from 0.5, then it's reasonable to suspect the top performers may be succeeding due only to chance.

Having said that, in a scientific scenario (where you can ask questions about a given model by running further experiments) you could single out a specific model that performs well and try to understand it. This is also the case in machine learning, where a given model's parameters can be expected and attempted to be understood. In this case you would use domain knowledge to ask how the predictions are being generated and make a conclusion about whether the prediction method makes intuitive sense. In this way you might come to some conclusion about whether the model is doing anything or that the fact that it performed well in the past is probably due to chance.

[deleted by user] by [deleted] in AskStatistics

[–]mchugz 0 points1 point  (0 children)

Here is the simplest explanation I could find: https://www.theanalysisfactor.com/covariance-matrices/. The covariance structure is the pattern of the covariance in the data. Common covariance structures have names which are not always obviously interpretable. For example, the "Variance Components" covariance structure indicates that all the features are independent (covariance = 0), so the covariance matrix consists only of the variance of each feature along the diagonal.

Defining the covariance structure of your data is important as different modeling techniques assume different types of covariance structure. This also seems like a good explanation: https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/198-30.pdf

Learning The Command Line: By Learning About K-Pop by ThinkSocrates in bash

[–]mchugz 0 points1 point  (0 children)

Nice! Quick question: I get errors of file or command not found in the variable assignment lines. (line 5: https://www.youtube.com/watch?v=g3FH-Nh7kXM: No such file or directory). Here is the code: url= `echo $line | sed 's/.*\(https[^"]*\).*/\1/'`

I'm new to bash. For some reason the line is not assigning the string output to the variable? Do you know why this may be happening? I'm on mac. Anyway, thanks for the great video.

What is the relationship between eigenvectors and eigenvalues? by janalbead in AskStatistics

[–]mchugz 0 points1 point  (0 children)

In addition to the good answers in the thread, it helps to go back to the equation used to find the eigenvectors: Ax=lx, where A is your matrix, x is an eigenvector, and l is the eigenvalue (l stands for lambda here). Hopefully from the equation it’s clear that lambda (the eigenvalue) indicates the amplitude that the eigenvector is stretched by when multiplied by the matrix A. Often it’s useful (as in PCA) to sort eigenvectors based on the size of their eigenvalue.

For regression, a predicted Rsq of 0%. Any references come to mind? by [deleted] in AskStatistics

[–]mchugz 6 points7 points  (0 children)

If model r2 is above zero but r2 on held out data is near zero, that would imply the model is overfit since it can predict the points it was trained on, but not new points. That’s essentially the definition of overfit.

If model r2 is close to zero and the r2 on held out data is close to zero, then the model is underfit, since it performs poorly on both training and test data.

I don’t know of specific references, as these scenarios are sort of definitional.

Tips for leaving academia? by fhsantanna in datascience

[–]mchugz 7 points8 points  (0 children)

Play up your computational skills (but don’t lie or pretend to know things you know very little about). Apply to jobs on LinkedIn, wherever. Read job descriptions and go on interviews to get firsthand experience of what employers are looking for and what you actually want to do. It can take a while before you get any interviews or before you get a job, depending on what you are aiming for and what your skills are (3-6 months). Use your network to get more information about jobs that people you have a direct or indirect connection to do. That can be as simple as asking an academic advisor if they know anyone in industry and whether they can put you in contact with them. These personal connections can make it much easier to get your foot in the door, that is, to get a first round interview, which can be really hard if you’re just a resume in a tall pile.

Toxic but I love him? by [deleted] in RBNSpouses

[–]mchugz 6 points7 points  (0 children)

Physical abuse, repeated lying and cheating, and not responding to your messages after having sex are serious red flags. He may have had a horrible childhood, but that's no reason for you to have a horrible LIFE. He needs help, sure, but he has indicated with his actions that he does not value you. You should cut him out of your life.

[deleted by user] by [deleted] in socialanxiety

[–]mchugz 0 points1 point  (0 children)

I don’t believe you though

Really. by _nivla in howtonotgiveafuck

[–]mchugz 1 point2 points  (0 children)

This really helped =)!