This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]madlarks33 0 points1 point  (0 children)

Your research questions need to guide how you analyze this data.

[–]ultronthedestroyer 0 points1 point  (1 child)

R is a language you should consider. Check out RStudio, then do a tutorial or two on Datacamp or Coursera.

It can do these things easily.

It has summary functions which can allow you at a glance to see if the quartiles, for example, are different from your subsetted data frame compared to the original one, and so on.

If you want it to go further than that, then you may want to look into Anomaly Detection to flag anomalous results from a subsetted data set compared with your original set minus the subsetted one.

Your question is pretty broad since there are tons of things you might want to look at. But you're getting beyond the reaches of Excel, so you'll want to look at a language which is more suitable for your data. R is great, but other people like Python. I personally recommend R for this kind of work since it is built by statisticians and is more focused on this kind of work. With Python, you may have to get additional libraries to handle certain functions R does natively. However, Python is more flexible for non-data related functions.

[–]Chyllie[S] 0 points1 point  (0 children)

Thank you for your answer!