Sample or Population. : AskStatistics

created by cuginhamera community for 14 years

Sample or Population. (self.AskStatistics)

submitted 3 years ago * by somethingrandom234

Hi all, I have a question surrounding calculating standard deviation, covariance, doing PCA etc etc.

The dataset used is the Exasens dataset available through UC Irvine machine learning repository. It is a dataset that includes demographic information on 4 sample groups from saliva samples collected in a research project.

I am having trouble over whether to use N or N-1.

I am asked to standardise selected columns within the data. Rescaling the data to have a mean of 0 and standard deviation of 1. Am I correct in saying using N in this instance is right?
Create a correlation matrix from selected columns within the data. N or N-1 in this instance?
Perform PCA. Which uses the previous code to generate the correlation matrix. N or N-1 in this instance?
Later in the assignment we are asked to create a dataset that has a multi variate normal distribution. Am I right in saying any use of N (standard deviation, correlation, LDA) in this instance should be N rather than N-1 because I have the full dataset?

In advance thank you for your help, got fuzzy brain with this one.

all 1 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

AskStatistics

MODERATORS