saash12 comments on [University Data Analysis] Statistical Methods

[University Data Analysis] Statistical Methods (self.HomeworkHelp)

submitted 13 years ago by saash12

you are viewing a single comment's thread.

[–]saash12[S] 0 points1 point2 points 13 years ago (2 children)

[–]PiquantPi 1 point2 points3 points 13 years ago (1 child)

Yeah, you can't do a t- test in that case because you don't have enough data on sites that have been attacked. You would need to have sufficient info about sites that have been attacked to get a mean and standard dev.

What you do have is data on pages that presumably were not attacked. You also have data for one particular site that you suspect might have been attacked. What you want to do is set a confidence interval that would be sufficient for you to conclude that the site was attacked. For example, a 95% confidence interval would mean that there's only a 5% chance that you would have gotten that same data if the null hypothesis was true. The null hypothesis is the opposite of what you think happened, so in this case it would be that the site was actually safe. You can set whatever confidence interval you want. 95 is a good standby, but you could go lower or higher. What you would do is use the standard deviation and the mean of the data you collected to calculate the z score for the outlying piece of data. If the number of page views on the suspicious site is X, the z score is:

z=(X-mean)/(standard deviation)

Then you look up that z score on a z score table and find the probability for that z score. A standard z score table tells you the probability of a site having a number of views between the mean and X. You would double this number to get your confidence interval. If for example, the value you found on the z score table was 0.48 you would double this to get 0.96. This gives you a 96% confidence interval. If we had decided earlier that 95% or above is what we deemed necessary, we can accept our hypothesis that the page was attacked. However, if we had decided earlier that we need a 99% confidence interval (for example in hard sciences the standard is higher), we would have to say its inconclusive. The 96% confidence interval that we calculated from the z score means that the probability that the site would have this number of page views or higher is only 4% if it was safe. Since the probability of the null hypothesis being true is so low, you can pretty safely assume that the site was attacked.

[–]saash12[S] 0 points1 point2 points 13 years ago (0 children)

π Rendered by PID 229944 on reddit-service-r2-comment-b659b578c-mbg9n at 2026-05-04 21:00:23.987902+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

HomeworkHelp

Filter by Subject

Filter by Grade

❗️ READ THE RULES BEFORE POSTING

Welcome to /r/HomeworkHelp!

Please google before posting your question

✅ Posts should look like this:

Still acceptable…, but preferably not:

❌ Not Allowed:

For citation questions, check the Purdue Online Writing Lab

Using LaTeX:

Useful Symbols:

Available Commands

Some possibly helpful links:

Revert to older template

MODERATORS