Instructions: Write a summary (less than two pages) using the bookdown format html_document2 of the key ideas in Foundations of Inference. Your summary should include a discussion/definition of the following: hypotheses (null and alternative), null distribution, test statistic, p-value, types of errors in hypothesis testing, confidence intervals, and bootstrapping. Properly labeled Figures, Tables, and Equations will be extra credit. Print out and turn in a hard copy of your html document with your source code stapled to the back of the html no later than 5pm October 26, 2017.

1 Hypothesis Testing

Hypothesis testing is the art of testing if variation between two sample distributions can just be explained through random chance or not. If we have to conclude that two distributions vary in a meaningful way, we must take enough precaution to see that the differences are not just through random chance. At the heart of Type I error is that we don’t want to make an unwarranted hypothesis so we exercise a lot of care by minimizing the chance of its occurrence. Traditionally we try to set Type I error as .05 or .01 - as in there is only a 5 or 1 in 100 chance that the variation that we are seeing is due to chance. This is called the ‘level of significance’. Again, there is no guarantee that 5 in 100 is rare enough so significance levels need to be chosen carefully. For example, a factory where a six sigma quality control system has been implemented requires that errors never add up to more than the probability of being six standard deviations away from the mean (an incredibly rare event). Type I error is generally reported as the p-value. https://www.stat.berkeley.edu/~hhuang/STAT141/Lecture-FDR.pdf

https://onlinecourses.science.psu.edu/statprogram/node/137

1.1 Null hypotheses:

The null hypotheses, \(H_O\), is a hypothesis that says there is no statistical significance between the two variables in the hypothesis. It is the hypothesis that the researcher is trying to disprove.

\[H_0: \mu_1 - \mu_2 = 0\]

where \(\mu\) is the population mean.

1.2 Alternative hypotheses:

The alternative hypothesis, \(H_A\), is a hypothesis that expects an alternative effect from the null hypothesis. The alternative hypotheses believes the sample observations are influenced by some non-random cause.

\[H_A: \mu_1 - \mu_2 \ne 0\]

2 Null distributon:

The null distribution is the probability distribution of the test statistic when the null hypothesis is true. Generating a distribution of the statistic from the null population gives information about whether the observed data are inconsistent with the null hypothesis.

3 Test statistic:

A test statistic, \(z\) is a standardized value that is calculated from sample data during a hypothesis test. You can use test statistics to determine whether to reject the null hypothesis. The test statistic compares your data with what is expected under the null hypothesis.

After being evaluated for the sample data \(\bf x\) the result is called an observed test statistic, \(T_{obs}\).

A \(z\) test statistic is computed by standardizing a normal random variable (here \(\bar x\)).

\[z = \frac{\bar x - \mu_0}{ \frac{\sigma}{\sqrt{n}}}\]

where \(\mu_0\) is the hypothesized expected value of \(\bar x\), \(\sigma\) is the population standard deviation and \(n\) is the sample size. A \(z\) test has limited practical use (because it assumes that the population standard deviation is known) but it demonstrates the idea of a test statistic well enough.

Notice that if \(\sigma\) is replaced by it’s estimate \(s\), then the denominator has the standard error of \(\bar x\)

4 P-value:

The p-value is the probability of observing data as or more extreme than what we actually got given that the null hypothesis is true.

5 Error types in hypothesis testing:

5.1 Type I

Type I error, also known as a “false positive”: the error of rejecting a null hypothesis when it is actually true. In other words, this is the error of accepting an alternative hypothesis (the real hypothesis of interest) when the results can be attributed to chance. Plainly speaking, it occurs when we are observing a difference when in truth there is none (or more specifically - no statistically significant difference). So the probability of making a type I error in a test with rejection region \(R\) is \(P(R | H_0\) is true\()\) .

5.2 Type II

Type II error, also known as a “false negative”: the error of not rejecting a null hypothesis when the alternative hypothesis is the true state of nature. In other words, this is the error of failing to accept an alternative hypothesis when you don’t have adequate power. Plainly speaking, it occurs when we are failing to observe a difference when in truth there is one. So the probability of making a type II error in a test with rejection region \(R\) is \(1 − P(R | H_A\) is true \()\).The power of the test can be \(P(R|H_A\) is true \()\).

6 Confidence intervals:

A Confidence Interval refers to the amount of uncertainty associated with a sample population estimate (the mean or proportion) of a true population.

The confidence interval is the sample mean or proportion plus or minus the margin of error, the value used to calculate the upper limit and lower limit of the sample statistic.

Before calculating the Confidence Interval from a sample mean or proportion, choose either a 90%, 95%, or 99% confidence level. This is the amount of uncertainty in the sampling method. Meaning each time the same sampling method is used, the true population value would be represented in 95% of all the sample estimated confidence intervals. That also means that 5% would not contain the true population score.

7 Bootstrapping:

The bootstrap is a statistical procedure for estimating the sampling distribution of a statistic by resampling the sample data. In taking a sample from a population, we must make sure that the sample is representative of the population.