## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Hypothesis Testing

Testing an assumption regarding a population

Null Hypothesis assumes no affect

Alternative Hypothesis suggests an affect

Types of Hypothesis Tests: Z Test

Used to determine if a relationship is statistically significant.

Requires more than 30 data points

Formula to determine Z score

z = ( x– μ ) / (σ /√n)

x = What is being evaluated

μ = The mean

σ = Standard deviation

n = Sample size

Example 1:

Let’s assume for our null hypothesis that the average height form men in the U.S. is 71 inches. After measuring 100 men, the measured average height is 70 inches, with a standard deviation of 2 inches. Using our z test:

(70 - 71) / (2 / √100) = -5

This z-score is negative, and not within 0.05, the average height of men is likely smaller that 71 inches, therefore the hypothesis is rejected.

Example 1 Graph:

Code from graph:

dnormTwo = function(x){ twoside = dnorm(x) twoside[x <= -2 | x >= 2] = NA return(twoside)

}

g = ggplot(data.frame(x = c(-6, 6)), aes(x = x)) + stat_function(fun = dnorm) + stat_function(fun = dnormT, geom = “area”, fill = “green”, alpha = 0.3) g + geom_vline(xintercept = 1.96) + geom_vline(xintercept = -1.96) + geom_vline(xintercept = -5, linetype=“dotted”, color=“red”) + geom_label( label=“Value inside of rejection region”, x=-3.0, y=0.2, label.padding = unit(0.05, “lines”), label.size = 0.05, color = “black”, fill=“#69b3a2”

)

Example 2:

Using the mtcars data set from R programming we’ll assume the average mpg of the 32 cars made between 1971-72 is 20.1.

The measured mean is 20.09, and the standard deviation is 6.03.

(20.1 - 20.09) / (6.03 / √32) = 0.009

This z-score is within the interval of acceptance, there for the hypothesis is accepted.

Example 2 Graph:

Example 3:

Again using the mtcars data set, we will assume the mean horse-power of the cars is 150. The standard deviation is 68.56, the actual mean is 146.7, sample size is 32.

(150 - 147.6) /(68.65 / √32) = 1.97

This falls just inside the rejection zone.

Example 3 Graph: