\(H_0\) is the null hypothesis (what we are testing)
\(H_a\) is the alternative hypothesis (what is contrary to the null hypothesis)
Example: Test if average scores of exam are 78 (two-sided test)
\(H_0: \mu = 78\)
\(H_a: \mu \neq 78\)
3/27/2020
\(H_0\) is the null hypothesis (what we are testing)
\(H_a\) is the alternative hypothesis (what is contrary to the null hypothesis)
Example: Test if average scores of exam are 78 (two-sided test)
\(H_0: \mu = 78\)
\(H_a: \mu \neq 78\)
Test if the average speed of the cars from the dataset “cars” is less than 17 mph.
\(H_0: \mu = 17\)
\(H_a: \mu < 17\)
## ## One Sample t-test ## ## data: cars$speed ## t = -2.1397, df = 49, p-value = 0.01869 ## alternative hypothesis: true mean is less than 17 ## 95 percent confidence interval: ## -Inf 16.6537 ## sample estimates: ## mean of x ## 15.4
data("cars")
h <- t.test(cars$speed, mu = 17, alternative = "less")
h
From the previous example, we saw the p-value was 0.01869 which is less than our \(\alpha = 0.05\) so we can reject the null hypothesis, \(H_0: \mu = 17\). We conclude that the true average speed of the cars is less than 17 mph.
What we should have done before conducting a hypothesis test is check to see if the data is normal. We can confirm this using a plot.
data("cars")
ggqqplot(cars$speed, ylab = "Speed of Cars")
The plot shows the data to be within a normal distribution so the hypothesis testing would be able to proceed.
We can also visualize the results of the t-test we calculated with cars data.
data("cars")
ggttest(t.test(cars$speed, mu = 17, alternative = "less"))
We see that the plot is a normal distribution density curve that displays our test statistic within the rejection area of -1.645 which once agains shows that we can reject \(H_0: \mu = 17\).
If we wanted to test the correlation between speed and distance of the cars we could visualize that in plotly to get an idea.
From the plotly, we can see some correlation, but to know for sure we can use a hypothesis test to see if the true correlation between speed and distance exists:
\(H_0: r = 0\) and \(H_a: r \neq 0\)
Using Pearson’s product-moment correlation test in R we get:
## ## Pearson's product-moment correlation ## ## data: cars$speed and cars$dist ## t = 9.464, df = 48, p-value = 1.49e-12 ## alternative hypothesis: true correlation is not equal to 0 ## 95 percent confidence interval: ## 0.6816422 0.8862036 ## sample estimates: ## cor ## 0.8068949
cor.test(cars$speed, cars$dist, method = "pearson")
From the test we see that the p-value is significantly less than \(\alpha = 0.05\) so we can conclude that speed and distance are significantly correlated with a value of 0.8068949.
Of course correlation in general is the formula:
\(r = {\sum(x-\bar{x})(y-\bar{y})\over(n-1)s_xs_y}\)