Language of Hypothesis Testing/One Sample t-tests
2024-10-22
Comments and Questions about HW 5
A few minutes for R Questions 🪄
Review of the Concept of Testing a Hypothesis using Confidence Interval
Language of Hypothesis Testing
Goals and Framework for Testing Hypotheses
What we can say and what we can’t
One Sided vs. Two Sided Hypothesis
In this course we will use R and RStudio to understand statistical concepts.
You will access R and RStudio through Posit Cloud.
I will post R/RStudio files on Posit Cloud that you can access in provided links.
I will also provide demo videos that show how to access files and complete exercises.
NOTE: The free Posit Cloud account is limited to 25 hours per month.
I demo how to download completed work so that you can use this allotment efficiently.
For those who want to go further with R/RStudio:
In Lecture 16, a survey about Electric Vehicles (EVs).
1025 people were surveyed
318 people said they were extremely, very, or somewhat likely to buy an EV.
Use the prop.test command, to estimate the 90% confidence interval for the proportion of US Adults who are likely to buy an EV soon.
What is the lower bound of this confidence interval?
In the last lecture we talked about what a hypothesis is and how we can test it using a confidence interval.
For example, a green energy skeptic says that less than a quarter of Adults are likely to buy an EV.
We can use a 95% confidence interval to test his claim.
If our 95% confidence interval is fully above 25%, then we are 95% confident that the true proportion is in our interval and above 25%
What do we conclude based on the 95% confidence interval of these proportion data?
Testing hypotheses requires TWO Hypotheses
The two hypotheses are COMPLEMENTS and cover all possible values.
In other words ONLY one or the other can be true
There is NO WAY both hypotheses can be true
There is NO WAY that neither hypothesis can be true
For example, let’s examine the heights of human male characters in the Star Wars franchise.
I hypothesize that these characters, on average, are significantly taller than human males in the United States.
Specifying TWO Hypotheses
The NULL Hypothesis, \(H_{0}\), is the default. This is what we try to DISPROVE
The ALTERNATIVE Hypothesis, \(H_{A}\), is what we hope to prove.
For our Star Wars Data, we hope to prove that the average height of Star Wars male humans is significantly taller than the population mean height of males in the United States.
\(\mu_{0} = 176\): Population mean of male heights in the United States
\(\overline{X}_{SW}\) is the sample mean of heights for Star Wars Characters
\(H_{0}:\) Average heights of Star Wars males are less than or equal to the average heights of US males.
\(H_{A}:\) Average heights of Star Wars males are greater than the average heights of US males.
\(H_{0}:\) Average heights of Star Wars males are less than or equal to the average height of all US males.
\(H_{A}:\) Average heights of Star Wars males are greater than the average height of all US males.
Here re the same Hypotheses written in Formal Notation:
\(H_{0}: \mu_{SW} \leq \mu_{0}\)
\(H_{A}: \mu_{SW} \gt \mu_{0}\)
Notice that we specify what we are trying to prove as the ALTERNATIVE
We CAN NEVER prove the Null hypothesis, \(H_{0}\) is true.
We assume the null hypothesis \(H_{0}\) is true, and test if our data contradict this assumption.
The null hypothesis is ALWAYS specified to include an equality: \(\leq\), \(\geq\), or \(=\)
The alternative hypothesis is ALWAYS specified as a strict inequality: \(\lt\), \(\gt\), or \(\neq\)
\(H_{0}: \mu_{SW} \leq 176\)
\(H_{A}: \mu_{SW} \gt 176\)
sw_male_heights <- read_csv("data/StarWars_Human_Male_Heights.csv", show_col_types = F)
t.test(sw_male_heights$height, mu=176, alternative = "greater")
One Sample t-test
data: sw_male_heights$height
t = 3.7177, df = 22, p-value = 0.0005989
alternative hypothesis: true mean is greater than 176
95 percent confidence interval:
179.4159 Inf
sample estimates:
mean of x
182.3478
This is a ONE-SIDED test.
We reject the null hypothesis if our sample mean is signifcantly GREATER than 176.
The p-value
indicates the probability of seeing these sample data if the null hypothesis is true.
Interpreting the p-value
from the the t.test
One Sample t-test
data: sw_male_heights$height
t = 3.7177, df = 22, p-value = 0.0005989
alternative hypothesis: true mean is greater than 176
95 percent confidence interval:
179.4159 Inf
sample estimates:
mean of x
182.3478
If the mean height of males in the Star Wars world is 176 or less, then the chance of seeing the data we have is
The Confidence Interval ONLY shows the Lower Bound because this is a ONE-SIDED TEST
Based on this P-value what do we conclude?
In the Star Wars Example, the P-value is very small so the decision is clear-cut
In many cases we have to set a cutoff, also called an \(\alpha\) level
For hypothesis tests:
\(\alpha\) is the type 1 error rate, the probability of rejecting \(H_{0}\) when it is actually true.
We (the analyst) specify the \(\alpha\) cutoff, typically but not always as 0.05
Analysts most commonly set \(\alpha\) at 0.05 for Hypothesis Tests, but SOMETIMES 0.01 or 0.10 are used.
By far, the most typical \(\alpha\) is 0.05.
Even with a cutoff, we should interpret the p-value along a spectrum
A p-value of 0.049 is alsmost identical to 0.05001.
It is wise to set an objective cutoff BEFORE we analyze the data,
In a standard situation when we use \(\alpha=0.05\), I think of the evidence against the null hypothesis along this spectrum:
\(\alpha\) is the probability that we falsely reject \(H_{0}\) when it is TRUE.
In some disciplines, you want to minimize the chance of making a mistake.
In other disciplines, you might be willing to take a riskier more exploratory approach to testing a hypothesis.
We have data from two Coca-cola plants.
Each plant is required to fill each can with 12 ounces of soda.
If, on average, they are under filling OR overfilling the cans, the plant will have to shut down and recalibratethe machinery.
The default is that each plant is working fine. That’s our null hypothesis, \(H_{0}\).
How do we state the null and alternative hypotheses we are testing based on the question we want to answer?
A. \(H_{0}: \mu \leq 12\) vs. \(H_{A}: \mu \gt 12\)
B. \(H_{0}: \mu = 12\) vs. \(H_{A}: \mu \neq 12\)
C. \(H_{0}: \mu \geq 12\) vs. \(H_{A}: \mu \lt 12\)
NOTE: Translating a question into testable hypotheses is the often challenging and takes some practice.
Run the t-test command for a two-tailed test with \(\alpha = 0.05\) (default options) for both Plant 1 and Plant 2.
Which plant would need to shut down if we set \(\alpha = 0.05\).
How are they related?
If the same \(\alpha\) is used for a hypothesis test and a confidence interval, results will agree.
P-value \(< \alpha\), we reject \(H_{0}\) and conclude that \(\mu \neq \mu_{0}\)
(1 - \(\alpha\))x100% Confidence Interval does not include \(\mu_{0}\)
P-value \(\geq \alpha\), we DON’T reject \(H_{0}\) and conclude that there no evidence to contradict \(\mu = \mu_{0}\)
(1 - \(\alpha\))x100% Confidence Interval includes \(\mu_{0}\)
Shutting down a plant is very expensive and the owner feels these test should have been done using \(\alpha = 0.01\)
Would either plant have to shut down if \(\alpha\) was set to 0.01 for this question?
NOTE that the P-value does not change but we change the Confidence Level to match our new \(\alpha = 0.01\).
NOTE: It is unethical to change \(\alpha\) AFTER looking at the data, but we will do it here to better understand the effect of these choices.
Language of Hypothesis Testing take a little time to get used to
The challenge is to interpret the question posed and translate that into testable hypotheses.
The hypotheses is set up so that the alternative hypothesis, \(H_{A}\) is what you are trying to prove.
The null hypothesis, \(H_{0}\) always includes an equality
The alternative hypothesis, \(H_{A}\) always includes a strict inequality
Today we used the t.test
command to find our p-values and confidence intervals
Next week vdist_t_prob
to visualize the p-values, review of t calculation, and two sample tests.
To submit an Engagement Question or Comment about material from Lecture 17: Submit it by midnight today (day of lecture).