MAS 261 - Lecture 17

Language of Hypothesis Testing/One Sample t-tests

Author

Penelope Pooler Eisenbies

Published

October 22, 2024

Housekeeping

  • Comments and Questions about HW 5

  • A few minutes for R Questions 🪄

  • Review of the Concept of Testing a Hypothesis using Confidence Interval

  • Language of Hypothesis Testing

  • Goals and Framework for Testing Hypotheses

  • What we can say and what we can’t

  • One Sided vs. Two Sided Hypothesis

R and RStudio

  • In this course we will use R and RStudio to understand statistical concepts.

  • You will access R and RStudio through Posit Cloud.

  • I will post R/RStudio files on Posit Cloud that you can access in provided links.

  • I will also provide demo videos that show how to access files and complete exercises.

  • NOTE: The free Posit Cloud account is limited to 25 hours per month.

    • I demo how to download completed work so that you can use this allotment efficiently.

    • For those who want to go further with R/RStudio:

Lecture 17 In-class Exercises - Q1

In Lecture 16, a survey about Electric Vehicles (EVs).

  • 1025 people were surveyed

  • 318 people said they were extremely, very, or somewhat likely to buy an EV.


  • Use the prop.test command, to estimate the 90% confidence interval for the proportion of US Adults who are likely to buy an EV soon.

  • What is the lower bound of this confidence interval?

Testing Hypotheses Informally

In the last lecture we talked about what a hypothesis is and how we can test it using a confidence interval.

For example, a green energy skeptic says that less than a quarter of Adults are likely to buy an EV.

  • We can use a 95% confidence interval to test his claim.

  • If our 95% confidence interval is fully above 25%, then we are 95% confident that the true proportion is in our interval and above 25%

What do we conclude based on the 95% confidence interval of these proportion data?

Formal Language of Hypothesis Testing

  • Testing hypotheses requires TWO Hypotheses

  • The two hypotheses are COMPLEMENTS and cover all possible values.

    • In other words ONLY one or the other can be true

    • There is NO WAY both hypotheses can be true

    • There is NO WAY that neither hypothesis can be true

  • For example, let’s examine the heights of human male characters in the Star Wars franchise.

  • I hypothesize that these characters, on average, are significantly taller than human males in the United States.

    • How should I test this?

Setting up a Formal Hypothesis Test

Specifying TWO Hypotheses

  • The NULL Hypothesis, \(H_{0}\), is the default. This is what we try to DISPROVE

  • The ALTERNATIVE Hypothesis, \(H_{A}\), is what we hope to prove.

  • For our Star Wars Data, we hope to prove that the average height of Star Wars male humans is significantly taller than the population mean height of males in the United States.

    • \(\mu_{0} = 176\): Population mean of male heights in the United States

    • \(\overline{X}_{SW}\) is the sample mean of heights for Star Wars Characters

      • \(\overline{X}_{SW}\) is an estimate \(\mu_{SW}\), the population of all Star Wars male heights
  • \(H_{0}:\) Average heights of Star Wars males are less than or equal to the average heights of US males.

  • \(H_{A}:\) Average heights of Star Wars males are greater than the average heights of US males.

Important Details about These Hypotheses

\(H_{0}:\) Average heights of Star Wars males are less than or equal to the average height of all US males.

\(H_{A}:\) Average heights of Star Wars males are greater than the average height of all US males.

Here re the same Hypotheses written in Formal Notation:

  • \(H_{0}: \mu_{SW} \leq \mu_{0}\)

  • \(H_{A}: \mu_{SW} \gt \mu_{0}\)

  • Notice that we specify what we are trying to prove as the ALTERNATIVE

    • We CAN NEVER prove the Null hypothesis, \(H_{0}\) is true.

    • We assume the null hypothesis \(H_{0}\) is true, and test if our data contradict this assumption.

    • The null hypothesis is ALWAYS specified to include an equality: \(\leq\), \(\geq\), or \(=\)

    • The alternative hypothesis is ALWAYS specified as a strict inequality: \(\lt\), \(\gt\), or \(\neq\)

Testing Our Hypothesis and Drawing a Conclusion

\(H_{0}: \mu_{SW} \leq 176\)

\(H_{A}: \mu_{SW} \gt 176\)

Code
```{r echo=T}
sw_male_heights <- read_csv("data/StarWars_Human_Male_Heights.csv", show_col_types = F)
t.test(sw_male_heights$height, mu=176, alternative = "greater")
```

    One Sample t-test

data:  sw_male_heights$height
t = 3.7177, df = 22, p-value = 0.0005989
alternative hypothesis: true mean is greater than 176
95 percent confidence interval:
 179.4159      Inf
sample estimates:
mean of x 
 182.3478 
  • This is a ONE-SIDED test.

  • We reject the null hypothesis if our sample mean is signifcantly GREATER than 176.

  • The p-value indicates the probability of seeing these sample data if the null hypothesis is true.

Lecture 17 In-class Exercises - Q2

Interpreting the p-value from the the t.test


    One Sample t-test

data:  sw_male_heights$height
t = 3.7177, df = 22, p-value = 0.0005989
alternative hypothesis: true mean is greater than 176
95 percent confidence interval:
 179.4159      Inf
sample estimates:
mean of x 
 182.3478 
  • If the mean height of males in the Star Wars world is 176 or less, then the chance of seeing the data we have is

    • 0.0006 or 0.006%
  • The Confidence Interval ONLY shows the Lower Bound because this is a ONE-SIDED TEST

  • Based on this P-value what do we conclude?

General Guidelines for Interpreting P-values

  • In the Star Wars Example, the P-value is very small so the decision is clear-cut

  • In many cases we have to set a cutoff, also called an \(\alpha\) level

    • Yes this is related to \(\alpha\) in Confidence Intervals - stay tuned
  • For hypothesis tests:

    • \(\alpha\) is the type 1 error rate, the probability of rejecting \(H_{0}\) when it is actually true.

    • We (the analyst) specify the \(\alpha\) cutoff, typically but not always as 0.05

Four Possible Outcomes of a Hypothesis test

Choice of \(\alpha\)

Analysts most commonly set \(\alpha\) at 0.05 for Hypothesis Tests, but SOMETIMES 0.01 or 0.10 are used.

Interpreting p-values sensibly

  • By far, the most typical \(\alpha\) is 0.05.

  • Even with a cutoff, we should interpret the p-value along a spectrum

    • A p-value of 0.049 is alsmost identical to 0.05001.

    • It is wise to set an objective cutoff BEFORE we analyze the data,

      • BUT we also should put results in perspective.
  • In a standard situation when we use \(\alpha=0.05\), I think of the evidence against the null hypothesis along this spectrum:

    • 0.0 - 0.01 Extremely strong evidence against \(H_{0}\)
    • 0.011 - 0.03 Strong evidence against \(H_{0}\)
    • 0.031 - 0.049 Some evidence against \(H_{0}\)
    • 0.05 - 0.07 Suggestive evidence against \(H_{0}\)
    • 0.071 - 0.099 Minimal evidence against \(H_{0}\)
    • 0.1 and above No evidence against \(H_{0}\)

Why choose a different \(\alpha\) than 0.05

\(\alpha\) is the probability that we falsely reject \(H_{0}\) when it is TRUE.

  • Depending on the discipline, we might want to set a different cutoff


  • In some disciplines, you want to minimize the chance of making a mistake.

    • For example in a road safety or drug approval study where you want to be sure of your conclusion before going public.


  • In other disciplines, you might be willing to take a riskier more exploratory approach to testing a hypothesis.

    • For example in the initial stages of exploratory scientific research you may lower your criteria in an inital pilot study to determine if further study is warranted.

A Two-Sided Hypothesis Test Example

We have data from two Coca-cola plants.

  • Each plant is required to fill each can with 12 ounces of soda.

  • If, on average, they are under filling OR overfilling the cans, the plant will have to shut down and recalibratethe machinery.

  • The default is that each plant is working fine. That’s our null hypothesis, \(H_{0}\).

Lecture 17 In-class Exercises - Q3

How do we state the null and alternative hypotheses we are testing based on the question we want to answer?


A. \(H_{0}: \mu \leq 12\) vs. \(H_{A}: \mu \gt 12\)

B. \(H_{0}: \mu = 12\) vs. \(H_{A}: \mu \neq 12\)

C. \(H_{0}: \mu \geq 12\) vs. \(H_{A}: \mu \lt 12\)


NOTE: Translating a question into testable hypotheses is the often challenging and takes some practice.

Lecture 17 In-class Exercises - Q4

Run the t-test command for a two-tailed test with \(\alpha = 0.05\) (default options) for both Plant 1 and Plant 2.

Which plant would need to shut down if we set \(\alpha = 0.05\).

Code
```{r eval = F, echo=T}
coke <- read_csv("data/Coca-cola.csv", show_col_types = F)

t.test(coke$Plant_1, mu=12)

t.test(coke$Plant_2, mu=12)
```

Complete Conclusions for Plant 1

Complete Conclusions for Plant 2

t-test P-values and Confidence Intervals

How are they related?

If the same \(\alpha\) is used for a hypothesis test and a confidence interval, results will agree.


  • P-value \(< \alpha\), we reject \(H_{0}\) and conclude that \(\mu \neq \mu_{0}\)

  • (1 - \(\alpha\))x100% Confidence Interval does not include \(\mu_{0}\)


  • P-value \(\geq \alpha\), we DON’T reject \(H_{0}\) and conclude that there no evidence to contradict \(\mu = \mu_{0}\)

  • (1 - \(\alpha\))x100% Confidence Interval includes \(\mu_{0}\)

Lecture 17 In-class Exercises - Q5

Shutting down a plant is very expensive and the owner feels these test should have been done using \(\alpha = 0.01\)

Would either plant have to shut down if \(\alpha\) was set to 0.01 for this question?


NOTE that the P-value does not change but we change the Confidence Level to match our new \(\alpha = 0.01\).

Code
```{r eval = F, echo=T}
t.test(coke$Plant_1, mu=12, conf.level = .99)

t.test(coke$Plant_2, mu=12, conf.level = .99)
```

NOTE: It is unethical to change \(\alpha\) AFTER looking at the data, but we will do it here to better understand the effect of these choices.