2024-09-22

What is hypothesis testing?

  • Hypothesis testing is a way to use data to make a decision. It helps you figure out whether a hypothesis about some thing is likely true or likely not true.
  • In this presentation we will be conducting a hypothesis test to understand how GDP affects life expectancy.

Hypothesis

We’ll define the null hypothesis and alternative hypothesis as follows:

\[H_0: \text{No relationship between GDP per capita} \\ \text{and life expectancy.} \]

\[ H_1: \text{There is a relationship between GDP per capita} \\ \text{and life expectancy.} \]

In mathematical terms:

\[ H_0: \beta_1 = 0 \] \[ H_1: \beta_1 \neq 0 \]

\(\beta_1\) is the slope of the regression line from the linear regression that we will conduct.

The Data

  • We will be looking at a dataset called gapminder. It that contains data on countries all around the globe, including info on Life Expectancy, GDP per capita, Population, Year, and Continent.

Data (continued)

Mean Life Exp Median Life Expy Mean GDP per Capita Median GDP per Capita
59.47444 60.7125 7215.327 3531.847

Testing the Hypothesis

  1. Re-iterate the Hypotheses:
    • \(H_0\): No significant relationship between life expectancy and GDP per capita exists (\(\beta_1 = 0\)).
    • \(H_1\): Significant relationship between life expectancy and GDP per exists (\(\beta_1 \neq 0\)).
  2. Significance Level:
    • We’ll choose \(\alpha = 0.05\).

Testing the Hypothesis (cont.)

Perform the Test:

  • We will perform a linear regression to examine the relationship between life expectancy and GDP per capita.

Linear Regression Equation

\[ Y = \beta_0 + \beta_1 X + \epsilon \]

R Code for the linear regression

model <- lm(lifeExp ~ gdpPercap, data = gapminder)

Results

Results (cont.)

  • \(\beta_1\) = 7.6488265^{-4}

  • p-value = 3.5657242^{-156}

  • And because \(\text{p-value} < \alpha = 0.05\), we can reject the null hypothesis \(\beta_1 \neq 0\)

  • Therefore there IS a significant relationship between GDP per capita and life expectancy

  • In this case, if we multiply \(\beta_1\) by 10,000, that means that according to the linear regression, with every $10k increase in GDP per capita, life expectancy can be expected to increase by ~7.64 years