Inference of population means

Learning Objectives

  • State hypotheses for 1-sample and 2-sample t-test

  • State the linear model formulations of 1-sample and 2-sample t-test

  • State and check assumptions of 1-sample and 2-sample t-test

  • Obtain and interpret results of 1-sample and 2-sample t-test

  • Obtain and interpret confidence intervals for population means & mean differences

Load necessary libraries

# Load necessary packages
library(tidyverse)
library(ggthemes)
library(flextable)
# Set ggplot theme for visualizations
theme_set(ggthemes::theme_few())

# Set options for flextables
set_flextable_defaults(na_str = "NA")

# Load function for printing tables nicely
source("https://raw.githubusercontent.com/dilernia/STA323/main/Functions/make_flex.R")

Let’s import data to use for this activity

# Load Palmer penguins data
penguins <- readr::read_csv("https://raw.githubusercontent.com/dilernia/STA323/main/Data/penguins.csv")

One-sample t-test

Test if the true average flipper length of penguins is = 200mm

We are interested in testing whether the flipper length of all penguins in Palmer Archipelago Antarctica is equal to 200mm or not.

Formally, the one-sample \(t\)-test hypotheses are as below:

H0: \(\mu = 200\) vs. Ha: \(\mu \ne 200\)

Where \(\mu\) is the average flipper length of penguins in Palmer Archipelago, Antarctica.

Next, we calculate some summary statistics for the flipper lengths of penguins

Descriptive statistics

# Calculating descriptive statistics
quant1Stats <- penguins %>% 
  dplyr::summarize(
  Minimum = min(flipper_length_mm, na.rm = TRUE),
  Q1 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.25),
  M = median(flipper_length_mm, na.rm = TRUE),
  Q3 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.75),
  Maximum = max(flipper_length_mm, na.rm = TRUE),
  Mean = mean(flipper_length_mm, na.rm = TRUE),
  R = Maximum - Minimum,
  s = sd(flipper_length_mm, na.rm = TRUE),
  n = n()
)

# Printing table of statistics
quant1Stats %>% 
  make_flex(caption = "Quantitative summary statistics for penguin flipper lengths (mm).")
Table 1: Quantitative summary statistics for penguin flipper lengths (mm).

Minimum

Q1

M

Q3

Maximum

Mean

R

s

n

172.00

190.00

197.00

213.00

231.00

200.92

59.00

14.06

344

Let’s get a histogram to explore the distribution of the flipper lengths.

Histogram

# Creating a histogram
penguins %>% 
  ggplot(aes(x = flipper_length_mm)) + 
  geom_histogram(color = "white") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
  labs(title = "Penguin flipper lengths",
       x = "Flipper length (mm)",
       y = "Frequency",
       caption = "Data source: palmerpenguins R package")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

This is not a normal distribution because:

  • The histogram is slightly right-skewed

  • It is bimodal as opposed to normal distribution with one peak

We can also create a box plot of the data.

Box plot

# Creating a box plot
penguins %>% 
  ggplot(aes(x = flipper_length_mm)) + 
  geom_boxplot() +
  scale_y_discrete(breaks = NULL) +
    labs(title = "Penguin flipper lengths",
       x = "Flipper length (mm)",
       caption = "Data source: palmerpenguins R package")

We can also use a quantile-quantile plot QQ plot

Quantile-quantile plot

# Creating a Quantile-Quantile (QQ) plot
penguins %>% 
  ggplot(aes(sample = flipper_length_mm)) + 
  stat_qq_line() +
  stat_qq() +
  labs(title = "QQ-plot for penguin flipper lengths",
       x = "Theoretical normal quantiles",
       y = "Empirical quantiles",
       caption = "Data source: palmerpenguins R package")

This data is not normal because:

  • The data is straying from the diagonal line in the QQ plot.

  • For normality, the data should be distributed along the diagonal line through the origin.

Checking the model assumptions

State assumption for one-sample t-test, indicate whether the assumption is met, give evidence in each case.

Independent observations:

We will assume these measurements were randomly selected and they are of different penguins for the sake of the problem

Normality assumption:

Since n = 344 which is >100, normality assumption is met by the Central Limit Theorem.

Since the assumptions are met we proceed with the t-test in R.

Implementing a one-sample t-test

# Implementing a one-sample t-test using R
ttestRes <- t.test(penguins$flipper_length_mm,
                   mu = 200, conf.level = 0.95)

# Printing model output
ttestRes %>% 
  broom::tidy() %>% 
  make_flex()

estimate

statistic

p.value

parameter

conf.low

conf.high

method

alternative

200.92

1.20

0.23

341.00

199.42

202.41

One Sample t-test

two.sided

State our p-value, decision, and conclusion in the context of the problem testing at the 5% significance level, citing specific evidence from the obtained output.

  • If the p-value if < 0.05 (p-value is low reject H0) reject H0;

  • If the p-value if >= 0.05 (p-value is high fail to reject H0) fail to reject H0;

p-value: 0.23

Decision: Since the p-value 0.23 is > 0.05, we fail to reject the H0

Interpretation: We have insufficient evidence that the mean Flipper length of Penguins from Palmer Archipelago Antarctica differs from 200mm.

CI Limits for \(\mu\) : (199.42, 202.41)

We are 95% sure that the mean Flipper length of Penguins in Palmer … is between 199.42mm and 202.41mm

Two-sample t-test

A formal statement of hyptheses for testing if penguins from Dream Island weigh the same as penguins from Biscoe Island, as below:

H0: \(\mu1_ = \mu_2\) vs. Ha: \(\mu1_ \ne\mu_2\)

Where \(\mu_1\) is the average body mass in grams for penguins from the Dream Island, and \(\mu_2\) is the average body mass in grams of penguins from Biscoe Island.

Calculating descriptive statistics

# Calculating descriptive statistics
quant2Stats <- penguins %>% 
  dplyr::filter(island %in% c("Biscoe", "Dream")) %>% 
  group_by(island) %>% 
  summarize(
  Minimum = min(body_mass_g, na.rm = TRUE),
  Q1 = quantile(body_mass_g, na.rm = TRUE, probs = 0.25),
  M = median(body_mass_g, na.rm = TRUE),
  Q3 = quantile(body_mass_g, na.rm = TRUE, probs = 0.75),
  Maximum = max(body_mass_g, na.rm = TRUE),
  Mean = mean(body_mass_g, na.rm = TRUE),
  R = Maximum - Minimum,
  s = sd(body_mass_g, na.rm = TRUE),
  n = n()
)

# Printing table of statistics
quant2Stats %>% 
  make_flex(caption = "Summary statistics for penguin body masses (g) by species.")
Table 3: Summary statistics for penguin body masses (g) by species.

island

Minimum

Q1

M

Q3

Maximum

Mean

R

s

n

Biscoe

2,850.00

4,200.00

4,775.00

5,325.00

6,300.00

4,716.02

3,450.00

782.86

168

Dream

2,700.00

3,400.00

3,687.50

3,956.25

4,800.00

3,712.90

2,100.00

416.64

124

Creating side-by-side box plots

# Creating side-by-side box plots
penguins %>% 
  dplyr::filter(island %in% c("Biscoe", "Dream")) %>% 
  ggplot(aes(x = island, y = body_mass_g, fill = island)) + 
  geom_boxplot() + 
  scale_fill_manual(values = c("#1F77B4", "#2CA02C")) +
        labs(title = "Penguin body masses by island",
       x = "Island",
       y = "Body mass (g)",
       caption = "Data source: palmerpenguins R package") +
  theme(legend.position = "none")

## Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).

Describe the box plot in terms of center and symmetry

It is symmetric. No out of order outliers.

Creating a histogram

# Creating a histogram
penguins %>% 
    dplyr::filter(island %in% c("Biscoe")) %>% 
  ggplot(aes(x = body_mass_g)) + 
  geom_histogram(color = "white", fill = "#1F77B4") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
  labs(title = "Biscoe island penguin body masses",
       x = "Body mass (g)",
       y = "Frequency",
       caption = "Data source: palmerpenguins R package")

Biscoe Island histogram is:

  • unimodal

  • fairly symmetric

  • slightly skewed

# Creating a histogram
penguins %>% 
    dplyr::filter(island %in% c("Dream")) %>% 
  ggplot(aes(x = body_mass_g)) + 
  geom_histogram(color = "white", fill = "#2CA02C") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
  labs(title = "Dream island penguin body masses",
       x = "Body mass (g)",
       y = "Frequency",
       caption = "Data source: palmerpenguins R package")

Dream Island histogram is:

  • unimodal

  • fairly symmetric

  • slightly skewed - right-skewed

  • average = 3800-3900g -from from the graph

We can make QQ plots for each as well

# Creating a Quantile-Quantile (QQ) plot
penguins %>% 
    dplyr::filter(island %in% c("Biscoe")) %>% 
  ggplot(aes(sample = body_mass_g)) + 
  stat_qq_line() +
  stat_qq(color = "white", fill = "#1F77B4", pch = 21) +
  labs(title = "QQ-plot for Biscoe island penguin body masses",
       x = "Theoretical normal quantiles",
       y = "Empirical quantiles",
       caption = "Data source: palmerpenguins R package")

  • Some deviations but the data looks normally distributed

  • A little of dancing from the line is okay.

# Creating a Quantile-Quantile (QQ) plot
penguins %>% 
    dplyr::filter(island %in% c("Dream")) %>% 
  ggplot(aes(sample = body_mass_g)) + 
  stat_qq_line() +
  stat_qq(color = "white", fill = "#2CA02C", pch = 21) +
  labs(title = "QQ-plot for Dream island penguin body masses",
       x = "Theoretical normal quantiles",
       y = "Empirical quantiles",
       caption = "Data source: palmerpenguins R package")

  • Normality is met.

When the sample size is > 100 then the normality assumption does not matter

since the sample sizes for both groups were greater than 100( = 124, n2 = 168) the normality assumption is met by the Central Limit Theorem.

  • Moreover, since these are all measurements for separate penguins, the independence assumption is met.

Two-sample t-test

Let’s conduct the unpooled independent samples t-test

# Creating vectors of values
dreamMasses <- dplyr::filter(penguins, island == "Dream") %>% 
  dplyr::pull(body_mass_g)

biscoeMasses <- dplyr::filter(penguins, island == "Biscoe") %>% 
  dplyr::pull(body_mass_g)

# Implementing the two-sample t-test
ttestRes2 <- t.test(x = dreamMasses,
                    y = biscoeMasses,
                    mu = 0,
                    conf.level = 0.95)

# Printing model output
ttestRes2 %>% 
  broom::tidy() %>% 
  make_flex(ndigits = 1)

estimate

estimate1

estimate2

statistic

p.value

parameter

conf.low

conf.high

method

alternative

-1,003.1

3,712.9

4,716.0

-14.1

<2e-16

264.8

-1,143.3

-862.9

Welch Two Sample t-test

two.sided

What is our estimate of 1, denoted by ȳ1, for this data set?

3,712.9g

What would be our estimate of the average body mass in grams of all penguins across the Dream and Biscoe islands, assuming an equal number of penguins living on each island?

Take the average of the the estimates(3,712.9g and 4,716.0g) == 4215g

How many penguins are there in this data set from each island?

From the summary statistics:

What is s2?

Check in the summary statistics table:

Get the standard deviation and square it.

Provide our test statistic, p-value, and decision at the 5% significance level citing evidence from the output.

Test statistic: - 14.1

p-value: <2e-16

Decision: Reject the null hypothesis (If p-value is low reject H0) Since p-value < 0.05

The choices are either reject H0 or Fail to reject H0

Interpretation of result: Since the p-value is less than 0.005, we have sufficient evidence that the average body mass of penguins from the Dream Island is less than that of penguins from the Biscoe Island at the 5% significance level.

Any time we have a two sample t-test that is statistically significant (p-value < 0.05) we should indicate which group was higher or lower, on average in our interpretation.

Confidence interval limits: (-1,143.3, -862.9) –> (862.9, 1,143.3)

We are 95% confident that the mean body mass in grams for penguins from the Dream Island is between 862.9 and 1,143.3 grams less than that of penguins from the Biscoe Island.

We never include negative / bounds in confidence interval interpretation

The confidence interval is more useful since it tells us whether or not there was a statistically significant difference between the two groups and it quantifies the magnitude of the difference between the two groups.