Hypothesis Testing and p-Values

Example data

We generate synthetic execution-time data for two algorithms to demonstrate hypothesis testing in practice.

algo_A <- rnorm(30, mean = 120, sd = 8)
algo_B <- rnorm(30, mean = 110, sd = 8)

df <- data.frame(
  time = c(algo_A, algo_B),
  algorithm = rep(c("Algorithm A", "Algorithm B"), each = 30)
)

What is Hypothesis Testing?

Hypothesis testing is a method used to make decisions using data.

We: Start with a null hypothesis, Collect sample data Measure how likely the data is under the null hypothesis This likelihood is measured using a p-value. Problem Statement: We compare execution times of two algorithms.

Hypotheses:

Let \(\mu_A\) and \(\mu_B\) be the true mean execution times.

\[ H_0: \mu_A = \mu_B \]

\[ H_1: \mu_A \ne \mu_B \]

Execution Time Comparison

Let \(\bar{X}_A\) and \(\bar{X}_B\) denote the sample mean execution times of Algorithm A and Algorithm B, respectively.

\[ \Delta = \bar{X}_A - \bar{X}_B \]

This quantity measures the observed effect between the two algorithms.

Test Statistic & p-Value

For a two-sample t-test, the test statistic is:

\[ t = \frac{\bar{X}_A - \bar{X}_B} {\sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}}} \]

The p-value represents the probability of observing a test statistic at least this extreme, assuming the null hypothesis is true.

Running the Hypothesis Test in R

t.test(algo_A, algo_B)

## 
##  Welch Two Sample t-test
## 
## data:  algo_A and algo_B
## t = 4.8216, df = 56.249, p-value = 1.122e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   6.736535 16.311390
## sample estimates:
## mean of x mean of y 
##  120.5487  109.0247

Example data

What is Hypothesis Testing?

Execution Time Comparison

Execution Time Comparison

Test Statistic & p-Value

Running the Hypothesis Test in R

Sampling Distributions

p-Value Surface