Slide 1: What is Bootstrapping?

Bootstrapping is a resampling technique used to estimate statistics (like the mean, standard error, or confidence intervals) by sampling with replacement from a data set.

It’s powerful when we don’t want to assume a normal distribution or when the sample size is small.

Slide 2: Why Use Bootstrapping?

  • No assumption of normality
  • Works with small samples
  • Helps estimate standard errors and confidence intervals
  • Easily implemented with code

Slide 3: Bootstrapping Process

  1. Draw a sample from your data
  2. Resample with replacement (many times)
  3. Compute your statistic (e.g., mean) for each sample
  4. Analyze the distribution of those statistics

Slide 4: R Code for Resampling

set.seed(123)
library(ggplot2); library(plotly)

data <- rnorm(100, 50, 10)
boot_means <- replicate(1000, mean(sample(data, replace = TRUE)))
df <- data.frame(boot_means)

Slide 5: Distribution of Bootstrap Means

ggplot(df, aes(x = boot_means)) +
  geom_histogram(fill = "#8C1D40", bins = 30, color = "white") +
  labs(title = "Bootstrap Distribution of the Mean", x = "Mean", y = "Frequency")

Slide 6: Bootstrap Confidence Interval

ci <- quantile(boot_means, c(0.025, 0.975))
ggplot(df, aes(x = boot_means)) + geom_density(fill = "#8C1D40", alpha = 0.4) +
  geom_vline(xintercept = ci, linetype = "dashed", color = "blue") +
  labs(x = "Mean", y = "Density")

Slide 7: Interactive Plotly Histogram

plot_ly(x = ~boot_means, type = "histogram", nbinsx = 30, marker = list(color = "#8C1D40")) %>%
  layout(margin = list(t = 10),  # reduce top margin xaxis = list(title = "Bootstrapped Means"),
    yaxis = list(title = "Count")
   )

Slide 8: LaTeX - Bootstrap Standard Error

Let \(x_1, x_2, \dots, x_n\) be the sample.

The bootstrap estimate of standard error is:

\[ \hat{SE}_{boot} = \sqrt{ \frac{1}{B-1} \sum_{b=1}^{B} \left( \bar{x}^{(b)} - \bar{x}_{boot} \right)^2 } \]

Where:
- \(\bar{x}^{(b)}\) = mean of the b-th bootstrap sample
- \(\bar{x}_{boot}\) = average of all bootstrap means
- \(B\) = number of bootstrap samples

Slide 9: LaTeX - Bootstrap Confidence Interval

The percentile bootstrap confidence interval is given by:

\[ CI_{boot} = \left[ \text{quantile}_{0.025}(\bar{x}^{(b)}), \ \text{quantile}_{0.975}(\bar{x}^{(b)}) \right] \]

This interval is computed directly from the distribution of resampled statistics.

Advantages: - Does not assume normality - Adapts to the shape of your data - Easy to compute using quantile() in R

Slide 10: Conclusion

  • Bootstrapping is intuitive and powerful
  • It helps estimate standard errors and confidence intervals without strict assumptions
  • Works well even with small sample sizes
  • Visual, flexible, and easy to implement in R

Practice using it on your own datasets, it is a great tool for modern data analysis!