# Load required libraries
library(tidyverse)      # Data manipulation and visualization
library(knitr)          # Dynamic report generation
library(kableExtra)     # Enhanced table formatting
library(openxlsx)       # Excel file operations
library(moments)        # Statistical moments (skewness, kurtosis)
library(gridExtra)      # Arrange multiple plots
library(scales)         # Scale functions for visualization
library(DT)             # Interactive data tables

# Set ggplot theme
theme_set(theme_minimal(base_size = 12))

Executive Summary

This document presents a comprehensive exploration of bootstrap resampling methodology as a powerful non-parametric approach to statistical inference. Through four distinct case studies, we demonstrate how resampling techniques provide intuitive, robust probability estimates without relying on traditional distributional assumptions.

Key Learning Objectives:

  • Understand the theoretical foundation of bootstrap resampling
  • Apply resampling methods to real-world business and scientific problems
  • Interpret bootstrap probability estimates and confidence intervals
  • Evaluate the strengths and limitations of non-parametric inference
  • Develop computational skills for implementing resampling in R

1 Introduction to Resampling Methods

1.1 Theoretical Foundation

Resampling is a straightforward yet powerful method for drawing statistical conclusions from data. The fundamental principle involves regenerating data numerous times by sampling with replacement from our observed data, then drawing inferences based on the outcomes of this iterative sampling process.

1.1.1 The Bootstrap Principle

“The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates.” - Efron & Tibshirani (1994)

Conceptual Illustration of Bootstrap Resampling

Conceptual Illustration of Bootstrap Resampling

1.1.2 Key Characteristics

Sampling With Replacement:

  • Each data point may be selected multiple times in a single resample
  • Sample size of resample equals original sample size
  • Creates variation in resampled statistics

Statistical Inference:

  • Empirical distribution approximates sampling distribution
  • Confidence intervals constructed from percentiles
  • Hypothesis testing via probability estimation

1.2 Problems Addressed by Resampling

Bootstrap resampling is particularly valuable for addressing questions such as:

  1. Treatment Efficacy: What is the probability that online teaching improves student learning?
  2. Medical Interventions: What is the probability that Ivermectin reduces coronavirus infection incidence?
  3. Process Optimization: What is the probability that Machine 1 is most productive among three machines?
  4. Comparative Analysis: What is the probability that high-pressure manufacturing improves product yield?

2 Case Study 1: Manufacturing Process Optimization

2.1 Problem Statement

Research Question: Does high-pressure manufacturing significantly improve product yield compared to low-pressure conditions?

An industrial engineer conducted an experiment producing batches of a product under two different pressure conditions:

  • High Pressure: 10 batches produced
  • Low Pressure: 8 batches produced

We aim to estimate the probability that high pressure yields superior results using bootstrap resampling.


2.2 Data Presentation

# Original manufacturing data
high_pressure <- c(17.55, 39.93, 48.98, 41.40, 35.70, 
                   42.24, 54.75, 46.96, 30.19, 34.80)
low_pressure <- c(37.37, 43.20, 34.85, 31.22, 26.59, 
                  42.67, 10.00, 43.70)

# Create comprehensive data frame
max_length <- max(length(high_pressure), length(low_pressure))
yield_data <- data.frame(
  Observation = 1:max_length,
  High_Pressure = c(high_pressure, rep(NA, max_length - length(high_pressure))),
  Low_Pressure = c(low_pressure, rep(NA, max_length - length(low_pressure)))
)
Table 1: Product Yield Data by Pressure Condition
Observation High Pressure Low Pressure
1 17.55 37.37
2 39.93 43.20
3 48.98 34.85
4 41.40 31.22
5 35.70 26.59
6 42.24 42.67
7 54.75 10.00
8 46.96 43.70
9 30.19 NA
10 34.80 NA

2.3 Exploratory Data Analysis

2.3.1 Descriptive Statistics

# Function to compute comprehensive descriptive statistics
compute_descriptive_stats <- function(data, label) {
  tibble(
    Statistic = c(
      "Sample Size (n)", "Mean (x̄)", "Median", 
      "Standard Deviation (s)", "Variance (s²)",
      "Standard Error (SE)", "Minimum", "Maximum", 
      "Range", "IQR", "CV (%)", "Skewness"
    ),
    Value = c(
      length(data),
      mean(data),
      median(data),
      sd(data),
      var(data),
      sd(data)/sqrt(length(data)),
      min(data),
      max(data),
      max(data) - min(data),
      IQR(data),
      (sd(data)/mean(data))*100,
      skewness(data)
    )
  ) %>%
    mutate(Value = round(Value, 4))
}

# Compute statistics for both conditions
stats_high <- compute_descriptive_stats(high_pressure, "High Pressure")
stats_low <- compute_descriptive_stats(low_pressure, "Low Pressure")

# Combine for display
stats_combined <- left_join(
  stats_high, stats_low, 
  by = "Statistic", 
  suffix = c("_High", "_Low")
)
Table 2: Comprehensive Descriptive Statistics
Statistic High Pressure Low Pressure
Sample Size (n) 10.0000 8.0000
Mean (x̄)
      39.250

2.3.2 Key Observations

# Calculate observed difference
observed_diff <- mean(high_pressure) - mean(low_pressure)

Observed Mean Difference: 5.55 units

  • High Pressure Mean: 39.25
  • Low Pressure Mean: 33.7
  • Relative Difference: 16.47%

Critical Question: Is this observed difference statistically meaningful, or could it have occurred by chance?


2.3.3 Outlier Detection

# Function for outlier detection using Tukey's method
detect_outliers <- function(data, label) {
  Q1 <- quantile(data, 0.25)
  Q3 <- quantile(data, 0.75)
  IQR_val <- Q3 - Q1
  lower_bound <- Q1 - 1.5 * IQR_val
  upper_bound <- Q3 + 1.5 * IQR_val
  
  outliers <- data[data < lower_bound | data > upper_bound]
  
  list(
    outliers = outliers,
    lower = lower_bound,
    upper = upper_bound,
    Q1 = Q1,
    Q3 = Q3,
    IQR = IQR_val
  )
}

# Detect outliers in both datasets
outliers_high <- detect_outliers(high_pressure, "High Pressure")
outliers_low <- detect_outliers(low_pressure, "Low Pressure")
Table 3: Outlier Detection Results (Tukey’s Method)
Condition Lower Fence Upper Fence Outliers Detected
High Pressure 18.89 61.91 17.55
Low Pressure 10.95 61.91 10

Methodological Note: The value 10.00 in the Low Pressure group is flagged as a statistical outlier. We proceed with all data to demonstrate the robustness of bootstrap methods, but readers should consider:

  1. Potential measurement error
  2. Process deviation or equipment malfunction
  3. Sensitivity analysis (with/without outliers)

2.3.4 Data Visualization

# Create side-by-side boxplots
data_long <- data.frame(
  Yield = c(high_pressure, low_pressure),
  Condition = factor(
    c(rep("High Pressure", length(high_pressure)),
      rep("Low Pressure", length(low_pressure))),
    levels = c("High Pressure", "Low Pressure")
  )
)

ggplot(data_long, aes(x = Condition, y = Yield, fill = Condition)) +
  geom_boxplot(alpha = 0.7, outlier.size = 3, outlier.color = "red") +
  geom_jitter(width = 0.1, alpha = 0.4, size = 2) +
  stat_summary(fun = mean, geom = "point", shape = 23, 
               size = 4, fill = "red", color = "darkred") +
  scale_fill_manual(values = c("High Pressure" = "steelblue", 
                                "Low Pressure" = "coral")) +
  labs(
    title = "Product Yield Distribution by Pressure Condition",
    subtitle = "Diamond = Mean | Box = Median & IQR | Red dots = Outliers",
    x = "",
    y = "Product Yield (units)",
    caption = "Data: Manufacturing experiment (n_high=10, n_low=8)"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    legend.position = "none",
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )
Figure 1: Comparative Boxplots of Product Yields

Figure 1: Comparative Boxplots of Product Yields

ggplot(data_long, aes(x = Yield, fill = Condition)) +
  geom_density(alpha = 0.5, size = 1) +
  geom_vline(xintercept = mean(high_pressure), 
             color = "darkblue", linetype = "dashed", size = 1) +
  geom_vline(xintercept = mean(low_pressure), 
             color = "darkred", linetype = "dashed", size = 1) +
  scale_fill_manual(values = c("High Pressure" = "steelblue", 
                                "Low Pressure" = "coral")) +
  labs(
    title = "Probability Density Functions of Product Yields",
    subtitle = "Dashed lines represent group means",
    x = "Product Yield (units)",
    y = "Density",
    fill = "Condition"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    legend.position = "top"
  )
Figure 2: Kernel Density Estimation of Yield Distributions

Figure 2: Kernel Density Estimation of Yield Distributions


2.4 Bootstrap Resampling Methodology

2.4.1 Algorithmic Framework

The bootstrap procedure follows these steps:

  1. Initialize: Set number of iterations\(B = 1000\)
  2. Resample: For\(b = 1\) to\(B\):
  • Draw\(n_{high}\) observations with replacement from high pressure data
  • Draw\(n_{low}\) observations with replacement from low pressure data
  • Calculate \(\bar{x}*{high}^{(b)*\( and \)*\bar{x}*{low}^{(b)}\)
  • Record whether \(\bar{x}*{high}^{(b)} > \bar{x}*{low}^{(b)}\)
  1. Estimate:\(\hat{P}(\mu_{high} > \mu_{low}) = \frac{1}{B}\sum_{b=1}^{B} I(\bar{x}_{high}^{(b)} > \bar{x}_{low}^{(b)})\)

2.4.2 R Implementation

#' Bootstrap Resampling for Two-Sample Comparison
#'
#' @param high_data Numeric vector of high pressure observations
#' @param low_data Numeric vector of low pressure observations
#' @param n_iterations Number of bootstrap iterations (default: 1000)
#' @param seed Random seed for reproducibility
#'
#' @return List containing bootstrap results
perform_resampling <- function(high_data, low_data, 
                               n_iterations = 1000, seed = 123) {
  set.seed(seed)
  
  # Initialize storage
  high_greater <- numeric(n_iterations)
  mean_diff_distribution <- numeric(n_iterations)
  resampled_high_means <- numeric(n_iterations)
  resampled_low_means <- numeric(n_iterations)
  
  # Progress bar
  pb <- txtProgressBar(min = 0, max = n_iterations, style = 3)
  
  # Perform resampling
  for (i in 1:n_iterations) {
    # Resample with replacement
    resampled_high <- sample(high_data, size = length(high_data), replace = TRUE)
    resampled_low <- sample(low_data, size = length(low_data), replace = TRUE)
    
    # Calculate statistics
    mean_high <- mean(resampled_high)
    mean_low <- mean(resampled_low)
    
    # Store results
    resampled_high_means[i] <- mean_high
    resampled_low_means[i] <- mean_low
    mean_diff_distribution[i] <- mean_high - mean_low
    high_greater[i] <- ifelse(mean_high > mean_low, 1, 0)
    
    # Update progress
    setTxtProgressBar(pb, i)
  }
  close(pb)
  
  return(list(
    high_greater = high_greater,
    mean_diff_distribution = mean_diff_distribution,
    resampled_high_means = resampled_high_means,
    resampled_low_means = resampled_low_means
  ))
}

2.4.3 Execute Bootstrap Analysis

# Perform bootstrap resampling
n_iterations <- 1000
results <- perform_resampling(high_pressure, low_pressure, n_iterations)
  |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
# Calculate key statistics
probability_high_better <- mean(results$high_greater)
times_high_greater <- sum(results$high_greater)

# Bootstrap confidence intervals
ci_95 <- quantile(results$mean_diff_distribution, probs = c(0.025, 0.975))
ci_90 <- quantile(results$mean_diff_distribution, probs = c(0.05, 0.95))

2.5 Results and Interpretation

2.5.1 Primary Findings

Table 4: Bootstrap Resampling Results Summary
Metric Value
Bootstrap Iterations 1000
High Pressure Mean > Low Pressure Mean 860
High Pressure Mean ≤ Low Pressure Mean 140
Probability (High > Low) 86%
95% CI for Mean Difference (Lower) -3.8197
95% CI for Mean Difference (Upper) 15.4894
Bootstrap Mean Difference 5.5508
Bootstrap SD of Difference 4.9693

2.5.2 KEY FINDING

Probability that High Pressure yields better results: 86%

The 95% confidence interval for the mean difference is [-3.82, 15.49] units.


2.5.3 Bootstrap Distribution Visualization

# Create bootstrap data frame
bootstrap_df <- data.frame(
  Iteration = 1:n_iterations,
  Mean_Difference = results$mean_diff_distribution,
  High_Greater = factor(results$high_greater, levels = c(0, 1))
)

# Main plot
p1 <- ggplot(bootstrap_df, aes(x = Mean_Difference)) +
  geom_histogram(aes(y = ..density..), bins = 40, 
                 fill = "steelblue", alpha = 0.7, color = "black") +
  geom_density(color = "darkblue", size = 1.5) +
  geom_vline(xintercept = 0, color = "red", 
             linetype = "dashed", size = 1.2) +
  geom_vline(xintercept = observed_diff, color = "darkgreen", 
             linetype = "solid", size = 1.2) +
  geom_vline(xintercept = ci_95[1], color = "orange", 
             linetype = "dotted", size = 1) +
  geom_vline(xintercept = ci_95[2], color = "orange", 
             linetype = "dotted", size = 1) +
  annotate("text", x = observed_diff, 
           y = max(density(bootstrap_df$Mean_Difference)$y) * 0.9,
           label = "Observed\nDifference", color = "darkgreen", 
           fontface = "bold", hjust = -0.1, size = 4) +
  annotate("text", x = 0, 
           y = max(density(bootstrap_df$Mean_Difference)$y) * 0.7,
           label = "H₀: No\nDifference", color = "red", 
           fontface = "bold", hjust = 1.1, size = 4) +
  annotate("rect", xmin = ci_95[1], xmax = ci_95[2],
           ymin = 0, ymax = Inf, alpha = 0.1, fill = "orange") +
  labs(
    title = "Bootstrap Distribution of Mean Differences (High - Low Pressure)",
    subtitle = sprintf("P(High > Low) = %.2f%% | 95%% CI shown in shaded region", 
                       probability_high_better * 100),
    x = "Mean Difference (High Pressure - Low Pressure)",
    y = "Density",
    caption = paste("Based on", n_iterations, "bootstrap iterations")
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )

print(p1)
Figure 3: Bootstrap Distribution of Mean Differences

Figure 3: Bootstrap Distribution of Mean Differences

# Cumulative probability plot
bootstrap_df$Cumulative_Prop <- cumsum(bootstrap_df$High_Greater == 1) / 
                                 seq_along(bootstrap_df$High_Greater)

ggplot(bootstrap_df, aes(x = Iteration, y = Cumulative_Prop)) +
  geom_line(color = "steelblue", size = 1) +
  geom_hline(yintercept = probability_high_better, 
             color = "darkblue", linetype = "dashed", size = 1) +
  geom_hline(yintercept = 0.95, color = "red", 
             linetype = "dotted", size = 0.8) +
  geom_hline(yintercept = 0.90, color = "orange", 
             linetype = "dotted", size = 0.8) +
  annotate("text", x = n_iterations * 0.7, 
           y = probability_high_better + 0.02,
           label = sprintf("Final: %.2f%%", probability_high_better * 100),
           color = "darkblue", fontface = "bold", size = 4) +
  annotate("text", x = n_iterations * 0.1, y = 0.95 + 0.01,
           label = "α = 0.05", color = "red", fontface = "bold") +
  annotate("text", x = n_iterations * 0.1, y = 0.90 + 0.01,
           label = "α = 0.10", color = "orange", fontface = "bold") +
  labs(
    title = "Convergence of Bootstrap Probability Estimate",
    subtitle = "Cumulative proportion where High Pressure mean exceeds Low Pressure mean",
    x = "Bootstrap Iteration",
    y = "Cumulative Proportion",
    caption = "Horizontal lines indicate conventional significance thresholds"
  ) +
  scale_y_continuous(limits = c(0, 1), labels = percent) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )
Figure 4: Convergence of Bootstrap Probability Estimate

Figure 4: Convergence of Bootstrap Probability Estimate


2.5.4 Academic Interpretation

2.5.5 Statistical Conclusion: MODERATE EVIDENCE

With a bootstrap probability of 86.00%, we have MODERATE EVIDENCE suggesting a potential advantage for high pressure manufacturing, but the evidence is not conclusive.

Recommendations:

  • Collect additional data to increase statistical power
  • Investigate sources of variability
  • Consider cost-benefit analysis of implementing high pressure

2.5.6 Contextual Considerations

2.5.6.1 Sample Size Limitations

  • High Pressure: n = 10 (small sample)
  • Low Pressure: n = 8 (small sample)
  • Implication: Limited statistical power; larger samples would provide more robust inference

2.5.6.2 Practical Significance

  • Statistical significance ≠ Practical/economic significance
  • Mean difference: 5.55 units
  • Consider: Cost of high-pressure equipment, production efficiency, quality implications

2.5.6.3 Outlier Impact

  • Value 10.00 in low pressure group is a clear outlier
  • Bootstrap methods are relatively robust, but sensitivity analysis recommended
  • Investigate cause of extreme value

2.6 Sensitivity Analysis

# Remove outliers for sensitivity check
high_pressure_clean <- high_pressure[!high_pressure %in% outliers_high$outliers]
low_pressure_clean <- low_pressure[!low_pressure %in% outliers_low$outliers]

# Perform resampling without outliers (if sufficient data remains)
if (length(high_pressure_clean) >= 5 && length(low_pressure_clean) >= 5) {
  results_clean <- perform_resampling(high_pressure_clean, low_pressure_clean, 
                                      n_iterations, seed = 456)
  probability_clean <- mean(results_clean$high_greater)
  
  # Create comparison
  sensitivity_comparison <- tibble(
    Analysis = c("With All Data", "Without Outliers"),
    `Sample Size High` = c(length(high_pressure), length(high_pressure_clean)),
    `Sample Size Low` = c(length(low_pressure), length(low_pressure_clean)),
    `Probability (%)` = c(probability_high_better * 100, probability_clean * 100),
    `Difference` = c(NA, (probability_clean - probability_high_better) * 100)
  )
}
  |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
Table 5: Sensitivity Analysis - Impact of Outliers
Analysis Sample Size High Sample Size Low Probability (%) Difference
With All Data 10 8 86.0 NA
Without Outliers 9 7 92.9 6.9
Figure 5: Sensitivity Analysis Visualization

Figure 5: Sensitivity Analysis Visualization

2.6.1 Sensitivity Analysis Conclusion

Results show MODERATE SENSITIVITY to outlier removal (≥ 5 percentage point change).

Interpretation: Careful consideration of outliers is warranted. Further investigation into the cause of extreme values is recommended before making final decisions.


3 Problem 1: Influenza Drug Effectiveness

3.1 Research Context

A pharmaceutical company conducted a clinical trial comparing a new influenza drug against a placebo. The primary outcome was patient-reported symptom improvement.

Study Design:

  • Drug Group (n = 34): 25 felt better, 9 felt worse
  • Placebo Group (n = 19): 11 felt better, 8 felt worse

Research Question: What is the probability that the drug is more effective than the placebo?


3.2 Data Summary

# Clinical trial data
drug_better <- 25
drug_worse <- 9
drug_total <- drug_better + drug_worse

placebo_better <- 11
placebo_worse <- 8
placebo_total <- placebo_better + placebo_worse

# Create binary outcome vectors (1 = better, 0 = worse)
drug_results <- c(rep(1, drug_better), rep(0, drug_worse))
placebo_results <- c(rep(1, placebo_better), rep(0, placebo_worse))

# Summary table
trial_summary <- tibble(
  Group = c("Drug", "Placebo"),
  `Total Patients` = c(drug_total, placebo_total),
  `Felt Better` = c(drug_better, placebo_better),
  `Felt Worse` = c(drug_worse, placebo_worse),
  `Proportion Better (%)` = c(
    round((drug_better/drug_total)*100, 2),
    round((placebo_better/placebo_total)*100, 2)
  )
)
Table 6: Clinical Trial Results Summary
Group Total Patients Felt Better Felt Worse Proportion Better (%)
Drug 34 25 9 73.53
Placebo 19 11 8 57.89

Observed Difference in Proportions: 15.63 percentage points


3.3 Bootstrap Analysis for Proportions

#' Bootstrap Resampling for Proportion Comparison
#'
#' @param drug_data Binary vector (1=success, 0=failure) for drug group
#' @param placebo_data Binary vector for placebo group
#' @param n_iterations Number of bootstrap iterations
#' @param seed Random seed
#'
#' @return List with bootstrap results
resample_proportions <- function(drug_data, placebo_data, 
                                 n_iterations = 1000, seed = 789) {
  set.seed(seed)
  drug_better_results <- numeric(n_iterations)
  prop_diff_distribution <- numeric(n_iterations)
  
  pb <- txtProgressBar(min = 0, max = n_iterations, style = 3)
  
  for (i in 1:n_iterations) {
    # Resample with replacement
    resampled_drug <- sample(drug_data, size = length(drug_data), replace = TRUE)
    resampled_placebo <- sample(placebo_data, size = length(placebo_data), replace = TRUE)
    
    # Calculate proportions
    prop_drug <- mean(resampled_drug)
    prop_placebo <- mean(resampled_placebo)
    
    # Store results
    prop_diff_distribution[i] <- prop_drug - prop_placebo
    drug_better_results[i] <- ifelse(prop_drug > prop_placebo, 1, 0)
    
    setTxtProgressBar(pb, i)
  }
  close(pb)
  
  return(list(
    drug_better = drug_better_results,
    prop_diff_distribution = prop_diff_distribution
  ))
}

# Execute bootstrap
drug_resampling <- resample_proportions(drug_results, placebo_results, n_iterations)
  |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
prob_drug_better <- mean(drug_resampling$drug_better)

# Confidence intervals
ci_drug_95 <- quantile(drug_resampling$prop_diff_distribution, probs = c(0.025, 0.975))

3.4 Results

Table 7: Bootstrap Results for Drug Efficacy
Metric Value
Probability (Drug > Placebo) 87.50%
95% CI Lower Bound (%) -8.99
95% CI Upper Bound (%) 42.57
Bootstrap Mean Difference (%) 15.98
Times Drug Superior 875 / 1000

3.4.1 Clinical Finding

Probability that drug is more effective than placebo: 87.5%

95% Confidence Interval for difference in proportions: [-8.99%, 42.57%]


drug_bootstrap_df <- data.frame(
  Iteration = 1:n_iterations,
  Prop_Difference = drug_resampling$prop_diff_distribution * 100
)

ggplot(drug_bootstrap_df, aes(x = Prop_Difference)) +
  geom_histogram(aes(y = ..density..), bins = 40, 
                 fill = "mediumseagreen", alpha = 0.7, color = "black") +
  geom_density(color = "darkgreen", size = 1.5) +
  geom_vline(xintercept = 0, color = "red", 
             linetype = "dashed", size = 1.2) +
  geom_vline(xintercept = (drug_better/drug_total - placebo_better/placebo_total)*100, 
             color = "darkblue", linetype = "solid", size = 1.2) +
  geom_vline(xintercept = ci_drug_95[1]*100, 
             color = "orange", linetype = "dotted", size = 1) +
  geom_vline(xintercept = ci_drug_95[2]*100, 
             color = "orange", linetype = "dotted", size = 1) +
  annotate("rect", xmin = ci_drug_95[1]*100, xmax = ci_drug_95[2]*100,
           ymin = 0, ymax = Inf, alpha = 0.1, fill = "orange") +
  labs(
    title = "Bootstrap Distribution: Difference in Proportion Feeling Better",
    subtitle = sprintf("Drug - Placebo | P(Drug > Placebo) = %.2f%%", 
                       prob_drug_better * 100),
    x = "Difference in Proportion Feeling Better (%)",
    y = "Density",
    caption = paste("Based on", n_iterations, "bootstrap iterations")
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )
Figure 6: Bootstrap Distribution of Proportion Differences

Figure 6: Bootstrap Distribution of Proportion Differences


3.5 Clinical Interpretation

3.5.1 Recommendation: ADDITIONAL TRIALS WARRANTED

Moderate evidence of drug efficacy. Results are promising but not definitive for regulatory approval.

Next Steps:

  • Conduct larger clinical trial
  • Increase statistical power through extended recruitment
  • Consider stratified analysis by patient characteristics

4 Problem 2: Stock Beta Comparison

4.1 Financial Theory Background

4.1.1 What is Beta (β)?

Beta is a fundamental measure in finance that quantifies a stock’s volatility relative to the overall market:

\(\beta = \frac{Cov(R_{stock}, R_{market})}{Var(R_{market})}\)

Interpretation:

  • β > 1: Stock is MORE volatile than market (cyclical/aggressive)
  • β = 1: Stock moves WITH the market
  • β < 1: Stock is LESS volatile than market (defensive)
  • β < 0: Stock moves OPPOSITE to market (rare)

4.1.2 Research Question

Is Microsoft (MSFT) less cyclical than Pfizer (PFE)?

Equivalently: What is the probability that MSFT has a lower beta than PFE?


4.2 Data Simulation

Note: In practice, you would load actual historical return data from Betaresampling.xlsx. For pedagogical purposes, we simulate realistic monthly returns based on typical stock characteristics.

# Simulate monthly returns (replace with actual data in practice)
set.seed(999)
n_months <- 150  # ~12.5 years of monthly data

# Market returns (S&P 500)
sp500_returns <- rnorm(n_months, mean = 0.01, sd = 0.04)

# Stock returns following CAPM: R = α + β*R_market + ε
msft_returns <- 0.002 + 1.2 * sp500_returns + rnorm(n_months, 0, 0.03)
pfe_returns <- 0.001 + 0.8 * sp500_returns + rnorm(n_months, 0, 0.025)

# Create financial data frame
financial_data <- tibble(
  Month = 1:n_months,
  Date = seq.Date(from = as.Date("2010-01-01"), 
                  by = "month", length.out = n_months),
  SP500 = sp500_returns * 100,  # Convert to percentage
  MSFT = msft_returns * 100,
  PFE = pfe_returns * 100
)
Table 8: Sample of Monthly Stock Returns (%)
Month Date SP500 MSFT PFE
1 2010-01-01 -0.127 3.033 3.059
2 2010-02-01 -4.250 -0.671 -3.807
3 2010-03-01 4.181 5.530 6.172
4 2010-04-01 2.080 4.978 4.543
5 2010-05-01 -0.109 2.756 0.187
6 2010-06-01 -1.264 -1.610 -2.795
7 2010-07-01 -6.515 -8.123 -3.557
8 2010-08-01 -4.067 -0.241 0.515
9 2010-09-01 -2.871 0.998 -0.788
10 2010-10-01 -3.484 -5.719 -1.506

4.3 Beta Estimation

#' Calculate Stock Beta using Linear Regression
#'
#' Beta = Slope coefficient from: R_stock ~ R_market
#'
#' @param stock_returns Numeric vector of stock returns
#' @param market_returns Numeric vector of market returns
#'
#' @return Numeric beta estimate
calculate_beta <- function(stock_returns, market_returns) {
  model <- lm(stock_returns ~ market_returns)
  return(coef(model)[2])  # Return slope coefficient
}

# Calculate original betas
beta_msft_original <- calculate_beta(msft_returns, sp500_returns)
beta_pfe_original <- calculate_beta(pfe_returns, sp500_returns)
Table 9: Original Beta Estimates
Stock Beta Interpretation
Microsoft (MSFT) 1.2036 More cyclical than market
Pfizer (PFE) 0.6926 Less cyclical than market
Difference (MSFT - PFE) 0.5109 MSFT more cyclical

4.4 Bootstrap Beta Analysis

#' Bootstrap Resampling for Beta Comparison
#'
#' @param msft_ret Microsoft returns
#' @param pfe_ret Pfizer returns
#' @param sp500_ret Market returns
#' @param n_iterations Number of bootstrap iterations
#' @param seed Random seed
#'
#' @return List with bootstrap results
resample_beta <- function(msft_ret, pfe_ret, sp500_ret, 
                          n_iterations = 1000, seed = 111) {
  set.seed(seed)
  msft_lower <- numeric(n_iterations)
  beta_msft_dist <- numeric(n_iterations)
  beta_pfe_dist <- numeric(n_iterations)
  
  pb <- txtProgressBar(min = 0, max = n_iterations, style = 3)
  
  for (i in 1:n_iterations) {
    # Resample indices (maintains temporal relationship)
    indices <- sample(1:length(sp500_ret), size = length(sp500_ret), replace = TRUE)
    
    # Resample all series with same indices
    resampled_sp500 <- sp500_ret[indices]
    resampled_msft <- msft_ret[indices]
    resampled_pfe <- pfe_ret[indices]
    
    # Calculate betas
    beta_msft <- calculate_beta(resampled_msft, resampled_sp500)
    beta_pfe <- calculate_beta(resampled_pfe, resampled_sp500)
    
    # Store results
    beta_msft_dist[i] <- beta_msft
    beta_pfe_dist[i] <- beta_pfe
    msft_lower[i] <- ifelse(beta_msft < beta_pfe, 1, 0)
    
    setTxtProgressBar(pb, i)
  }
  close(pb)
  
  return(list(
    msft_lower = msft_lower,
    beta_msft_dist = beta_msft_dist,
    beta_pfe_dist = beta_pfe_dist
  ))
}

# Execute beta resampling
beta_resampling <- resample_beta(msft_returns, pfe_returns, 
                                 sp500_returns, n_iterations)
  |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
prob_msft_lower_beta <- mean(beta_resampling$msft_lower)

4.5 Results

Table 10: Bootstrap Results for Beta Comparison
Metric Value
Probability (MSFT β < PFE β) 0.00%
Probability (MSFT β ≥ PFE β) 100.00%
Bootstrap Mean Beta - MSFT 1.2061
Bootstrap Mean Beta - PFE 0.6938
Bootstrap SD Beta - MSFT 0.0579
Bootstrap SD Beta - PFE 0.0476

4.5.1 Investment Insight

Probability that MSFT has lower beta than PFE: 0%

Probability that MSFT has higher beta than PFE: 100%


# Create beta comparison data
beta_comparison_df <- data.frame(
  Beta = c(beta_resampling$beta_msft_dist, beta_resampling$beta_pfe_dist),
  Stock = rep(c("Microsoft (MSFT)", "Pfizer (PFE)"), each = n_iterations)
)

ggplot(beta_comparison_df, aes(x = Beta, fill = Stock)) +
  geom_density(alpha = 0.5, size = 1) +
  geom_vline(xintercept = beta_msft_original, 
             color = "darkblue", linetype = "dashed", size = 1) +
  geom_vline(xintercept = beta_pfe_original, 
             color = "darkred", linetype = "dashed", size = 1) +
  geom_vline(xintercept = 1, 
             color = "black", linetype = "dotted", size = 1.2) +
  scale_fill_manual(values = c("Microsoft (MSFT)" = "steelblue", 
                                "Pfizer (PFE)" = "coral")) +
  annotate("text", x = 1, y = max(density(beta_comparison_df$Beta)$y) * 0.95,
           label = "Market β = 1", angle = 90, vjust = -0.5, 
           fontface = "bold", size = 4) +
  labs(
    title = "Bootstrap Distribution of Stock Betas",
    subtitle = sprintf("P(MSFT β < PFE β) = %.2f%%", prob_msft_lower_beta * 100),
    x = "Beta (β)",
    y = "Density",
    fill = "Stock",
    caption = "Dashed vertical lines represent original beta estimates"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    legend.position = "top"
  )
Figure 7: Bootstrap Distribution of Stock Betas

Figure 7: Bootstrap Distribution of Stock Betas

beta_pairs_df <- data.frame(
  MSFT_Beta = beta_resampling$beta_msft_dist,
  PFE_Beta = beta_resampling$beta_pfe_dist,
  MSFT_Lower = factor(beta_resampling$msft_lower, 
                      levels = c(0, 1),
                      labels = c("MSFT ≥ PFE", "MSFT < PFE"))
)

ggplot(beta_pairs_df, aes(x = PFE_Beta, y = MSFT_Beta, color = MSFT_Lower)) +
  geom_point(alpha = 0.3, size = 1.5) +
  geom_abline(slope = 1, intercept = 0, 
              color = "red", linetype = "dashed", size = 1.2) +
  scale_color_manual(values = c("MSFT ≥ PFE" = "coral", 
                                 "MSFT < PFE" = "steelblue")) +
  labs(
    title = "Scatterplot of Bootstrap Beta Comparisons",
    subtitle = "Points below diagonal indicate MSFT has lower beta than PFE",
    x = "Pfizer Beta (β)",
    y = "Microsoft Beta (β)",
    color = "Comparison Result",
    caption = "Red diagonal line represents β_MSFT = β_PFE"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    legend.position = "top"
  )
Figure 8: Scatterplot of Bootstrap Beta Pairs

Figure 8: Scatterplot of Bootstrap Beta Pairs


4.6 Investment Implications

4.6.1 Portfolio Recommendation

Microsoft (MSFT) is likely MORE cyclical than Pfizer (PFE).

Characteristics:

  • MSFT: Higher beta → Higher systematic risk → Higher expected return (growth stock)
  • PFE: Lower beta → Lower systematic risk → More stable returns (defensive stock)

Investment Strategy:

  • Risk-Averse Investors: Consider PFE for stability and dividend income
  • Growth-Oriented Investors: MSFT may offer higher long-term returns
  • Diversification: Hold both to balance risk/return profile
  • Market Timing: Favor PFE during economic uncertainty, MSFT during expansion

5 Problem 3: Cholesterol Reduction Study

5.1 Study Design

5.1.1 Research Context

An educational intervention study examined whether a health talk on cholesterol dangers could motivate lifestyle changes leading to cholesterol reduction.

Design: Paired (before-after) observational study

Participants: 8 workers

Intervention: Educational talk on dangers of high cholesterol

Outcome: Cholesterol levels (mg/dL) measured before and after intervention


5.2 Data

# Cholesterol measurements
cholesterol_before <- c(220, 195, 250, 200, 220, 260, 175, 198)
cholesterol_after <- c(210, 198, 210, 199, 224, 212, 179, 184)

# Create comprehensive data frame
cholesterol_data <- tibble(
  Worker_ID = 1:8,
  Before = cholesterol_before,
  After = cholesterol_after,
  Difference = cholesterol_before - cholesterol_after,
  Percent_Change = round(((cholesterol_after - cholesterol_before) / 
                           cholesterol_before) * 100, 2),
  Direction = case_when(
    Difference > 0 ~ "Decreased ↓",
    Difference < 0 ~ "Increased ↑",
    TRUE ~ "No Change"
  )
)
Table 11: Cholesterol Measurements Before and After Intervention
Worker Before After Difference % Change Direction
1 220 210 10 -4.55 Decreased ↓
2 195 198 -3 1.54 Increased ↑
3 250 210 40 -16.00 Decreased ↓
4 200 199 1 -0.50 Decreased ↓
5 220 224 -4 1.82 Increased ↑
6 260 212 48 -18.46 Decreased ↓
7 175 179 -4 2.29 Increased ↑
8 198 184 14 -7.07 Decreased ↓

5.3 Descriptive Analysis

Table 12: Cholesterol Change Summary Statistics
Metric Value
Mean Cholesterol Before 214.75 mg/dL
Mean Cholesterol After 202.00 mg/dL
Mean Reduction 12.75 mg/dL
Median Reduction 5.50 mg/dL
SD of Differences 20.50 mg/dL
Workers with Decrease 5 (62.5%)
Workers with Increase 3 (37.5%)
Workers with No Change 0 (0.0%)

5.4 Visualization

# Create long format for plotting
cholesterol_long <- data.frame(
  Worker = rep(1:8, 2),
  Time = factor(rep(c("Before", "After"), each = 8), 
                levels = c("Before", "After")),
  Cholesterol = c(cholesterol_before, cholesterol_after)
)

# Create direction variable first
cholesterol_long <- cholesterol_long %>%
  group_by(Worker) %>%
  mutate(Direction = Cholesterol[Time == "Before"] > Cholesterol[Time == "After"]) %>%
  ungroup()

# Then plot
ggplot(cholesterol_long, aes(x = Time, y = Cholesterol, group = Worker)) +
  geom_line(aes(color = Direction), alpha = 0.6, size = 1) +
  geom_point(size = 3) +
  scale_color_manual(values = c("FALSE" = "coral", "TRUE" = "steelblue"),
                     labels = c("Increased", "Decreased")) 
Figure 9: Individual Cholesterol Trajectories

Figure 9: Individual Cholesterol Trajectories

ggplot(cholesterol_data, aes(x = Difference)) +
  geom_histogram(bins = 8, fill = "steelblue", 
                 color = "black", alpha = 0.7) +
  geom_vline(xintercept = 0, color = "red", 
             linetype = "dashed", size = 1.2) +
  geom_vline(xintercept = mean(cholesterol_data$Difference), 
             color = "darkblue", linetype = "solid", size = 1.2) +
  annotate("text", x = mean(cholesterol_data$Difference), 
           y = 2.5, label = sprintf("Mean\nReduction\n%.1f mg/dL", 
                                    mean(cholesterol_data$Difference)), 
           color = "darkblue", fontface = "bold", hjust = -0.2, size = 4) +
  labs(
    title = "Distribution of Cholesterol Changes (Before - After)",
    subtitle = "Positive values indicate reduction (improvement)",
    x = "Change in Cholesterol (mg/dL)",
    y = "Frequency",
    caption = "Red dashed line at zero = no change"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )
Figure 10: Distribution of Cholesterol Changes

Figure 10: Distribution of Cholesterol Changes


5.5 Bootstrap Analysis for Paired Data

#' Bootstrap Resampling for Paired Data
#'
#' Maintains pairing structure by resampling paired observations together
#'
#' @param before Numeric vector of before measurements
#' @param after Numeric vector of after measurements
#' @param n_iterations Number of bootstrap iterations
#' @param seed Random seed
#'
#' @return List with reduction results and difference distribution
resample_paired <- function(before, after, n_iterations = 1000, seed = 222) {
  set.seed(seed)
  reduction_occurred <- numeric(n_iterations)
  mean_diff_distribution <- numeric(n_iterations)
  
  pb <- txtProgressBar(min = 0, max = n_iterations, style = 3)
  
  for (i in 1:n_iterations) {
    # Resample indices (maintains pairing)
    indices <- sample(1:length(before), size = length(before), replace = TRUE)
    
    # Resample both before and after using same indices
    resampled_before <- before[indices]
    resampled_after <- after[indices]
    
    # Calculate mean difference (positive = reduction)
    mean_diff <- mean(resampled_before - resampled_after)
    mean_diff_distribution[i] <- mean_diff
    
    # Check if reduction occurred (difference > 0)
    reduction_occurred[i] <- ifelse(mean_diff > 0, 1, 0)
    
    setTxtProgressBar(pb, i)
  }
  close(pb)
  
  return(list(
    reduction = reduction_occurred,
    diff_distribution = mean_diff_distribution
  ))
}

# Execute paired resampling
cholesterol_resampling <- resample_paired(cholesterol_before, 
                                          cholesterol_after, n_iterations)
  |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
prob_reduction <- mean(cholesterol_resampling$reduction)

# Confidence intervals
ci_chol_95 <- quantile(cholesterol_resampling$diff_distribution, 
                       probs = c(0.025, 0.975))
ci_chol_90 <- quantile(cholesterol_resampling$diff_distribution, 
                       probs = c(0.05, 0.95))

5.6 Results

Table 13: Bootstrap Results for Cholesterol Reduction
Metric Value
Probability of Cholesterol Reduction 98.10%
95% CI Lower Bound (mg/dL) 0.86
95% CI Upper Bound (mg/dL) 26.13
90% CI Lower Bound (mg/dL) 2.5
90% CI Upper Bound (mg/dL) 24.75
Bootstrap Mean Reduction (mg/dL) 12.66
Bootstrap SD (mg/dL) 6.7
Times Reduction Observed 981 / 1000

5.6.1 Public Health Finding

Probability that educational intervention caused cholesterol reduction: 98.1%

95% CI for mean reduction: [0.86, 26.13] mg/dL


cholesterol_bootstrap_df <- data.frame(
  Iteration = 1:n_iterations,
  Mean_Difference = cholesterol_resampling$diff_distribution
)

ggplot(cholesterol_bootstrap_df, aes(x = Mean_Difference)) +
  geom_histogram(aes(y = ..density..), bins = 40, 
                 fill = "mediumseagreen", alpha = 0.7, color = "black") +
  geom_density(color = "darkgreen", size = 1.5) +
  geom_vline(xintercept = 0, color = "red", 
             linetype = "dashed", size = 1.2) +
  geom_vline(xintercept = mean(cholesterol_data$Difference), 
             color = "darkblue", linetype = "solid", size = 1.2) +
  geom_vline(xintercept = ci_chol_95[1], 
             color = "orange", linetype = "dotted", size = 1) +
  geom_vline(xintercept = ci_chol_95[2], 
             color = "orange", linetype = "dotted", size = 1) +
  annotate("rect", xmin = ci_chol_95[1], xmax = ci_chol_95[2],
           ymin = 0, ymax = Inf, alpha = 0.1, fill = "orange") +
  annotate("text", x = mean(cholesterol_data$Difference), 
           y = max(density(cholesterol_bootstrap_df$Mean_Difference)$y) * 0.9,
           label = "Observed\nReduction", color = "darkblue", 
           fontface = "bold", hjust = -0.1, size = 4) +
  labs(
    title = "Bootstrap Distribution of Mean Cholesterol Reduction",
    subtitle = sprintf("P(Reduction > 0) = %.2f%% | 95%% CI shaded", 
                       prob_reduction * 100),
    x = "Mean Cholesterol Reduction (mg/dL)",
    y = "Density",
    caption = paste("Based on", n_iterations, "bootstrap iterations")
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )
Figure 11: Bootstrap Distribution of Mean Cholesterol Reduction

Figure 11: Bootstrap Distribution of Mean Cholesterol Reduction


5.7 Public Health Interpretation

5.7.1 Recommendation: IMPLEMENT INTERVENTION

Strong evidence that the educational intervention was effective in motivating lifestyle changes that reduced cholesterol.

Program Recommendations:

  • Implement educational intervention as part of workplace wellness program
  • Expected benefit: 12.8 mg/dL average cholesterol reduction
  • Conduct follow-up measurements at 6 and 12 months to assess sustainability
  • Consider cost-effectiveness analysis for large-scale deployment
  • Expand to larger employee population with proper evaluation

5.7.2 Methodological Considerations

5.7.2.1 Study Limitations

  1. Small Sample Size: n = 8 limits statistical power and generalizability
  2. No Control Group: Cannot definitively attribute changes to intervention vs. natural variation
  3. Short-Term Follow-up: Long-term sustainability unknown
  4. Individual Variation: One worker (5) showed cholesterol increase
  5. Self-Selection Bias: Participants may have been more motivated than general population
  6. Measurement Error: Single measurements subject to biological variation

5.7.2.2 Clinical Significance

  • Mean reduction of 12.8 mg/dL
  • For context, 10% cholesterol reduction → ~20% reduction in cardiac risk
  • Individual variation suggests personalized approaches may be beneficial

6 Comprehensive Summary

6.1 Comparative Analysis of All Studies

# Create comprehensive summary
summary_results <- tibble(
  Problem = c(
    "1. High vs Low Pressure Yield", 
    "2. Drug vs Placebo Efficacy", 
    "3. MSFT vs PFE Beta (MSFT Lower)", 
    "4. Cholesterol Reduction"
  ),
  `Sample Size` = c(
    "n₁=10, n₂=8",
    "n₁=34, n₂=19",
    "n=150 months",
    "n=8 paired"
  ),
  `Probability (%)` = c(
    round(probability_high_better * 100, 2),
    round(prob_drug_better * 100, 2),
    round(prob_msft_lower_beta * 100, 2),
    round(prob_reduction * 100, 2)
  ),
  `Evidence Level` = c(
    ifelse(probability_high_better >= 0.95, "Strong", 
           ifelse(probability_high_better >= 0.80, "Moderate", "Weak")),
    ifelse(prob_drug_better >= 0.95, "Strong", 
           ifelse(prob_drug_better >= 0.80, "Moderate", "Weak")),
    ifelse(prob_msft_lower_beta >= 0.95 | prob_msft_lower_beta <= 0.05, 
           "Strong", ifelse(prob_msft_lower_beta >= 0.80 | 
                            prob_msft_lower_beta <= 0.20, "Moderate", "Weak")),
    ifelse(prob_reduction >= 0.95, "Strong", 
           ifelse(prob_reduction >= 0.80, "Moderate", "Weak"))
  ),
  `Primary Conclusion` = c(
    ifelse(probability_high_better >= 0.90, 
           "High pressure improves yield", "Inconclusive"),
    ifelse(prob_drug_better >= 0.80, 
           "Drug is effective", "Weak evidence"),
    ifelse(prob_msft_lower_beta < 0.50, 
           "MSFT more cyclical", "PFE more cyclical"),
    ifelse(prob_reduction >= 0.80, 
           "Intervention effective", "Weak evidence")
  )
)
Table 14: Comprehensive Summary of All Bootstrap Analyses
Problem Sample Size Probability (%) Evidence Level Primary Conclusion
  1. High vs Low Pressure Yield
n₁=10, n₂=8 86.0 Moderate Inconclusive
  1. Drug vs Placebo Efficacy
n₁=34, n₂=19 87.5 Moderate Drug is effective
  1. MSFT vs PFE Beta (MSFT Lower)
n=150 months 0.0 Strong MSFT more cyclical
  1. Cholesterol Reduction
n=8 paired 98.1 Strong Intervention effective

ggplot(summary_results, 
       aes(x = reorder(Problem, `Probability (%)`), 
           y = `Probability (%)`,
           fill = `Evidence Level`)) +
  geom_bar(stat = "identity", alpha = 0.8, color = "black", width = 0.7) +
  geom_text(aes(label = sprintf("%.2f%%", `Probability (%)`)), 
            hjust = -0.1, size = 5, fontface = "bold") +
  geom_hline(yintercept = 95, color = "red", 
             linetype = "dashed", alpha = 0.7, size = 1) +
  geom_hline(yintercept = 90, color = "orange", 
             linetype = "dashed", alpha = 0.7, size = 1) +
  geom_hline(yintercept = 80, color = "yellow3", 
             linetype = "dashed", alpha = 0.7, size = 1) +
  scale_fill_manual(values = c("Strong" = "darkgreen", 
                                "Moderate" = "goldenrod", 
                                "Weak" = "coral")) +
  coord_flip() +
  labs(
    title = "Comparative Summary: Bootstrap Probability Estimates Across All Studies",
    subtitle = "Evidence level classification with conventional significance thresholds",
    x = "",
    y = "Probability (%)",
    fill = "Evidence Level",
    caption = paste("All analyses based on", n_iterations, "bootstrap iterations | α thresholds shown as horizontal lines")
  ) +
  ylim(0, 110) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
    plot.subtitle = element_text(size = 12, hjust = 0.5),
    legend.position = "top",
    axis.text.y = element_text(size = 11)
  )
Figure 12: Comparative Summary of All Resampling Analyses

Figure 12: Comparative Summary of All Resampling Analyses


7 Methodological Reflection

7.1 Strengths of Bootstrap Resampling

  1. Non-Parametric Nature
  • No assumptions about underlying distributions required
  • Particularly valuable when normality assumptions are questionable
  • Robust to moderate violations of independence
  1. Intuitive Appeal
  • Conceptually straightforward: “What if we repeated this experiment?”
  • Easy to explain to non-statisticians and stakeholders
  • Transparent methodology builds trust
  1. Flexibility
  • Applicable to wide variety of statistics (means, medians, correlations, regression coefficients, etc.)
  • Can handle complex sample structures (paired, clustered, stratified)
  • Extensions available for dependent data (block bootstrap, moving block bootstrap)
  1. Robust Inference
  • Less sensitive to outliers than traditional parametric methods
  • Provides empirical confidence intervals without distributional assumptions
  • Works well with small to moderate sample sizes
  1. Computational Advantages
  • Leverages modern computing power effectively
  • Easily parallelizable for faster computation
  • Reproducible with proper seed setting
  1. Comprehensive Output
  • Probability estimates with intuitive interpretation
  • Bootstrap distributions reveal shape and spread
  • Confidence intervals naturally emerge from percentiles

7.2 Limitations and Considerations

7.2.1 Critical Limitations to Consider

  1. Small Sample Issues
  • Bootstrap cannot create information that doesn’t exist in data
  • With n < 10, bootstrap distributions may be unstable
  • Original sample must be representative of population
  1. Computational Cost
  • Requires 1000+ iterations for stable estimates
  • Can be time-consuming for complex models
  • May be prohibitive for very large datasets
  1. Independence Assumptions
  • Standard bootstrap assumes independent observations
  • Time series and spatial data require specialized methods
  • Clustered data needs careful consideration
  1. Outlier Sensitivity
  • While more robust than parametric methods, still influenced by extreme values
  • Original outliers are resampled and can dominate bootstrap samples
  • Sensitivity analysis crucial
  1. Sample Quality Requirements
  • “Garbage in, garbage out” principle applies
  • Biased samples lead to biased bootstrap estimates
  • Cannot correct for selection bias or measurement error
  1. Interpretation Challenges
  • Probability ≠ certainty; context essential
  • Multiple testing issues if conducting many comparisons
  • P-values and probabilities have subtle differences

7.3 When to Use Resampling Methods

7.3.1 Ideal Scenarios:

  • Small to moderate sample sizes (10 ≤ n ≤ 1000)
  • Unknown or non-normal distributions
  • Complex statistics without closed-form standard errors
  • Preliminary or exploratory analyses
  • Teaching statistical concepts

7.3.2 Consider Alternatives:

  • Very large datasets (computational burden)
  • Strongly dependent or time-series data (need specialized methods)
  • When parametric assumptions clearly hold (classical methods may be more powerful)
  • Regulatory submissions requiring exact p-values

7.4 Best Practices for Resampling


8 Pedagogical Insights

8.1 Learning Objectives Achieved

By completing this comprehensive resampling analysis, students have:

Understood the theoretical foundation of bootstrap resampling
Applied resampling to diverse real-world problems
Interpreted bootstrap probability estimates and confidence intervals
Evaluated strengths and limitations of non-parametric inference
Developed computational skills in R for statistical resampling
Communicated findings effectively through visualization and reporting


8.2 Key Takeaways for Practice

8.2.1 For Business Analysts

  • Resampling provides intuitive answers to business questions
  • Probability estimates directly address stakeholder concerns
  • Visualizations facilitate communication with non-technical audiences
  • Sensitivity analyses demonstrate robustness of conclusions

8.2.2 For Researchers

  • Bootstrap methods expand analytical toolkit beyond parametric assumptions
  • Particularly valuable in pilot studies and exploratory research
  • Transparent methodology enhances reproducibility
  • Appropriate for diverse data structures and outcome types

8.2.3 For Decision-Makers

  • Probability-based inference aligns with risk assessment frameworks
  • Confidence intervals quantify uncertainty for informed decisions
  • Comparative analyses support evidence-based choices
  • Flexible methodology adapts to unique organizational contexts

9 Further Reading and Resources

9.1 Foundational Texts

  1. Efron, B., & Tibshirani, R. J. (1994). An Introduction to the Bootstrap. Chapman and Hall/CRC.
  • The seminal text on bootstrap methodology
  • Comprehensive theoretical treatment with practical examples
  1. Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and their Application. Cambridge University Press.
  • Advanced topics and extensions
  • Excellent coverage of confidence interval methods
  1. Good, P. I. (2005). Resampling Methods: A Practical Guide to Data Analysis. Birkhäuser.
  • Accessible introduction for practitioners
  • Focus on applied problems across disciplines

9.2 R Packages for Resampling

# Core resampling functionality
library(boot)           # Comprehensive bootstrap and jackknife tools
library(bootstrap)      # Original bootstrap methods
library(rsample)        # Tidyverse-friendly resampling

# Specialized applications
library(bootLR)         # Bootstrapped confidence intervals for regression
library(meboot)         # Maximum entropy bootstrap for time series
library(tsboot)         # Time series bootstrap methods

10 Appendix: Utility Functions

10.1 General Bootstrap Function

#' General Bootstrap Resampling Function
#'
#' Flexible function for bootstrapping any statistic from a single sample
#'
#' @param data Numeric vector of observations
#' @param statistic_func Function to calculate statistic (e.g., mean, median, sd)
#' @param n_iterations Number of bootstrap iterations (default: 1000)
#' @param confidence_level Confidence level for interval (default: 0.95)
#' @param seed Random seed for reproducibility
#'
#' @return List with bootstrap results and confidence interval
#'
#' @examples
#' result <- general_resampling(c(1:20), mean, 1000, 0.95)
#' cat("95% CI:", result$ci_lower, "to", result$ci_upper)
general_resampling <- function(data, statistic_func, 
                               n_iterations = 1000, 
                               confidence_level = 0.95, 
                               seed = 333) {
  set.seed(seed)
  results <- numeric(n_iterations)
  
  for (i in 1:n_iterations) {
    resampled_data <- sample(data, size = length(data), replace = TRUE)
    results[i] <- statistic_func(resampled_data)
  }
  
  # Calculate confidence interval
  alpha <- 1 - confidence_level
  ci_lower <- quantile(results, alpha / 2)
  ci_upper <- quantile(results, 1 - alpha / 2)
  
  return(list(
    results = results,
    mean = mean(results),
    median = median(results),
    sd = sd(results),
    ci_lower = ci_lower,
    ci_upper = ci_upper,
    confidence_level = confidence_level,
    original_stat = statistic_func(data)
  ))
}

10.2 Data Export Function

#' Export Resampling Results to Excel
#'
#' @param filename Name of Excel file to create
#'
#' @examples
#' export_resampling_results("My_Analysis_Results.xlsx")
export_resampling_results <- function(filename = "Resampling_Results_R.xlsx") {
  # Create workbook
  wb <- createWorkbook()
  
  # Add Summary sheet
  addWorksheet(wb, "Summary")
  writeData(wb, "Summary", summary_results)
  
  # Add Original Data sheet
  addWorksheet(wb, "Original_Data")
  writeData(wb, "Original_Data", yield_data)
  
  # Add Cholesterol Data sheet
  addWorksheet(wb, "Cholesterol_Data")
  writeData(wb, "Cholesterol_Data", cholesterol_data)
  
  # Save workbook
  saveWorkbook(wb, filename, overwrite = TRUE)
  message(paste("Results exported to:", filename))
}

# Uncomment to export:
# export_resampling_results()

11 Session Information

sessionInfo()
R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Africa/Lagos
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DT_0.34.0        scales_1.4.0     gridExtra_2.3    moments_0.14.1  
 [5] openxlsx_4.2.8   kableExtra_1.4.0 knitr_1.50       lubridate_1.9.4 
 [9] forcats_1.0.1    stringr_1.5.2    dplyr_1.1.4      purrr_1.1.0     
[13] readr_2.1.5      tidyr_1.3.1      tibble_3.3.0     ggplot2_4.0.0   
[17] tidyverse_2.0.0 

loaded via a namespace (and not attached):
 [1] sass_0.4.10        generics_0.1.4     xml2_1.4.0         stringi_1.8.7     
 [5] hms_1.1.3          digest_0.6.37      magrittr_2.0.4     evaluate_1.0.5    
 [9] grid_4.5.1         timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0     
[13] jsonlite_2.0.0     zip_2.3.3          viridisLite_0.4.2  textshaping_1.0.4 
[17] jquerylib_0.1.4    cli_3.6.5          rlang_1.1.6        withr_3.0.2       
[21] cachem_1.1.0       yaml_2.3.10        tools_4.5.1        tzdb_0.5.0        
[25] vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4    htmlwidgets_1.6.4 
[29] pkgconfig_2.0.3    pillar_1.11.1      bslib_0.9.0        gtable_0.3.6      
[33] glue_1.8.0         Rcpp_1.1.0         systemfonts_1.3.1  xfun_0.53         
[37] tidyselect_1.2.1   rstudioapi_0.17.1  farver_2.1.2       htmltools_0.5.8.1 
[41] labeling_0.4.3     rmarkdown_2.30     svglite_2.2.1      compiler_4.5.1    
[45] S7_0.2.0          

Acknowledgments

This document was prepared for Data Analytics II at Pan-Atlantic University, Lagos Business School. The resampling methodology presented here draws upon foundational work by Bradley Efron and Robert Tibshirani, whose contributions revolutionized modern statistical practice.

Special Thanks:

  • Students of Data Analytics II for engaging discussions
  • Colleagues who provided feedback on pedagogical approaches
  • The R community for developing exceptional open-source tools

Contact Information

Author: Bongo Adi
Institution: Pan-Atlantic University - Lagos Business School
Course: Data Analytics II
Date: October 21, 2025