Fast Robust Bootstrap Algorithm

Author

Beniamino Sartini, Mar Talens, Tomeu López-Nieto

Published

March 24, 2024

Setup

library(dplyr)
library(ggplot2)
library(robustbase)
library(knitr)
#' @param B number of resamples. 
#' @param n number of observation in generated data. 
#' @param k number of regressors in generated data.
#' @param sigma standard deviation of the errors in generated data.
#' @param level probability for the confidence intervals, lower with at confidence `conf = (1 - level/2)` and upper at `1 - conf`. 
#' @param sample_perc percentage of resampled observations with respect to total sample. 
#' The number of observations in subsample are computed as `n_bar = trunc(n*sample_perc)`. 
#' @param ci type of confidence interval. Can be `empiric` in which bootstrapped quantiles are used
#'  or `theoric` in which normal quantiles are used. 
#' @param d threshold parameter for Tukey function.
#' @param eps threshold for outliers. We generate a random sequence in `tau in [0,1]`. 
#' When `tau < eps` then the observation become an outlier. 
#' @param z value of the outlier. 
#' @param type type of contamination when `type = "R"` the outlier observation become equal to `z`. 
#' When `type = "M"`, the outlier observation become equal to `z*Y[tau < eps]`.
# Data generation parameters
n = 1000
k = 8
sigma = 4
seed = 0 
# Bootstrap parameters 
B = 500
level = 0.95
sample_perc = 0.5
# Outliers parameters 
d = 4.56
eps = 0.05
z = 1.9
type = "M"

We review a development on a bootstrap method presented by Salibián-Barrera, Aelst, and Willems (2008) for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent.

Robustness to outliers

In most practical cases the bootstrap is based on simulating the distribution of the estimator of interest by generating a large number of new samples randomly drawn from the original data. The estimator is recalculated in each of these bootstrap samples, and the empirical distribution of these bootstrapped estimates yields an approximation to the estimator’s true sampling distribution.

However, two main problems arise when we want to use the bootstrap to estimate the sampling distribution of robust estimators: numerical instability and computational cost.

Intuitively, the reason for the numerical instability mentioned above is as follows. Since outlying and non-outlying observations have the same chance of belonging to any bootstrap sample, with a certain positive probability, the proportion of outliers in a bootstrap sample may be larger than the fraction of contamination that the robust estimator can tolerate. In other words, a certain proportion of the re-calculated values of the robust estimate may be heavily influenced by the outliers in the data.

Recently, Salibian-Barrera and Zamar (2002) introduced a new bootstrap method for robust estimators which has the advantage of being at the same time easy to compute and resistant to outliers in the sample (in other words, it can overcome the two problems mentioned above). Examples of well-known robust estimators that fulfill this condition are S-estimators, MM-estimators and \tau-estimators.

Notations

Let’s denote with n the number of observation and with k the number of regressors. For simplify the computations, we will use a matrix notation. All the bold symbols are interpreted as vector or matrices.

The functions \rho_0 and \rho_1 are the Tukey’s functions used respectively in S-estimate and MM-estimate. For simplicity we will fix a threshold d and we consider the same Tukey function for both S and MM-estimates. We consider the following function: \rho_d(x) = \begin{cases} \begin{align*} {} & \frac{3x^2}{d^2} - \frac{3x^4}{d^4} + \frac{x^6}{d^6} \quad {} & |x| \le d \\ & 1 & |x| > d \end{align*} \end{cases} \tag{1}

For Tukey’s derivatives functions we refer to this page.

The parameter \hat{\boldsymbol{\beta}} stands for the parameter obtained by an MM-regression, while \tilde{\boldsymbol{\beta}} and \hat{\sigma} are obtained from an S-regression.

Robust Regression

Consider a regression model in matrix form

\underset{\small n \times 1}{\boldsymbol{Y}} = \underset{\small n \times k}{\boldsymbol{X}} \; \underset{\small k \times 1}{\boldsymbol{\beta}} + \sigma \underset{\small n \times 1}{\boldsymbol{\epsilon}} where \boldsymbol{\epsilon} \sim \mathbb{F}^{c}, \boldsymbol{\beta} and \sigma are the true parameters in population. For simplicity we consider a normal distribution of the errors, namely \boldsymbol{\epsilon} \sim \mathbb{F}^{c} = N(0,1)

Initialization

Given an initial estimate of \hat{\boldsymbol{\beta}}, \tilde{\boldsymbol{\beta}} and \hat{\sigma}, compute the \textbf{residuals} for the MM and S regression respectively: \underset{\small n \times 1}{\boldsymbol{r}} = \underset{\small n \times 1}{\boldsymbol{Y}} - \underset{\small n \times k}{\boldsymbol{X}} \; \underset{\small k \times 1}{\hat{\boldsymbol{\beta}}} \quad \quad \underset{\small n \times 1}{\boldsymbol{\tilde{r}}} = \underset{\small n \times 1}{\boldsymbol{Y}} - \underset{\small n \times k}{\boldsymbol{X}} \; \underset{\small k \times 1}{\tilde{\boldsymbol{\beta}}} \tag{2} Compute the \textbf{weights} for i = 1,2, \dots, n: w_i = \frac{1}{r_i}\rho_1^{\prime} \biggl(\frac{r_i}{\hat{\sigma}}\biggl) \tag{3} \tilde{w}_i = \frac{1}{\tilde{r}_i}\rho_0^{\prime} \biggl(\frac{\tilde{r}_i}{\hat{\sigma}}\biggl) \tag{4} v_i = \frac{\hat{\sigma}_s}{n \ b \ \tilde{r}_i}\rho_0 \biggl(\frac{\tilde{r}_i}{\hat{\sigma}}\biggl) \tag{5} where b = \mathbb{E}^{\mathbb{F}_c}\{\rho_0\}. Then, MM-parameters are computed as: \underset{k \times 1}{\hat{\boldsymbol{\beta}}} = (\underset{k \times n}{\boldsymbol{X}^{T}} \underset{n \times n}{\text{diag}(w)} \underset{n \times k}{\boldsymbol{X}})^{-1} \underset{k \times n}{\boldsymbol{X}}^T \underset{n \times n}{\text{diag}(w)} \underset{n \times 1}{\boldsymbol{Y}} \tag{6} while for the S-parameters, \underset{k \times 1}{\boldsymbol{\tilde{\beta}}} = (\underset{k \times n}{\boldsymbol{X}}^{T} \underset{n \times n}{\text{diag}(\tilde{w})} \underset{n \times k}{\boldsymbol{X}})^{-1} \underset{k \times n}{\boldsymbol{X}}^T \underset{n \times n}{\text{diag}(\tilde{w})} \underset{n \times 1}{\boldsymbol{Y}} \tag{7} and the S-scale is obtained as \underset{1 \times 1}{\hat{\sigma}} = \underset{1 \times n}{\tilde{\boldsymbol{r}}^T} \ \underset{n \times 1}{\boldsymbol{v}} \iff \underset{1 \times 1}{\hat{\sigma}} = \sum_{i = 1}^{n} v_i \tilde{r}_i \tag{8} Finally, the correction factors are defined as: \underset{k \times k}{\boldsymbol{M}} = \hat{\sigma} \biggl(\boldsymbol{X}^T \text{diag}[\rho_1^{\prime\prime}\bigl(\frac{\boldsymbol{r}}{\hat{\sigma}}\bigl)] \boldsymbol{X}\biggl)^{-1} \boldsymbol{X}^T \text{diag}(\boldsymbol{w}) \quad\quad \underset{k \times k}{\tilde{\boldsymbol{M}}} = 0 \tag{9} \underset{k \times 1}{\boldsymbol{d}} = \frac{1}{a_n} \biggl(\boldsymbol{X}^T \text{diag}[\rho_1^{\prime\prime}\bigl(\frac{\boldsymbol{r}}{\hat{\sigma}}\bigl)] \boldsymbol{X}\biggl)^{-1} \boldsymbol{X}^T (\text{diag}[\rho_1^{\prime\prime}\bigl(\frac{\boldsymbol{r}}{\hat{\sigma}}\bigl) \ \odot \ \boldsymbol{r}]) \tag{10} \underset{1 \times 1}{a_n} = \frac{1}{n \ b} \sum_{i=1}^{n} \biggl[\rho_0^{\prime} \biggl(\frac{r_i}{\hat{\sigma}}\biggl) \ \frac{\tilde{r}_i}{\hat{\sigma}} \biggl] \tag{11} where \odot denotes the Hadamard product and all are computed once on the complete sample.

Bootstrapping

Once the initial parameters are estimated, consider a sample of n^{\ast} of observations from which we extract X^{\ast}, Y^{\ast}. The residuals are then updated on the sub-sample as: \underset{\small n^{\ast} \times 1}{\boldsymbol{r}^{\ast}} = \underset{\small n^{\ast} \times 1}{\boldsymbol{Y}^{\ast}} - \underset{\small n^{\ast} \times k}{\boldsymbol{X}^{\ast}} \; \underset{\small k \times 1}{\hat{\boldsymbol{\beta}}} \quad \quad \underset{\small n^{\ast} \times 1}{\boldsymbol{\tilde{r}}^{\ast}} = \underset{\small n^{\ast} \times 1}{\boldsymbol{Y}^{\ast}} - \underset{\small n^{\ast} \times k}{\boldsymbol{X}^{\ast}} \; \underset{\small k \times 1}{\tilde{\boldsymbol{\beta}}} \tag{12} Then, for i = 1, 2, \dots, n^{\ast} the \textbf{updated weights} becomes: w_i^{\ast} = \frac{1}{r_i^{\ast}}\rho_1^{\prime} \biggl(\frac{r_i^{\ast}}{\hat{\sigma}_s}\biggl) \tag{13} \tilde{w}_i^{\ast} = \frac{1}{\tilde{r}_i^{\ast}}\rho_0^{\prime} \biggl(\frac{\tilde{r}_i^{\ast}}{\hat{\sigma}_s}\biggl) \tag{14} v_i^{\ast} = \frac{\hat{\sigma}_s}{n^{\ast} \ b \ \tilde{r}_i^{\ast}}\rho_0 \biggl(\frac{\tilde{r}_i^{\ast}}{\hat{\sigma}}\biggl) \tag{15} We recalculate the one-step estimate \boldsymbol{\hat{\beta}}^{1\ast}, \boldsymbol{\tilde{\beta}}^{1\ast} and \hat{\sigma}^{1\ast} respectively as in Equation 6, Equation 7 and Equation 8, but with the sub-sample elements. Then, the linearly corrected estimator for building the confidence intervals are obtained as: \underset{k \times 1}{\hat{\boldsymbol{\beta}}^{R\ast}} = \hat{\boldsymbol{\beta}} + \boldsymbol{M} (\boldsymbol{\hat{\beta}}^{1\ast} - \boldsymbol{\hat{\beta}}) + \boldsymbol{d} (\boldsymbol{\hat{\sigma}}^{1\ast} - \boldsymbol{\hat{\sigma}}) + \boldsymbol{\tilde{M}} (\boldsymbol{\tilde{\beta}}^{1\ast} - \boldsymbol{\tilde{\beta}}) \tag{16}

Since \tilde{\boldsymbol{M}} is assumed to be zero, it is possible to avoid the calculation of \boldsymbol{\tilde{\beta}}^{1\ast} and \tilde{w}_i^{\ast} on the sub-sample and consider a simplified version: \underset{k \times 1}{\hat{\boldsymbol{\beta}}^{R\ast}} = \hat{\boldsymbol{\beta}} + \boldsymbol{M} (\boldsymbol{\hat{\beta}}^{1\ast} - \boldsymbol{\hat{\beta}}) + \boldsymbol{d} (\boldsymbol{\hat{\sigma}}^{1\ast} - \boldsymbol{\hat{\sigma}}) \tag{17}

Routine

Step	Description
Initialization	Compute \hat{\boldsymbol{\beta}} (MM-estimate) and \tilde{\boldsymbol{\beta}} and \hat{\sigma} (S-estimate). Compute the correction factors \boldsymbol{M}, a_n and \boldsymbol{d} on the complete sample.
Bootstrapping	For b = 1, 2, \dots, B: Extract a sample \boldsymbol{X}^{\ast}, \boldsymbol{Y}^{\ast} with n^{\ast} observations. Update the weights \boldsymbol{w}^{\ast}, \boldsymbol{v}^{\ast}. Compute one-step estimates \hat{\boldsymbol{\beta}}^{1\ast} and $ ^{1}$. Compute linearly corrected parameters \hat{\boldsymbol{\beta}}^{R\ast}.
Output	Confidence intervals for \hat{\boldsymbol{\beta}}_j \quad j = 1,\dots,k. Two methods: Theoric: with normal quantiles as \hat{\boldsymbol{\beta}}_{j} \pm z_{\alpha/2} \hat{\sigma}_j where z_{\alpha/2} is the normal quantile with confidence \alpha and \hat{\sigma}_j is the estimated asymptotic standard deviation of \hat{\boldsymbol{\beta}}_{j}. Estimated: with the empirical quantiles of the bootstrapped distribution of \hat{\boldsymbol{\beta}}_{j}.

Step

Description

Initialization

Compute \hat{\boldsymbol{\beta}} (MM-estimate) and \tilde{\boldsymbol{\beta}} and \hat{\sigma} (S-estimate).
Compute the correction factors \boldsymbol{M}, a_n and \boldsymbol{d} on the complete sample.

Bootstrapping

For b = 1, 2, \dots, B:

Extract a sample \boldsymbol{X}^{\ast}, \boldsymbol{Y}^{\ast} with n^{\ast} observations.
Update the weights \boldsymbol{w}^{\ast}, \boldsymbol{v}^{\ast}.
Compute one-step estimates \hat{\boldsymbol{\beta}}^{1\ast} and $ ^{1}$.
Compute linearly corrected parameters \hat{\boldsymbol{\beta}}^{R\ast}.

Output

Confidence intervals for \hat{\boldsymbol{\beta}}_j \quad j = 1,\dots,k. Two methods:

Theoric: with normal quantiles as \hat{\boldsymbol{\beta}}_{j} \pm z_{\alpha/2} \hat{\sigma}_j where z_{\alpha/2} is the normal quantile with confidence \alpha and \hat{\sigma}_j is the estimated asymptotic standard deviation of \hat{\boldsymbol{\beta}}_{j}.
Estimated: with the empirical quantiles of the bootstrapped distribution of \hat{\boldsymbol{\beta}}_{j}.

Implementation Example

Data Generator Function

#' Generate regression data 
#' 
#' @param n number of observation in generated data. 
#' @param k number of regressors in generated data.
#' @param sigma standard deviation of the errors in generated data.
#' @param intercept Set to `TRUE` to include the intercept parameter. 
#' @param seed random seed.
#' @param eps threshold for outliers. We generate a random sequence in `tau in [0,1]`. 
#' When `tau < eps` then the observation become an outlier. 
#' @param z value of the outlier. 
#' @param type type of contamination when `type = "R"` the outlier observation become equal to `z`. 
#' When `type = "M"`, the outlier observation become equal to `z*Y[tau < eps]`.
#' @rdname generate_regression_data
#' @name generate_regression_data
generate_regression_data <- function(n = 5000, k = 8, sigma = 2, intercept = TRUE, 
                                     eps = 0, z = 100, type = "R", seed = 1){
  set.seed(seed)
  # Simulate value for true beta 
  true_beta <- rnorm(k, mean = 0, sd = 1)
  names(true_beta) <- paste0("beta_", 1:k) # names 
  # Vector of parameters 
  theta_true = c(true_beta, sigma = sigma)
  # Simulate the regressors 
  X <- matrix(rnorm(n*k), nrow = n, ncol = k, byrow = TRUE)
  colnames(X) <- c(paste0("X", 1:k)) # names 
  # Include an intercept 
  if (intercept) {
    X[,1] <- rep(1, n)
  } 
  # Regression error
  eps_sim <- rnorm(n, mean = 0, sd = sigma)
  # Response variable 
  Y <- X %*% true_beta
  if (!eps != 0) {
    tau <-runif(n)
    type <- match.arg(type, choices = c("R", "M"))
    if (type == "R") {
      Y[tau < eps,] <- z
    } else if (type == "R") {
      Y[tau < eps,] <- z*Y[tau < eps,]
    }
  }
  colnames(Y) <- "Y"
  structure(
    list(
      X = X, 
      Y = Y,
      e = eps,
      z = z,
      type = type,
      eps = eps_sim,
      k = k,
      n = n,
      theta = theta_true
    )
  )
}

Tukey-Beaton Bisquare Functions

#' Tukey-Beaton Bisquare Functions
#' 
#' @param d scalar, threshold.
#' @rdname tukey_beaton
#' @name tukey_beaton
#' @aliases tukey_beaton_bisquare
#' @aliases tukey_beaton_prime
#' @aliases tukey_beaton_second
#' @return A function of x 

#' @rdname tukey_beaton
#' @export    
tukey_beaton_bisquare <- function(d){
  function(x){
    x[abs(x) > d] <- NA
    f_x <- 3*(x/d)^2 - 3*(x/d)^4 + (x/d)^6
    f_x[is.na(f_x)] <- 1
    return(f_x)
  }
}

#' Tukey-Beaton Bisquare First Derivative
#' @rdname tukey_beaton
#' @export    
tukey_beaton_prime <- function(d){
  function(x){
    x[abs(x) > d] <- NA
    f_x <- 6*(1/d^2)*x - 12*(1/d^4)*(x)^3 + 6*(x)^5*(1/d^6)
    f_x[is.na(f_x)] <- 0
    return(f_x)
  }
}

#' Tukey-Beaton Bisquare Second Derivative
#' @rdname tukey_beaton
#' @export   
tukey_beaton_second <- function(d){
  function(x){
    x[abs(x) > d] <- NA
    f_x <- 6*(1/d^2) - 36*(1/d^4)*(x)^2 + 30*(x)^4*(1/d^6)
    f_x[is.na(f_x)] <- 0
    return(f_x)
  }
}

fastRobBoot Function

#' @param B number of resamples. 
#' @param n number of observation in generated data. 
#' @param k number of regressors in generated data.
#' @param sigma standard deviation of the errors in generated data.
#' @param level probability for the confidence intervals, lower with at confidence `conf = (1 - level/2)` and upper at `1 - conf`. 
#' @param sample_perc percentage of resampled observations with respect to total sample. 
#' The number of observations in subsample are computed as `n_bar = trunc(n*sample_perc)`. 
#' @param ci type of confidence interval. Can be `empiric` in which bootstrapped quantiles are used
#'  or `theoric` in which normal quantiles are used. 
#' @param d threshold parameter for Tukey function.
#' @param eps threshold for outliers. We generate a random sequence in `tau in [0,1]`. 
#' When `tau < eps` then the observation become an outlier. 
#' @param z value of the outlier. 
#' @param type type of contamination when `type = "R"` the outlier observation become equal to `z`. 
#' When `type = "M"`, the outlier observation become equal to `z*Y[tau < eps]`.
#' @param seed random seed.
#' @param quiet Set to `TRUE` to avoid messages. 
#' @param consistency when `TRUE` will test the consistency of the estimator generating a new dataset (simulating only the error) for each bootstrap. When `FALSE`, the default it will use a sample from initial data. 
fastRobBoot <- function(B = 100, n = 500, k = 8, sigma = 4, 
                        level = 0.95, sample_perc = 0.8, ci = "empiric", 
                        d = 4.56, eps = 0, z = 0.04, type = "M",
                        seed = 0, quiet = FALSE, consistency = FALSE){
  
  # Type of confidence interval 
  ci <- match.arg(ci, choices = c("empiric", "theoric"))
  
  # Loss functions 
  phi <- tukey_beaton_bisquare(d)
  phi_prime <- tukey_beaton_prime(d)
  phi_second <- tukey_beaton_second(d)

  # parameter b 
  compute_b <- function(.f){
    fun <- function(x){
      .f(x)*dnorm(x, 0, sd = 1)
    }
    integrate(fun, lower = -Inf, upper = Inf)$value
  }
  b <- compute_b(phi)
  
  # Initialization 
  object <- generate_regression_data(n = n, k = k, sigma = sigma, 
                                     eps = eps, z = z, type = type, seed = seed)
  theta_true <- object$theta  # True parameters
  X <- object$X  # Regressors
  Y <- object$Y + object$eps  # Perturbated response variable 
  
  data <- as.data.frame(cbind(X, Y))
  # MM-estimate (beta_hat)
  rob_fit_mm <- robustbase::lmrob(Y~.-1, data = data, method = "SMDM")
  # S-estimate (beta_tilde and sigma_hat)
  rob_fit_s <- robustbase::lmrob(Y~.-1, data = data, method = "S")
  
  # Initial estimated parameters  
  beta_hat <- matrix(coef(rob_fit_mm), ncol = 1)
  beta_tilde <- matrix(coef(rob_fit_s), ncol = 1)
  sigma_hat <- matrix(rob_fit_s$scale, ncol = 1)
  
  # Residuals 
  r <- Y - X %*% beta_hat  # MM-estimate  
  r_tilde <- Y - X %*% beta_tilde  # S-estimate 
  # Weights 
  w <- phi_prime(r/sigma_hat[1,1])/r  # MM-estimate  
  v <- sigma_hat[1,1]/(n*b)*phi(r_tilde[,1]/sigma_hat[1,1])/r_tilde  # S-scale   
  w_tilde <- phi_prime(r_tilde[,1]/sigma_hat[1,1])/r_tilde  # S-estimate  
  
  # Estimated beta_hat 
  beta_hat <- solve(t(X) %*% diag(w[,1]) %*% X) %*% t(X) %*% diag(w[,1]) %*% Y 
  # Estimated sigma_hat
  sigma_hat <- t(r_tilde) %*% v
  # Estimated beta_tilde
  beta_tilde <- solve(t(X) %*% diag(w_tilde[,1]) %*% X) %*% t(X) %*% diag(w_tilde[,1]) %*% Y 
  # Correction matrix M
  M_n <- sigma_hat[1,1]*solve(t(X) %*% diag(phi_second(r[,1]/sigma_hat[1,1])) %*% X) %*% t(X) %*% diag(w[,1]) %*% X 
  # Correction factor a_n 
  a_n <- 1/(n*b)*sum(phi_prime(r[,1]/sigma_hat[1,1])*r[,1]/sigma_hat[1,1])
  # Correction matrix d
  d_n <- (1/a_n)*solve(t(X) %*% diag(phi_second(r[,1]/sigma_hat[1,1])) %*% X) %*% t(X) %*% (phi_second(r[,1]/sigma_hat[1,1])*r)
  
  bootstrap <- list()
  for(i in 1:B){
    if (!quiet) message(i, "/", B, " (", round(i/B*100, 2), "%) \r", appendLF = FALSE)
    
    if (consistency) {
      eps <- rnorm(n, mean = 0, sd = object$theta[object$k+1])
      # Inputs 
      X_star <- X
      Y_star <- object$Y + eps
    } else {
      # Creation of confidence intervals 
      idx_sample <- sample(n, trunc(n*sample_perc), replace = FALSE)
      # Inputs 
      X_star <- X[idx_sample,]
      Y_star <- Y[idx_sample,]
    }
    n_star <- nrow(X_star)
    # Updated residuals MM-estimate  
    r_star <- Y_star - X_star %*% beta_hat
    # Updated residuals S-estimate  
    r_tilde_star <- Y_star - X_star %*% beta_tilde
    # Updated weights 
    w_star <- phi_prime(r_star/sigma_hat[1,1])/r_star  # MM-estimate  
    v_star <- sigma_hat[1,1]/(n_star*b)*phi(r_tilde_star[,1]/sigma_hat[1,1])/r_tilde_star  # S-scale 
    # not required when M_tilde = 0
    # w_tilde_star <- phi_prime(r_tilde_star[,1]/sigma_hat[1,1])/r_tilde_star  # S-estimate  
    # Updated one-step beta_hat_star 
    beta_hat_star <- solve(t(X_star) %*% diag(w_star[,1]) %*% X_star) %*% t(X_star) %*% diag(w_star[,1]) %*% Y_star 
    # Updated one-step sigma_hat_star 
    sigma_hat_star <- t(r_tilde_star) %*% v_star
    # Updated one-step beta_tilde_star (not required when M_tilde = 0)
    # beta_tilde_star <- solve(t(X_star) %*% diag(w_tilde_star[,1]) %*% X_star) %*% t(X_star) %*% diag(w_tilde_star[,1]) %*% Y_star
    # Linearly corrected beta 
    beta_R_star <- beta_hat + M_n %*% (beta_hat_star - beta_hat) + d_n*(sigma_hat_star[1,1] - sigma_hat[1,1])
    rownames(beta_R_star) <- names(theta_true[1:ncol(X)])
    bootstrap[[i]] <- dplyr::bind_cols(t(beta_R_star), sigma = sigma_hat_star)
  }
  
  bootstrap <- dplyr::bind_rows(bootstrap)
  # Expected value 
  e_theta_hat <- apply(bootstrap, 2, mean)
  # Asymptotic variance 
  sd_theta_hat <- apply(bootstrap, 2, sd)
  
  conf <- (1 - level)/2
  if (ci == "empiric") {
    theta_hat_lower <- apply(bootstrap, 2, quantile, probs = conf)
    theta_hat_upper <- apply(bootstrap, 2, quantile, probs = 1 - conf)
  } else if (ci == "theoric") {
    theta_hat_lower <- e_theta_hat + qnorm(conf)*sd_theta_hat
    theta_hat_upper <- e_theta_hat - qnorm(conf)*sd_theta_hat
  }
  df_bootstrap <- dplyr::bind_cols(theta = names(theta_true), 
                                   init = c(beta_hat, sigma_hat[1,1]),
                                   lower = theta_hat_lower, 
                                   mean = e_theta_hat,
                                   upper = theta_hat_upper, 
                                   true = theta_true, 
                                   bias = e_theta_hat - theta_true,
                                   sigma = sd_theta_hat,
                                   level = level,
                                   ci = ci)
  df_bootstrap$accuracy <- mean(theta_true[1:k] >= theta_hat_lower[1:k] & theta_true[1:k] <= theta_hat_upper[1:k])
  structure(
    list(
      bootstrap = bootstrap,
      results = df_bootstrap
    ),
    class = c("fastRobBoot", "list")
  )
}

slowRobBoot Function

#' @param robust If `TRUE` robust regression will be used
slowRobBoot <- function(B = 100, n = 500, k = 8, sigma = 4, 
                        level = 0.95, sample_perc = 0.8, ci = "empiric",
                        eps = 0, z = 0.04, type = "M",
                        seed = 0, quiet = FALSE, consistency = FALSE, robust = TRUE){
  
  # Type of confidence interval 
  ci <- match.arg(ci, choices = c("empiric", "theoric"))
  
  # Initialization 
  object <- generate_regression_data(n = n, k = k, sigma = sigma, intercept = TRUE, 
                                     eps = eps, z = z, type = type, seed = seed)
  X <- object$X  # Regressors
  Y <- object$Y + object$eps  # Perturbated response variable 
  theta_true <- object$theta  # True parameters
  data <- as.data.frame(cbind(X, Y))
  n <- nrow(data)
  
  if (robust) {
    # MM-estimate (beta_hat)
    rob_fit_mm <- robustbase::lmrob(Y~.-1, data = data, method = "SMDM")
    # Initial estimated parameters  
    beta_hat <- matrix(coef(rob_fit_mm), ncol = 1)
    sigma_hat <- matrix(rob_fit_mm$scale, ncol = 1)
  } else {
    # LM-estimate (beta_hat)
    fit_mm <- robustbase::lmrob(Y~.-1, data = data, method = "SMDM")
    # Initial estimated parameters  
    beta_hat <- matrix(coef(fit_mm), ncol = 1)
    sigma_hat <- matrix(sd(residuals(fit_mm)), ncol = 1)
  }
  
  bootstrap <- list()
  for(i in 1:B){
    if (!quiet) message(i, "/", B, " (", round(i/B*100, 2), "%) \r", appendLF = FALSE)
    
    if (consistency) {
      eps <- rnorm(n, mean = 0, sd = object$theta[object$k+1])
      # Inputs 
      X_star <- X
      Y_star <- object$Y + eps
    } else {
      # Creation of confidence intervals 
      idx_sample <- sample(n, trunc(n*sample_perc), replace = FALSE)
      # Inputs 
      X_star <- X[idx_sample,]
      Y_star <- matrix(Y[idx_sample,], ncol = 1)
    }
    colnames(Y_star) <- "Y"
    colnames(X_star) <- paste0("X", 1:ncol(X_star))
    data_star <- as.data.frame(cbind(X_star, Y_star))
    
    if (robust) {
      # MM-estimate (beta_hat)
      rob_fit_mm_star <- robustbase::lmrob(Y~.-1, data = data_star, method = "SMDM")
      # Initial estimated parameters  
      beta_hat_star <- matrix(coef(rob_fit_mm_star), ncol = 1)
      sigma_hat_star <- matrix(rob_fit_mm_star$scale, ncol = 1)
    } else {
      # LM-estimate (beta_hat)
      fit_mm_star <- lm(Y~.-1, data = data_star, method = "SMDM")
      # Initial estimated parameters  
      beta_hat_star <- matrix(coef(fit_mm_star), ncol = 1)
      sigma_hat_star <- matrix(sd(residuals(fit_mm_star)), ncol = 1)
    }
    rownames(beta_hat_star) <- names(theta_true[1:ncol(X)])
    bootstrap[[i]] <- dplyr::bind_cols(t(beta_hat_star), sigma = sigma_hat_star[1,1])
  }
  
  bootstrap <- dplyr::bind_rows(bootstrap)
  # Expected value 
  e_theta_hat <- apply(bootstrap, 2, mean)
  # Asymptotic variance 
  sd_theta_hat <- apply(bootstrap, 2, sd)
  
  conf <- (1 - level)/2
  if (ci == "empiric") {
    theta_hat_lower <- apply(bootstrap, 2, quantile, probs = conf)
    theta_hat_upper <- apply(bootstrap, 2, quantile, probs = 1 - conf)
  } else if (ci == "theoric") {
    theta_hat_lower <- e_theta_hat + qnorm(conf)*sd_theta_hat
    theta_hat_upper <- e_theta_hat - qnorm(conf)*sd_theta_hat
  }
  df_bootstrap <- dplyr::bind_cols(theta = names(theta_true), 
                                   init = c(beta_hat, sigma_hat[1,1]),
                                   lower = theta_hat_lower, 
                                   mean = e_theta_hat,
                                   upper = theta_hat_upper, 
                                   true = theta_true, 
                                   bias = e_theta_hat - theta_true,
                                   sigma = sd_theta_hat,
                                   level = level,
                                   ci = ci)
  df_bootstrap$accuracy <- mean(theta_true[1:k] >= theta_hat_lower[1:k] & theta_true[1:k] <= theta_hat_upper[1:k])
  structure(
    list(
      bootstrap = bootstrap,
      results = df_bootstrap
    ),
    class = c("fastRobBoot", "list")
  )
}

Plot functions for fastRobBoot

#' @param object object coming from fastRobBoot function
#' @param k index of the parameter to plot, if `NULL` all the parameter will be plotted. 
#' @param legend position of the legend, can be `none`, `top`, `left`, `right`, `bottom`. 
#' In case of multiple parameter plotted, it will be included only on first plot.
#' @title position of the legend, when `TRUE` it will be added only on first plot.
plot.fastRobBoot <- function(object, k = NULL, legend = "none", title = TRUE){
  
  plot_fastRobBoot <- function(object, k = 1, legend = "top", title = TRUE){
    
    # Bootstrapped series 
    theta_boot <- c(object$bootstrap[,k])[[1]]
    # Parameter name 
    theta_name <- names(object$bootstrap[,k])
    theta_min <- min(theta_boot) # minimum value
    theta_max <- max(theta_boot) # maximum value
    theta_lo <- object$results$lower[k] # lower quantile 
    theta_up <- object$results$upper[k] # upper quantile 
    theta_true <- object$results$true[k] # true parameter
    theta_init <- object$results$init[k] # true parameter
    e_theta <- object$results$mean[k] # expected value  
    sd_theta <- object$results$sigma[k] # standard deviation
    
    # Theoric Distribution
    xy_e_theta <- list(x = e_theta, y = dnorm(e_theta, mean = e_theta, sd = sd_theta))
    xy_theta_lo <- list(x = theta_lo, y = dnorm(theta_lo, mean = e_theta, sd = sd_theta))
    xy_theta_up <- list(x = theta_up, y = dnorm(theta_up, mean = e_theta, sd = sd_theta))
    # Grid for theoric distribution
    grid <- list(x = 0, y = 0)
    grid$x <- seq(theta_min, theta_max, length.out = 100)
    grid$y <- dnorm(grid$x, mean = e_theta, sd = sd_theta)
    
    plt <- ggplot()+
      geom_histogram(aes(theta_boot, ..density..),color = "black", fill = "lightgray", alpha = 0.5, bins = 30)+
      geom_segment(aes(x = xy_e_theta$x, xend = xy_e_theta$x, y = 0, yend = xy_e_theta$y, color = "mean"), linewidth = 1.1)+
      geom_segment(aes(x = xy_theta_lo$x, xend = xy_theta_lo$x, y = 0, yend = xy_theta_lo$y, color = "lower"), linewidth = 1.1)+
      geom_segment(aes(x = xy_theta_up$x, xend = xy_theta_up$x, y = 0, yend = xy_theta_up$y, color = "upper"), linewidth = 1.1)+
      geom_line(aes(grid$x, grid$y, color = "theoric"), linewidth = 1.3)+
      geom_point(aes(x = theta_init, y = 0, color = "init"), size = 3)
    
    if (!is.na(theta_true[1])){
      plt <- plt +
        geom_point(aes(x = theta_true, y = 0, color = "true"), size = 2, fill = "purple", shape = 17)
    }
    
    plt <- plt + 
      scale_color_manual(values = c(mean = "green", lower = "red", upper = "blue", theoric = "darkorange", true = "purple", init = "magenta"),
                         labels = c(mean = "Mean", lower = "Lower", upper = "Upper", theoric = "Theoric", true = "True", init = "Initial"))+
      theme_bw()+
      theme(legend.position = legend)+
      labs(x = theta_name, y = NULL, color = NULL, subtitle = ifelse(title, paste0(nrow(object$bootstrap), " bootstrap"), ""))
    return(plt)
  }
  
  max_k <- nrow(object$results)
  if (is.null(k)){
    k <- 1:max_k 
  } 
  
  plt <- list()
  plt[[1]] <- plot_fastRobBoot(object, k = k[1], legend = legend, title = title)
  if (length(k) > 1){
    k[k > max_k | k == k[1]] <- NA
    k <- na.omit(k)
    for(i in k){
      plt[[i]] <- plot_fastRobBoot(object, k = i, legend = "none", title = FALSE)
    }
  }
  gridExtra::grid.arrange(grobs = plt)
}

Test: is it fast?

We compare the fast robust bootstrap procedure proposed with another approach that is still robust but slower. In the slow-robust approach we simply re-estimate the parameters on a sub-sample with an MM-estimator. We can see that the fast robust bootstrap is almost 7x faster.

init_time <- Sys.time()
object <- fastRobBoot(B = B, n = n, k = k, sigma = sigma,
                      level = level, sample_perc = sample_perc, d = d, 
                      ci = "empiric", seed = seed, quiet = TRUE)
end_time <- Sys.time()
init_time2 <- Sys.time()
object2 <- slowRobBoot(B = B, n = n, k = k, sigma = sigma,
                       level = level, sample_perc = sample_perc,
                       ci = "empiric", seed = seed, quiet = TRUE, robust = TRUE)
end_time2 <- Sys.time()

[1] "Execution time (Fast Robust): 3.54428815841675 seconds!"

[1] "Execution time (Slow Robust): 26.7345499992371 seconds!"

Fast robust bootstrap on subsamples without outliers. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter, in purple the **true parameter**. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

Figure (Slow Robust Bootstrap)

plot(object2)

Slow robust bootstrap on subsamples without outliers. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter, in purple the **true parameter**. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

(Fast Robust) Confidence intervals without outliers.
theta	init	lower	mean	upper	true	bias	sigma
\hat{\beta_{1}}	1.0589404	0.7889691	1.0548896	1.3078614	1.2629543	-0.2080647	0.1294458
\hat{\beta_{2}}	-0.3988854	-0.6793477	-0.4046007	-0.1390852	-0.3262334	-0.0783674	0.1334516
\hat{\beta_{3}}	1.3919871	1.1320694	1.3882910	1.6424921	1.3297993	0.0584917	0.1284604
\hat{\beta_{4}}	1.3962083	1.1343451	1.3945891	1.6418834	1.2724293	0.1221598	0.1260848
\hat{\beta_{5}}	0.5257455	0.2716822	0.5210130	0.7807042	0.4146414	0.1063715	0.1301632
\hat{\beta_{6}}	-1.4569341	-1.6987144	-1.4554621	-1.2101879	-1.5399500	0.0844879	0.1295230
\hat{\beta_{7}}	-0.8917018	-1.1550028	-0.8972858	-0.6389953	-0.9285670	0.0312813	0.1332243
\hat{\beta_{8}}	-0.1095115	-0.3542053	-0.1062331	0.1613692	-0.2947204	0.1884873	0.1348299
\hat{\sigma}	4.0257902	3.6790257	3.9813193	4.2993828	4.0000000	-0.0186807	0.1525036

(Slow Robust) Confidence intervals without outliers.
theta	init	lower	mean	upper	true	bias	sigma
\hat{\beta_{1}}	1.0511706	0.7878794	1.0674608	1.3628851	1.2629543	-0.1954935	0.1480856
\hat{\beta_{2}}	-0.3981599	-0.6498027	-0.3865137	-0.1022582	-0.3262334	-0.0602804	0.1517938
\hat{\beta_{3}}	1.3961376	1.1330371	1.3866320	1.6821311	1.3297993	0.0568327	0.1409718
\hat{\beta_{4}}	1.3954471	1.1021259	1.3930762	1.6847461	1.2724293	0.1206469	0.1418879
\hat{\beta_{5}}	0.5261062	0.2400689	0.5275744	0.7928579	0.4146414	0.1129330	0.1577865
\hat{\beta_{6}}	-1.4521350	-1.7256519	-1.4659156	-1.2128659	-1.5399500	0.0740344	0.1444875
\hat{\beta_{7}}	-0.8949850	-1.1179134	-0.8891667	-0.6380188	-0.9285670	0.0394003	0.1317179
\hat{\beta_{8}}	-0.1095828	-0.3904141	-0.0922017	0.1573025	-0.2947204	0.2025187	0.1453933
\hat{\sigma}	4.0112498	3.8397435	4.0192902	4.2228865	4.0000000	0.0192902	0.1002278

Test: is it robust?

We compare the fast robust bootstrap procedure proposed with another approach that is not robust but faster. In the non-robust approach we simply re-estimate the parameters on a sub-sample with an OLS-estimator.

init_time <- Sys.time()
object_pert <- fastRobBoot(B = B, n = n, k = k, sigma = sigma,
                           level = level, sample_perc = sample_perc, d = d, 
                           eps = eps, z = z, type = type, 
                           ci = "empiric", seed = seed, quiet = TRUE, consistency = FALSE)
end_time <- Sys.time()
init_time2 <- Sys.time()
object2_pert <- slowRobBoot(B = B, n = n, k = k, sigma = sigma,
                            level = level, sample_perc = sample_perc,
                            eps = eps, z = z, type = type,
                            ci = "empiric", seed = seed, quiet = TRUE, robust = FALSE)
end_time2 <- Sys.time()

[1] "Execution time (Fast Robust): 3.40817880630493 seconds!"

[1] "Execution time (non-Robust): 0.718003988265991 seconds!"

Fast robust bootstrap on subsamples with outliers. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter, in purple the **true parameter**. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

Figure (non-Robust Bootstrap)

plot(object2_pert)

Slow non-robust bootstrap on subsamples with outliers. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter, in purple the **true parameter**. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

(Fast Robust) Confidence intervals with outliers.
theta	init	lower	mean	upper	true	bias	sigma
\hat{\beta_{1}}	1.0589404	0.7889691	1.0553627	1.3078614	1.2629543	-0.2075915	0.1292296
\hat{\beta_{2}}	-0.3988854	-0.6793477	-0.4042600	-0.1390852	-0.3262334	-0.0780266	0.1336887
\hat{\beta_{3}}	1.3919871	1.1235711	1.3875019	1.6424921	1.3297993	0.0577026	0.1290206
\hat{\beta_{4}}	1.3962083	1.1343451	1.3948776	1.6418834	1.2724293	0.1224483	0.1263326
\hat{\beta_{5}}	0.5257455	0.2716822	0.5219242	0.7844976	0.4146414	0.1072828	0.1311146
\hat{\beta_{6}}	-1.4569341	-1.6987144	-1.4550449	-1.2044396	-1.5399500	0.0849052	0.1300482
\hat{\beta_{7}}	-0.8917018	-1.1550028	-0.8973723	-0.6389953	-0.9285670	0.0311947	0.1332723
\hat{\beta_{8}}	-0.1095115	-0.3542053	-0.1051146	0.1637056	-0.2947204	0.1896058	0.1358755
\hat{\sigma}	4.0257902	3.6790257	3.9815491	4.2993828	4.0000000	-0.0184509	0.1530552

(non-Robust) Confidence intervals with outliers.
theta	init	lower	mean	upper	true	bias	sigma
\hat{\beta_{1}}	1.0511706	0.8005770	1.0449652	1.2820813	1.2629543	-0.2179891	0.1206155
\hat{\beta_{2}}	-0.3981599	-0.6669506	-0.4086902	-0.1466543	-0.3262334	-0.0824569	0.1302746
\hat{\beta_{3}}	1.3961376	1.1036999	1.3592628	1.6133904	1.3297993	0.0294636	0.1312717
\hat{\beta_{4}}	1.3954471	1.2360310	1.4701240	1.7206092	1.2724293	0.1976947	0.1168997
\hat{\beta_{5}}	0.5261062	0.2960187	0.5253167	0.7519219	0.4146414	0.1106753	0.1168807
\hat{\beta_{6}}	-1.4521350	-1.6617490	-1.4503151	-1.2281579	-1.5399500	0.0896350	0.1153854
\hat{\beta_{7}}	-0.8949850	-1.1344081	-0.8735736	-0.5855254	-0.9285670	0.0549934	0.1359038
\hat{\beta_{8}}	-0.1095828	-0.4125427	-0.1510493	0.1253507	-0.2947204	0.1436711	0.1358338
\hat{\sigma}	4.0048564	3.8003654	3.9915528	4.1610923	4.0000000	-0.0084472	0.0916854

Test: is it consistent?

We generate a new random error at each step and we re-estimate the linearly corrected beta parameter to show that it’s expected value converge to the true population value.

object_consistency <- fastRobBoot(B = B, n = n, k = k, sigma = sigma,
                                  level = level, sample_perc = sample_perc, d = d,
                                  ci = "empiric", seed = seed, quiet = TRUE, consistency = TRUE)

Figure (Consistency Fast Robust Bootstrap)

plot(object_consistency)

Bootstrapped confidence intervals on all the sample for **consistency**. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter, in purple the **true parameter**. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

(Fast Robust) Consistency of bootstrapped expectation.
theta	init	lower	mean	upper	true	bias	sigma
\hat{\beta_{1}}	1.0589404	1.0056799	1.2549838	1.5158393	1.2629543	-0.0079705	0.1250140
\hat{\beta_{2}}	-0.3988854	-0.5878056	-0.3220486	-0.0442126	-0.3262334	0.0041847	0.1398862
\hat{\beta_{3}}	1.3919871	1.0836042	1.3338814	1.5985329	1.3297993	0.0040821	0.1350198
\hat{\beta_{4}}	1.3962083	1.0205886	1.2793218	1.5267441	1.2724293	0.0068924	0.1291070
\hat{\beta_{5}}	0.5257455	0.1525924	0.4098380	0.6619437	0.4146414	-0.0048034	0.1316811
\hat{\beta_{6}}	-1.4569341	-1.7909675	-1.5414243	-1.3024909	-1.5399500	-0.0014743	0.1293158
\hat{\beta_{7}}	-0.8917018	-1.1801149	-0.9356286	-0.7067514	-0.9285670	-0.0070615	0.1259557
\hat{\beta_{8}}	-0.1095115	-0.5481921	-0.2951539	-0.0305207	-0.2947204	-0.0004335	0.1300155
\hat{\sigma}	4.0257902	3.7198937	4.0096205	4.3140947	4.0000000	0.0096205	0.1575011

Implementation for S3 Class lm

fastRobBoot for lm Class

#' @rdname fastRobBoot
#' @name fastRobBoot
#' @examples 
#' object <- lm(hp ~ disp + carb + wt, data = mtcars)
#' boot <- fastRobBoot.lm(object, B = 10000, level = 0.95, sample_perc = 0.5)
#' plot(boot)
#' @export
fastRobBoot.lm <- function(object, B = 100, level = 0.95, 
                           sample_perc = 0.8, d = 4.56, ci = "empiric", 
                           seed = 0, quiet = FALSE){
  
  # Type of confidence interval 
  ci <- match.arg(ci, choices = c("empiric", "theoric"))
  
  # Loss functions 
  phi <- tukey_beaton_bisquare(d)
  phi_prime <- tukey_beaton_prime(d)
  phi_second <- tukey_beaton_second(d)
  
  # parameter b 
  compute_b <- function(.f){
    fun <- function(x){
      .f(x)*dnorm(x, 0, 1)
    }
    integrate(fun, lower = -Inf, upper = Inf)$value
  }
  b <- compute_b(phi)
  
  # Initialization 
  formula <- as.formula(object$terms)
  X <- model.matrix(object)  # Regressors
  Y <- as.matrix(object$model[,1])  # Perturbated response variable 
  n <- nrow(X); k <- ncol(X)
  # MM-estimate (beta_hat)
  rob_fit_mm <- robustbase::lmrob(formula, data = object$model, method = "SMDM")
  # S-estimate (beta_tilde and sigma_hat)
  rob_fit_s <- robustbase::lmrob(formula, data = object$model, method = "S")
  
  # Initial estimated parameters  
  beta_hat <- matrix(coef(rob_fit_mm), ncol = 1)
  beta_tilde <- matrix(coef(rob_fit_s), ncol = 1)
  sigma_hat <- matrix(rob_fit_s$scale, ncol = 1)
  
  # Residuals 
  r <- Y - X %*% beta_hat  # MM-estimate  
  r_tilde <- Y - X %*% beta_tilde  # S-estimate 
  # Weights 
  w <- phi_prime(r/sigma_hat[1,1])/r  # MM-estimate  
  v <- sigma_hat[1,1]/(n*b)*phi(r_tilde[,1]/sigma_hat[1,1])/r_tilde  # S-scale   
  w_tilde <- phi_prime(r_tilde[,1]/sigma_hat[1,1])/r_tilde  # S-estimate  
  
  # Estimated beta_hat 
  beta_hat <- solve(t(X) %*% diag(w[,1]) %*% X) %*% t(X) %*% diag(w[,1]) %*% Y 
  # Estimated sigma_hat
  sigma_hat <- t(r_tilde) %*% v
  # Estimated beta_tilde
  beta_tilde <- solve(t(X) %*% diag(w_tilde[,1]) %*% X) %*% t(X) %*% diag(w_tilde[,1]) %*% Y 
  # Correction matrix M
  M_n <- sigma_hat[1,1]*solve(t(X) %*% diag(phi_second(r[,1]/sigma_hat[1,1])) %*% X) %*% t(X) %*% diag(w[,1]) %*% X 
  # Correction factor a_n 
  a_n <- 1/(n*b)*sum(phi_prime(r[,1]/sigma_hat[1,1])*r[,1]/sigma_hat[1,1])
  # Correction matrix d
  d_n <- (1/a_n)*solve(t(X) %*% diag(phi_second(r[,1]/sigma_hat[1,1])) %*% X) %*% t(X) %*% (phi_second(r[,1]/sigma_hat[1,1])*r)
  
  bootstrap <- list()
  for(i in 1:B){
    if (!quiet) message(i, "/", B, " (", round(i/B*100, 2), "%) \r", appendLF = FALSE)
    # Creation of confidence intervals 
    idx_sample <- sample(n, trunc(n*sample_perc), replace = FALSE)
    # Inputs 
    X_star <- X[idx_sample,]
    Y_star <- Y[idx_sample,]
    n_star <- nrow(X_star)
    # Residuals MM-estimate  
    r_star <- Y_star - X_star %*% beta_hat
    # Residuals S-estimate  
    r_tilde_star <- Y_star - X_star %*% beta_tilde
    # Weights 
    w_star <- phi_prime(r_star/sigma_hat[1,1])/r_star  # MM-estimate  
    v_star <- sigma_hat[1,1]/(n_star*b)*phi(r_tilde_star[,1]/sigma_hat[1,1])/r_tilde_star  # S-scale   
    w_tilde_star <- phi_prime(r_tilde_star[,1]/sigma_hat[1,1])/r_tilde_star  # S-estimate  
    # Estimated beta_hat_star 
    beta_hat_star <- solve(t(X_star) %*% diag(w_star[,1]) %*% X_star) %*% t(X_star) %*% diag(w_star[,1]) %*% Y_star 
    # Estimated sigma_hat_star 
    sigma_hat_star <- t(r_tilde_star) %*% v_star
    # Linearly corrected beta 
    beta_R_star <- beta_hat + M_n %*% (beta_hat_star - beta_hat) + d_n*(sigma_hat_star[1,1] - sigma_hat[1,1])
    rownames(beta_R_star) <- colnames(X)
    bootstrap[[i]] <- dplyr::bind_cols(t(beta_R_star), sigma = sigma_hat_star[1,1])
  }
  
  bootstrap <- dplyr::bind_rows(bootstrap)
  # Expected value 
  e_theta_hat <- apply(bootstrap, 2, mean)
  # Asymptotic variance 
  sd_theta_hat <- apply(bootstrap, 2, sd)
  conf <- (1 - level)/2
  if (ci == "empiric") {
    theta_hat_lower <- apply(bootstrap, 2, quantile, probs = conf)
    theta_hat_upper <- apply(bootstrap, 2, quantile, probs = 1 - conf)
  } else if (ci == "theoric") {
    theta_hat_lower <- e_theta_hat + qnorm(conf)*sd_theta_hat
    theta_hat_upper <- e_theta_hat - qnorm(conf)*sd_theta_hat
  }
  
  df_bootstrap <- dplyr::bind_cols(theta = c(colnames(X), "sigma"),
                                   init = c(beta_hat, sigma_hat[1,1]),
                                   lower = theta_hat_lower, 
                                   mean = e_theta_hat,
                                   true = NA, 
                                   bias = NA,
                                   upper = theta_hat_upper, 
                                   sigma = sd_theta_hat,
                                   level = level, 
                                   ci = ci)
  structure(
    list(
      bootstrap = bootstrap,
      results = df_bootstrap
    ),
    class = c("fastRobBoot", "list")
  )
}

For example purposes we use iris dataset that contains only 150 observations.

object_emp <- lm(Sepal.Width ~., data = iris)
boot <- fastRobBoot.lm(object_emp, B = 20000, level = 0.95, sample_perc = 0.5, d = 4.56, quiet = TRUE)

Fast robust bootstrap on subsamples of real data. In green the bootstrapped **mean parameter**, in magenta the **initial estimate** parameter. The histogram is obtained from the bootstrapped distribution while the **theoric normal distribution** is in orange. The **lower** and **upper** confidence levels are respectively in red and blue.

(Fast Robust) Confidence intervals on real data.
theta	init	lower	mean	upper	sigma
\hat{\beta_{1}}	1.8149570	1.1593555	1.7685507	2.3911049	0.3163409
\hat{\beta_{2}}	0.3289448	0.1920690	0.3442840	0.4940748	0.0777053
\hat{\beta_{3}}	-0.1336001	-0.3570081	-0.1519162	0.0536134	0.1037093
\hat{\beta_{4}}	0.6508073	0.3749381	0.6488262	0.9445551	0.1442441
\hat{\beta_{5}}	-1.2747748	-1.6833223	-1.2440808	-0.7838698	0.2282095
\hat{\beta_{6}}	-1.5776975	-2.1674423	-1.5282813	-0.8621432	0.3308906
\hat{\sigma}	0.2822656	0.1829479	0.2398150	0.2965793	0.0292481

References

Salibian-Barrera, M, and RH Zamar. 2002. “Bootstrapping Robust Estimates of Regression.” Article. Annals of Statistics 30 (2): 556–82. https://doi.org/10.1214/aos/1021379865.

Salibián-Barrera, Matías, Stefan Aelst, and Gert Willems. 2008. “Fast and robust bootstrap.” Statistical Methods & Applications 17 (1): 41–71. https://doi.org/10.1007/s10260-007-0048-6.