Optional Advanced Topic: Best Out of Sample Prediction

First we will reload the statsr and BAS packages along with the wage data.

library(statsr)
library(BAS)
data(wage)

One approach to help select among alternate estimators is to use out of sample of sample validation to compare predictive accuracy. The following code creates a training data and test data set based on a left out 10% sample. We obtain the posterior distribution of models and model specific parameters using BAS using only the training data. Using this posterior distribution we then predict lwage using the explanatory variables in the test data set for several estimators: BMA, BPM, MPM and HPM and extract the predicted (fit) values. The function cv.summary.bas computes the average square root mean squared error between the fitted values and the observed values of lwage in the left our test data. We repeat this in a loop This may take some time to run.

set.seed(42)

wage_no_na = na.omit(wage)

n = nrow(wage_no_na)
n_cv = 50
ape = matrix(NA, ncol=4, nrow=n_cv)
colnames(ape) = c("BMA", "BPM", "HPM", "MPM")

for (i in 1:n_cv) {
  train = sample(1:n, size=round(.90*n), replace=FALSE)
  lwage_train = wage_no_na[train,]
  lwage_test = wage_no_na[-train,]

  bma_train_wage = bas.lm(lwage ~ . - wage, data=lwage_train, 
                          prior="BIC", modelprior=uniform(), initprobs="eplogp")
  yhat_bma = predict(bma_train_wage, lwage_test, estimator="BMA")$fit
  yhat_hpm = predict(bma_train_wage, lwage_test, estimator="HPM")$fit
  yhat_mpm = predict(bma_train_wage, lwage_test, estimator="MPM")$fit
  yhat_bpm = predict(bma_train_wage, lwage_test, estimator="BPM")$fit
  ape[i, "BMA"] = cv.summary.bas(yhat_bma, lwage_test$lwage)
  ape[i, "BPM"] = cv.summary.bas(yhat_bpm, lwage_test$lwage)
  ape[i, "HPM"] = cv.summary.bas(yhat_hpm, lwage_test$lwage)
  ape[i, "MPM"] = cv.summary.bas(yhat_mpm, lwage_test$lwage)
}

We can look at how sensitive the predictions are to the choice of estimator by viewing the side-by-side boxplots of the average prediction errors as well as the mean of the APE over the different test sets.

boxplot(ape)

apply(ape, 2, mean)
##       BMA       BPM       HPM       MPM 
## 0.3498728 0.3500223 0.3560288 0.3556935

While BMA is the best, followed by BPM, they are all pretty close in this case. Note the values of the average prediction error are on the same scale as the lwage or the residual MSE from lm where the smaller the better.

This can be used to compare different prior distributions (say BIC versus ZS-null) or other options.

Exercise: Using the reduced data set, compare the average prediction error using the four different estimators with BIC. What happens if you switch to using prior="ZS-null"' withbas.lm<ol><b> A: TheBPMis the best, followed by "BMA". However, they are all closer than those usingprior=“BIC”`.
wage_red <- subset(wage,
                   select = -c(sibs, brthord, feduc, meduc))

set.seed(42)
wage_red_no.na <- na.omit(wage_red)
n_red <- nrow(wage_red_no.na)
n_cv_red <- 50
ape_red <- matrix(NA, ncol = 4, nrow = n_cv_red)
colnames(ape_red) <- c("BMA", "BPM", "HPM", "MPM")

for (i in 1:n_cv_red){
    train <- sample(1:n_red, size=round(0.9*n_red), replace = FALSE)
    lwage_train_red <- wage_red_no.na[train,]
    lwage_test_red <- wage_red_no.na[-train,]
    
    bma_train_wage_red <- bas.lm(lwage ~ . -wage, data = lwage_train_red,
                                 prior = "ZS-null", modelprior = uniform(), initprobs = "eplogp")
    yhat_red_bma <- predict(bma_train_wage_red, lwage_test_red, estimator = "BMA")$fit
    yhat_red_bpm <- predict(bma_train_wage_red, lwage_test_red, estimator = "BPM")$fit
    yhat_red_hpm <- predict(bma_train_wage_red, lwage_test_red, estimator = "HPM")$fit
    yhat_red_mpm <- predict(bma_train_wage_red, lwage_test_red, estimator = "MPM")$fit
    ape_red[i, "BMA"] <- cv.summary.bas(yhat_red_bma, lwage_test_red$lwage)
    ape_red[i, "BPM"] <- cv.summary.bas(yhat_red_bpm, lwage_test_red$lwage)
    ape_red[i, "HPM"] <- cv.summary.bas(yhat_red_hpm, lwage_test_red$lwage)
    ape_red[i, "MPM"] <- cv.summary.bas(yhat_red_mpm, lwage_test_red$lwage)
}
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(lwage ~ . - wage, data = lwage_train_red, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
## Warning in bas.lm(eval(object$call$formula), data = eval(object$call
## $data), : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
boxplot(ape_red)

apply(ape_red, 2, mean)
##       BMA       BPM       HPM       MPM 
## 0.3632045 0.3628332 0.3635935 0.3634771

Experimental: Outlier Detection

Looking at the residual plot of observed and fitted values under BMA using the original model fit,

bma_lwage = bas.lm(lwage ~ . -wage, data = wage_no_na,
                   prior = "BIC", 
                   modelprior = uniform())
plot(bma_lwage, which=1)

it appears that there may be some outliers in the data - observations 379, 440 and 560 have been flagged as the points with the three largest absolute residuals. Are they outliers or points who have a different mean or distribution than the others?

We can add an indicator variable for each observation to the design matrix where if the indicator variable is included this corresponds to the case having a different mean than what is expected under the regression model. If all indicators are included, then that corresponds to each case having a different mean, which is not very useful, so some form of outliers selection or model averaging is needed in combination with posterior inference about the other variables.

set.seed(42)
n = nrow(wage_no_na)
wage_outliers = cbind(wage_no_na, diag(1, nrow=n))
outliers_lwage = bas.lm(lwage ~ . -wage, data=wage_outliers, 
                        prior="ZS-null", a=n,
                        modelprior=tr.beta.binomial(a=1, b=1, trunc=n/2),
                        method="MCMC",
                        initprobs="marg-eplogp",
                        MCMC.iterations=500000, n.models=2^15
                        )
## Warning in bas.lm(lwage ~ . - wage, data = wage_outliers, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation

This uses the method="MCMC" as enumeration is not feasible. We are also introducing a truncated beta-binomial distribution on the model size. This assigns all models with more than n/2 coefficients a zero probability a priori which limits the number of possible outliers. (This could be smaller, say even 16, to prevent models with too many outliers and predictors). The initprobs argument used here helps to sort variables and makes the algorithm more efficient with the MCMC option.

Looking at the diagnostic plot for MCMC

diagnostics(outliers_lwage, type="pip")

it appears that we did not use enough MCMC iterations for convergence to the marginal inclusion probabilities. Let’s rerun again and also lower the truncation point.

outliers_lwage = bas.lm(lwage ~ . -wage, data=wage_outliers, 
                        prior="ZS-null", a=n,
                        modelprior=tr.beta.binomial(a=1, b=1, trunc=16),
                        method="MCMC",
                        initprobs="marg-eplogp",
                        MCMC.iterations=2*10^6)
## Warning in bas.lm(lwage ~ . - wage, data = wage_outliers, prior = "ZS-
## null", : We recommend using the implementation using the Jeffreys-Zellner-
## Siow prior (prior='JZS') which uses numerical integration rahter than the
## Laplace approximation
diagnostics(outliers_lwage, type="pip")

The diagnostic plot looks better, with the larger inclusion probabilities in reasonable agreement, however, several below 0.3 could benefit from running longer. This is probably adequate for the outlier detection and implementing model averaging, however, as predicted values tend to converge faster than probabilities.

Exercise: Fit the outlier model with the reduced dataset. Create a diagnostic plot with the output and determine how many MCMC iterations are needed.

    A: The MCMC iteration requires 4 million times.

    set.seed(42)
    n_red <- nrow(wage_red_no.na)
    wage_red_outliers <- cbind(wage_red_no.na, diag(1, nrow = n_red))
    outliers_lwage_red <- bas.lm(lwage ~ . -wage, data = wage_red_outliers,
                                 prior = "ZS-null", a=n_red,
                                 modelprior = tr.beta.binomial(a=1, b=1, trunc = n_red/2),
                                 method = "MCMC",
                                 initprob = "marg-eplogp",
                                 MCMC.iterations = 500000, n.models = 2^11)
    ## Warning in bas.lm(lwage ~ . - wage, data = wage_red_outliers, prior = "ZS-
    ## null", : We recommend using the implementation using the Jeffreys-Zellner-
    ## Siow prior (prior='JZS') which uses numerical integration rahter than the
    ## Laplace approximation
    diagnostics(outliers_lwage_red, type = "pip")

    outliers_lwage_red <- bas.lm(lwage ~ . -wage, data = wage_red_outliers,
                                 prior = "ZS-null", a=n_red,
                                 modelprior = tr.beta.binomial(a=1, b=1, trunc = 16),
                                 method = "MCMC",
                                 initprob = "marg-eplogp",
                                 MCMC.iterations = 2*10^6)
    ## Warning in bas.lm(lwage ~ . - wage, data = wage_red_outliers, prior = "ZS-
    ## null", : We recommend using the implementation using the Jeffreys-Zellner-
    ## Siow prior (prior='JZS') which uses numerical integration rahter than the
    ## Laplace approximation
    diagnostics(outliers_lwage_red, type = "pip")

    outliers_lwage_red <- bas.lm(lwage ~ . -wage, data = wage_red_outliers,
                                 prior = "ZS-null", a=n_red,
                                 modelprior = tr.beta.binomial(a=1, b=1, trunc = 16),
                                 method = "MCMC",
                                 initprob = "marg-eplogp",
                                 MCMC.iterations = 3*10^6)
    ## Warning in bas.lm(lwage ~ . - wage, data = wage_red_outliers, prior = "ZS-
    ## null", : We recommend using the implementation using the Jeffreys-Zellner-
    ## Siow prior (prior='JZS') which uses numerical integration rahter than the
    ## Laplace approximation
    diagnostics(outliers_lwage_red, type = "pip")

    outliers_lwage_red <- bas.lm(lwage ~ . -wage, data = wage_red_outliers,
                                 prior = "ZS-null", a=n_red,
                                 modelprior = tr.beta.binomial(a=1, b=1, trunc = 16),
                                 method = "MCMC",
                                 initprob = "marg-eplogp",
                                 MCMC.iterations = 4*10^6)
    ## Warning in bas.lm(lwage ~ . - wage, data = wage_red_outliers, prior = "ZS-
    ## null", : We recommend using the implementation using the Jeffreys-Zellner-
    ## Siow prior (prior='JZS') which uses numerical integration rahter than the
    ## Laplace approximation
    diagnostics(outliers_lwage_red, type = "pip")

The summary function is not as useful with so many possible predictors. Instead, let’s look at which variables have high marginal inclusion probabilities:

outliers_lwage$namesx[outliers_lwage$probne0 > .5]
##  [1] "Intercept" "iq"        "kww"       "educ"      "tenure"   
##  [6] "married1"  "urban1"    "`379`"     "`440`"     "`499`"    
## [11] "`560`"

This suggests that cases 379, 440 and 560 are potential outliers (this list may vary depending on Monte Carlo variation). A good data sleuth will investigate these cases and determine if there are valid reasons for why they may appear to be from a different population as justification for removing them. The model averaging paradigm will let us perform an analysis that accounts for the uncertainty that they have different means than what any of the regression models would specify.

Exercise: Figure out which variables have high marginal inlusion probabilities with reduce dataset.
    A: This suggests that cases 514,616 and 784 are potential outliers.
outliers_lwage_red$namesx[outliers_lwage_red$probne0 > .5]
##  [1] "Intercept" "iq"        "educ"      "exper"     "tenure"   
##  [6] "married1"  "black1"    "urban1"    "`514`"     "`616`"    
## [11] "`784`"