Boosting

Boosting is another approach for improving the predictions resulting from a decision tree. Like bagging, boosting is a general approach that can be applied to many statistical learning methods for regression or classification.

Boosting works similarly as the bagging approach, except that the trees are grown sequentially: each trees is grown using information from previously grown trees. Boosting does not involve bootstrap sampling; instead each tree is fit on a modified version of the original data set.

The boosting approach learns slowly. Given the current model, we fit a decision tree to the residuals from the model. That is, we fit a tree using the current residuals, rather than the outcome \(Y\), as the response. We then update the residuals and fit the model again until a certain number of specified trees have been constructed, denote by B. The shrinkage parameter \(\lambda\), a small positive number, controls the rate at which boosting learns. Typical values are 0.01 or 0.001, depending on the problem.

The boosting method is applied to predict housing prices using the Boston data.

##     BOOSTING     ##

# Learns slowly
# Sequential (n.trees = ?)
# Decision Tree 1: Residual 1
# Decision Tree 2: (Residual 1 as outcome) -> Residual 2
# Decision Tree 3: (Residual 2 as outcome) -> Residual 3
# ...

library(gbm)
library(MASS)
Boston <- Boston

# Validation Set 
set.seed(2)
train.index <- sample(c(1:506), 354, replace=F) 
train <- Boston[train.index,]
test <- Boston[-train.index,]

# Build the Boosted Regression Model
set.seed(24)
boost.boston <- gbm(medv ~ . , distribution = "gaussian", 
                    data=train, n.trees = 500, interaction.depth=4,
                    shrinkage = 0.1) # learning rate

summary(boost.boston)

##             var     rel.inf
## lstat     lstat 38.59131366
## rm           rm 31.39322601
## dis         dis  7.70481752
## crim       crim  4.53862014
## nox         nox  4.53227291
## ptratio ptratio  3.90171594
## age         age  3.45819700
## black     black  2.61605484
## tax         tax  1.53609406
## indus     indus  1.01617657
## rad         rad  0.56204710
## zn           zn  0.12284648
## chas       chas  0.02661777
# Validating the Model (MSE)
boost.pred <- predict(boost.boston, newdata=test)
mean((boost.pred - test$medv)^2)
## [1] 9.495608
# Comparison with Random Forest / Bagging
library(randomForest)

set.seed(25)
rf.boston <- randomForest(medv ~ . , data=train, ntrees=5000)

# MSE of random forest
rf.pred <- predict(rf.boston, newdata=test)
mean((rf.pred - test$medv)^2)
## [1] 10.82841