The stump model is a CART model with just one split, hence having two terminal nodes. The key is to find a splitting rule (with a splitting variable and a cutting point), and properly record the fitted value at each terminal node. For this question, I use a one-dimensional example, so only the cutting point needs to be searched. An additional difficulty is that I need to incorporate subject weights, which is used in AdaBoost.
set.seed(5)
n = 200
x = runif(n)
py <- function(x) sin(4*pi*x)/3 + 0.5
y = (rbinom(n, 1, py(x))-0.5)*2
plot(x, y + 0.1*runif(n, -1, 1), ylim = c(-1.1, 1.1), pch = 19,
col = ifelse(y == 1, "darkorange", "deepskyblue"), ylab = "y")
lines(sort(x), py(x)[order(x)] - 0.5)
testx = seq(0, 1, length.out = 1000)
testy = (rbinom(1000, 1, py(testx))-0.5)*2
The stump model works this way:
I wrote a function called myStump(x, y, w) that outputs the cutoff point, and the left and right predictions. After that, I test the code in the following two scenarios: * All training data has equal weights * Observations with \(x \geq 0.5\) has weights 2, while observations with \(x < 0.5\) has weights 0.1.
gini <- function(y, w){
p = weighted.mean(y ==1, w)
return(p*(1-p)) }
stump <-function(x, y, w){
n = length(x)
score = rep(1, n-1)
# search for the cutoff
for (i in 1:(n-1)){
index = (x <= x[i])
score[i] = -(sum(w[index])*gini(y[index], w[index]) +
sum(w[!index])*gini(y[!index], w[!index]))/sum(w) }
C = x[which.max(score)]
# label: vote by (weighted) majority
pl = weighted.mean(y[x <= C] ==1, w[x <= C])
pr = weighted.mean(y[x > C] ==1, w[x > C])
return(list(cutoff = C, left = ifelse(pl>0.5,1, -1),
right = ifelse(pr>0.5,1, -1))) }
# check two scenarios
w = rep(1/n, n)
stump(x, y, w)
## $cutoff
## [1] 0.2486734
##
## $left
## [1] 1
##
## $right
## [1] -1
w = ifelse(x >=0.5,2,0.1)
stump(x, y, w)
## $cutoff
## [1] 0.8036775
##
## $left
## [1] 1
##
## $right
## [1] -1
Let’s write my own code for AdaBoost which is an iterative method that calls the stump model and updates the weights.
shrinkage factors and plotted the final model (functional value of \(F\), and also the sign) with the observed data. # fit Adaboost
my_adaboost <-function(x, y, ntrees, epsilon = 0.5){
modellist = list()
w = rep(1/length(x), length(x))
for(k in 1 : ntrees){
fk = stump(x, y, w)
fit = ifelse(x <= fk$cutoff, fk$left, fk$right)
e = sum(w*(fit != y))
# cat(paste(" e is ", e, "\n") )
a =0.5* log((1-e)/e)
w = w * exp(-epsilon*a*y*fit)
w = w / sum(w)
# apply shrinkage parameter to alpha
fk['alpha'] = a * epsilon
modellist[[k]] = fk }
return(modellist) }
# predict by AdaBoost
my_adaboost_pred <-function (fit, K, testx, testy){
pred = rep(0, length(testx))
exp_loss = err = rep(NA, K)
for(k in 1: min(K, length(fit))){
pred_tree = ifelse(testx <= fit[[k]]$cutoff, fit[[k]]$left, fit[[k]]$right)
pred = pred + fit[[k]]$alpha * pred_tree
exp_loss[k] = mean( exp(- testy * pred) )
err[k] = mean(testy != sign(pred)) }
return (list("pred"= pred,"exp_loss"= exp_loss,"err"= err))}
#Fit model and prediction
K = c(200,400,1000)
delta = c(1,0.5,0.1)
pred_test <- list()
pred_train <- list()
for(i in 1:3){
myfit = my_adaboost(x, y, K[i], delta[i])
pred_test[[i]] = my_adaboost_pred(myfit, K[i], testx, testy)
pred_train[[i]] = my_adaboost_pred(myfit, K[i], x, y) }
par(mfrow = c(2,3))
# plot the exponential loss and error
for(i in 1:3){
main_text = paste0("Exponential loss and errors, delta = ", delta[i])
plot(pred_train[[i]]$exp_loss, type ="l", ylim = c(0,1),
main = main_text, ylab ="exponetial loss", xlab ="ntrees")
lines(pred_train[[i]]$err, col ='green')
lines(pred_test[[i]]$err, col ='red')
legend("right", lty =1, legend = c("exp loss","train error","test error"),
col = c("black","green","red")) }
# plot fitted model (last iteration)
for(i in 1:3){
main_text = paste0("Final model, delta = ", delta[i])
plot(x, y +0.1*runif(n, -1,1), ylim = c(-1.1,1.1),
col = ifelse(y ==1,"darkorange","deepskyblue"), ylab ="y",
main = main_text)
lines(sort(x), py(x)[order(x)] -0.5)
lines(testx, pred_test[[i]]$pred, col ="red")
lines(testx, ifelse(pred_test[[i]]$pred >0,1, -1), col ="blue", lty =2)}