p = seq(0, 1, 0.05)
gini = p * (1 - p) * 2
entropy = -(p * log(p) + (1 - p) * log(1 - p))
class.err = 1 - pmax(p, 1 - p)
matplot(p, cbind(gini, entropy, class.err), type = 'l',col = c("blue","red", "green"))
legend("topleft", legend=c("gini", "entropy", "classification.error"), col=c("blue", "red", "green"), lty=1:2, cex=0.8)
train = sample(dim(OJ)[1], 800)
OJ.train = OJ[train, ]
OJ.test = OJ[-train, ]
oj.tree = tree(Purchase ~ ., data = OJ.train)
summary(oj.tree)
##
## Classification tree:
## tree(formula = Purchase ~ ., data = OJ.train)
## Variables actually used in tree construction:
## [1] "LoyalCH" "PriceDiff" "ListPriceDiff" "PctDiscMM"
## Number of terminal nodes: 8
## Residual mean deviance: 0.776 = 614.6 / 792
## Misclassification error rate: 0.1662 = 133 / 800
Tree uses 3 Variables: LoyalCH, SalePriceMM, and PriceDiff. Number of terminal nodes: 8 Misclassification error rate: 0.155
oj.tree
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 800 1067.00 CH ( 0.61375 0.38625 )
## 2) LoyalCH < 0.5036 355 434.60 MM ( 0.30141 0.69859 )
## 4) LoyalCH < 0.276142 167 126.30 MM ( 0.12575 0.87425 )
## 8) LoyalCH < 0.0356415 56 10.03 MM ( 0.01786 0.98214 ) *
## 9) LoyalCH > 0.0356415 111 104.70 MM ( 0.18018 0.81982 ) *
## 5) LoyalCH > 0.276142 188 259.30 MM ( 0.45745 0.54255 )
## 10) PriceDiff < 0.05 76 83.21 MM ( 0.23684 0.76316 ) *
## 11) PriceDiff > 0.05 112 150.10 CH ( 0.60714 0.39286 ) *
## 3) LoyalCH > 0.5036 445 355.70 CH ( 0.86292 0.13708 )
## 6) LoyalCH < 0.764572 186 214.50 CH ( 0.73656 0.26344 )
## 12) ListPriceDiff < 0.235 76 105.30 CH ( 0.51316 0.48684 )
## 24) PctDiscMM < 0.196196 61 81.77 CH ( 0.60656 0.39344 ) *
## 25) PctDiscMM > 0.196196 15 11.78 MM ( 0.13333 0.86667 ) *
## 13) ListPriceDiff > 0.235 110 75.81 CH ( 0.89091 0.10909 ) *
## 7) LoyalCH > 0.764572 259 97.16 CH ( 0.95367 0.04633 ) *
I chose node 4, the dsplit is on variable LoyalCH, if LoyalCHis below 0.06167 then it goes here this node is terminal. Class is MM (Minute Maid). #### (d) Create a plot of the tree, and interpret the results.
plot(oj.tree)
text(oj.tree, pretty = 0)
Loyalty is a big factor if its high it goes CH if low goes MM, if they dont really care then price is determining factor.
oj.pred = predict(oj.tree, OJ.test, type = "class")
table(OJ.test$Purchase, oj.pred)
## oj.pred
## CH MM
## CH 151 11
## MM 30 78
cv.oj = cv.tree(oj.tree, FUN = prune.tree)
par(mfrow = c(1, 2))
plot(cv.oj$size, cv.oj$dev, type = "b")
plot(cv.oj$k, cv.oj$dev, type = "b")
size 7 is best
plot(cv.oj$size, cv.oj$dev, type = "b", xlab = "Tree Size", ylab = "Cross-validated classification error rate")
treesize = 7
oj.pruned = prune.tree(oj.tree, best = 7)
par(mfrow = c(1, 1))
plot(oj.pruned)
text(oj.pruned, pretty = 0)
summary(oj.pruned)
##
## Classification tree:
## snip.tree(tree = oj.tree, nodes = 4L)
## Variables actually used in tree construction:
## [1] "LoyalCH" "PriceDiff" "ListPriceDiff" "PctDiscMM"
## Number of terminal nodes: 7
## Residual mean deviance: 0.7896 = 626.1 / 793
## Misclassification error rate: 0.1662 = 133 / 800
Pruned is higher.
pred.unpruned = predict(oj.tree, OJ.test, type = "class")
misclass.unpruned = sum(OJ.test$Purchase != pred.unpruned)
misclass.unpruned/length(pred.unpruned)
## [1] 0.1518519
pred.pruned = predict(oj.pruned, OJ.test, type = "class")
misclass.pruned = sum(OJ.test$Purchase != pred.pruned)
misclass.pruned/length(pred.pruned)
## [1] 0.1518519
Once again Pruned is higher.