3.)

#Create a sequence of probabilities
p <- seq(0, 1, length.out = 200)

# Classification Error: min(p, 1-p)
class_error <- pmin(p, 1 - p)

# Gini Index: 2 * p * (1 - p)
gini <- 2 * p * (1 - p)

# Entropy: using natural logarithm; handle log(0) by replacing with 0
entropy <- - (p * log(p) + (1 - p) * log(1 - p))
entropy[is.na(entropy)] <- 0

# Plotting
plot(p, class_error, type = "l", col = "red", lwd = 2,
     ylim = c(0, max(entropy)), xlab = expression(hat(p)[m1]),
     ylab = "Measure Value", main = "Classification Error, Gini, and Entropy")
lines(p, gini, col = "blue", lwd = 2)
lines(p, entropy, col = "green", lwd = 2)

Legend

-Red: Classification Error

-Blue: Gini Index

-Green: Entropy

4.)

a.)

b.)

5.)

-It appears to me that the Majority Vote approach and the average probability approach yield different classifications. The majority vote yields Red and the Average probability yields green. Both are very close to the “middle point” each slightly skewing towards their respective answers.

6.)

1.) Start with all data in one-node, I considered training the entire training set as one group.

2.) Recusrsive Splitting

-At each node, search over all predictors and all possible split points to find the single split that most reduces the sum of squared errors (RSS).

-Split the data into two child nodes accordingly.

-Repeat this process (recursively) in each child node, again looking for the best single split to reduce RSS.

3.) Stop when you reach a minimum node size or when splitting no longer significantly reduces the error.

4.) “Leaf” Predictions

-Each leaf of the tree is assigned the mean of all values in that leaf.

-For a new observation, you “drop” it down the tree, following the splits, and predict the leaf’s mean.