First, from the following code, I extracted my data set, test data, and training data.
iris_sub <- iris %>% filter(Species!='setosa') %>% mutate(Species=factor(Species))
iris_sub$Species <- factor(iris_sub$Species)
iris_test <- iris_sub %>%
group_by(Species) %>% slice(1:10)
iris_train <- iris_sub %>%
group_by(Species) %>% slice(11:100)
The class-specific means of the predictor variables for the training data are as follows:
The class-specific mean for just Sepal.Length is 12.4875.
Also, the class-specific mean for just Sepal.Width is 5.7275.
We can see this from the following table:
| Sepal.Length | Sepal.Width | |
|---|---|---|
| versicolor | 5.8950 | 2.7450 |
| virginica | 6.5925 | 2.9825 |
| versicolor | virginica | |
|---|---|---|
| versicolor | 4 | 2 |
| virginica | 6 | 8 |
Sepal.Length and Sepal.Width as predictors, I got the following answers:
| versicolor | virginica | |
|---|---|---|
| versicolor | 4 | 2 |
| virginica | 6 | 8 |
iii. It seems that Sepal.Width does not seem to be a necessary predictor variable for the purpose of classification. The main reason is that Sepal.Width does not apprear to be a statistically significant variable because its p-value is high since we have p = 0.4188.
Sepal.Length as a one-dimensional predictor, I got the following answers:
| versicolor | virginica | |
|---|---|---|
| versicolor | 4 | 2 |
| virginica | 6 | 8 |
iii. Comparing my results here with those in 3(a), I have very close estimates for the model parameters for both the \(\hat\beta\)s and the standard errors in parts (a) and (b). Additionally, the misclassification error rate and the computed confusion matrices are identical in parts (a) and (b). Therefore, yes my result in 3(b)(ii) supports my answer to 3(a)(iii) since these results are so similar then this shows it is very likely the predictor Sepal.Width is not statistically significant.
For k = 1, the misclassification error rate is 0.35, and the confusion matrix is shown below:
| versicolor | virginica | |
|---|---|---|
| versicolor | 5 | 2 |
| virginica | 5 | 8 |
For k = 5, the misclassification error rate is 0.4, and the confusion matrix is shown below:
| versicolor | virginica | |
|---|---|---|
| versicolor | 4 | 2 |
| virginica | 6 | 8 |
First, we saw that each of the three different classification methods have the same computed confusion matrices and the same value of 0.4 for their misclassification error rates, except the k-NN classification method with k = 1 which still has close results. Therefore, each of the three methods having the same or very close results tells us that each method performed about as well as the others. Overall though, their performance was not especially great because they were each only accurate in classifying 60% to 65% and misclassified 35% to 40%.