knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
To find the best classifier model among KNN and SVM models, I am going to approach them by using (a) cross-validation and (b) splitting, for all KNN and linear/non-linear SVMs. The accuracy results will be reported in a summary table in part (c) Result Discussion at the end of this section (Question 3.1).
The following part is my step-by-step codelines for the mentioned models.
Step 1. Install and load kknn packages.
Then import the dataset, setting seed for reproducibility and assigning
the 11th attribute as the response. I would also randomly assign 80% of
the data points to cross-validation and leave the rest 20% for
testing.
library(kknn)
data <- read.table("~/Dropbox/Nga_GA/ISYE6501/Homework2_ISYE6501/data3.1/credit_card_data.txt", header = FALSE)
compare_result <- data.frame(Model = character(),
Validation = numeric(),
Test = numeric())
set.seed(123)
data$V11 <- as.factor(data$V11)
n <- nrow(data)
cv_idx <- sample(1:n, 0.8 * n)
cv_data <- data[cv_idx, ]
test_data <- data[-cv_idx, ]
Step 2. I created number of folds for cross-validation and a list of k values for iteration. Detailed steps are put down in between-code comments
k_folds <- 10 #Divide dataset into 10 folds
folds <- sample(rep(1:k_folds, length.out = nrow(cv_data))) #Randomly assign each row of the cross-validation data to 1 of the 20 folds
k_vals <- 1:30 #Create a list of k values from 1 to 30
acc_cv <- numeric(length(k_vals)) #the average accuracy for each k value
Step 3. Create the nested looping for
cross-validation. Then for each k value, I would iterate through each
fold, use that fold for validation and train the model on the rest 9
folds. This process will repeat 9 times for each k value, resulting an
average accuracy acc_cv for each k, which then would be
plotted in Figure 1 for visualization.
for (k in k_vals) {
acc_per_fold <- numeric(k_folds)
for (fold in 1:k_folds) {
validate_idx <- which(folds == fold)
train_data <- cv_data[-validate_idx, ]
validate_data <- cv_data[validate_idx, ]
knn_model <- kknn(V11 ~ ., train = train_data, test = validate_data,
k = k, kernel = 'rectangular', scale=TRUE)
pred <- predict(knn_model, type= "raw")
acc_per_fold[fold] <- mean(pred == validate_data$V11) #Getting the accuracy for each fold
}
acc_cv[k] <- mean(acc_per_fold) #Taking accuracy for each k value by averaging out the accuracy of all folds
}
plot(k_vals, acc_cv, type = "b", col = "blue", pch = 16,
xlab = "k", ylab = "Average CV Accuracy",
main = "Figure 1. Cross-Validated Accuracy vs k")
Step 4. Find the best k value based on the above accuracy results across k values.
best_k <- which.max(acc_cv)
cat("Best k from CV:", best_k, "with accuracy:", round(acc_cv[best_k], 4), "\n")
## Best k from CV: 7 with accuracy: 0.8548
Step 5. Use the best k value as the parameter for the final model. Then making prediction and test on the untouched 20% of the data.
final_knn_model <- kknn(V11 ~ ., train = cv_data, test = test_data,
k = best_k, kernel = 'rectangular', scale=TRUE) #Use best k for final KNN model
test_pred_knn <- predict(final_knn_model, type= "raw") #Predict on test set
test_accuracy_knn <- mean(test_pred_knn == test_data$V11) #Compute test accuracy
compare_result <- rbind(compare_result,
data.frame(Model = "CV KNN",
Validation = max(acc_cv),
Test = test_accuracy_knn))
cat("Test set accuracy using k =", best_k, "is", round(test_accuracy_knn, 4), "\n")
## Test set accuracy using k = 7 is 0.8015
####a.2 Linear and Non-Linear SVM models** I also built SVM models using cross-validation, out of curiosity whether there is any difference or improvement compared to the above result from cross-validated KNN.
Step 1. Import kernlab the SVM package.
Import the dataset again in case reviewer only run this code chunk
without running the above.
#install.packages("kernlab")
library(kernlab)
#data <- read.table("~/Dropbox/Nga_GA/ISYE6501/Homework2_ISYE6501/data3.1/credit_card_data.txt", header = FALSE)
Step 2. Setting the random seed for reproducibility,
I then divided the dataset into 2 parts: 80% for cross-validation and
the rest 20% for testing. I also list 1 linear kernel
vanilladot and 1 non-linear kernel rbfdot to
compare between different types of SVM models. I also tried different C
values to finetune the parameter combination by comparing the accuracy
results, which would be reported in the summary
dataframe.
set.seed(234)
data$V11 <- as.factor(data$V11)
n <- nrow(data)
cv_idx <- sample(1:n, 0.8 * n)
cv_data <- data[cv_idx, ]
test_data <- data[-cv_idx, ]
kernels <- c("vanilladot", "rbfdot") # Build SVM with different kernels
C_values <- c(0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000, 100000) #Examples of C values to test
results <- data.frame(Kernel = character(), C = numeric(), Accuracy = numeric())
Step 3. I divided the cross-validation set into 5
folds then looping through each value of kernels, C and each fold to
obtain the average accuracy. The model is trained on the train dataset
train_data, then validated on validate_data.
The accuracy across different kernels and different C values is sorted
and reported in table results.
k_folds <- 10 #Divide dataset into 10 folds
folds <- sample(rep(1:k_folds, length.out = nrow(cv_data))) #Randomly assign each row of the cross-validation data to 1 of the 20 folds
for (j in seq_along(kernels)) {
for (i in seq_along(C_values)) {
acc_per_fold <- numeric(k_folds)
for (fold in 1:k_folds) {
validate_idx <- which(folds == fold)
train_data <- cv_data[-validate_idx, ]
validate_data <- cv_data[validate_idx, ]
model_ksvm <- ksvm(as.matrix(train_data[,1:10]), as.factor(train_data[,11]),
type = "C-svc", kernel = kernels[j], C = C_values[i], scaled = TRUE)
pred <- predict(model_ksvm, validate_data[,1:10])
acc_per_fold[fold] <- mean(pred == validate_data$V11) #Getting the accuracy for each fold
}
acc_cv <- mean(acc_per_fold) #Taking accuracy for each C value by averaging out the accuracy of all folds
# Add to results
results <- rbind(results,
data.frame(Kernel = kernels[j],
C = C_values[i],
Accuracy = acc_cv))
}
}
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
results_sorted <- results[order(-results$Accuracy), ]
results_sorted
## Kernel C Accuracy
## 3 vanilladot 1e-02 0.8604862
## 4 vanilladot 1e-01 0.8604862
## 5 vanilladot 1e+00 0.8604862
## 6 vanilladot 1e+01 0.8604862
## 7 vanilladot 1e+02 0.8604862
## 8 vanilladot 1e+03 0.8604862
## 9 vanilladot 1e+04 0.8604862
## 15 rbfdot 1e+00 0.8585994
## 14 rbfdot 1e-01 0.8547896
## 10 vanilladot 1e+05 0.8529753
## 16 rbfdot 1e+01 0.8299710
## 2 vanilladot 1e-03 0.8070029
## 17 rbfdot 1e+02 0.7956459
## 18 rbfdot 1e+03 0.7804427
## 19 rbfdot 1e+04 0.7670900
## 20 rbfdot 1e+05 0.7554790
## 1 vanilladot 1e-04 0.5428157
## 11 rbfdot 1e-04 0.5428157
## 12 rbfdot 1e-03 0.5428157
## 13 rbfdot 1e-02 0.5428157
best_result <- results[which.max(results$Accuracy), ]
Step 4. Based on the best result (highest accuracy)
from step 3, now using the parameter combination of
vanilladot and C = 0.01, I retrained the model
on all cv_data and then tested it on
test_data. The accuracy on test set is checked and reported
below:
final_model <- ksvm(as.matrix(cv_data[,1:10]), as.factor(cv_data[,11]),
type = "C-svc", kernel = best_result$Kernel,
C = best_result$C, scaled = TRUE)
## Setting default kernel parameters
test_pred_svm <- predict(final_model, test_data[,1:10])
test_accuracy_svm <- mean(test_pred_svm == test_data$V11)
compare_result <- rbind(compare_result,
data.frame(Model = "CV SVM",
Validation = best_result$Accuracy,
Test = test_accuracy_svm
))
cat("Test accuracy using kernel = '",best_result$Kernel,"' and C = ",best_result$C, "is: ", round(test_accuracy_svm, 4), "\n")
## Test accuracy using kernel = ' vanilladot ' and C = 0.01 is: 0.8702
For both KNN and SVM models, I used the same data set splitting in Step 1.
set.seed(456) # for reproducibility
n <- nrow(data)
data$V11 <- as.factor(data$V11)
splits <- sample(c("train", "validate", "test"), size = n, replace = TRUE,
prob = c(0.6, 0.25, 0.15)) #Split the data points randomly to 3 parts
train_split <- data[splits == "train", ]
validate_split <- data[splits == "validate", ]
test_split <- data[splits == "test", ]
####b.1. K-Nearest_Neighbor Model Step 2.** I training my KNN model with only training set, and tuning k value on validation set.
k_vals <- 1:30
acc_val <- numeric(length(k_vals))
for (k in k_vals) {
split_knn_model <- kknn(V11 ~ ., train = train_split, test = validate_split,
k = k, kernel = "rectangular", scale = TRUE)
pred <- predict(split_knn_model, type = "raw")
acc_val[k] <- mean(pred == validate_split$V11)
}
# Choose best k
best_k_split <- which.max(acc_val)
cat("Best k based on validation:", best_k_split, "with accuracy:",
round(acc_val[best_k_split], 4), "\n")
## Best k based on validation: 7 with accuracy: 0.8497
I also plotted the validation accuracy across different k values for better visualization:
plot(k_vals, acc_val, type = "b", pch = 16, col = "blue",
xlab = "k", ylab = "Validation Accuracy",
main = "Figure 2. KNN Accuracy on Validation Set")
Step 3. Now using the best k value, I input this parameter to retrain my model on train + validation data subset, then use this retrained model to predict on test set.
# Combine training and validation sets
trainval_split <- rbind(train_split, validate_split)
# Train final model with best k
final_split_knn_model <- kknn(V11 ~ .,
train = trainval_split,
test = test_split,
k = best_k_split,
kernel = "rectangular",
scale = TRUE)
test_pred <- predict(final_split_knn_model, type = "raw")
test_acc_split_knn <- mean(test_pred == test_split$V11)
compare_result <- rbind(compare_result,
data.frame(Model = "Split KNN",
Validation = acc_val[best_k_split],
Test = test_acc_split_knn
))
cat("Final split KNN model accuracy on test set with k =", best_k_split, "is", round(test_acc_split_knn, 4), "\n")
## Final split KNN model accuracy on test set with k = 7 is 0.9091
####b.2. SVM Model Step 2. I trained both linear
vanilladot and non-linear rbfdot SVM models on
training set, then tuning C value (from 1 to ) and choosing best kernerl
(either vanilladot or rbfdot)
results_split <- data.frame(Kernel = character(), C = numeric(), Accuracy = numeric())
kernels <- c("vanilladot", "rbfdot") # Build SVM with different kernels
C_values <- c(0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000, 100000) #Examples of C values to test
for (j in seq_along(kernels)) {
for (i in seq_along(C_values)) {
model_split_ksvm <- ksvm(as.matrix(train_split[,1:10]), as.factor(train_split[,11]),
type = "C-svc", kernel = kernels[j], C = C_values[i], scaled = TRUE)
pred <- predict(model_split_ksvm, validate_split[,1:10])
acc_split_val <- mean(pred == validate_split$V11) #Getting the accuracy for each fold
# Add to results
results_split <- rbind(results_split,
data.frame(Kernel = kernels[j],
C = C_values[i],
Accuracy = acc_split_val))
}
}
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
## Setting default kernel parameters
results_split_sorted <- results_split[order(-results_split$Accuracy), ]
results_split_sorted
## Kernel C Accuracy
## 3 vanilladot 1e-02 0.8670520
## 4 vanilladot 1e-01 0.8670520
## 5 vanilladot 1e+00 0.8670520
## 6 vanilladot 1e+01 0.8670520
## 7 vanilladot 1e+02 0.8670520
## 8 vanilladot 1e+03 0.8670520
## 9 vanilladot 1e+04 0.8670520
## 10 vanilladot 1e+05 0.8670520
## 14 rbfdot 1e-01 0.8612717
## 15 rbfdot 1e+00 0.8612717
## 16 rbfdot 1e+01 0.8323699
## 18 rbfdot 1e+03 0.7687861
## 19 rbfdot 1e+04 0.7630058
## 20 rbfdot 1e+05 0.7630058
## 17 rbfdot 1e+02 0.7398844
## 2 vanilladot 1e-03 0.6936416
## 1 vanilladot 1e-04 0.5491329
## 11 rbfdot 1e-04 0.5491329
## 12 rbfdot 1e-03 0.5491329
## 13 rbfdot 1e-02 0.5491329
best_split_svm <- results_split[which.max(results_split$Accuracy), ]
print(best_split_svm)
## Kernel C Accuracy
## 3 vanilladot 0.01 0.867052
Step 3. Now I choose the linear SVM model
kernel = "vanilladot" with C = 0.01 to retrain
my model before testing on the test set.
final_split_ksvm_model <- ksvm(as.matrix(trainval_split[,1:10]),
as.factor(trainval_split[,11]),
type = "C-svc", kernel = best_split_svm$Kernel,
C = best_split_svm$C, scaled = TRUE)
## Setting default kernel parameters
test_pred_split_svm <- predict(final_split_ksvm_model, test_split[,1:10])
test_accuracy_split_svm <- mean(test_pred_split_svm == test_split$V11)
compare_result <- rbind(compare_result,
data.frame(Model = "Split SVM",
Validation = best_split_svm$Accuracy,
Test = test_accuracy_split_svm
))
cat("Test accuracy using kernel = '",best_split_svm$Kernel,"' and C = ",
best_split_svm$C, "is: ", round(test_accuracy_split_svm, 4), "\n")
## Test accuracy using kernel = ' vanilladot ' and C = 0.01 is: 0.8909
The accuracy results of 4 models above is shown below. The best performing classifier is the KNN model which is built on splitting the data set into 3 separated parts: 60% for training, 25% for validation and 15% for testing. The accuracy on test set is 90.91% with k = 7.
When we compare the validation accuracy, the cross-validated linear SVM seemed to perform the best, but its accuracy on test set turned out to be at the third place. This implied the problem of over-optimism due to exposure to validation data during cross-validation. By splitting the data set and only expose the model to validation subset for parameter tuning but not for training, we could limit the over-optimistic problem to some extend, and achieve a better test accuracy with Split models.
compare_result
## Model Validation Test
## 1 CV KNN 0.8547896 0.8015267
## 2 CV SVM 0.8604862 0.8702290
## 3 Split KNN 0.8497110 0.9090909
## 4 Split SVM 0.8670520 0.8909091
When I worked at a cafe, we would like to categorize customers’ profiles into groups so that our marketing strategies could be more specifically-targeting hence more likely to succeed. However, we did not pre-define those groups, so a clustering model would be suitable in this case. The predictors we used were: - age - occupation - sales channels (in-person, via app, or on web) - mode of consumption (in-store, pickup, or delivery) - timestamp of the transaction/sales
To find the best K-means clustering of our data points, I would divide my modelling process into 3 steps: (1) find the optimal k, (2) find the best combination of predictors, and (3) cross-check between confusion matrix, result from (2) and clustering visualization.
Before any modelling, I import the data set and scale the data on the
first 4 variables (except the last attribute, which is the label). I
then load relevant packages for clustering, setting seed for
reproducibility. To find the best number of clustering k (or number of
center in kmeans() function), I iterated
through a range of k from 1 to 5, run the kmeans model and compare the
total within-cluster sum of squares ttwss = tot.withinss.
After that, I plotted the ttwss corresponding to number of k, so show
that when k>3, the ttwss did not decrease as significantly as when
k=<3. Result: So based on my observation, the
optimal k should be 3.
rm(list = ls())
dt <- read.table("/Users/nganguyen/Dropbox/Nga_GA/ISYE6501/Homework2_ISYE6501/data4.2/iris.txt", header = TRUE)
dt_scaled <- scale(dt[1:4])
#install.packages("tidyverse")
#install.packages("cluster")
#install.packages("factoextra")
library(tidyverse)
library(cluster)
library(factoextra)
n_clusters <- 5
ttwss <- numeric(n_clusters)
# Elbow method to choose the optimal number of clusters
set.seed(42)
for (i in 1:n_clusters) {
km_model <- kmeans(dt_scaled, centers = i, nstart = 20)
ttwss[i] = km_model$tot.withinss
}
ttwss_df <- tibble(clusters = 1:n_clusters, ttwss = ttwss)
elbow_plot = ggplot(ttwss_df, aes(x = clusters, y = ttwss, group = 1)) +
geom_point(size = 4)+
geom_line()+
scale_x_continuous(breaks = c(1, 2, 3, 4, 5)) +
xlab("Number of clusters") +
ylab("Total Within-Cluster Sum of Squares")
elbow_plot
Because there are 4 variables to use for kmeans clustering, it is
possible to loop through all combinations of them, including combination
of 1, 2, 3 and 4 variables. I would run my model for each loop, getting
the accuracy for comparison, and store all results in a list for
visualization later. The accuracy was calculated by using the Hugarian
algorithm, called by solve_LSAP() function in package
clue.
Result: the best predictors are Petal.Width (accuracy 96%), Petal.Length (accuracy 94.67%) or the combination of those 2 (accuracy 96%).
#install.packages("combinat")
#install.packages("clue")
library(combinat)
library(clue)
labels <- dt[, 5] # true class labels
# Encode labels as numeric
true_labels <- as.numeric(as.factor(labels))
combi_results <- list()
all_cols <- colnames(dt_scaled)
for (k in 1:4) {
combi <- combn(all_cols, k, simplify = FALSE)
for (vars in combi) {
subset <- dt_scaled[, vars, drop = FALSE]
set.seed(421)
km <- kmeans(subset, centers = 3, nstart = 20)
tab <- table(km$cluster, true_labels)
mapping <- solve_LSAP(tab, maximum = TRUE)
mapped <- as.numeric(mapping[km$cluster])
acc <- mean(mapped == true_labels)
key <- paste(vars, collapse = ", ")
combi_results[[key]] <- list(
variables = vars,
accuracy = round(acc, 4),
km_model = km,
mapped_clusters = mapped
)
}
}
# Create summary data frame
combi_df <- data.frame(
Variables = names(combi_results),
Accuracy = sapply(combi_results, function(x) x$accuracy),
row.names = NULL
)
combi_df_sorted <- combi_df[order(-combi_df$Accuracy), ]
print(combi_df_sorted)
## Variables Accuracy
## 4 Petal.Width 0.9600
## 10 Petal.Length, Petal.Width 0.9600
## 3 Petal.Length 0.9467
## 13 Sepal.Length, Petal.Length, Petal.Width 0.8667
## 14 Sepal.Width, Petal.Length, Petal.Width 0.8600
## 7 Sepal.Length, Petal.Width 0.8333
## 15 Sepal.Length, Sepal.Width, Petal.Length, Petal.Width 0.8333
## 9 Sepal.Width, Petal.Width 0.8200
## 6 Sepal.Length, Petal.Length 0.8067
## 11 Sepal.Length, Sepal.Width, Petal.Length 0.8067
## 12 Sepal.Length, Sepal.Width, Petal.Width 0.8067
## 5 Sepal.Length, Sepal.Width 0.7733
## 8 Sepal.Width, Petal.Length 0.7667
## 1 Sepal.Length 0.7067
## 2 Sepal.Width 0.5600
This last step is just to have a look at the confusion matrix of the
top 3 combinations of predictors. So based on Petal alone, width or
length, our kmeans model can clustered well the iris class
setosa. Looking at the plots, we can easily observe that
this class setosa positioning quite far and separated from
the rest of data points. For other classes versicolor and
virginica, our kmeans model mis-label a few data points
locating at the boundary of the 2 clusters 2 and 3.
library(ggplot2)
library(gridExtra)
top_combos <- head(combi_df_sorted$Variables, 3)
x_axis <- "Petal.Length"
y_axis <- "Petal.Width"
plot_list <- list()
for (name in top_combos) {
res <- combi_results[[name]] # already stored result
# Print confusion matrix
cat("\nConfusion matrix for:", name, "\n")
print(table(Cluster = res$mapped_clusters, TrueLabel = labels))
# Use Petal.Length and Petal.Width from dt_scaled for plotting
plot_df <- as.data.frame(dt_scaled[, c(x_axis, y_axis)])
plot_df$Cluster <- factor(res$mapped_clusters)
plot_df$TrueLabel <- factor(true_labels)
p <- ggplot(plot_df, aes_string(x = x_axis, y = y_axis)) +
geom_point(aes(color = Cluster, shape = TrueLabel), size = 3, alpha = 0.7) +
labs(
title = paste0("Clustered using:", name, "\nAccuracy: ", sprintf("%.2f%%", 100 * res$accuracy))
) +
coord_fixed() +
theme_minimal()
plot_list[[length(plot_list) + 1]] <- p
}
##
## Confusion matrix for: Petal.Width
## TrueLabel
## Cluster setosa versicolor virginica
## 1 50 0 0
## 2 0 48 4
## 3 0 2 46
##
## Confusion matrix for: Petal.Length, Petal.Width
## TrueLabel
## Cluster setosa versicolor virginica
## 1 50 0 0
## 2 0 48 4
## 3 0 2 46
##
## Confusion matrix for: Petal.Length
## TrueLabel
## Cluster setosa versicolor virginica
## 1 50 0 0
## 2 0 48 6
## 3 0 2 44
for (p in plot_list) {
print(p)
}