Complete all Exercises, and submit answers to VtopBeta
## Loading packages
library(caret)## Loading required package: lattice
## Loading required package: ggplot2
library(knitr)
library(mlbench)| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
|---|---|---|---|---|
| 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 5.0 | 3.6 | 1.4 | 0.2 | setosa |
Using Iris dataset to perform a comparative analysis on the various classification algorithms and projecting the result on which performs well using visualization techniques.
control <- trainControl(method="repeatedcv",number=10,repeats=3)
set.seed(7)
fit.svm <- train(Species~., data=iris, method="svmRadial", trControl=control)
set.seed(7)
fit.knn <- train(Species~., data=iris, method="knn", trControl=control)
set.seed(7)
fit.rf <- train(Species~., data=iris, method="rf", trControl=control)
set.seed(7)
fit.nb <- train(Species~., data=iris, method="nb", trControl=control)
set.seed(7)
fit.decisionTree <- train(Species~., data=iris, method="treebag", trControl=control)
results <- resamples(list(DecisionTree=fit.decisionTree,NaiveBayes=fit.nb, SVM=fit.svm, KNN=fit.knn, RF=fit.rf))
summary(results)##
## Call:
## summary.resamples(object = results)
##
## Models: DecisionTree, NaiveBayes, SVM, KNN, RF
## Number of resamples: 30
##
## Accuracy
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## DecisionTree 0.8000000 0.9333333 0.9666667 0.9511111 1 1 0
## NaiveBayes 0.8000000 0.9333333 1.0000000 0.9600000 1 1 0
## SVM 0.8666667 0.9333333 1.0000000 0.9666667 1 1 0
## KNN 0.8666667 0.9333333 1.0000000 0.9755556 1 1 0
## RF 0.8000000 0.9333333 1.0000000 0.9555556 1 1 0
##
## Kappa
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## DecisionTree 0.7 0.9 0.95 0.9266667 1 1 0
## NaiveBayes 0.7 0.9 1.00 0.9400000 1 1 0
## SVM 0.8 0.9 1.00 0.9500000 1 1 0
## KNN 0.8 0.9 1.00 0.9633333 1 1 0
## RF 0.7 0.9 1.00 0.9333333 1 1 0
scales <- list(x=list(relation="free"), y=list(relation="free"))
bwplot(results,scales=scales)densityplot(results, scales=scales, pch = "|")dotplot(results, scales=scales)parallelplot(results)splom(results)diffs <- diff(results)
summary(diffs)##
## Call:
## summary.diff.resamples(object = diffs)
##
## p-value adjustment: bonferroni
## Upper diagonal: estimates of the difference
## Lower diagonal: p-value for H0: difference = 0
##
## Accuracy
## DecisionTree NaiveBayes SVM KNN RF
## DecisionTree -0.008889 -0.015556 -0.024444 -0.004444
## NaiveBayes 1.0000 -0.006667 -0.015556 0.004444
## SVM 0.1687 0.8307 -0.008889 0.011111
## KNN 0.1366 0.6984 1.0000 0.020000
## RF 1.0000 1.0000 0.5731 0.1737
##
## Kappa
## DecisionTree NaiveBayes SVM KNN RF
## DecisionTree -0.013333 -0.023333 -0.036667 -0.006667
## NaiveBayes 1.0000 -0.010000 -0.023333 0.006667
## SVM 0.1687 0.8307 -0.013333 0.016667
## KNN 0.1366 0.6984 1.0000 0.030000
## RF 1.0000 1.0000 0.5731 0.1737
Based on the above results using the visualization techniques, we can say that KNN and Naïve Bayes are more efficient and the Decision Tree is less efficient when compared to other classifiers.