This famous (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
iris
iris is a data frame with 150 cases (rows) and 5 variables (columns) named Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species.
Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179–188.
The data were collected by Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2–5.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(rpart)
library(rpart.plot)
iris
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
## 51 7.0 3.2 4.7 1.4 versicolor
## 52 6.4 3.2 4.5 1.5 versicolor
## 53 6.9 3.1 4.9 1.5 versicolor
## 54 5.5 2.3 4.0 1.3 versicolor
## 55 6.5 2.8 4.6 1.5 versicolor
## 56 5.7 2.8 4.5 1.3 versicolor
## 57 6.3 3.3 4.7 1.6 versicolor
## 58 4.9 2.4 3.3 1.0 versicolor
## 59 6.6 2.9 4.6 1.3 versicolor
## 60 5.2 2.7 3.9 1.4 versicolor
## 61 5.0 2.0 3.5 1.0 versicolor
## 62 5.9 3.0 4.2 1.5 versicolor
## 63 6.0 2.2 4.0 1.0 versicolor
## 64 6.1 2.9 4.7 1.4 versicolor
## 65 5.6 2.9 3.6 1.3 versicolor
## 66 6.7 3.1 4.4 1.4 versicolor
## 67 5.6 3.0 4.5 1.5 versicolor
## 68 5.8 2.7 4.1 1.0 versicolor
## 69 6.2 2.2 4.5 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 71 5.9 3.2 4.8 1.8 versicolor
## 72 6.1 2.8 4.0 1.3 versicolor
## 73 6.3 2.5 4.9 1.5 versicolor
## 74 6.1 2.8 4.7 1.2 versicolor
## 75 6.4 2.9 4.3 1.3 versicolor
## 76 6.6 3.0 4.4 1.4 versicolor
## 77 6.8 2.8 4.8 1.4 versicolor
## 78 6.7 3.0 5.0 1.7 versicolor
## 79 6.0 2.9 4.5 1.5 versicolor
## 80 5.7 2.6 3.5 1.0 versicolor
## 81 5.5 2.4 3.8 1.1 versicolor
## 82 5.5 2.4 3.7 1.0 versicolor
## 83 5.8 2.7 3.9 1.2 versicolor
## 84 6.0 2.7 5.1 1.6 versicolor
## 85 5.4 3.0 4.5 1.5 versicolor
## 86 6.0 3.4 4.5 1.6 versicolor
## 87 6.7 3.1 4.7 1.5 versicolor
## 88 6.3 2.3 4.4 1.3 versicolor
## 89 5.6 3.0 4.1 1.3 versicolor
## 90 5.5 2.5 4.0 1.3 versicolor
## 91 5.5 2.6 4.4 1.2 versicolor
## 92 6.1 3.0 4.6 1.4 versicolor
## 93 5.8 2.6 4.0 1.2 versicolor
## 94 5.0 2.3 3.3 1.0 versicolor
## 95 5.6 2.7 4.2 1.3 versicolor
## 96 5.7 3.0 4.2 1.2 versicolor
## 97 5.7 2.9 4.2 1.3 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 99 5.1 2.5 3.0 1.1 versicolor
## 100 5.7 2.8 4.1 1.3 versicolor
## 101 6.3 3.3 6.0 2.5 virginica
## 102 5.8 2.7 5.1 1.9 virginica
## 103 7.1 3.0 5.9 2.1 virginica
## 104 6.3 2.9 5.6 1.8 virginica
## 105 6.5 3.0 5.8 2.2 virginica
## 106 7.6 3.0 6.6 2.1 virginica
## 107 4.9 2.5 4.5 1.7 virginica
## 108 7.3 2.9 6.3 1.8 virginica
## 109 6.7 2.5 5.8 1.8 virginica
## 110 7.2 3.6 6.1 2.5 virginica
## 111 6.5 3.2 5.1 2.0 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 113 6.8 3.0 5.5 2.1 virginica
## 114 5.7 2.5 5.0 2.0 virginica
## 115 5.8 2.8 5.1 2.4 virginica
## 116 6.4 3.2 5.3 2.3 virginica
## 117 6.5 3.0 5.5 1.8 virginica
## 118 7.7 3.8 6.7 2.2 virginica
## 119 7.7 2.6 6.9 2.3 virginica
## 120 6.0 2.2 5.0 1.5 virginica
## 121 6.9 3.2 5.7 2.3 virginica
## 122 5.6 2.8 4.9 2.0 virginica
## 123 7.7 2.8 6.7 2.0 virginica
## 124 6.3 2.7 4.9 1.8 virginica
## 125 6.7 3.3 5.7 2.1 virginica
## 126 7.2 3.2 6.0 1.8 virginica
## 127 6.2 2.8 4.8 1.8 virginica
## 128 6.1 3.0 4.9 1.8 virginica
## 129 6.4 2.8 5.6 2.1 virginica
## 130 7.2 3.0 5.8 1.6 virginica
## 131 7.4 2.8 6.1 1.9 virginica
## 132 7.9 3.8 6.4 2.0 virginica
## 133 6.4 2.8 5.6 2.2 virginica
## 134 6.3 2.8 5.1 1.5 virginica
## 135 6.1 2.6 5.6 1.4 virginica
## 136 7.7 3.0 6.1 2.3 virginica
## 137 6.3 3.4 5.6 2.4 virginica
## 138 6.4 3.1 5.5 1.8 virginica
## 139 6.0 3.0 4.8 1.8 virginica
## 140 6.9 3.1 5.4 2.1 virginica
## 141 6.7 3.1 5.6 2.4 virginica
## 142 6.9 3.1 5.1 2.3 virginica
## 143 5.8 2.7 5.1 1.9 virginica
## 144 6.8 3.2 5.9 2.3 virginica
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
glimpse(iris)
## Rows: 150
## Columns: 5
## $ Sepal.Length <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4, 4.…
## $ Sepal.Width <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.…
## $ Petal.Length <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.…
## $ Petal.Width <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2, 0.…
## $ Species <fct> setosa, setosa, setosa, setosa, setosa, setosa, setosa, s…
table(iris$Species)
##
## setosa versicolor virginica
## 50 50 50
iris %>%
count(Species) %>%
mutate(n_percent = n/sum(n))
## Species n n_percent
## 1 setosa 50 0.3333333
## 2 versicolor 50 0.3333333
## 3 virginica 50 0.3333333
ggplot(iris, aes(Petal.Length, Petal.Width)) +
geom_jitter(color = '#77aaaa') +
labs(title = "Petal Width x Petal Length") +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_jitter(color = '#5982e2') +
labs(title = "Sepal Width x Sepal Length") +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(iris, aes(Petal.Length, Petal.Width)) +
geom_jitter(color = '#77aaaa') +
facet_wrap(~Species)+
labs(title = "Petal Width x Petal Length of each Species") +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_jitter(color = '#5982e2') +
facet_wrap(~Species)+
labs(title = "Sepal Width x Sepal Length of each Species") +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5))
mean(complete.cases(iris))
## [1] 1
set.seed(55)
id <- createDataPartition(iris$Species, p = 0.8, list = FALSE)
train_df <- iris[id, ] # 80%
test_df <- iris[-id, ] # 20%
train_df
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
## 51 7.0 3.2 4.7 1.4 versicolor
## 53 6.9 3.1 4.9 1.5 versicolor
## 55 6.5 2.8 4.6 1.5 versicolor
## 56 5.7 2.8 4.5 1.3 versicolor
## 57 6.3 3.3 4.7 1.6 versicolor
## 58 4.9 2.4 3.3 1.0 versicolor
## 60 5.2 2.7 3.9 1.4 versicolor
## 62 5.9 3.0 4.2 1.5 versicolor
## 63 6.0 2.2 4.0 1.0 versicolor
## 64 6.1 2.9 4.7 1.4 versicolor
## 65 5.6 2.9 3.6 1.3 versicolor
## 67 5.6 3.0 4.5 1.5 versicolor
## 68 5.8 2.7 4.1 1.0 versicolor
## 69 6.2 2.2 4.5 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 71 5.9 3.2 4.8 1.8 versicolor
## 72 6.1 2.8 4.0 1.3 versicolor
## 73 6.3 2.5 4.9 1.5 versicolor
## 74 6.1 2.8 4.7 1.2 versicolor
## 75 6.4 2.9 4.3 1.3 versicolor
## 76 6.6 3.0 4.4 1.4 versicolor
## 77 6.8 2.8 4.8 1.4 versicolor
## 78 6.7 3.0 5.0 1.7 versicolor
## 80 5.7 2.6 3.5 1.0 versicolor
## 81 5.5 2.4 3.8 1.1 versicolor
## 83 5.8 2.7 3.9 1.2 versicolor
## 84 6.0 2.7 5.1 1.6 versicolor
## 85 5.4 3.0 4.5 1.5 versicolor
## 87 6.7 3.1 4.7 1.5 versicolor
## 88 6.3 2.3 4.4 1.3 versicolor
## 89 5.6 3.0 4.1 1.3 versicolor
## 90 5.5 2.5 4.0 1.3 versicolor
## 91 5.5 2.6 4.4 1.2 versicolor
## 92 6.1 3.0 4.6 1.4 versicolor
## 93 5.8 2.6 4.0 1.2 versicolor
## 94 5.0 2.3 3.3 1.0 versicolor
## 96 5.7 3.0 4.2 1.2 versicolor
## 97 5.7 2.9 4.2 1.3 versicolor
## 99 5.1 2.5 3.0 1.1 versicolor
## 100 5.7 2.8 4.1 1.3 versicolor
## 101 6.3 3.3 6.0 2.5 virginica
## 102 5.8 2.7 5.1 1.9 virginica
## 105 6.5 3.0 5.8 2.2 virginica
## 106 7.6 3.0 6.6 2.1 virginica
## 107 4.9 2.5 4.5 1.7 virginica
## 108 7.3 2.9 6.3 1.8 virginica
## 109 6.7 2.5 5.8 1.8 virginica
## 110 7.2 3.6 6.1 2.5 virginica
## 111 6.5 3.2 5.1 2.0 virginica
## 113 6.8 3.0 5.5 2.1 virginica
## 114 5.7 2.5 5.0 2.0 virginica
## 116 6.4 3.2 5.3 2.3 virginica
## 117 6.5 3.0 5.5 1.8 virginica
## 118 7.7 3.8 6.7 2.2 virginica
## 119 7.7 2.6 6.9 2.3 virginica
## 121 6.9 3.2 5.7 2.3 virginica
## 122 5.6 2.8 4.9 2.0 virginica
## 123 7.7 2.8 6.7 2.0 virginica
## 124 6.3 2.7 4.9 1.8 virginica
## 125 6.7 3.3 5.7 2.1 virginica
## 127 6.2 2.8 4.8 1.8 virginica
## 128 6.1 3.0 4.9 1.8 virginica
## 130 7.2 3.0 5.8 1.6 virginica
## 131 7.4 2.8 6.1 1.9 virginica
## 132 7.9 3.8 6.4 2.0 virginica
## 134 6.3 2.8 5.1 1.5 virginica
## 137 6.3 3.4 5.6 2.4 virginica
## 138 6.4 3.1 5.5 1.8 virginica
## 139 6.0 3.0 4.8 1.8 virginica
## 140 6.9 3.1 5.4 2.1 virginica
## 141 6.7 3.1 5.6 2.4 virginica
## 142 6.9 3.1 5.1 2.3 virginica
## 143 5.8 2.7 5.1 1.9 virginica
## 144 6.8 3.2 5.9 2.3 virginica
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
test_df
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 52 6.4 3.2 4.5 1.5 versicolor
## 54 5.5 2.3 4.0 1.3 versicolor
## 59 6.6 2.9 4.6 1.3 versicolor
## 61 5.0 2.0 3.5 1.0 versicolor
## 66 6.7 3.1 4.4 1.4 versicolor
## 79 6.0 2.9 4.5 1.5 versicolor
## 82 5.5 2.4 3.7 1.0 versicolor
## 86 6.0 3.4 4.5 1.6 versicolor
## 95 5.6 2.7 4.2 1.3 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 103 7.1 3.0 5.9 2.1 virginica
## 104 6.3 2.9 5.6 1.8 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 115 5.8 2.8 5.1 2.4 virginica
## 120 6.0 2.2 5.0 1.5 virginica
## 126 7.2 3.2 6.0 1.8 virginica
## 129 6.4 2.8 5.6 2.1 virginica
## 133 6.4 2.8 5.6 2.2 virginica
## 135 6.1 2.6 5.6 1.4 virginica
## 136 7.7 3.0 6.1 2.3 virginica
set.seed(55)
# set control methods
control_1 <- trainControl(method = "boot",
number = 5,
verboseIter = TRUE)
control_2 <- trainControl(method = "cv",
number = 5,
verboseIter = TRUE)
# train rpart model with boot
rpart_boot <- train(
Species ~ . ,
data = train_df,
method = "rpart",
metric = "Accuracy",
trControl = control_1
)
## + Resample1: cp=0
## - Resample1: cp=0
## + Resample2: cp=0
## - Resample2: cp=0
## + Resample3: cp=0
## - Resample3: cp=0
## + Resample4: cp=0
## - Resample4: cp=0
## + Resample5: cp=0
## - Resample5: cp=0
## Aggregating results
## Selecting tuning parameters
## Fitting cp = 0 on full training set
# train rpart model with cv
rpart_cv <- train(
Species ~ . ,
data = train_df,
method = "rpart",
metric = "Accuracy",
trControl = control_2
)
## + Fold1: cp=0
## - Fold1: cp=0
## + Fold2: cp=0
## - Fold2: cp=0
## + Fold3: cp=0
## - Fold3: cp=0
## + Fold4: cp=0
## - Fold4: cp=0
## + Fold5: cp=0
## - Fold5: cp=0
## Aggregating results
## Selecting tuning parameters
## Fitting cp = 0 on full training set
# train knn model with boot
knn_boot <- train(
Species ~ . ,
data = train_df,
method = "knn",
metric = "Accuracy",
trControl = control_1
)
## + Resample1: k=5
## - Resample1: k=5
## + Resample1: k=7
## - Resample1: k=7
## + Resample1: k=9
## - Resample1: k=9
## + Resample2: k=5
## - Resample2: k=5
## + Resample2: k=7
## - Resample2: k=7
## + Resample2: k=9
## - Resample2: k=9
## + Resample3: k=5
## - Resample3: k=5
## + Resample3: k=7
## - Resample3: k=7
## + Resample3: k=9
## - Resample3: k=9
## + Resample4: k=5
## - Resample4: k=5
## + Resample4: k=7
## - Resample4: k=7
## + Resample4: k=9
## - Resample4: k=9
## + Resample5: k=5
## - Resample5: k=5
## + Resample5: k=7
## - Resample5: k=7
## + Resample5: k=9
## - Resample5: k=9
## Aggregating results
## Selecting tuning parameters
## Fitting k = 5 on full training set
# train knn model with cv
knn_cv <- train(
Species ~ . ,
data = train_df,
method = "knn",
metric = "Accuracy",
trControl = control_2
)
## + Fold1: k=5
## - Fold1: k=5
## + Fold1: k=7
## - Fold1: k=7
## + Fold1: k=9
## - Fold1: k=9
## + Fold2: k=5
## - Fold2: k=5
## + Fold2: k=7
## - Fold2: k=7
## + Fold2: k=9
## - Fold2: k=9
## + Fold3: k=5
## - Fold3: k=5
## + Fold3: k=7
## - Fold3: k=7
## + Fold3: k=9
## - Fold3: k=9
## + Fold4: k=5
## - Fold4: k=5
## + Fold4: k=7
## - Fold4: k=7
## + Fold4: k=9
## - Fold4: k=9
## + Fold5: k=5
## - Fold5: k=5
## + Fold5: k=7
## - Fold5: k=7
## + Fold5: k=9
## - Fold5: k=9
## Aggregating results
## Selecting tuning parameters
## Fitting k = 5 on full training set
rpart_boot
## CART
##
## 120 samples
## 4 predictor
## 3 classes: 'setosa', 'versicolor', 'virginica'
##
## No pre-processing
## Resampling: Bootstrapped (5 reps)
## Summary of sample sizes: 120, 120, 120, 120, 120
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.00 0.9333581 0.8992098
## 0.45 0.6180129 0.4404429
## 0.50 0.6180129 0.4404429
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.
rpart_cv
## CART
##
## 120 samples
## 4 predictor
## 3 classes: 'setosa', 'versicolor', 'virginica'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 96, 96, 96, 96, 96
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.00 0.9583333 0.9375
## 0.45 0.8916667 0.8375
## 0.50 0.3333333 0.0000
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.
knn_boot
## k-Nearest Neighbors
##
## 120 samples
## 4 predictor
## 3 classes: 'setosa', 'versicolor', 'virginica'
##
## No pre-processing
## Resampling: Bootstrapped (5 reps)
## Summary of sample sizes: 120, 120, 120, 120, 120
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 5 0.9433118 0.9118366
## 7 0.9348823 0.8992411
## 9 0.9379989 0.9044210
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 5.
knn_cv
## k-Nearest Neighbors
##
## 120 samples
## 4 predictor
## 3 classes: 'setosa', 'versicolor', 'virginica'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 96, 96, 96, 96, 96
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 5 0.9666667 0.9500
## 7 0.9583333 0.9375
## 9 0.9583333 0.9375
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 5.
p_rpart_boot <- predict(rpart_boot, newdata = test_df)
p_rpart_cv <- predict(rpart_cv, newdata = test_df)
p_knn_boot <- predict(knn_boot, newdata = test_df)
p_knn_cv <- predict(knn_cv, newdata = test_df)
table(p_rpart_boot)
## p_rpart_boot
## setosa versicolor virginica
## 10 12 8
table(p_rpart_cv)
## p_rpart_cv
## setosa versicolor virginica
## 10 12 8
table(p_knn_boot)
## p_knn_boot
## setosa versicolor virginica
## 10 11 9
table(p_knn_cv)
## p_knn_cv
## setosa versicolor virginica
## 10 11 9
mean(p_rpart_boot == test_df$Species)
## [1] 0.9333333
mean(p_rpart_cv == test_df$Species)
## [1] 0.9333333
mean(p_knn_boot == test_df$Species)
## [1] 0.9666667
mean(p_knn_cv == test_df$Species)
## [1] 0.9666667
It can be seen that knn model with cross-validation provides the highest accuracy when train model and test data set. However, each model provides the accuracy more than 9 at all.
This is iris data set of R. For actual work, it may have some loss values need to be cleaned, some highly different values need to be normalized. A sample needs to gather as much as possible with depending on budget, time, and others.