Using the iris dataset:
library(dplyr)
library(caret)
iris %>% head
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Create a binary response:
d <- iris %>%
mutate(response=factor(ifelse(Species=='versicolor', 'IsVersicolor', 'NotVersicolor'))) %>%
select(-Species) %>%
arrange(sample(nrow(.))) # Arrange rows randomly
Train a model:
# Train random forest (over 10 fold CV) and show results
# * The 'tuneLength' parameter automatically takes care of trying several random forest settings
tc <- trainControl(method = 'cv', number=10, savePredictions = T, classProbs = T)
m <- train(response ~ ., data=d, method='rf', tuneLength=3, trControl = tc) # Method could be 'gbm' instead
m
## Random Forest
##
## 150 samples
## 4 predictor
## 2 classes: 'IsVersicolor', 'NotVersicolor'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 135, 135, 135, 135, 135, 135, ...
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa Accuracy SD Kappa SD
## 2 0.9466667 0.8826521 0.06126244 0.1341698
## 3 0.9533333 0.8956391 0.05488484 0.1233041
## 4 0.9533333 0.8956391 0.05488484 0.1233041
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 3.
Show accuracy and other stats:
confusionMatrix(table(m$pred$obs, m$pred$pred))
## Confusion Matrix and Statistics
##
##
## IsVersicolor NotVersicolor
## IsVersicolor 141 9
## NotVersicolor 13 287
##
## Accuracy : 0.9511
## 95% CI : (0.9269, 0.9691)
## No Information Rate : 0.6578
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.8907
## Mcnemar's Test P-Value : 0.5224
##
## Sensitivity : 0.9156
## Specificity : 0.9696
## Pos Pred Value : 0.9400
## Neg Pred Value : 0.9567
## Prevalence : 0.3422
## Detection Rate : 0.3133
## Detection Prevalence : 0.3333
## Balanced Accuracy : 0.9426
##
## 'Positive' Class : IsVersicolor
##