Comparing prediction algorithms over the 'iris' dataset

João Martins
2016-06-03

Overview

Algorithms and Parameters

Users can choose between three different algorithms supported by the caret package:

  • Logistic regression - using the multinom algorithm
  • Random forests - rf
  • K-Nearest-Neighbors - knn

Besides the algorithms, users can change:

  • The number of resampling iterations
  • The split between training and testing on the irisdataset

Output (1/2)

The output is a confusion matrix, prediction accuracy, and a plot over the two most important prediction variables.

An example using random forests, with 5 resampling iterations and a 60/40 training/testing will yield the following:

            Actual
Prediction   setosa versicolor virginica
  setosa         20          0         0
  versicolor      0         19         3
  virginica       0          1        17
 Accuracy 
0.9333333 

Output (2/2)

plot of chunk unnamed-chunk-2