Comparing prediction algorithms over the 'iris' dataset

João Martins
2016-06-03

Overview

A small application that compares how three different algorithms perform on a classification task.
Classification is done over the iris dataset.
Lives at https://joaotmartins.shinyapps.io/dataproducts/

Algorithms and Parameters

Users can choose between three different algorithms supported by the caret package:

Logistic regression - using the multinom algorithm
Random forests - rf
K-Nearest-Neighbors - knn

Besides the algorithms, users can change:

The number of resampling iterations
The split between training and testing on the irisdataset

Output (1/2)

The output is a confusion matrix, prediction accuracy, and a plot over the two most important prediction variables.

An example using random forests, with 5 resampling iterations and a 60/40 training/testing will yield the following:

            Actual
Prediction   setosa versicolor virginica
  setosa         20          0         0
  versicolor      0         19         3
  virginica       0          1        17

 Accuracy 
0.9333333

Output (2/2)

plot of chunk unnamed-chunk-2