iris - Prediction using RandomForest

khsarma
2-August-2018

Shiny App creation

iris Shiny app has sidebar layout with following inputs

  • Data partition slider - It's slider input and has values ranging from 0.5 to .95 and default being .7. This will be used in splitting the data set into Training and Testing for cross validation.
  • Number of Trees Selection - User can select number of trees parameter for Random Forest algorithm.
  • Predict Values button - This is to prevent reactive statements.

Random Forest model creation

Consider User inputs provided in the following way: Data partition = 0.7 and Number of trees = 500 Model will be created with cross-validation as follows:

inTrain <- createDataPartition(y=iris$Species,
p=0.7, list=FALSE)
training <- iris[inTrain,]
testing <- iris[-inTrain,]
modFit <- randomForest(Species~ .,data=training,ntrees=500)
modFit

Call:
 randomForest(formula = Species ~ ., data = training, ntrees = 500) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 2

        OOB estimate of  error rate: 3.81%
Confusion matrix:
           setosa versicolor virginica class.error
setosa         35          0         0  0.00000000
versicolor      0         33         2  0.05714286
virginica       0          2        33  0.05714286

Prediction

Now, lets predict the values using the training set:

pred <- predict(modFit,testing) 
testing$predRight <- pred==testing$Species
table(pred,testing$Species)

pred         setosa versicolor virginica
  setosa         15          0         0
  versicolor      0         14         1
  virginica       0          1        14

Plot Predicted values

We can check Predicted values against True values in the plot below:

qplot(Petal.Width,Petal.Length,colour=predRight,data=testing,
      main="newdata Predictions")

plot of chunk unnamed-chunk-4