Iris Species Classification

Saurabh Gupta

10/10/2019

Introduction

This presentation is part of the Developing Data Products Coursera.org course project submission.

It is an R Presentation generated with RStudio.

The Shiny application pitched by this presentation is at https://bodhi1606.shinyapps.io/Project/

The Shiny app source code is available at https://github.com/bodhi1606/Developing-Data-Products.

About

The Iris flower data set or Fisher’s Iris data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper. This is a very famous and widely used dataset by everyone trying to learn machine learning and statistics. The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. The fifth column is the species of the flower observed.

Objectives :- To predict the species of flower on the basis of sepal length, petal length, sepal width and petal width

Prediction using Random Forest

The iris dataset was used to train a random forest prediction model:

fitControl <- trainControl(method = "cv",
                           number = 5)
fitRF <- train(Species ~ .,
               data = iris,
               method = "rf",
               trControl = fitControl)

The classification is based on sepal and petal widths and lengths. You can specify these values using easy-to-use sliders, while density plots based on the iris dataset provide useful visual feedback as to where your values stand, compared to the values in the dataset.

Try the App

Use the Shiny app at https://bodhi1606.shinyapps.io/Project/

Get the app source code at https://github.com/bodhi1606/Developing-Data-Products.