4/28/2018

About Iris Dataset Visualizer

This is a visualizer application that was developed using Shiny on R Studio. It allows the user to create a scatterplot of two-chosen variables within an Iris dataset. Features of the app includes:

  • Build scatterplot with two chosen iris dataset numeric variables using the selectInput widget.
  • Select and change the number of samples from the sliderInput widget.
  • Create a generalized additive model (GAM) for specific species dataset using checkboxInput widget.

Iris Dataset

A data frame with 150 observations on 5 variables. It is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper. The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals.

str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Generalized Additive Model (GAM)

a generalized additive model (GAM) is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models.

We have chosen this model as it fits best to most of the combinations of numeric parameters of Iris dataset.

Snapshot and link to the Application