February 13, 2017
About the Iris Data Set
* The Iris Data Set is a multivariate data set used by R. A Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems" as an example of linear discriminate analysis.
* The data was collected by Edgar Andersen in 1935 to quantify the morphologic variation of Iris flowers of three related species.
* Two of the species were collected in the Gaspé Peninsula, Canada, in one pasture, picked the same day, measured at the same time by the same person using the same equipment
* The Iris data set contains 150 random observations and 5 variables (one categorical and 4 numeric) from three iris species, setosa, versicolor, and virginica.
* There are 50 observations from each of the three iris species, measuring sepal length, sepal width, petal length and petal width, all numeric values in centimeters.
* There is no missing data.
## vars n mean sd median trimmed mad min max range skew ## Sepal.Length 1 120 5.81 0.80 5.70 5.79 0.89 4.3 7.9 3.6 0.26 ## Sepal.Width 2 120 3.04 0.43 3.00 3.03 0.37 2.0 4.2 2.2 0.22 ## Petal.Length 3 120 3.73 1.74 4.35 3.75 1.70 1.0 6.9 5.9 -0.32 ## Petal.Width 4 120 1.20 0.78 1.30 1.19 1.04 0.1 2.5 2.4 -0.09 ## Species* 5 120 2.00 0.82 2.00 2.00 1.48 1.0 3.0 2.0 0.00 ## kurtosis se ## Sepal.Length -0.66 0.07 ## Sepal.Width -0.03 0.04 ## Petal.Length -1.45 0.16 ## Petal.Width -1.38 0.07 ## Species* -1.52 0.07