Intro
In this exercise we explore the iris dataset from a few previous modules which contains measurements of petal and sepal dimensions for three species of iris flowers: setosa, versicolor and virginica.
Loading the Data
First we load the data
iris <- read.csv ("iris.csv" )
str (iris)
'data.frame': 150 obs. of 6 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : chr "setosa" "setosa" "setosa" "setosa" ...
$ Code : int 1 1 1 1 1 1 1 1 1 1 ...
Basic Plot
Next we can make a scatterplot of Petal Length vs Petal Width:
plot (iris$ Petal.Length, iris$ Petal.Width,
xlab = 'Petal Length' , ylab = 'Petal Width' ,
main = 'Petal Length vs Petal Width' )
Plot by Species
Here we differentiate each species with color to show the differences between the three groups:
plot (iris$ Petal.Length, iris$ Petal.Width,
col = c ('purple' , 'darkorange' , 'blue' )[as.integer (as.factor (iris$ Species))],
pch = 16 ,
xlab = 'Petal Length' , ylab = 'Petal Width' ,
main = 'Petal Length vs Width by Species' )
legend ('topleft' , legend = c ('setosa' , 'versicolor' , 'virginica' ),
col = c ('purple' , 'darkorange' , 'blue' ), pch = 16 )
Summary Statistics
Now, we look at the mean petal length for each species:
tapply (iris$ Petal.Length, iris$ Species, mean)
setosa versicolor virginica
1.462 4.260 5.552