“OpenIntro Statistics”

Exercise 1.7 Fisher’s irises.

Sir Ronald Aylmer Fisher was an English statistician, evolutionary biologist, and geneticist who worked on a data set that contained sepal length and width, and petal length and width from three species of iris flowers (setosa, versicolor and virginica). There were 50 flowers from each species in the data set.

  1. How many cases were included in the data?
  2. How many numerical variables are included in the data? Indicate what they are, and if they are continuous or discrete.
  3. How many categorical variables are included in the data, and what are they? List the corresponding levels (categories).

Load library “openintro”

library(openintro)
## Please visit openintro.org for free statistics materials
## 
## Attaching package: 'openintro'
## The following objects are masked from 'package:datasets':
## 
##     cars, trees

Load data “iris”

data(iris)

Take a look at the data

iris

(a) How many cases were included in the data?

nrow(iris)
## [1] 150

We can calculate by the information given: 3 species X 50 flowers from each species = 150 cases in the data.

OR

We can simply see that there are 150 rows in the data representing 150 cases.

(b) How many numerical variables are included in the data? Indicate what they are, and if they are continuous or discrete.

str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

There are 4 numerical variables: sepal length, sepal width, petal length and petal width. They are continuous.

(c) How many categorical variables are included in the data, and what are they? List the corresponding levels (categories).

levels(iris$Species)
## [1] "setosa"     "versicolor" "virginica"

There is 1 categorial variable: species. The corresponding levels are: setosa, versicolor, virginica.