1 Goal


The goal of this tutorial is to learn how to count absolute and relative frequencies of the entries of a vector in a fast way.


2 Data import


# In this tutorial we are going to use the iris dataset
# We will count the amount of plants of each Species
data("iris")
str(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

3 Countig the amount of plants of each species


# We can obtain this information doing a summary
summary(iris$Species)
##     setosa versicolor  virginica 
##         50         50         50
str(summary(iris$Species))
##  Named int [1:3] 50 50 50
##  - attr(*, "names")= chr [1:3] "setosa" "versicolor" "virginica"
# We can also create a table with the frequencies 
table(iris$Species)
## 
##     setosa versicolor  virginica 
##         50         50         50
str(table(iris$Species))
##  'table' int [1:3(1d)] 50 50 50
##  - attr(*, "dimnames")=List of 1
##   ..$ : chr [1:3] "setosa" "versicolor" "virginica"
# If we want a data frame we can always do
data.frame(table(iris$Species))
##         Var1 Freq
## 1     setosa   50
## 2 versicolor   50
## 3  virginica   50

4 Counting the relative frequency of each species: Proportion


# We can do it by hand dividing the frequency by the total number of entries
table(iris$Species)/length(iris$Species)
## 
##     setosa versicolor  virginica 
##  0.3333333  0.3333333  0.3333333
# And we can round the result to be more useful
round(table(iris$Species)/length(iris$Species),2)
## 
##     setosa versicolor  virginica 
##       0.33       0.33       0.33
# However there is a function that does that for us: prop.table
prop.table(table(iris$Species))
## 
##     setosa versicolor  virginica 
##  0.3333333  0.3333333  0.3333333
# And we can round again to get a nicer view of the table
round(prop.table(table(iris$Species)),2)
## 
##     setosa versicolor  virginica 
##       0.33       0.33       0.33

5 Conclusion


In this tutorial we have learnt how to get the absolute and relative frequencies of some values in a vector. This could be very useful if we want to know the frequency of certain values or their weight in the vector.