1 Goal
2 Data import
3 Countig the amount of plants of each species
4 Counting the relative frequency of each species: Proportion
5 Conclusion

1 Goal

The goal of this tutorial is to learn how to count absolute and relative frequencies of the entries of a vector in a fast way.

2 Data import

# In this tutorial we are going to use the iris dataset
# We will count the amount of plants of each Species
data("iris")
str(iris)

## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

3 Countig the amount of plants of each species

# We can obtain this information doing a summary
summary(iris$Species)

##     setosa versicolor  virginica 
##         50         50         50

str(summary(iris$Species))

##  Named int [1:3] 50 50 50
##  - attr(*, "names")= chr [1:3] "setosa" "versicolor" "virginica"

# We can also create a table with the frequencies 
table(iris$Species)

## 
##     setosa versicolor  virginica 
##         50         50         50

str(table(iris$Species))

##  'table' int [1:3(1d)] 50 50 50
##  - attr(*, "dimnames")=List of 1
##   ..$ : chr [1:3] "setosa" "versicolor" "virginica"

# If we want a data frame we can always do
data.frame(table(iris$Species))

##         Var1 Freq
## 1     setosa   50
## 2 versicolor   50
## 3  virginica   50

4 Counting the relative frequency of each species: Proportion

# We can do it by hand dividing the frequency by the total number of entries
table(iris$Species)/length(iris$Species)

## 
##     setosa versicolor  virginica 
##  0.3333333  0.3333333  0.3333333

# And we can round the result to be more useful
round(table(iris$Species)/length(iris$Species),2)

## 
##     setosa versicolor  virginica 
##       0.33       0.33       0.33

# However there is a function that does that for us: prop.table
prop.table(table(iris$Species))

## 
##     setosa versicolor  virginica 
##  0.3333333  0.3333333  0.3333333

# And we can round again to get a nicer view of the table
round(prop.table(table(iris$Species)),2)

## 
##     setosa versicolor  virginica 
##       0.33       0.33       0.33

5 Conclusion

In this tutorial we have learnt how to get the absolute and relative frequencies of some values in a vector. This could be very useful if we want to know the frequency of certain values or their weight in the vector.

Absolute and relative frequencies in a vector: prop.table

Ubiqum Code Academy

1 Goal

2 Data import

3 Countig the amount of plants of each species

4 Counting the relative frequency of each species: Proportion

5 Conclusion