1 Goal

The goal of this tutorial is to drop empty levels of a factor that are inherited from a previous dataset.

2 Data preparation

# In this tutorial we are going to use the iris dataset
data("iris")
str(iris)

## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

# We can know how many of each level do we have in the Species factor
str(iris$Species)

##  Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

table(iris$Species)

## 
##     setosa versicolor  virginica 
##         50         50         50

3 Creating a factor with empty levels

# Imagine that you want to make a selection of all plants except for setosa

iris_sample <- iris[which(iris$Species != "setosa"), ]

# And now we want to know the levels of the factor
levels(iris_sample$Species)

## [1] "setosa"     "versicolor" "virginica"

str(iris_sample$Species)

##  Factor w/ 3 levels "setosa","versicolor",..: 2 2 2 2 2 2 2 2 2 2 ...

# And learn how many of each level do we have in our table
table(iris_sample$Species)

## 
##     setosa versicolor  virginica 
##          0         50         50

4 Removing empty levels in a factor

# We have inherited an empty level from a bigger dataset
# The only thing we need to drop empty levels is create a factor with the remaining data
iris_sample$Species <- factor(iris_sample$Species)

# Now the empty level is gone
levels(iris_sample$Species)

## [1] "versicolor" "virginica"

str(iris_sample$Species)

##  Factor w/ 2 levels "versicolor","virginica": 1 1 1 1 1 1 1 1 1 1 ...

table(iris_sample$Species)

## 
## versicolor  virginica 
##         50         50

5 Conclusion

In this tutorial we have learnt how to drop empty levels from a factor. This could be useful if we want to study a sample of a dataset and we don’t need to keep track of those empty levels.

Drop inherited empty levels in factor

Ubiqum Code Academy

1 Goal

2 Data preparation

3 Creating a factor with empty levels

4 Removing empty levels in a factor

5 Conclusion