The goal of this tutorial is to learn how to change specific names to more general categories read from a different table. This process can be useful when we want to make analysis by category instead of by individual products.
# First of all we load the data
# For this tutorial we are going to use the iris plant dataset
data(iris)
str(iris)
## 'data.frame': 150 obs. of 5 variables:
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# First we add a new column called name which is just the row number
iris$name <- 1:nrow(iris)
iris <- iris[ , c(ncol(iris), 1:(ncol(iris) -1))]
str(iris)
## 'data.frame': 150 obs. of 6 variables:
## $ name : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# Now we create a table containing just the name and the Species
iris_Species <- iris[ c(1, ncol(iris))]
iris_Species$Species <- as.character(iris_Species$Species)
str(iris_Species)
## 'data.frame': 150 obs. of 2 variables:
## $ name : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Species: chr "setosa" "setosa" "setosa" "setosa" ...
# And remove the Species from the original table
iris$Species <- NULL
str(iris)
## 'data.frame': 150 obs. of 5 variables:
## $ name : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# Now we can tell them to put the proper species into the name
iris_Species[1, "Species"]
## [1] "setosa"
iris$name <- iris_Species[iris$name, "Species"]
head(iris)
## name Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1 setosa 5.1 3.5 1.4 0.2
## 2 setosa 4.9 3.0 1.4 0.2
## 3 setosa 4.7 3.2 1.3 0.2
## 4 setosa 4.6 3.1 1.5 0.2
## 5 setosa 5.0 3.6 1.4 0.2
## 6 setosa 5.4 3.9 1.7 0.4
# Now we have changed the name of the specific plant with its own species
In this tutorial we have learnt how to change the specific name of an entry by its larger category from a different table.