The goal of this tutorial learn how to select & manipulate specific values within a df based on conditions.
# For this tutorial we'll use the ever so popular iris data set
# It's built in the base package of Rstudio, so no need to load anything, you can go head and check it out:
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
# If for some reason you want to restore the dataset to it's original values,
# (let's say after deleting a couple of rows) you can do so by executing:
data(iris)
# Let's assume you want to manipulate the 3rd row in the iris data set
iris[3,]
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 3 4.7 3.2 1.3 0.2 setosa
# And let's assume, you only want to manipulate one or more values in the first column, "Sepal.Length"
# You have several options for selecting it
iris$Sepal.Length[3]
## [1] 4.7
# or
iris[3,1]
## [1] 4.7
# Now let's assume you want to filter by iris$Species = setosa
iris$Sepal.Length[which(iris$Species == "setosa" & iris$Sepal.Length > 5)]
## [1] 5.1 5.4 5.4 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1 5.1 5.2 5.2 5.4 5.2 5.5
## [18] 5.5 5.1 5.1 5.1 5.3
# Now let's say you also want to subset the selection with iris$Petal.Width < 0.4
iris$Sepal.Length[which(iris$Species == "setosa" & iris$Sepal.Length > 5 & iris$Petal.Width < 0.4)]
## [1] 5.1 5.4 5.8 5.1 5.7 5.1 5.4 5.2 5.2 5.2 5.5 5.5 5.1 5.1 5.3
# Regardless of your conditions, you can now manipulate the values of iris$Sepal.Length various ways
# To keep things short & succinct, one good method is using the index as a proxy
my_index <- which(iris$Species == "setosa" & iris$Sepal.Length > 5)
# Now you can easily manipulate the values within this index, for example:
iris$Sepal.Length[my_index] <- iris$Sepal.Length[my_index]/2
# You could also set the values to a specific value
iris$Sepal.Length[my_index] <- 12.5
# You can also exlude all rows that match the index range (and be doing this removing them)
iris <- iris[-my_index,]
In this tutorial we have learnt how to select & manipulate specific values/ rows based on conditions