true

Goal


The goal of this tutorial learn how to select & manipulate specific values within a df based on conditions.


Data import


# For this tutorial we'll use the ever so popular iris data set
# It's built in the base package of Rstudio, so no need to load anything, you can go head and check it out:
head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
# If for some reason you want to restore the dataset to it's original values, 
# (let's say after deleting a couple of rows) you can do so by executing:
data(iris)

Selecting & manipulating specific rows


# Let's assume you want to manipulate the 3rd row in the iris data set
iris[3,]
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 3          4.7         3.2          1.3         0.2  setosa
# And let's assume, you only want to manipulate one or more values in the first column, "Sepal.Length"
# You have several options for selecting it

iris$Sepal.Length[3]
## [1] 4.7
# or
iris[3,1]
## [1] 4.7

Selecting specific values by using conditions


# Now let's assume you want to filter by iris$Species = setosa
iris$Sepal.Length[which(iris$Species == "setosa" & iris$Sepal.Length > 5)] 
##  [1] 5.1 5.4 5.4 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1 5.1 5.2 5.2 5.4 5.2 5.5
## [18] 5.5 5.1 5.1 5.1 5.3
# Now let's say you also want to subset the selection with iris$Petal.Width < 0.4
iris$Sepal.Length[which(iris$Species == "setosa" & iris$Sepal.Length > 5 & iris$Petal.Width < 0.4)]
##  [1] 5.1 5.4 5.8 5.1 5.7 5.1 5.4 5.2 5.2 5.2 5.5 5.5 5.1 5.1 5.3

Manipulating/ deleting specific values


# Regardless of your conditions, you can now manipulate the values of iris$Sepal.Length various ways
# To keep things short & succinct, one good method is using the index as a proxy 

my_index <- which(iris$Species == "setosa" & iris$Sepal.Length > 5)

# Now you can easily manipulate the values within this index, for example:
iris$Sepal.Length[my_index] <- iris$Sepal.Length[my_index]/2

# You could also set the values to a specific value
iris$Sepal.Length[my_index] <- 12.5

# You can also exlude all rows that match the index range (and be doing this removing them)
iris <- iris[-my_index,]

Conclusion


In this tutorial we have learnt how to select & manipulate specific values/ rows based on conditions