Subsetting

Difference to other programs, R starts from 1 not 0. You will find the index of first element of vector in Python is 0. This is key difference.

Subsetting vector

by index

c1 = c(1:10)
c1[1]
## [1] 1
c1[2]
## [1] 2
c1[1:2]
## [1] 1 2
c1[c(1:2,5)]
## [1] 1 2 5
tail(c1,5)
## [1]  6  7  8  9 10
head(c1,2)
## [1] 1 2

by negative index

c1[-1]
## [1]  2  3  4  5  6  7  8  9 10
c1[-c(1,2)]
## [1]  3  4  5  6  7  8  9 10

by logical condition

c1[c1>=4]
## [1]  4  5  6  7  8  9 10
c1[c1<=4]
## [1] 1 2 3 4
c1[c1==4] 
## [1] 4
c1[c1!=4]
## [1]  1  2  3  5  6  7  8  9 10

by name (if named vector)

c2 <- c(a=1,b=2,c=3,d=4)
c2["a"]
## a 
## 1
c2[c("a","b")]
## a b 
## 1 2

Subsetting Matrices

by index

m1 <- matrix(1:9,nrow=3,byrow = FALSE) #the matrix is filled column by column This means that the data you provide will be placed into the first column until it's full, then the next column, and so on. In other words, the matrix is filled vertically.
m1 <- matrix(1:9,nrow=3,byrow = TRUE) #the matrix is filled row by row. 
m1[1, 2]  # Extracts the element in the 1st row, 2nd column
## [1] 2
m1[1, ]   # Extracts the entire 1st row
## [1] 1 2 3
m1[, 2]   # Extracts the entire 2nd column
## [1] 2 5 8
m1[m1 > 5] 
## [1] 7 8 6 9

Subsetting Data Frames

By row and column indices

df <- data.frame(a = 1:3, b = 4:6)
df[1, 2]  # Extracts the element in the 1st row, 2nd column
## [1] 4
df[1, ]   # Extracts the entire 1st row
##   a b
## 1 1 4
df[, 2]   # Extracts the entire 2nd column
## [1] 4 5 6

By row and column name

df["a"]  # Extracts column "a" as a data frame
##   a
## 1 1
## 2 2
## 3 3
df$a     # Extracts column "a" as a vector
## [1] 1 2 3
df[,"a"]
## [1] 1 2 3
df[, c("a","b")]  
##   a b
## 1 1 4
## 2 2 5
## 3 3 6

you will find kowning the index of elements is important for subseting here comes the function

which()

The which() function in R is used to identify the indices of elements in a logical vector that are TRUE

which(c1 > 4) 
## [1]  5  6  7  8  9 10
which(c1 == 4)
## [1] 4
which(m1 < 5) #always examin by row
## [1] 1 2 4 7
which(df >3) #always examin by row
## [1] 4 5 6

which.max() # Index of the maximum value

which.min() # Index of the minimum value

Adding elements

The c() function is used to concatenate the original vector with the new elements.

# Creating a vector
vec <- c(1, 2, 3)

# Adding a single element
vec <- c(vec, 4)
print(vec)
## [1] 1 2 3 4
# Adding multiple elements
vec <- c(vec, 5, 6, 7)
print(vec)
## [1] 1 2 3 4 5 6 7

Adding Elements to Dataframes

You can add rows or columns to data frames using rbind() and cbind() as well.

df <- data.frame(a = 1:3, b = 4:6)
df$c <- c(11,2,33)
#df$d <- c(1,2) #have to same number of elements
rbind(df,c(3:5))
##   a b  c
## 1 1 4 11
## 2 2 5  2
## 3 3 6 33
## 4 3 4  5
df = cbind(df,d=c(3:5))
df = rbind(df,c(1:5))
## Warning in rbind(deparse.level, ...): number of columns of result, 4, is not a
## multiple of vector length 5 of arg 2

Removing elements

Removing Elements from Matrices To remove rows or columns from a matrix, you can use negative indexing for row and column indices.

# Creating a matrix
mat <- matrix(1:9, nrow = 3)

# Removing the 1st row
mat <- mat[-1, ]
print(mat)
##      [,1] [,2] [,3]
## [1,]    2    5    8
## [2,]    3    6    9
# Removing the 2nd column
mat <- mat[, -2]
print(mat)
##      [,1] [,2]
## [1,]    2    8
## [2,]    3    9

Exercise: Subsetting, Adding, and Removing Elements in R

Objective: This exercise will test your ability to subset, add, and remove elements from various data structures in R, including vectors, matrices, lists, and data frames.

Part 1: Subsetting

Vectors: Given the following vector, extract the elements that are greater than 10.

vec <- c(5, 12, 3, 18, 9, 21)

Matrices: Given the following matrix, extract the elements from the second row.

mat <- matrix(1:9, nrow = 3, byrow = TRUE)

Data Frames: Given the following data frame, extract the rows where the age is greater than 30.

df <- data.frame(name = c(“John”, “Alice”, “Bob”, “Catherine”), age = c(25, 32, 28, 41))

Part 2: Adding Elements

Vectors: Add the element 25 to the vector below.

vec <- c(10, 20, 30)

Data Frames: Add a new column salary with the values [45000, 52000, 61000] to the following data frame.

df <- data.frame(name = c(“John”, “Alice”, “Bob”), age = c(25, 32, 28))