Brief apply()

  • apply() function here is applied on a data frame
  • The first argument in apply() function is input data frame
  • The second argument in apply() function is whether you want to apply a particular function to all the ROWS or the COLUMNS. Value is given 1 if ROWS, 2 if COLUMNS
  • The third argument is the function you want to apply to each row or column example, mean of each row, or mean of each column. So mean() is the function here
  • Any arguments for the function used in third argument of apply() function can be passed directly after third argument. Example, na.rm of mean() function can be passed to apply() after third argument.

apply() on

Data Frame (All Numeric, No NA)

df <- data.frame(Sub1 = c(1.1, 2.3, 3, 4.1), Sub2 = c(3,2,1,2), Sub3 = 1:4)
df

Mean of each COLUMN

As we see that all columns are numeric, lets take mean of each column. As you see there are 3 columns, we should get 3 mean values as output.

apply(df, 2, mean)
##  Sub1  Sub2  Sub3 
## 2.625 2.000 2.500

Sum of each COLUMN

As we see that all columns are numeric, lets take sum of each column. As you see there are 3 columns, we should get 3 sum values as output.

apply(df, 2, sum)
## Sub1 Sub2 Sub3 
## 10.5  8.0 10.0

Mean of each ROW

apply(df, 1, mean)
## [1] 1.700000 2.100000 2.333333 3.366667

Sum of each ROW

apply(df, 1, sum)
## [1]  5.1  6.3  7.0 10.1

Data Frame (All Numeric, With NA)

df_NA <- data.frame(Sub1 = c(1.1, 2.3, NA, 4.1), Sub2 = c(3,2,1,NA), Sub3 = c(NA,NA,1,2))
df_NA

Mean of each COLUMN

As we see that all columns are numeric, lets take mean of each column

apply(df_NA, 2, mean)
## Sub1 Sub2 Sub3 
##   NA   NA   NA

We see that, the mean of each column shows NA, we can skip the NAs by setting argument na.rm = TRUE of mean() function inside apply() function itself. This will skip NAs and compute mean with rest of the values.

apply(df_NA, 2, mean, na.rm = TRUE)
## Sub1 Sub2 Sub3 
##  2.5  2.0  1.5

Sum of each COLUMN

apply(df_NA, 2, sum, na.rm = TRUE)
## Sub1 Sub2 Sub3 
##  7.5  6.0  3.0

Mean of each ROW

apply(df_NA, 1, mean, na.rm = TRUE)
## [1] 2.05 2.15 1.00 3.05

Sum of each ROW

apply(df_NA, 1, sum, na.rm = TRUE)
## [1] 4.1 4.3 2.0 6.1

Data Frame (Some Numeric, Some Factor, Character)

df1 <- data.frame(Sub1 = c(1.1, 2.3, 3, 4.1), Chr2 = c("A","D","D","D"), Chr3 = letters[11:14],
Sub2 = 11:14, Sub3 = 1:4) 
df1

Mean of each Numeric COLUMN

While providing data frame into apply(), remove the non numeric columns.

apply(df1[,-c(2:3)], 2, mean)
##   Sub1   Sub2   Sub3 
##  2.625 12.500  2.500

Sum of each Numeric COLUMN

While providing data frame into apply(), remove the non numeric columns.

apply(df1[,-c(2:3)], 2, sum)
## Sub1 Sub2 Sub3 
## 10.5 50.0 10.0

Mean of each ROW

While providing data frame into apply(), remove the non numeric columns.

apply(df1[,-c(2:3)], 1, mean)
## [1] 4.366667 5.433333 6.333333 7.366667

Sum of each ROW

While providing data frame into apply(), remove the non numeric columns.

apply(df1[,-c(2:3)], 1, sum)
## [1] 13.1 16.3 19.0 22.1