Apply function family is a basic function in R. The apply function include apply, lapply, sapply, mapply and tapply.

###(1) apply

apply: for dataframe, deal with row or column

apply(X, MARGIN, FUN, …)

a<-matrix(1:12,c(3,4))
a
##      [,1] [,2] [,3] [,4]
## [1,]    1    4    7   10
## [2,]    2    5    8   11
## [3,]    3    6    9   12
apply(a,1,sum)
## [1] 22 26 30
apply(a,2,sum)
## [1]  6 15 24 33

###(2) lapply and sapply

The lapply and sapply are nearly same, only minor difference in the output formate. lapply and sapply deal with data by the column name or list name. data frame is a special list.

lapply(list, function, …)

a.df<-data.frame(a)
is.list(a.df)
## [1] TRUE
str(a.df)
## 'data.frame':    3 obs. of  4 variables:
##  $ X1: int  1 2 3
##  $ X2: int  4 5 6
##  $ X3: int  7 8 9
##  $ X4: int  10 11 12
lapply(a.df, function(x) x+3)
## $X1
## [1] 4 5 6
## 
## $X2
## [1] 7 8 9
## 
## $X3
## [1] 10 11 12
## 
## $X4
## [1] 13 14 15

###(3) sapply

sapply(list, function, …, simplify=T) if simplify=F, then the output is same with lapply. if simplify=T, the the output formate is determined by the input formate.

yy<-sapply(a.df, function(x) x^2)
yy
##      X1 X2 X3  X4
## [1,]  1 16 49 100
## [2,]  4 25 64 121
## [3,]  9 36 81 144
y1<-sapply(a.df, sum)
y1
## X1 X2 X3 X4 
##  6 15 24 33

###(4) mapply

mapply is a multivariate sapply. This is suit for Loop the input.It apply a function to Multiple List or Vector arguments.

mapply(FUN, …, MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)

mapply(function(x,y) x^y, c(1:5), c(1:5))
## [1]    1    4   27  256 3125

###(5) tapply

tapply: it applies to different factors.

tapply(X, INDEX, FUN = NULL, …, simplify = TRUE) X: a vector INDEX: factor

df <- data.frame(year=kronecker(2001:2003, rep(1,4)), loc=c('beijing','beijing','shanghai','shanghai'), type=rep(c('A','B'),6), sale=rep(1:12))
df
##    year      loc type sale
## 1  2001  beijing    A    1
## 2  2001  beijing    B    2
## 3  2001 shanghai    A    3
## 4  2001 shanghai    B    4
## 5  2002  beijing    A    5
## 6  2002  beijing    B    6
## 7  2002 shanghai    A    7
## 8  2002 shanghai    B    8
## 9  2003  beijing    A    9
## 10 2003  beijing    B   10
## 11 2003 shanghai    A   11
## 12 2003 shanghai    B   12
tapply(df$sale,df[,c('year','loc')],sum)
##       loc
## year   beijing shanghai
##   2001       3        7
##   2002      11       15
##   2003      19       23