Apply function in R

This is part of My notes on R programming on my site: https://dataz4s.com/. It is based on Mike Marin’s Statslectures video ’Apply Function in R’.

About apply functions

Apply functions consist of a set of loop functions in R. They take less coding and thereby result in lower risk of error when writing as well as they usually a faster than e.g. for loops.

Usage and arguments

?apply(): apply(X, MARGIN, FUN, …) the X is the object to which we apply the function to. The MARGIN is for rows or columns. MARGIN1 means rows and MARGIN2 is for columns. FUN is the function and the … are the rest of the arguments we might send to the function.Let’s run an example with the dataset StockData which can be downloaded in Mike Marin’s page: https://www.statslectures.com/r-scripts-datasets

Read in data

# Read in data via read_excel
library(readxl)
StockData <- read.table("C:/Users/Usuario/Documents/dataZ4s/R/Apply function/StockExample.txt")
StockData
##       Stock1 Stock2 Stock3 Stock4
## Day1  185.74   1.47   1605  95.05
## Day2  184.26   1.56   1580  97.49
## Day3  162.21   1.39   1490  88.57
## Day4  159.04   1.43   1520  85.55
## Day5  164.87   1.42   1550  92.04
## Day6  162.72   1.36   1525  91.70
## Day7  157.89     NA   1495  89.88
## Day8  159.49   1.43   1485  93.17
## Day9  150.22   1.57   1470  90.12
## Day10 151.02   1.54   1510  92.14

Mean price of each stock

# We will use the apply function 
# MARGIN=2 meaning for columns. The data is StockData and the function is mean()
# An NA value is returned for column 2 as Day 7 in Stock 2 has a missing value
apply(X = StockData, MARGIN = 2,FUN = mean)
##   Stock1   Stock2   Stock3   Stock4 
##  163.746       NA 1523.000   91.571

Dealing with NA

# With the na.rm function we can have NA values removed
# With the na.rm function we thereby get the mean of all 4 stocks
apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)
##      Stock1      Stock2      Stock3      Stock4 
##  163.746000    1.463333 1523.000000   91.571000
# Save apply function to object
AVG <- apply(X = StockData, MARGIN = 2,FUN = mean, na.rm=TRUE)
AVG
##      Stock1      Stock2      Stock3      Stock4 
##  163.746000    1.463333 1523.000000   91.571000
# When confortable with the commands and the default orders in the functions we can skip the argument names
apply(StockData, 2, mean, na.rm=TRUE)
##      Stock1      Stock2      Stock3      Stock4 
##  163.746000    1.463333 1523.000000   91.571000

colMeans function

# The colMeans command does the same as the apply command that we used above
# It is already built into the function that it is the mean of columns
# The argument only takes the data adding argument for na.rm
colMeans(StockData, na.rm = TRUE)
##      Stock1      Stock2      Stock3      Stock4 
##  163.746000    1.463333 1523.000000   91.571000

Max values and percentiles

# Max values of the stocks
apply(X = StockData, MARGIN = 2, FUN = max, na.rm=TRUE)
##  Stock1  Stock2  Stock3  Stock4 
##  185.74    1.57 1605.00   97.49
# 20st and 80st percentiles
apply(X = StockData, MARGIN = 2, FUN = quantile, probs=c(0.2, 0.8), na.rm=TRUE)
##      Stock1 Stock2 Stock3 Stock4
## 20% 156.516  1.408   1489 89.618
## 80% 168.748  1.548   1556 93.546

Row sums

# Sum for each row
apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE)
##    Day1    Day2    Day3    Day4    Day5    Day6    Day7    Day8    Day9   Day10 
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70
# And like the colMeans command, there is a rowSums command
rowSums(StockData, na.rm = TRUE)
##    Day1    Day2    Day3    Day4    Day5    Day6    Day7    Day8    Day9   Day10 
## 1887.26 1863.31 1742.17 1766.02 1808.33 1780.78 1742.77 1739.09 1711.91 1754.70

Plots

# Create line plots for each of the stocks
apply(X = StockData, MARGIN = 2, FUN = plot, type="l", main="Stock", ylab="Price", xlab="Day")

## NULL
# Plot for total per day
plot(apply(X=StockData, MARGIN = 1, FUN = sum, na.rm=TRUE), type = "l", ylab = "Total Market Value", xlab = "Day", main = "Markets per day")

View this page on my site: https://dataz4s.com/r-statistical-programming/apply-function-in-r/