Functions

Illya Mowerman

11/27/2017

What are functions?

A function is an algorithm that receives inputs, processes the inputs, and then returns a value(s)

Why use functions?

A simple function X^2

fun1 <- function(x){ x^2 }

fun1(7)
## [1] 49

Once a function is defined, we can use it most anywhere

cars <- mtcars
head(cars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
cars$mpg2 <- fun1(cars$mpg)

head(cars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
##                     mpg2
## Mazda RX4         441.00
## Mazda RX4 Wag     441.00
## Datsun 710        519.84
## Hornet 4 Drive    457.96
## Hornet Sportabout 349.69
## Valiant           327.61

Let’s create a nice plot

ggplot(cars) +
  geom_smooth(aes(x = mpg , wt))
## `geom_smooth()` using method = 'loess'

If you want to create the same plot over and over, but with other variables on the y axis, then create a function

scatter_mpg <- function(yAxisVar){
  ggplot(cars) +
  geom_smooth(aes(x = mpg , y = yAxisVar))
}

scatter_mpg(cars$hp)
## `geom_smooth()` using method = 'loess'

Let’s add some more parameters to the function, like labels

scatter_mpg <- function(yAxisVar , ylabel){
  ggplot(cars) +
  geom_smooth(aes(x = mpg , y = yAxisVar)) +
    labs(x = 'Miles per Gallon' , y = ylabel)
}

scatter_mpg(cars$hp , 'Horse Power')
## `geom_smooth()` using method = 'loess'

Loops: Loops are like history, same story, different actors

In the function world, if you want to a few correlations with the variable mpg you would write the following syntax

fun_corr <- function(var2){ cor.test(cars$mpg , var2)}

fun_corr(cars$hp)
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and var2
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8852686 -0.5860994
## sample estimates:
##        cor 
## -0.7761684
fun_corr(cars$disp)
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and var2
## t = -8.7472, df = 30, p-value = 9.38e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9233594 -0.7081376
## sample estimates:
##        cor 
## -0.8475514
fun_corr(cars$wt)
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and var2
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9338264 -0.7440872
## sample estimates:
##        cor 
## -0.8676594

In the loop world, the same code would be as follows

fun_corr2 <- function(var2){ cor.test(cars$mpg , cars[ , var2])}

vars_for_corr <- c('hp' , 'disp' , 'wt')

for (i in vars_for_corr){
  
  print(fun_corr2(i))
}
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8852686 -0.5860994
## sample estimates:
##        cor 
## -0.7761684 
## 
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -8.7472, df = 30, p-value = 9.38e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9233594 -0.7081376
## sample estimates:
##        cor 
## -0.8475514 
## 
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9338264 -0.7440872
## sample estimates:
##        cor 
## -0.8676594

In the loop world, the same code would be as follows

fun_corr2 <- function(var2){ cor.test(cars$mpg , cars[ , var2])}

vars_for_corr <- c('hp' , 'disp' , 'wt')

for (i in vars_for_corr){
  
  print(fun_corr2(i))
}
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8852686 -0.5860994
## sample estimates:
##        cor 
## -0.7761684 
## 
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -8.7472, df = 30, p-value = 9.38e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9233594 -0.7081376
## sample estimates:
##        cor 
## -0.8475514 
## 
## 
##  Pearson's product-moment correlation
## 
## data:  cars$mpg and cars[, var2]
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9338264 -0.7440872
## sample estimates:
##        cor 
## -0.8676594

Loops can also loop through numeric values: Fibonacci sequence

len <- 10

fibvals <- numeric()

fibvals[1] <- 1
fibvals[2] <- 1

for (i in 3:len) { 
  fibvals[i] <- fibvals[i-1]+fibvals[i-2]
} 

print(fibvals)
##  [1]  1  1  2  3  5  8 13 21 34 55

We can also have a loop within a function: Determine the i-th number in the Fibonacci sequence

fib_fun <- function(len){
  
  fibvals <- numeric()

fibvals[1] <- 1
fibvals[2] <- 1

for (i in 3:len) { 
  fibvals[i] <- fibvals[i-1]+fibvals[i-2]
} 

return(fibvals[len])
}

fib_fun(12)
## [1] 144