Functional Programming in R with purrr

I have been writing about the family packages from TidyVerse. Tidyverse is a great collection of R packages offering data science solutions in the areas of data manipulation, exploration, and visualization that share a common design philosophy. Today, I will talk about purrr.

As I enter my second to last semester of masters of science in data science program, I have become accustomed to writing several functions for various analysis. But often writing several functions creates mistakes which throws an error. Take the following code for example:

aov_mpg <- aov(mpg ~ factor(cyl), data = mtcars)
summary(aov_mpg)

aov_disp <- aov(disp ~ factor(cyll), data = mtcars)
summary(aov_disp)

aov_hp <- aov(hp ~ factor(cyl), data = mrcars)
summry(aov_hpp)

aov_wt <- aov(wt ~ factor(cyl), datas = mtcars)
summary(aov_wt)

In the code chunk above, if you wanted to change ANOVAs for number of gears instead of number of cylinders, you would have to go back and change the factor(cyl) call to factor(gear) 4x! This is not very efficient, and you are likely to end up with mistakes as you have to type everything multiple times. It gets more complicated if you have to write functions for hundreds of variables.

This is where purrr comes in. Purrr solves the issue of minimizing repetition with further replication. Here we use purrr, to solve the same one-way ANOVAs for some dependent variables and a set independent variable. We can see that purrr requires less coding and if were to change a variable, we have to do it once. Thatโ€™s the beauty of purrr.

mtcars %>%
  mutate(cyl = factor(cyl)) %>%
  select(mpg, disp, hp) %>%
  map(~ aov(.x ~ cyl, data = mtcars)) %>%
  map_dfr(~ tidy(.), .id = 'source') %>%
  mutate(p.value = round(p.value, 5)) %>% 
  kable() %>% 
  kable_styling()
source term df sumsq meansq statistic p.value
mpg cyl 1 817.7130 817.71295 79.56103 0
mpg Residuals 30 308.3342 10.27781 NA NA
disp cyl 1 387454.0926 387454.09261 130.99888 0
disp Residuals 30 88730.7021 2957.69007 NA NA
hp cyl 1 100984.1721 100984.17209 67.70993 0
hp Residuals 30 44742.7029 1491.42343 NA NA