Multiple Steps and Nested Functions

Let’s say we want to use R to find the following:

\[\sqrt{ | \log_{10}{0.75} |}\]

and round the result to two decimal places. How can we achieve that?

Step-by-step we need to:

  1. Find the log of 0.75
  2. Take the absolute value
  3. Square root the result
  4. Round the answer to 2 decimal places

How would that look in R?

How functions and computers work, you actually need to start with the last step and work you’re way to the first step!

# Have to round, the squareroot, the absolute value, then take the log
round(digits = 2, sqrt(abs(log10(0.75))))
## [1] 0.35

To improve the readability, a practice is to include the new function in a new line:

round(
  digits = 2, 
  sqrt(
    abs(
      log10(
        0.75
      )
    )
  )
)
## [1] 0.35

Using Pipes in R

One reason why the code is difficult to read is the first function is the one we use in step 4 (round) and the last function written in the code is the one we want to use first (log10()). So we need to read the code from the center to the ends (a “Middle-out” approach) to understand what is going on. Ideally we’d start with the first function we need and end on the last function we’d use. So how do we do that?

“Piping” in coding was created as an approach to make code more natural to read and write. The pipe operator (|> in R or %>% using the tidyverse universe) will “pass” the results from one function to the next in the order we use them.

I’ll be using the native pipe, |>, but you can use either!


Note

You can shortcut either pipe by using with CRTL+SHIFT+m on PC or CMND+SHIFT+m on a mac.

It will default to using %>% but you can change it to use |> by

Tools > Global Options… > Code > Use Native Pipe Operator

It’s faster once you get the hang of it!


Using pipes we use R to calculate the number in the same steps we would if we were to do it by hand!

  1. Find the log of 0.75
  2. Take the absolute value
  3. Square root the result
  4. Round the answer to 2 decimal places
# Start with 0.75
0.75 |> 
  # Then pass it to the log10() function
  log10() |> 
  # Next, we need the absolute value
  abs() |> 
  # Now that it is positive, we can take the square root
  sqrt() |> 
  # We can pass it into round, but we'll need to specify how many digits to round
  round(digits = 2)
## [1] 0.35

So what’s the big deal about pipes? We can use it for many functions, not just basic calculations!

# Getting the data set from ggplot2
ggplot2::mpg |> 
  # Only keeping the cars from 2008
  dplyr::filter(year == 2008) |> 
  # Pulling out the manufacturer column from the data frame and changing it to a vector
  pull(manufacturer) |> 
  # Counting the number times each manufacturer appears using table()
  table(dnn = "Manufacturer") |> 
  # Converting it from a table type object to a data.frame object
  data.frame() |> 
  # Changing the column name from Freq to Count
  rename(Count = Freq) |> 
  # Making the data.frame look nice for the knitted document with gt()
  gt::gt()
Manufacturer Count
audi 9
chevrolet 12
dodge 21
ford 10
honda 4
hyundai 8
jeep 6
land rover 2
lincoln 1
mercury 2
nissan 7
pontiac 2
subaru 8
toyota 14
volkswagen 11

When Piping Doesn’t Work

When piping a data set into a function, we can only use the name of the columns directly (no data$column) if the function has a data = argument. Sadly, there are many functions that don’t have a data argument, like table():

# Code commented out so the document will knit
#ggplot2::mpg |> 
  # table(manufacturer, class)

There is a different pipe operator that works with functions missing the data argument, but we won’t worry about that and you aren’t expected to use it.

Hopefully you’re able to see the benefits of piping when programming in ease of writing and reading code. But you still need to add descriptive comments!

Piping is like baking