Magrittr is the source of pipe operators in R. Although the most commonly used pipe operator, %>% is now available in dplyr, it also has other interesting pipe options.
To begin testing Magrittr, we import a Heart Disease dataset exported from Kaggle.
heart <- read.csv("https://raw.githubusercontent.com/mkivenson/Data-Acquisition-and-Management/master/TidyVerse/heart.csv") %>%
rename(age = 1)
datatable(heart)Let’s say that we want to perform the following data wrangling techniques on the dataset:
Using dpylr, tidyr, and data.table alone (without piping operators), this code would look something like this:
data <- select(heart, age, sex, trestbps, chol)
data <- filter(data, (age <= 50) & (sex == 0))
data <- gather(data, "metric", "value", 3:4)
data <- arrange(data, age)
datatable(data)If we instead use the magrittr %>% pipe operator, we can do this. It is the same result but somewhat cleaner.
What if we wanted to modify the mtcars dataframe directly, rather than creating a new dataframe (or writing heart <- heart)? We can just use the %<>% operator instead of %>%:
heart %<>%
select(age, sex, trestbps, chol) %>%
filter((age <= 50) & (sex == 0)) %>%
gather("metric", "value", 3:4) %>%
arrange(age)
datatable(heart)Additional Examples by Fernando Figueres
Some functions, such as plot() will terminate a string of piped arguments. To avoid this, we can use the %T>% operator. If for example, we would like to display a plot and summary statistics, we can do as follows.
## age value
## Min. :34.00 Min. : 94.0
## 1st Qu.:41.00 1st Qu.:120.5
## Median :44.00 Median :141.5
## Mean :43.52 Mean :175.8
## 3rd Qu.:46.00 3rd Qu.:219.8
## Max. :50.00 Max. :341.0
In some cases, we may need to refer to specific variables while working with piped arguments. The %$% pipe allows us to access specific varibles without breaking our argument string.
In the following example we’ll calculate the correlation between two variables of the heart data set.
## [1] 0.09042612