Code
# This code automatically downloads and loads packages
# Better than repeating library() function many times
if (!requireNamespace("pacman")) {
install.packages("pacman")
}
library(pacman)
p_load(tidyverse, janitor, styler)This guide provides a practical introduction to pipe operators in R, demonstrating how they enhance readability, streamline data transformation workflows, and support reproducible analysis. Using built-in datasets and real examples, the document explains the Base R pipe |> and the magrittr family of pipes—%>%, %<>%, %$%, and %T>%—highlighting their syntax, use cases, and advantages. By comparing functionalities, illustrating best practices, and addressing common pitfalls, this work offers an accessible yet comprehensive resource for students, researchers, and practitioners seeking to write cleaner, more expressive R code.
Data Analysis, R, Tidyverse, Magrittr, Base pipe, Machine Learning, WHO Training, R programming, Data Science
Pipe operators have become one of the most important tools in modern R programming (Wickham, Çetinkaya-Rundel, and Grolemund 2023; R Core Team 2025). They allow you to express a sequence of data transformations in a clear, readable, and intuitive manner.
This document provides a comprehensive guide to the major pipe operators in R, including:
|> (R Core Team 2025)%>%, %<>%, %$%, %T>% (@ Wickham, Çetinkaya-Rundel, and Grolemund 2023)All examples use R’s built-in datasets.
# This code automatically downloads and loads packages
# Better than repeating library() function many times
if (!requireNamespace("pacman")) {
install.packages("pacman")
}
library(pacman)
p_load(tidyverse, janitor, styler)Pipes increase code readability and reduce nested function calls. Compare:
Without pipes:
summary(lm(mpg ~ wt + hp, data = mtcars))
Call:
lm(formula = mpg ~ wt + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
wt -3.87783 0.63273 -6.129 1.12e-06 ***
hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
With a pipe:
mtcars |>
lm(mpg ~ wt + hp, data = _) |>
summary()
Call:
lm(formula = mpg ~ wt + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
wt -3.87783 0.63273 -6.129 1.12e-06 ***
hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
|>The base pipe |> was introduced in R 4.1.0 (R Core Team 2025). It is written as “|” followed by “> (greater than symbol)” with no space in between (R Core Team 2024).
Base R pipes automatically place the input as the first argument. If you need to pipe into another position, use “_” placeholder:
mtcars |>
head() mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
You read the code as follows “mtcars then head()”. In short you could read the piple as “then”.
This is the same as calling:
head(mtcars)_The Base R pipe placeholder “” is used when the piped object needs to be inserted somewhere other than the first argument of a function. It is important because, like magrittr, the Base R pipe always passes the input into the first argument by default—so the ”” placeholder gives you precise control over where the data should go, enabling more flexible and complex function calls.
for example:
mtcars |>
lm(mpg ~ disp + hp, data = _) |>
summary()
Call:
lm(formula = mpg ~ disp + hp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.7945 -2.3036 -0.8246 1.8582 6.9363
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.735904 1.331566 23.083 < 2e-16 ***
disp -0.030346 0.007405 -4.098 0.000306 ***
hp -0.024840 0.013385 -1.856 0.073679 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.127 on 29 degrees of freedom
Multiple R-squared: 0.7482, Adjusted R-squared: 0.7309
F-statistic: 43.09 on 2 and 29 DF, p-value: 2.062e-09
You could think of the pipe (both |> and %>%) as being synonymous with “then”.
mtcars |>
subset(select = c(mpg, hp, wt)) |>
transform(power_ratio = hp / wt) |>
with(plot(mpg, power_ratio))We could read this code as
take mtcars then subset (select columns mpg, hp, wt) then
transform (power_ration = hp / wt) then
plot(mpg, power_ratio)%>%The %>% pipe from magrittr is widely used in the tidyverse (Bache & Wickham 2014). The %>% pipe allows more flexibility than the Base R pipe (Bache and Wickham 2014):
# Note with %>% we use "." as the placeholder
# We do this when what we pipe appears in another position other than the first position.
mtcars %>%
lm(mpg ~ wt + hp, data = .) %>%
summary()
Call:
lm(formula = mpg ~ wt + hp, data = .)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
wt -3.87783 0.63273 -6.129 1.12e-06 ***
hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
library(dplyr)
iris %>%
group_by(Species) %>%
summarise(across(everything(), mean))# A tibble: 3 × 5
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.01 3.43 1.46 0.246
2 versicolor 5.94 2.77 4.26 1.33
3 virginica 6.59 2.97 5.55 2.03
%<>%The pipe %<>% updates an object in-place.
library(magrittr)
x <- 1:5
x %<>% mean()%$%This pipe exposes variables so they can be used directly.
Example using mtcars:
mtcars %$% cor(mpg, wt)[1] -0.8676594
%T>%The Tee pipe allows you to insert side-effects (printing, plotting) without interrupting the pipeline.
Example with PlantGrowth:
PlantGrowth %>%
group_by(group) %>%
summarise(avg_weight = mean(weight)) %T>%
print() %>%
mutate(rank = rank(-avg_weight))# A tibble: 3 × 2
group avg_weight
<fct> <dbl>
1 ctrl 5.03
2 trt1 4.66
3 trt2 5.53
# A tibble: 3 × 3
group avg_weight rank
<fct> <dbl> <dbl>
1 ctrl 5.03 2
2 trt1 4.66 3
3 trt2 5.53 1
mtcars %>%
mutate(
hp_per_litre = hp / disp * 1000
) %>%
group_by(cyl) %>%
summarise(
avg_eff = mean(hp_per_litre)
)# A tibble: 3 × 2
cyl avg_eff
<dbl> <dbl>
1 4 808.
2 6 706.
3 8 608.
library(ggplot2)
iris %>%
ggplot(aes(Sepal.Length, Petal.Length, color = Species)) +
geom_point()ToothGrowth %>%
group_by(supp, dose) %>%
summarise(mean_len = mean(len))# A tibble: 6 × 3
# Groups: supp [2]
supp dose mean_len
<fct> <dbl> <dbl>
1 OJ 0.5 13.2
2 OJ 1 22.7
3 OJ 2 26.1
4 VC 0.5 7.98
5 VC 1 16.8
6 VC 2 26.1
Forgetting to load tidyverse when using %>%.
Using pipes inside functions without debugging — isolate steps.
Overusing pipes for very simple tasks.
Not naming intermediate variables when pipelines exceed 6–7 steps.
Use |> for simple, base R workflows.
Use %>% for data wrangling, modeling, and plotting.
Use %<>% for in-place updates.
Use %$% when functions expect several vectors.
Use %T>% for side effects like printing or plotting.
Keep pipelines readable and modular.
Prefer tidyverse pipelines for clean, reproducible analysis.
styler::style_file("pipe_operators_guide.qmd")Styling 1 files:
pipe_operators_guide.qmd ℹ
────────────────────────────────────────
Status Count Legend
✔ 0 File unchanged.
ℹ 1 File changed.
✖ 0 Styling threw an error.
────────────────────────────────────────
Please review the changes carefully!
utils::sessionInfo()R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26220)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
time zone: America/La_Paz
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] styler_1.11.0 janitor_2.2.1 lubridate_1.9.4 forcats_1.0.1
[5] stringr_1.6.0 purrr_1.2.0 readr_2.1.5 tidyr_1.3.1
[9] tibble_3.3.0 tidyverse_2.0.0 pacman_0.5.1 ggplot2_4.0.0
[13] magrittr_2.0.4 dplyr_1.1.4
loaded via a namespace (and not attached):
[1] gtable_0.3.6 jsonlite_2.0.0 compiler_4.5.2 tidyselect_1.2.1
[5] snakecase_0.11.1 scales_1.4.0 yaml_2.3.10 fastmap_1.2.0
[9] R6_2.6.1 labeling_0.4.3 generics_0.1.4 knitr_1.50
[13] htmlwidgets_1.6.4 R.cache_0.17.0 pillar_1.11.1 RColorBrewer_1.1-3
[17] tzdb_0.5.0 R.utils_2.13.0 rlang_1.1.6 utf8_1.2.6
[21] stringi_1.8.7 xfun_0.54 S7_0.2.0 timechange_0.3.0
[25] cli_3.6.5 withr_3.0.2 digest_0.6.38 grid_4.5.2
[29] rstudioapi_0.17.1 hms_1.1.4 lifecycle_1.0.4 R.oo_1.27.1
[33] R.methodsS3_1.8.2 vctrs_0.6.5 evaluate_1.0.5 glue_1.8.0
[37] farver_2.1.2 codetools_0.2-20 rmarkdown_2.30 tools_4.5.2
[41] pkgconfig_2.0.3 htmltools_0.5.8.1