A Practical Guide to Pipe Operators in R

Independent Data Analysis Project

Author
Affiliations

John Karuitha, PhD

Published

November 14, 2025

Modified

November 14, 2025

Executive Summary

This guide provides a practical introduction to pipe operators in R, demonstrating how they enhance readability, streamline data transformation workflows, and support reproducible analysis. Using built-in datasets and real examples, the document explains the Base R pipe |> and the magrittr family of pipes—%>%, %<>%, %$%, and %T>%—highlighting their syntax, use cases, and advantages. By comparing functionalities, illustrating best practices, and addressing common pitfalls, this work offers an accessible yet comprehensive resource for students, researchers, and practitioners seeking to write cleaner, more expressive R code.

Keywords

Data Analysis, R, Tidyverse, Magrittr, Base pipe, Machine Learning, WHO Training, R programming, Data Science

1 Introduction

Pipe operators have become one of the most important tools in modern R programming (Wickham, Çetinkaya-Rundel, and Grolemund 2023; R Core Team 2025). They allow you to express a sequence of data transformations in a clear, readable, and intuitive manner.

This document provides a comprehensive guide to the major pipe operators in R, including:

All examples use R’s built-in datasets.

Code
# This code automatically downloads and loads packages
# Better than repeating library() function many times
if (!requireNamespace("pacman")) {
  install.packages("pacman")
}

library(pacman)

p_load(tidyverse, janitor, styler)

2 Why Use Pipes?

Pipes increase code readability and reduce nested function calls. Compare:

Without pipes:

Code
summary(lm(mpg ~ wt + hp, data = mtcars))

Call:
lm(formula = mpg ~ wt + hp, data = mtcars)

Residuals:
   Min     1Q Median     3Q    Max 
-3.941 -1.600 -0.182  1.050  5.854 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
wt          -3.87783    0.63273  -6.129 1.12e-06 ***
hp          -0.03177    0.00903  -3.519  0.00145 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared:  0.8268,    Adjusted R-squared:  0.8148 
F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12

With a pipe:

Code
mtcars |>
  lm(mpg ~ wt + hp, data = _) |>
  summary()

Call:
lm(formula = mpg ~ wt + hp, data = mtcars)

Residuals:
   Min     1Q Median     3Q    Max 
-3.941 -1.600 -0.182  1.050  5.854 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
wt          -3.87783    0.63273  -6.129 1.12e-06 ***
hp          -0.03177    0.00903  -3.519  0.00145 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared:  0.8268,    Adjusted R-squared:  0.8148 
F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12

3 The Base R Pipe |>

The base pipe |> was introduced in R 4.1.0 (R Core Team 2025). It is written as “|” followed by “> (greater than symbol)” with no space in between (R Core Team 2024).

Base R pipes automatically place the input as the first argument. If you need to pipe into another position, use “_” placeholder:

3.1 Basic Usage

Code
mtcars |>
  head()
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

You read the code as follows “mtcars then head()”. In short you could read the piple as “then”.

This is the same as calling:

head(mtcars)

3.2 Placeholder _

The Base R pipe placeholder “” is used when the piped object needs to be inserted somewhere other than the first argument of a function. It is important because, like magrittr, the Base R pipe always passes the input into the first argument by default—so the ”” placeholder gives you precise control over where the data should go, enabling more flexible and complex function calls.

for example:

Code
mtcars |> 
  lm(mpg ~ disp + hp, data = _) |> 
  summary()

Call:
lm(formula = mpg ~ disp + hp, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.7945 -2.3036 -0.8246  1.8582  6.9363 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 30.735904   1.331566  23.083  < 2e-16 ***
disp        -0.030346   0.007405  -4.098 0.000306 ***
hp          -0.024840   0.013385  -1.856 0.073679 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.127 on 29 degrees of freedom
Multiple R-squared:  0.7482,    Adjusted R-squared:  0.7309 
F-statistic: 43.09 on 2 and 29 DF,  p-value: 2.062e-09

3.3 How to Read Pipes Function Calls

You could think of the pipe (both |> and %>%) as being synonymous with “then”.

Code
mtcars |>
  subset(select = c(mpg, hp, wt)) |>
  transform(power_ratio = hp / wt) |>
  with(plot(mpg, power_ratio))

We could read this code as

take mtcars then subset (select columns mpg, hp, wt) then
transform (power_ration = hp / wt) then
plot(mpg, power_ratio)

4 The magrittr Pipe %>%

The %>% pipe from magrittr is widely used in the tidyverse (Bache & Wickham 2014). The %>% pipe allows more flexibility than the Base R pipe (Bache and Wickham 2014):

  • Supports placeholders “.” used in a similar manner with the base pipe “_” placeholder.
Code
# Note with %>% we use "." as the placeholder
# We do this when what we pipe appears in another position other than the first position.
mtcars %>%
  lm(mpg ~ wt + hp, data = .) %>%
  summary()

Call:
lm(formula = mpg ~ wt + hp, data = .)

Residuals:
   Min     1Q Median     3Q    Max 
-3.941 -1.600 -0.182  1.050  5.854 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
wt          -3.87783    0.63273  -6.129 1.12e-06 ***
hp          -0.03177    0.00903  -3.519  0.00145 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared:  0.8268,    Adjusted R-squared:  0.8148 
F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12
  • Provides chaining-friendly transformations

4.1 Basic Usage

Code
library(dplyr)

iris %>%
  group_by(Species) %>%
  summarise(across(everything(), mean))
# A tibble: 3 × 5
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
  <fct>             <dbl>       <dbl>        <dbl>       <dbl>
1 setosa             5.01        3.43         1.46       0.246
2 versicolor         5.94        2.77         4.26       1.33 
3 virginica          6.59        2.97         5.55       2.03 

4.2 Other pipes in Magrittr (Rarely used so you may skip them for now)

5 The Assignment Pipe %<>%

The pipe %<>% updates an object in-place.

Code
library(magrittr)

x <- 1:5
x %<>% mean()

6 The Exposition Pipe %$%

This pipe exposes variables so they can be used directly.

Example using mtcars:

Code
mtcars %$% cor(mpg, wt)
[1] -0.8676594

7 The Tee Pipe %T>%

The Tee pipe allows you to insert side-effects (printing, plotting) without interrupting the pipeline.

Example with PlantGrowth:

Code
PlantGrowth %>%
  group_by(group) %>%
  summarise(avg_weight = mean(weight)) %T>%
  print() %>%
  mutate(rank = rank(-avg_weight))
# A tibble: 3 × 2
  group avg_weight
  <fct>      <dbl>
1 ctrl        5.03
2 trt1        4.66
3 trt2        5.53
# A tibble: 3 × 3
  group avg_weight  rank
  <fct>      <dbl> <dbl>
1 ctrl        5.03     2
2 trt1        4.66     3
3 trt2        5.53     1

8 Complex Examples

8.1 mtcars – horsepower efficiency

Code
mtcars %>%
  mutate(
    hp_per_litre = hp / disp * 1000
  ) %>%
  group_by(cyl) %>%
  summarise(
    avg_eff = mean(hp_per_litre)
  )
# A tibble: 3 × 2
    cyl avg_eff
  <dbl>   <dbl>
1     4    808.
2     6    706.
3     8    608.

8.2 iris – ggplot2 example

Code
library(ggplot2)

iris %>%
  ggplot(aes(Sepal.Length, Petal.Length, color = Species)) +
  geom_point()

8.3 Example 3: ToothGrowth – comparing supplements

Code
ToothGrowth %>%
  group_by(supp, dose) %>%
  summarise(mean_len = mean(len))
# A tibble: 6 × 3
# Groups:   supp [2]
  supp   dose mean_len
  <fct> <dbl>    <dbl>
1 OJ      0.5    13.2 
2 OJ      1      22.7 
3 OJ      2      26.1 
4 VC      0.5     7.98
5 VC      1      16.8 
6 VC      2      26.1 

9 Performance

  • Base pipes may be faster.
  • magrittr adds flexibility.
  • Avoid extremely long pipelines.

10 Common Pitfalls

  • Forgetting to load tidyverse when using %>%.

  • Using pipes inside functions without debugging — isolate steps.

  • Overusing pipes for very simple tasks.

  • Not naming intermediate variables when pipelines exceed 6–7 steps.

11 Summary and Best Practices

  • Use |> for simple, base R workflows.

  • Use %>% for data wrangling, modeling, and plotting.

  • Use %<>% for in-place updates.

  • Use %$% when functions expect several vectors.

  • Use %T>% for side effects like printing or plotting.

  • Keep pipelines readable and modular.

  • Prefer tidyverse pipelines for clean, reproducible analysis.

Code
styler::style_file("pipe_operators_guide.qmd")
Styling  1  files:
 pipe_operators_guide.qmd ℹ 
────────────────────────────────────────
Status  Count   Legend 
✔   0   File unchanged.
ℹ   1   File changed.
✖   0   Styling threw an error.
────────────────────────────────────────
Please review the changes carefully!
Code
utils::sessionInfo()
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26220)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United Kingdom.utf8 
[2] LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

time zone: America/La_Paz
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] styler_1.11.0   janitor_2.2.1   lubridate_1.9.4 forcats_1.0.1  
 [5] stringr_1.6.0   purrr_1.2.0     readr_2.1.5     tidyr_1.3.1    
 [9] tibble_3.3.0    tidyverse_2.0.0 pacman_0.5.1    ggplot2_4.0.0  
[13] magrittr_2.0.4  dplyr_1.1.4    

loaded via a namespace (and not attached):
 [1] gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.2     tidyselect_1.2.1  
 [5] snakecase_0.11.1   scales_1.4.0       yaml_2.3.10        fastmap_1.2.0     
 [9] R6_2.6.1           labeling_0.4.3     generics_0.1.4     knitr_1.50        
[13] htmlwidgets_1.6.4  R.cache_0.17.0     pillar_1.11.1      RColorBrewer_1.1-3
[17] tzdb_0.5.0         R.utils_2.13.0     rlang_1.1.6        utf8_1.2.6        
[21] stringi_1.8.7      xfun_0.54          S7_0.2.0           timechange_0.3.0  
[25] cli_3.6.5          withr_3.0.2        digest_0.6.38      grid_4.5.2        
[29] rstudioapi_0.17.1  hms_1.1.4          lifecycle_1.0.4    R.oo_1.27.1       
[33] R.methodsS3_1.8.2  vctrs_0.6.5        evaluate_1.0.5     glue_1.8.0        
[37] farver_2.1.2       codetools_0.2-20   rmarkdown_2.30     tools_4.5.2       
[41] pkgconfig_2.0.3    htmltools_0.5.8.1 

References

Bache, Stefan Milton, and Hadley Wickham. 2014. “Magrittr: A Forward-Pipe Operator for r.” R Package Version 1.5.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.
———. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, 2nd Edition. O’Reilly Media.