Introduction

R provides powerful functions that replace loops by applying a function over a collection of elements. This tutorial covers:

Function Output Type Package
lapply() Always a list base R
sapply() Vector, matrix, or list base R
vapply() Strict type-safe vector base R
mapply() Vector or list base R
map() Always a list purrr

1. lapply() — The Foundation

lapply(X, FUN) applies FUN to each element of X and always returns a list, no matter what.

# Square each number — result is a LIST
numbers <- list(1, 2, 3, 4, 5)

result <- lapply(numbers, function(x) x^2)
result
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 4
#> 
#> [[3]]
#> [1] 9
#> 
#> [[4]]
#> [1] 16
#> 
#> [[5]]
#> [1] 25
# Apply to a character vector — still returns a list
fruits <- c("apple", "banana", "cherry")

result_upper <- lapply(fruits, toupper)
result_upper
#> [[1]]
#> [1] "APPLE"
#> 
#> [[2]]
#> [1] "BANANA"
#> 
#> [[3]]
#> [1] "CHERRY"

Key rule: lapply → always a list. Predictable. Safe.


2. sapply() — Simplified Output

sapply() works like lapply() but tries to simplify the result into a vector or matrix automatically.

# sapply returns a NAMED VECTOR (simpler than a list)
numbers <- 1:5

squared <- sapply(numbers, function(x) x^2)
squared          # plain vector
#> [1]  1  4  9 16 25
class(squared)   # "numeric"
#> [1] "numeric"
# When FUN returns multiple values, sapply makes a MATRIX
stats_result <- sapply(1:4, function(x) c(square = x^2, cube = x^3))
stats_result
#>        [,1] [,2] [,3] [,4]
#> square    1    4    9   16
#> cube      1    8   27   64
# Side-by-side comparison
lapply_out <- lapply(1:3, function(x) x * 10)
sapply_out <- sapply(1:3, function(x) x * 10)

cat("lapply gives:", class(lapply_out), "\n")
#> lapply gives: list
cat("sapply gives:", class(sapply_out), "\n")
#> sapply gives: numeric
sapply_out   # much cleaner!
#> [1] 10 20 30

Key rule: Use sapply() for interactive/exploratory work when you want cleaner output.


3. vapply() — Type-Safe & Reliable

vapply(X, FUN, FUN.VALUE) is like sapply() but you declare the expected output type upfront. This prevents surprises in production code.

# FUN.VALUE = numeric(1) means: each result must be ONE number
numbers <- c(4, 9, 16, 25)

roots <- vapply(numbers, sqrt, FUN.VALUE = numeric(1))
roots
#> [1] 2 3 4 5
# Expect ONE character string per element
fruits <- c("apple", "banana", "cherry")

first_letters <- vapply(fruits, function(x) substr(x, 1, 1),
                        FUN.VALUE = character(1))
first_letters
#>  apple banana cherry 
#>    "a"    "b"    "c"
# vapply CATCHES TYPE ERRORS — sapply would silently return wrong type
mixed_list <- list(1, "two", 3)

# This will throw an error: "two" is not numeric — vapply protects you!
tryCatch(
  vapply(mixed_list, as.numeric, FUN.VALUE = numeric(1)),
  error = function(e) cat("ERROR caught by vapply:", conditionMessage(e), "\n")
)
#> Warning in vapply(mixed_list, as.numeric, FUN.VALUE = numeric(1)): NAs
#> introduced by coercion
#> [1]  1 NA  3

Key rule: Use vapply() in scripts and functions where type correctness matters. It fails loudly instead of silently.


4. mapply() — Multiple Inputs in Parallel

mapply(FUN, ...) is the multivariate version of sapply(). It applies FUN to the corresponding elements of multiple vectors/lists at once.

# Multiply corresponding elements from two vectors
x_vals <- c(1, 2, 3, 4)
y_vals <- c(10, 20, 30, 40)

products <- mapply(function(x, y) x * y, x_vals, y_vals)
products
#> [1]  10  40  90 160
# Build personalised greetings from two vectors
names_vec <- c("Alice", "Bob", "Carol")
ages_vec  <- c(25, 30, 22)

greetings <- mapply(function(name, age) {
  paste0("Hello, ", name, "! You are ", age, " years old.")
}, names_vec, ages_vec)

greetings
#>                                 Alice                                   Bob 
#> "Hello, Alice! You are 25 years old."   "Hello, Bob! You are 30 years old." 
#>                                 Carol 
#> "Hello, Carol! You are 22 years old."
# Classic use: rep() with different times per element
mapply(rep, x = 1:4, times = 4:1)
#> [[1]]
#> [1] 1 1 1 1
#> 
#> [[2]]
#> [1] 2 2 2
#> 
#> [[3]]
#> [1] 3 3
#> 
#> [[4]]
#> [1] 4

Key rule: Use mapply() when your function needs two or more parallel inputs — like zipping multiple vectors together.


5. map() — Modern Tidyverse Style (purrr)

map(.x, .f) from the purrr package is the tidyverse equivalent of lapply(). It always returns a list, but comes with typed variants:

Function Returns
map() list
map_dbl() double vector
map_chr() character vector
map_lgl() logical vector
map_int() integer vector
map2() list (2 inputs)
pmap() list (n inputs)
# Install purrr if needed: install.packages("purrr")
library(purrr)

# map() — always a list (like lapply)
result <- map(1:5, function(x) x^2)
result
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 4
#> 
#> [[3]]
#> [1] 9
#> 
#> [[4]]
#> [1] 16
#> 
#> [[5]]
#> [1] 25
# map_dbl() — returns a numeric vector directly
doubles <- map_dbl(1:5, ~ .x^2)   # ~ .x is purrr's shorthand for function(x)
doubles
#> [1]  1  4  9 16 25
class(doubles)
#> [1] "numeric"
# map_chr() — returns a character vector
labels <- map_chr(1:5, ~ paste0("Item_", .x))
labels
#> [1] "Item_1" "Item_2" "Item_3" "Item_4" "Item_5"
# map2() — like mapply, applies FUN over TWO inputs
prices  <- c(100, 250, 80)
qty     <- c(3, 1, 5)

totals <- map2_dbl(prices, qty, ~ .x * .y)
totals
#> [1] 300 250 400

Key rule: Use map() family in tidyverse pipelines. The typed variants (map_dbl, map_chr, etc.) give you vapply()-style safety with cleaner syntax.


Quick Comparison Summary

# Quick demo: same task done 5 ways
input <- 1:5

cat("lapply  :", unlist(lapply(input, function(x) x * 2)), "\n")
#> lapply  : 2 4 6 8 10
cat("sapply  :", sapply(input, function(x) x * 2),          "\n")
#> sapply  : 2 4 6 8 10
cat("vapply  :", vapply(input, function(x) x * 2, numeric(1)), "\n")
#> vapply  : 2 4 6 8 10
cat("mapply  :", mapply(function(x, y) x * y, input, rep(2,5)), "\n")
#> mapply  : 2 4 6 8 10
cat("map_dbl :", map_dbl(input, ~ .x * 2),                   "\n")
#> map_dbl : 2 4 6 8 10

When to Use Which?

Situation Best Choice
Exploring data interactively sapply()
Need a guaranteed vector/type sapply() or map_dbl/chr()
Writing reusable functions/packages vapply()
Two or more parallel input vectors mapply() or map2() / pmap()
Working in tidyverse/pipe chains map() family (purrr)
Always need a list back lapply() or map()

Conclusion

  • lapply() → safe list output, the building block of the family
  • sapply() → convenient auto-simplification for interactive use
  • vapply() → strict type checking, best for reliable scripts
  • mapply() → parallel iteration over multiple vectors
  • map() → modern purrr approach, integrates with tidyverse pipes

All five functions help you avoid explicit for loops, making your R code shorter, faster, and more readable.


Published with RPubs. Knit this file in RStudio → File → Publish → RPubs.