suppressPackageStartupMessages(library("modelr"))
package 㤼㸱modelr㤼㸲 was built under R version 3.6.3
suppressPackageStartupMessages(library("tidyverse"))
package 㤼㸱tidyverse㤼㸲 was built under R version 3.6.3
suppressPackageStartupMessages(library("gapminder"))
package 㤼㸱gapminder㤼㸲 was built under R version 3.6.3
Many functions in the stringr
package take a character vector as input and return a list.
str_split(sentences[1:3], " ")
[[1]]
[1] "The" "birch" "canoe" "slid" "on" "the" "smooth" "planks."
[[2]]
[1] "Glue" "the" "sheet" "to" "the" "dark" "blue"
[8] "background."
[[3]]
[1] "It's" "easy" "to" "tell" "the" "depth" "of" "a" "well."
str_match_all(c("abc", "aa", "aabaa", "abbbc"), "a+")
[[1]]
[,1]
[1,] "a"
[[2]]
[,1]
[1,] "aa"
[[3]]
[,1]
[1,] "aa"
[2,] "aa"
[[4]]
[,1]
[1,] "a"
The map()
function takes a vector and always returns a list.
map(1:3, runif)
[[1]]
[1] 0.4197749
[[2]]
[1] 0.6350175 0.5589084
[[3]]
[1] 0.03797927 0.41333901 0.85545111
quantile()
, return multiple values.Some examples of summary functions that return multiple values are the following.
range(mtcars$mpg)
[1] 10.4 33.9
fivenum(mtcars$mpg)
[1] 10.40 15.35 19.20 22.80 33.90
boxplot.stats(mtcars$mpg)
$stats
[1] 10.40 15.35 19.20 22.80 33.90
$n
[1] 32
$conf
[1] 17.11916 21.28084
$out
numeric(0)
quantile()
return that missing piece? Why isn’t that helpful here?mtcars %>%
group_by(cyl) %>%
summarise(q = list(quantile(mpg))) %>%
unnest()
`cols` is now required.
Please use `cols = c(q)`
The particular quantiles of the values are missing, e.g. 0%, 25%, 50%, 75%, 100%. quantile()
returns these in the names of the vector.
quantile(mtcars$mpg)
0% 25% 50% 75% 100%
10.400 15.425 19.200 22.800 33.900
Since the unnest function drops the names of the vector, they aren’t useful here.
mtcars %>%
group_by(cyl) %>%
summarise_each(funs(list))
funs() is soft deprecated as of dplyr 0.8.0
Please use a list of either functions or lambdas:
# Simple named list:
list(mean = mean, median = median)
# Auto named with `tibble::lst()`:
tibble::lst(mean, median)
# Using lambdas
list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
[90mThis warning is displayed once per session.[39m
mtcars %>%
group_by(cyl) %>%
summarise_each(funs(list))
It creates a data frame in which each row corresponds to a value of cyl
, and each observation for each column (other than cyl
) is a vector of all the values of that column for that value of cyl
. It seems like it should be useful to have all the observations of each variable for each group, but off the top of my head, I can’t think of a specific use for this. But, it seems that it may do many things that dplyr::do
does.