主要介绍purr,stringr,tidymodel,dplyr ,collapse,nimble这些包。
在purr包中,遍历结果的函数包括map,map2,pmap,invoke_map 和lmap,等等我们首先来看map
map函数通过对列表或原子向量的每个元素应用函数并返回与输入长度相同的对象来转换它们的输入。 map函数总是会返回一个列表,map_lgl(), map_int(), map_dbl() 和 map_chr() 会返回一个原子向量。map_dfr() 和 map_dfc() 将返回数据框。使用的格式如下所示:
map(.x, .f, ...)
map_lgl(.x, .f, ...)
map_chr(.x, .f, ...)
map_int(.x, .f, ...)
map_dbl(.x, .f, ...)
map_raw(.x, .f, ...)
map_dfr(.x, .f, ..., .id = NULL)
map_dfc(.x, .f, ...)
walk(.x, .f, ...)
关于函数中的参数:
.x :一个列表或者一个向量 .f :一个函数,公式或者一个向量
1 向量
library(purrr)
1:5 %>% map(rnorm,n=10)
## [[1]]
## [1] 0.26183272 0.64823698 1.12379658 1.25297323 1.31303127 1.79512995
## [7] 0.04887375 -0.03267512 0.49480743 0.42284264
##
## [[2]]
## [1] 3.1370780 1.8082809 3.4511962 1.6337153 1.6445375 2.0303527 3.9565843
## [8] 1.2004453 1.3725841 0.3694713
##
## [[3]]
## [1] 3.465118 2.398062 2.550844 2.673526 4.893808 4.612892 2.245932 3.725401
## [9] 4.893135 1.000897
##
## [[4]]
## [1] 4.673522 2.810388 4.588789 2.402792 3.456575 3.852705 4.487871 3.410695
## [9] 4.851114 4.277678
##
## [[5]]
## [1] 5.446313 4.067071 6.363740 5.655202 4.452537 5.426301 4.714484 5.854443
## [9] 5.133658 5.429843
上面的代码等价于
1:5 %>%
map(function(x) rnorm(10, x))
同样等价于
1:5 %>%
map(~ rnorm(10, .x))
我们对map函数的输出结果进行汇总
1:5 %>%
map(~ rnorm(10, .x)) %>% map_dbl(mean)
## [1] 1.084967 1.437935 3.124143 4.075858 4.842703
通过map函数,分组构建回归模型
mtcars %>%
split(.$cyl) %>%
map(~ lm(mpg ~ wt, data = .x)) %>%
map(summary)
## $`4`
##
## Call:
## lm(formula = mpg ~ wt, data = .x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1513 -1.9795 -0.6272 1.9299 5.2523
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.571 4.347 9.104 7.77e-06 ***
## wt -5.647 1.850 -3.052 0.0137 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.332 on 9 degrees of freedom
## Multiple R-squared: 0.5086, Adjusted R-squared: 0.454
## F-statistic: 9.316 on 1 and 9 DF, p-value: 0.01374
##
##
## $`6`
##
## Call:
## lm(formula = mpg ~ wt, data = .x)
##
## Residuals:
## Mazda RX4 Mazda RX4 Wag Hornet 4 Drive Valiant Merc 280
## -0.1250 0.5840 1.9292 -0.6897 0.3547
## Merc 280C Ferrari Dino
## -1.0453 -1.0080
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.409 4.184 6.789 0.00105 **
## wt -2.780 1.335 -2.083 0.09176 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.165 on 5 degrees of freedom
## Multiple R-squared: 0.4645, Adjusted R-squared: 0.3574
## F-statistic: 4.337 on 1 and 5 DF, p-value: 0.09176
##
##
## $`8`
##
## Call:
## lm(formula = mpg ~ wt, data = .x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.1491 -1.4664 -0.8458 1.5711 3.7619
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.8680 3.0055 7.942 4.05e-06 ***
## wt -2.1924 0.7392 -2.966 0.0118 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.024 on 12 degrees of freedom
## Multiple R-squared: 0.423, Adjusted R-squared: 0.3749
## F-statistic: 8.796 on 1 and 12 DF, p-value: 0.01179
需要注意的是这里是如何调用公式的。
map2 和map函数类似,但是map2函数可以同时处理两个列表。我们来看一个例子:
x <- list(1, 1, 1)
y <- list(10, 20, 30)
map2(x, y, ~ .x + .y) # 等价于 map2(x, y, `+`)
## [[1]]
## [1] 11
##
## [[2]]
## [1] 21
##
## [[3]]
## [1] 31
我们还可以分组构建模型,并进一步分组预测:
by_cyl <- mtcars %>% split(.$cyl)
mods <- by_cyl %>% map(~ lm(mpg ~ wt, data = .))
map2(mods, by_cyl, predict)
## $`4`
## Datsun 710 Merc 240D Merc 230 Fiat 128 Honda Civic
## 26.47010 21.55719 21.78307 27.14774 30.45125
## Toyota Corolla Toyota Corona Fiat X1-9 Porsche 914-2 Lotus Europa
## 29.20890 25.65128 28.64420 27.48656 31.02725
## Volvo 142E
## 23.87247
##
## $`6`
## Mazda RX4 Mazda RX4 Wag Hornet 4 Drive Valiant Merc 280
## 21.12497 20.41604 19.47080 18.78968 18.84528
## Merc 280C Ferrari Dino
## 18.84528 20.70795
##
## $`8`
## Hornet Sportabout Duster 360 Merc 450SE Merc 450SL
## 16.32604 16.04103 14.94481 15.69024
## Merc 450SLC Cadillac Fleetwood Lincoln Continental Chrysler Imperial
## 15.58061 12.35773 11.97625 12.14945
## Dodge Challenger AMC Javelin Camaro Z28 Pontiac Firebird
## 16.15065 16.33700 15.44907 15.43811
## Ford Pantera L Maserati Bora
## 16.91800 16.04103
这里首先是通过split函数将数据集拆成几个部分。然后通过map函数分组构建回归模型,最后是用map2函数对每一个回归模型进行预测。
如果有多个列表需要同时处理,则需要使用到pmap函数,示例代码如下所示:
x <- rnorm(10)
y <- rnorm(10)
z <- rnorm(10)
a <- rnorm(10)
pmap(list(x, y, z,a), sum)
## [[1]]
## [1] 3.770395
##
## [[2]]
## [1] -1.972205
##
## [[3]]
## [1] -0.2585453
##
## [[4]]
## [1] -0.03204745
##
## [[5]]
## [1] -2.732995
##
## [[6]]
## [1] 0.4160886
##
## [[7]]
## [1] -1.071157
##
## [[8]]
## [1] -0.7131349
##
## [[9]]
## [1] -1.44945
##
## [[10]]
## [1] -2.542039
在这个例子中,我们首先创建了4个列表,我们希望分别计算这四个列表中对应元素的和,这个时候我们就需要使用pmap函数。 另外一种写法是:
pmap(list(x,y,z,a),function(a,b,c,d) a+b+c+d)
## [[1]]
## [1] 3.770395
##
## [[2]]
## [1] -1.972205
##
## [[3]]
## [1] -0.2585453
##
## [[4]]
## [1] -0.03204745
##
## [[5]]
## [1] -2.732995
##
## [[6]]
## [1] 0.4160886
##
## [[7]]
## [1] -1.071157
##
## [[8]]
## [1] -0.7131349
##
## [[9]]
## [1] -1.44945
##
## [[10]]
## [1] -2.542039
map函数以用来遍历数据的,如果想要遍历函数,则需要使用invoke_map函数。首先我们可以通invoke_map过调用一个带有参数列表的函数
list(c("A","B","C"), c("a","b","c")) %>%
invoke(paste, ., sep = "-")
## [1] "A-a" "B-b" "C-c"
如果我们有两个函数,
invoke_map(list(runif, rnorm), list(list(n = 10)))
## [[1]]
## [1] 0.17100382 0.31905889 0.67579353 0.73374089 0.74985077 0.69917388
## [7] 0.11186007 0.08910544 0.95665860 0.22939053
##
## [[2]]
## [1] -0.02762745 -1.73680261 -0.92023588 0.79473687 0.02973006 0.61277657
## [7] 0.01774265 -1.10955484 0.91303617 -0.56892899
map函数有几个拓展函数,map_if,map_at 和map_depth 。
map_if函数会首先进行判断,然后再调用相应的函数,我们看一个例子
iris %>% map_if(is.factor,as.character,.else = as.integer)
## $Sepal.Length
## [1] 5 4 4 4 5 5 4 5 4 4 5 4 4 4 5 5 5 5 5 5 5 5 4 5 4 5 5 5 5 4 4 5 5 5 4 5 5
## [38] 4 4 5 5 4 4 5 5 4 5 4 5 5 7 6 6 5 6 5 6 4 6 5 5 5 6 6 5 6 5 5 6 5 5 6 6 6
## [75] 6 6 6 6 6 5 5 5 5 6 5 6 6 6 5 5 5 6 5 5 5 5 5 6 5 5 6 5 7 6 6 7 4 7 6 7 6
## [112] 6 6 5 5 6 6 7 7 6 6 5 7 6 6 7 6 6 6 7 7 7 6 6 6 7 6 6 6 6 6 6 5 6 6 6 6 6
## [149] 6 5
##
## $Sepal.Width
## [1] 3 3 3 3 3 3 3 3 2 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3
## [38] 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 2 2 2 3 2 2 2 3 3 2 2 2 3 2 2 2
## [75] 2 3 2 3 2 2 2 2 2 2 3 3 3 2 3 2 2 3 2 2 2 3 2 2 2 2 3 2 3 2 3 3 2 2 2 3 3
## [112] 2 3 2 2 3 3 3 2 2 3 2 2 2 3 3 2 3 2 3 2 3 2 2 2 3 3 3 3 3 3 3 2 3 3 3 2 3
## [149] 3 3
##
## $Petal.Length
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 3 4 3 3 4 4 4 3 4 4 4 4 3 4 4 4 4
## [75] 4 4 4 5 4 3 3 3 3 5 4 4 4 4 4 4 4 4 4 3 4 4 4 4 3 4 6 5 5 5 5 6 4 6 5 6 5
## [112] 5 5 5 5 5 5 6 6 5 5 4 6 4 5 6 4 4 5 5 6 6 5 5 5 6 5 5 4 5 5 5 5 5 5 5 5 5
## [149] 5 5
##
## $Petal.Width
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 2 2 1 1 1 2 2
## [112] 1 2 2 2 2 1 2 2 1 2 2 2 1 2 1 1 1 2 1 1 2 2 1 1 2 2 1 1 2 2 2 1 2 2 2 1 2
## [149] 2 1
##
## $Species
## [1] "setosa" "setosa" "setosa" "setosa" "setosa"
## [6] "setosa" "setosa" "setosa" "setosa" "setosa"
## [11] "setosa" "setosa" "setosa" "setosa" "setosa"
## [16] "setosa" "setosa" "setosa" "setosa" "setosa"
## [21] "setosa" "setosa" "setosa" "setosa" "setosa"
## [26] "setosa" "setosa" "setosa" "setosa" "setosa"
## [31] "setosa" "setosa" "setosa" "setosa" "setosa"
## [36] "setosa" "setosa" "setosa" "setosa" "setosa"
## [41] "setosa" "setosa" "setosa" "setosa" "setosa"
## [46] "setosa" "setosa" "setosa" "setosa" "setosa"
## [51] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [56] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [61] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [66] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [71] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [76] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [81] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [86] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [91] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [96] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [101] "virginica" "virginica" "virginica" "virginica" "virginica"
## [106] "virginica" "virginica" "virginica" "virginica" "virginica"
## [111] "virginica" "virginica" "virginica" "virginica" "virginica"
## [116] "virginica" "virginica" "virginica" "virginica" "virginica"
## [121] "virginica" "virginica" "virginica" "virginica" "virginica"
## [126] "virginica" "virginica" "virginica" "virginica" "virginica"
## [131] "virginica" "virginica" "virginica" "virginica" "virginica"
## [136] "virginica" "virginica" "virginica" "virginica" "virginica"
## [141] "virginica" "virginica" "virginica" "virginica" "virginica"
## [146] "virginica" "virginica" "virginica" "virginica" "virginica"
上面的代码含义是,如果这一列是因子类型的,那么将这一列转变成为字符类型,如果不是因子类型的转变成为整数类型。
map_at函数可以处理具体位置的数据,我们看几个例子:
iris %>% map_at(c(4, 5), is.numeric)
## $Sepal.Length
## [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
## [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
## [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
## [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
## [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
## [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9
##
## $Sepal.Width
## [1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5
## [19] 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2
## [37] 3.5 3.6 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3
## [55] 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8
## [73] 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5
## [91] 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9
## [109] 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2
## [127] 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2
## [145] 3.3 3.0 2.5 3.0 3.4 3.0
##
## $Petal.Length
## [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
## [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
## [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
## [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
## [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
## [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
##
## $Petal.Width
## [1] TRUE
##
## $Species
## [1] FALSE
这个函数指定了判断第四列和第五列是不是数值类型的。除了是用列数,还可以使用列名。
iris %>% map_at("Species", tolower)
## $Sepal.Length
## [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
## [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
## [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
## [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
## [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
## [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9
##
## $Sepal.Width
## [1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5
## [19] 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2
## [37] 3.5 3.6 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3
## [55] 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8
## [73] 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5
## [91] 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9
## [109] 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2
## [127] 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2
## [145] 3.3 3.0 2.5 3.0 3.4 3.0
##
## $Petal.Length
## [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
## [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
## [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
## [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
## [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
## [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
##
## $Petal.Width
## [1] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3
## [19] 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2
## [37] 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2 1.4 1.5 1.5 1.3
## [55] 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3
## [73] 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3
## [91] 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3 2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8
## [109] 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8
## [127] 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3
## [145] 2.5 2.3 1.9 2.0 2.3 1.8
##
## $Species
## [1] "setosa" "setosa" "setosa" "setosa" "setosa"
## [6] "setosa" "setosa" "setosa" "setosa" "setosa"
## [11] "setosa" "setosa" "setosa" "setosa" "setosa"
## [16] "setosa" "setosa" "setosa" "setosa" "setosa"
## [21] "setosa" "setosa" "setosa" "setosa" "setosa"
## [26] "setosa" "setosa" "setosa" "setosa" "setosa"
## [31] "setosa" "setosa" "setosa" "setosa" "setosa"
## [36] "setosa" "setosa" "setosa" "setosa" "setosa"
## [41] "setosa" "setosa" "setosa" "setosa" "setosa"
## [46] "setosa" "setosa" "setosa" "setosa" "setosa"
## [51] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [56] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [61] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [66] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [71] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [76] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [81] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [86] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [91] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [96] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [101] "virginica" "virginica" "virginica" "virginica" "virginica"
## [106] "virginica" "virginica" "virginica" "virginica" "virginica"
## [111] "virginica" "virginica" "virginica" "virginica" "virginica"
## [116] "virginica" "virginica" "virginica" "virginica" "virginica"
## [121] "virginica" "virginica" "virginica" "virginica" "virginica"
## [126] "virginica" "virginica" "virginica" "virginica" "virginica"
## [131] "virginica" "virginica" "virginica" "virginica" "virginica"
## [136] "virginica" "virginica" "virginica" "virginica" "virginica"
## [141] "virginica" "virginica" "virginica" "virginica" "virginica"
## [146] "virginica" "virginica" "virginica" "virginica" "virginica"
上面的代码表示将Species列的值全部变成小写的。
x <- list(a = list(foo = 1:2, bar = 3:4), b = list(baz = 5:6))
str(x)
## List of 2
## $ a:List of 2
## ..$ foo: int [1:2] 1 2
## ..$ bar: int [1:2] 3 4
## $ b:List of 1
## ..$ baz: int [1:2] 5 6
map_depth(x, 2, paste, collapse = "/")
## $a
## $a$foo
## [1] "1/2"
##
## $a$bar
## [1] "3/4"
##
##
## $b
## $b$baz
## [1] "5/6"
因为列表是可以嵌套的,通过map_depth函数可以处理嵌套的列表,上面的代码中,2表示的是第二层的列表。
选取或者删除列表中的元素是关于数据处理的常用的操作。通过pluck函数可以轻松的对数据进行选取,选取的方式有两种,通过元素位置或者名字,我们来看一个例子。
obj1 <- list("A", list(1, nam = "a"))
obj2 <- list("B", list(2, nam = "b"))
x <- list(obj1, obj2)
pluck(x, 1) # 等价于x[[1]]
## [[1]]
## [1] "A"
##
## [[2]]
## [[2]][[1]]
## [1] 1
##
## [[2]]$nam
## [1] "a"
上面的代码中,首先创建了一个数据集,然后使用pluck函数选取了第一层的第一个元素。同样,我们还可以选取更深层级的元素。
pluck(x,1,2) # 等价于x[[1]][[2]]
## [[1]]
## [1] 1
##
## $nam
## [1] "a"
从结果中可以看到, 这里还嵌套了一个列表,并且,有一个元素是有名字的,对于有名字的元素可以使用名字来获取。
pluck(x,1,2,"nam")
## [1] "a"
需要注意的是,chuck函数可以实现pluck函数一样的事情,但是这两个函数的不同点在于当一个元素不存在时,pull()总是返回NULL,在这种情况下chuck()总是抛出一个错误。另外,通过pluck函数还可以实现对于原始数据的修改。
pluck(x,1,2,"nam") <- "c"
x
## [[1]]
## [[1]][[1]]
## [1] "A"
##
## [[1]][[2]]
## [[1]][[2]][[1]]
## [1] 1
##
## [[1]][[2]]$nam
## [1] "c"
##
##
##
## [[2]]
## [[2]][[1]]
## [1] "B"
##
## [[2]][[2]]
## [[2]][[2]][[1]]
## [1] 2
##
## [[2]][[2]]$nam
## [1] "b"
可以看到,原始数据中,a变成了c。pluck函数能够通过位置或者名称索引数据,如果想要通过一个条件来筛选数据,那么就需要使用keep函数。
list(1:2,2:3,3:4,4:5,5:6,6:7)%>%
keep(function(x) mean(x) > 6) # 等价于 keep(~ mean(.x) > 6)
## [[1]]
## [1] 6 7
在这个例子中,我们首先创建了一个列表,然后对列表进行一个判断,如果列表的中元素的平均值大于6,那么保留对应的元素。那么如果想要删除满足条件的函数,则需要使用discard函数
list(1:2,2:3,3:4,4:5,5:6,6:7)%>%
discard(function(x) mean(x) > 6)
## [[1]]
## [1] 1 2
##
## [[2]]
## [1] 2 3
##
## [[3]]
## [1] 3 4
##
## [[4]]
## [1] 4 5
##
## [[5]]
## [1] 5 6
如果我们想要删除列表中那些空的元素,那么可以使用compact函数,示例代码如下所示:
list(a=1,b=NULL,c=list(),d=NA) %>% compact()
## $a
## [1] 1
##
## $d
## [1] NA
compact函数只会删除掉NULL和控list,不会处理NA。head_while是另外一个非常好用的函数,这个函数会返回列表的便利结果,直到元素不满足条件,我们来看一个例子,示例代码如下所示。
pos <- function(x) x >= 0
head_while(5:-5, pos)
## [1] 5 4 3 2 1 0
tail_while(5:-5, negate(pos))
## [1] -1 -2 -3 -4 -5
head_while或者tail_while相当于一个for循环,如果满足条件,那么返回对应的结果,如果不满足条件,那么就停止。
flatten函数可以将嵌套的列表展开
x <- rerun(2, sample(4))
x
## [[1]]
## [1] 1 4 3 2
##
## [[2]]
## [1] 1 2 4 3
首先我们创建一个列表,这个列表嵌套了两个列表。
x %>% flatten()
## [[1]]
## [1] 1
##
## [[2]]
## [1] 4
##
## [[3]]
## [1] 3
##
## [[4]]
## [1] 2
##
## [[5]]
## [1] 1
##
## [[6]]
## [1] 2
##
## [[7]]
## [1] 4
##
## [[8]]
## [1] 3
从结果中可以看到,数据似乎被"拉长了" 。另外,使用transpose函数可以转换数据格式。
x <- rerun(5, x = runif(1), y = runif(5))
x
## [[1]]
## [[1]]$x
## [1] 0.09884474
##
## [[1]]$y
## [1] 0.16572093 0.23756631 0.07406567 0.27570395 0.65871268
##
##
## [[2]]
## [[2]]$x
## [1] 0.9270122
##
## [[2]]$y
## [1] 0.67392157 0.83594149 0.01376925 0.29699908 0.21493212
##
##
## [[3]]
## [[3]]$x
## [1] 0.6002352
##
## [[3]]$y
## [1] 0.37654226 0.70192682 0.02910238 0.42529595 0.81272848
##
##
## [[4]]
## [[4]]$x
## [1] 0.7588233
##
## [[4]]$y
## [1] 0.2300723 0.7944437 0.7304211 0.5757704 0.1007236
##
##
## [[5]]
## [[5]]$x
## [1] 0.3424634
##
## [[5]]$y
## [1] 0.7198985 0.2464443 0.5224496 0.1382845 0.8438813
这里我们首先使用rerun生成一个列表,rerun函数的作用类似于rep函数,会重复生成数据。然后我们使用transpose函数
x %>% transpose()
## $x
## $x[[1]]
## [1] 0.09884474
##
## $x[[2]]
## [1] 0.9270122
##
## $x[[3]]
## [1] 0.6002352
##
## $x[[4]]
## [1] 0.7588233
##
## $x[[5]]
## [1] 0.3424634
##
##
## $y
## $y[[1]]
## [1] 0.16572093 0.23756631 0.07406567 0.27570395 0.65871268
##
## $y[[2]]
## [1] 0.67392157 0.83594149 0.01376925 0.29699908 0.21493212
##
## $y[[3]]
## [1] 0.37654226 0.70192682 0.02910238 0.42529595 0.81272848
##
## $y[[4]]
## [1] 0.2300723 0.7944437 0.7304211 0.5757704 0.1007236
##
## $y[[5]]
## [1] 0.7198985 0.2464443 0.5224496 0.1382845 0.8438813
可以看到,x和y对应的数据分别被放到一个子列表下面去了。
y <- list(0:10, 5.5)
y %>% every(is.numeric)
## [1] TRUE
y %>% every(is.integer)
## [1] FALSE
y %>% some(is.integer)
## [1] TRUE
y %>% none(is.character)
## [1] TRUE
x <- list(1:10, 5, 9.9)
x %>% has_element(1:10)
## [1] TRUE
x %>% has_element(3)
## [1] FALSE
is_even <- function(x) x %% 2 == 0
3:10 %>% detect(is_even)
## [1] 4
3:10 %>% detect_index(is_even)
## [1] 2
x <- list(
list(),
list(list()),
list(list(list(1)))
)
vec_depth(x)
## [1] 5
x %>% map_int(vec_depth)
## [1] 1 2 4
append(1:5, 0:1, after = 3)
## [1] 1 2 3 0 1 4 5
x <- as.list(1:3)
x %>% append("a")
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
##
## [[4]]
## [1] "a"
x %>% prepend("a")
## [[1]]
## [1] "a"
##
## [[2]]
## [1] 1
##
## [[3]]
## [1] 2
##
## [[4]]
## [1] 3
inputs <- list(arg1 = "a", arg2 = "b")
splice(inputs, arg3 = c("c1", "c2"),inputs)
## $arg1
## [1] "a"
##
## $arg2
## [1] "b"
##
## $arg3
## [1] "c1" "c2"
##
## $arg1
## [1] "a"
##
## $arg2
## [1] "b"
mtcars %>% modify_at(c(1, 4, 5), as.character) # 也可以使用变量名mtcars %>% modify_at(c("cyl", "am"), as.character)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21 6 160.0 110 3.9 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21 6 160.0 110 3.9 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.7 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
iris %>%
modify_if(is.factor, as.character)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
## 51 7.0 3.2 4.7 1.4 versicolor
## 52 6.4 3.2 4.5 1.5 versicolor
## 53 6.9 3.1 4.9 1.5 versicolor
## 54 5.5 2.3 4.0 1.3 versicolor
## 55 6.5 2.8 4.6 1.5 versicolor
## 56 5.7 2.8 4.5 1.3 versicolor
## 57 6.3 3.3 4.7 1.6 versicolor
## 58 4.9 2.4 3.3 1.0 versicolor
## 59 6.6 2.9 4.6 1.3 versicolor
## 60 5.2 2.7 3.9 1.4 versicolor
## 61 5.0 2.0 3.5 1.0 versicolor
## 62 5.9 3.0 4.2 1.5 versicolor
## 63 6.0 2.2 4.0 1.0 versicolor
## 64 6.1 2.9 4.7 1.4 versicolor
## 65 5.6 2.9 3.6 1.3 versicolor
## 66 6.7 3.1 4.4 1.4 versicolor
## 67 5.6 3.0 4.5 1.5 versicolor
## 68 5.8 2.7 4.1 1.0 versicolor
## 69 6.2 2.2 4.5 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 71 5.9 3.2 4.8 1.8 versicolor
## 72 6.1 2.8 4.0 1.3 versicolor
## 73 6.3 2.5 4.9 1.5 versicolor
## 74 6.1 2.8 4.7 1.2 versicolor
## 75 6.4 2.9 4.3 1.3 versicolor
## 76 6.6 3.0 4.4 1.4 versicolor
## 77 6.8 2.8 4.8 1.4 versicolor
## 78 6.7 3.0 5.0 1.7 versicolor
## 79 6.0 2.9 4.5 1.5 versicolor
## 80 5.7 2.6 3.5 1.0 versicolor
## 81 5.5 2.4 3.8 1.1 versicolor
## 82 5.5 2.4 3.7 1.0 versicolor
## 83 5.8 2.7 3.9 1.2 versicolor
## 84 6.0 2.7 5.1 1.6 versicolor
## 85 5.4 3.0 4.5 1.5 versicolor
## 86 6.0 3.4 4.5 1.6 versicolor
## 87 6.7 3.1 4.7 1.5 versicolor
## 88 6.3 2.3 4.4 1.3 versicolor
## 89 5.6 3.0 4.1 1.3 versicolor
## 90 5.5 2.5 4.0 1.3 versicolor
## 91 5.5 2.6 4.4 1.2 versicolor
## 92 6.1 3.0 4.6 1.4 versicolor
## 93 5.8 2.6 4.0 1.2 versicolor
## 94 5.0 2.3 3.3 1.0 versicolor
## 95 5.6 2.7 4.2 1.3 versicolor
## 96 5.7 3.0 4.2 1.2 versicolor
## 97 5.7 2.9 4.2 1.3 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 99 5.1 2.5 3.0 1.1 versicolor
## 100 5.7 2.8 4.1 1.3 versicolor
## 101 6.3 3.3 6.0 2.5 virginica
## 102 5.8 2.7 5.1 1.9 virginica
## 103 7.1 3.0 5.9 2.1 virginica
## 104 6.3 2.9 5.6 1.8 virginica
## 105 6.5 3.0 5.8 2.2 virginica
## 106 7.6 3.0 6.6 2.1 virginica
## 107 4.9 2.5 4.5 1.7 virginica
## 108 7.3 2.9 6.3 1.8 virginica
## 109 6.7 2.5 5.8 1.8 virginica
## 110 7.2 3.6 6.1 2.5 virginica
## 111 6.5 3.2 5.1 2.0 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 113 6.8 3.0 5.5 2.1 virginica
## 114 5.7 2.5 5.0 2.0 virginica
## 115 5.8 2.8 5.1 2.4 virginica
## 116 6.4 3.2 5.3 2.3 virginica
## 117 6.5 3.0 5.5 1.8 virginica
## 118 7.7 3.8 6.7 2.2 virginica
## 119 7.7 2.6 6.9 2.3 virginica
## 120 6.0 2.2 5.0 1.5 virginica
## 121 6.9 3.2 5.7 2.3 virginica
## 122 5.6 2.8 4.9 2.0 virginica
## 123 7.7 2.8 6.7 2.0 virginica
## 124 6.3 2.7 4.9 1.8 virginica
## 125 6.7 3.3 5.7 2.1 virginica
## 126 7.2 3.2 6.0 1.8 virginica
## 127 6.2 2.8 4.8 1.8 virginica
## 128 6.1 3.0 4.9 1.8 virginica
## 129 6.4 2.8 5.6 2.1 virginica
## 130 7.2 3.0 5.8 1.6 virginica
## 131 7.4 2.8 6.1 1.9 virginica
## 132 7.9 3.8 6.4 2.0 virginica
## 133 6.4 2.8 5.6 2.2 virginica
## 134 6.3 2.8 5.1 1.5 virginica
## 135 6.1 2.6 5.6 1.4 virginica
## 136 7.7 3.0 6.1 2.3 virginica
## 137 6.3 3.4 5.6 2.4 virginica
## 138 6.4 3.1 5.5 1.8 virginica
## 139 6.0 3.0 4.8 1.8 virginica
## 140 6.9 3.1 5.4 2.1 virginica
## 141 6.7 3.1 5.6 2.4 virginica
## 142 6.9 3.1 5.1 2.3 virginica
## 143 5.8 2.7 5.1 1.9 virginica
## 144 6.8 3.2 5.9 2.3 virginica
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
x <- c(foo = 1L, bar = 2L)
y <- c(TRUE, FALSE)
modify2(x, y, ~ if (.y) .x else 0L)
## foo bar
## 1 0
l1 <- list(
obj1 = list(
prop1 = list(param1 = 1:2, param2 = 3:4),
prop2 = list(param1 = 5:6, param2 = 7:8)
),
obj2 = list(
prop1 = list(param1 = 9:10, param2 = 11:12),
prop2 = list(param1 = 12:14, param2 = 15:17)
)
)
l1 %>% modify_depth(3, sum)
## $obj1
## $obj1$prop1
## $obj1$prop1$param1
## [1] 3
##
## $obj1$prop1$param2
## [1] 7
##
##
## $obj1$prop2
## $obj1$prop2$param1
## [1] 11
##
## $obj1$prop2$param2
## [1] 15
##
##
##
## $obj2
## $obj2$prop1
## $obj2$prop1$param1
## [1] 19
##
## $obj2$prop1$param2
## [1] 23
##
##
## $obj2$prop2
## $obj2$prop2$param1
## [1] 39
##
## $obj2$prop2$param2
## [1] 48
l1 %>% modify_depth(3, `+`, 100L)
## $obj1
## $obj1$prop1
## $obj1$prop1$param1
## [1] 101 102
##
## $obj1$prop1$param2
## [1] 103 104
##
##
## $obj1$prop2
## $obj1$prop2$param1
## [1] 105 106
##
## $obj1$prop2$param2
## [1] 107 108
##
##
##
## $obj2
## $obj2$prop1
## $obj2$prop1$param1
## [1] 109 110
##
## $obj2$prop1$param2
## [1] 111 112
##
##
## $obj2$prop2
## $obj2$prop2$param1
## [1] 112 113 114
##
## $obj2$prop2$param2
## [1] 115 116 117
Array_branch()和array_tree()通过将数组转换为列表,使数组能够与purrr的函数一起使用。这种强制的细节受到旁注论证的控制。Array_tree()创建一个分层列表(树),它的级别与margin中指定的维度一样多,而array_branch()则沿着所有提到的维度创建一个平面列表(类似于一个分支)。
x <- array(1:12, c(2, 2, 3))
array_branch(x)
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
##
## [[4]]
## [1] 4
##
## [[5]]
## [1] 5
##
## [[6]]
## [1] 6
##
## [[7]]
## [1] 7
##
## [[8]]
## [1] 8
##
## [[9]]
## [1] 9
##
## [[10]]
## [1] 10
##
## [[11]]
## [1] 11
##
## [[12]]
## [1] 12
array_branch(x, 1)
## [[1]]
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 3 7 11
##
## [[2]]
## [,1] [,2] [,3]
## [1,] 2 6 10
## [2,] 4 8 12
array_tree(x)
## [[1]]
## [[1]][[1]]
## [[1]][[1]][[1]]
## [1] 1
##
## [[1]][[1]][[2]]
## [1] 5
##
## [[1]][[1]][[3]]
## [1] 9
##
##
## [[1]][[2]]
## [[1]][[2]][[1]]
## [1] 3
##
## [[1]][[2]][[2]]
## [1] 7
##
## [[1]][[2]][[3]]
## [1] 11
##
##
##
## [[2]]
## [[2]][[1]]
## [[2]][[1]][[1]]
## [1] 2
##
## [[2]][[1]][[2]]
## [1] 6
##
## [[2]][[1]][[3]]
## [1] 10
##
##
## [[2]][[2]]
## [[2]][[2]][[1]]
## [1] 4
##
## [[2]][[2]][[2]]
## [1] 8
##
## [[2]][[2]][[3]]
## [1] 12
Cross2()返回.x和.y元素的乘积集。Cross3()接受一个额外的.z参数。Cross()接受list .l并返回列表中所有元素的笛卡尔积,其中包含一个元素组合。Cross_df()类似于cross(),但返回一个按行组合的数据帧。
data <- list(
id = c("John", "Jane"),
greeting = c("Hello.", "Bonjour."),
sep = c("! ", "... ")
)
data %>%
cross()
## [[1]]
## [[1]]$id
## [1] "John"
##
## [[1]]$greeting
## [1] "Hello."
##
## [[1]]$sep
## [1] "! "
##
##
## [[2]]
## [[2]]$id
## [1] "Jane"
##
## [[2]]$greeting
## [1] "Hello."
##
## [[2]]$sep
## [1] "! "
##
##
## [[3]]
## [[3]]$id
## [1] "John"
##
## [[3]]$greeting
## [1] "Bonjour."
##
## [[3]]$sep
## [1] "! "
##
##
## [[4]]
## [[4]]$id
## [1] "Jane"
##
## [[4]]$greeting
## [1] "Bonjour."
##
## [[4]]$sep
## [1] "! "
##
##
## [[5]]
## [[5]]$id
## [1] "John"
##
## [[5]]$greeting
## [1] "Hello."
##
## [[5]]$sep
## [1] "... "
##
##
## [[6]]
## [[6]]$id
## [1] "Jane"
##
## [[6]]$greeting
## [1] "Hello."
##
## [[6]]$sep
## [1] "... "
##
##
## [[7]]
## [[7]]$id
## [1] "John"
##
## [[7]]$greeting
## [1] "Bonjour."
##
## [[7]]$sep
## [1] "... "
##
##
## [[8]]
## [[8]]$id
## [1] "Jane"
##
## [[8]]$greeting
## [1] "Bonjour."
##
## [[8]]$sep
## [1] "... "
args <- data %>% cross_df()
args
## # A tibble: 8 x 3
## id greeting sep
## <chr> <chr> <chr>
## 1 John Hello. "! "
## 2 Jane Hello. "! "
## 3 John Bonjour. "! "
## 4 Jane Bonjour. "! "
## 5 John Hello. "... "
## 6 Jane Hello. "... "
## 7 John Bonjour. "... "
## 8 Jane Bonjour. "... "
filter <- function(x, y) x >= y
cross2(1:5, 1:5, .filter = filter) %>% str()
## List of 10
## $ :List of 2
## ..$ : int 1
## ..$ : int 2
## $ :List of 2
## ..$ : int 1
## ..$ : int 3
## $ :List of 2
## ..$ : int 2
## ..$ : int 3
## $ :List of 2
## ..$ : int 1
## ..$ : int 4
## $ :List of 2
## ..$ : int 2
## ..$ : int 4
## $ :List of 2
## ..$ : int 3
## ..$ : int 4
## $ :List of 2
## ..$ : int 1
## ..$ : int 5
## $ :List of 2
## ..$ : int 2
## ..$ : int 5
## $ :List of 2
## ..$ : int 3
## ..$ : int 5
## $ :List of 2
## ..$ : int 4
## ..$ : int 5
seq_len(3) %>%
cross2(., ., .filter = `==`) %>%
map(setNames, c("x", "y"))
## [[1]]
## [[1]]$x
## [1] 2
##
## [[1]]$y
## [1] 1
##
##
## [[2]]
## [[2]]$x
## [1] 3
##
## [[2]]$y
## [1] 1
##
##
## [[3]]
## [[3]]$x
## [1] 1
##
## [[3]]$y
## [1] 2
##
##
## [[4]]
## [[4]]$x
## [1] 3
##
## [[4]]$y
## [1] 2
##
##
## [[5]]
## [[5]]$x
## [1] 1
##
## [[5]]$y
## [1] 3
##
##
## [[6]]
## [[6]]$x
## [1] 2
##
## [[6]]$y
## [1] 3
seq_len(3) %>%
list(x = ., y = .) %>%
cross(.filter = `==`)
## [[1]]
## [[1]]$x
## [1] 2
##
## [[1]]$y
## [1] 1
##
##
## [[2]]
## [[2]]$x
## [1] 3
##
## [[2]]$y
## [1] 1
##
##
## [[3]]
## [[3]]$x
## [1] 1
##
## [[3]]$y
## [1] 2
##
##
## [[4]]
## [[4]]$x
## [1] 3
##
## [[4]]$y
## [1] 2
##
##
## [[5]]
## [[5]]$x
## [1] 1
##
## [[5]]$y
## [1] 3
##
##
## [[6]]
## [[6]]$x
## [1] 2
##
## [[6]]$y
## [1] 3
Reduce()是将vector的元素组合成单个值的操作。该组合由.f驱动,这是一个二进制函数,它接受两个值并返回一个值:将f减为1:3计算值f(f(1,2), 3)。
1:3 %>% reduce(`+`)
## [1] 6
paste2 <- function(x, y, sep = ".") paste(x, y, sep = sep)
letters[1:4] %>% reduce(paste2)
## [1] "a.b.c.d"
letters[1:4] %>% reduce2(c("-", ".", "-"), paste2)
## [1] "a-b.c-d"
x <- list(c(0, 1), c(2, 3), c(4, 5))
y <- list(c(6, 7), c(8, 9))
reduce2(x, y, paste)
## [1] "0 2 6 4 8" "1 3 7 5 9"
Accumulate()将一个包含2个参数的函数依次应用于向量的元素。该函数的每个应用程序都使用前一个应用程序的初始值或结果作为第一个参数。第二个参数是向量的下一个值。每个应用程序的结果以列表的形式返回。在处理整个向量之前,积累可以选择性地终止,以响应由积累函数返回的done()信号。
与accumulate()相反,reduce()以同样的方式应用2个参数的函数,但是丢弃除最终函数应用程序之外的所有结果。
Accumulate2()将一个函数依次应用于.x和.y两个列表中的元素。
1:5 %>% accumulate(`+`)
## [1] 1 3 6 10 15
accumulate(letters[1:5], paste, sep = ".")
## [1] "a" "a.b" "a.b.c" "a.b.c.d" "a.b.c.d.e"
accumulate(letters[1:5], paste, sep = ".",.dir = "backward")
## [1] "a.b.c.d.e" "b.c.d.e" "c.d.e" "d.e" "e"
paste2 <- function(x, y, sep = ".") paste(x, y, sep = sep)
letters[1:4] %>% accumulate(paste2)
## [1] "a" "a.b" "a.b.c" "a.b.c.d"
letters[1:4] %>% accumulate2(c("-", ".", "-"), paste2)
## [[1]]
## [1] "a"
##
## [[2]]
## [1] "a-b"
##
## [[3]]
## [1] "a-b.c"
##
## [[4]]
## [1] "a-b.c-d"
组合多个函数
add1 <- function(x) x + 1
compose(add1, add1)(8)
## [1] 10
fn <- compose(~ paste(.x, "foo"), ~ paste(.x, "bar"))
fn("input")
## [1] "input bar foo"
更改函数所接受的输入类型。
x <- list(x = c(1:100, NA, 1000), na.rm = TRUE, trim = 0.9)
lift_dl(mean)(x)
## [1] 51
lift(mean)(x)
## [1] 51
rerun 会重复一个表达式n次
10 %>% rerun(rnorm(5))
## [[1]]
## [1] -0.9056931 1.6306982 1.6524178 0.4251034 1.5473966
##
## [[2]]
## [1] -1.358256 -1.099187 1.141827 1.520544 -1.238645
##
## [[3]]
## [1] -1.2699996 -0.9242835 -0.7316490 -1.5328326 1.1109396
##
## [[4]]
## [1] -0.9821572 -0.5997459 -1.5452152 -0.7241080 0.5180241
##
## [[5]]
## [1] -0.4849522 -1.1122814 0.3323760 -1.7095975 0.8948831
##
## [[6]]
## [1] -0.61528961 0.86061005 -0.08723839 -0.92864749 -0.88771569
##
## [[7]]
## [1] 0.7054922 -0.8543814 0.6537769 -0.3061426 0.4992769
##
## [[8]]
## [1] -0.4998756 -1.4026105 1.9358640 -1.4204936 -1.4207250
##
## [[9]]
## [1] -0.2076087 0.7225889 1.0549358 -0.1766170 1.0105946
##
## [[10]]
## [1] 0.02485606 -0.04419857 1.59732266 1.24072346 -0.76393331
将一个函数改为她的否定
is.na(NA)
## [1] TRUE
negate(is.na)(NA)
## [1] FALSE
创建一个函数的版本,它的一些参数预先设置为值。
my_long_variable <- 1:10
plot2 <- partial(plot, my_long_variable)
plot2()
plot2(runif(10), type = "l")
也就是说,partial可以创建关于函数的函数
safely,quietly,possibly ,auto_browse 函数,可以对现有的函数进行封装,从而
safe_log <- safely(log)
safe_log(10)
## $result
## [1] 2.302585
##
## $error
## NULL
safe_log("a")
## $result
## NULL
##
## $error
## <simpleError in .Primitive("log")(x, base): non-numeric argument to mathematical function>
list("a", 10, 100) %>%
map_dbl(possibly(log, 0))
## [1] 0.000000 2.302585 4.605170
可以看到,这里将报错的结果修改为了0
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ dplyr 1.0.5
## ✓ tibble 3.1.0 ✓ stringr 1.4.0
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
n_iris <- iris %>% group_by(Species) %>% nest()
n_iris %>% unnest()
## Warning: `cols` is now required when using unnest().
## Please use `cols = c(data)`
## # A tibble: 150 x 5
## # Groups: Species [3]
## Species Sepal.Length Sepal.Width Petal.Length Petal.Width
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 setosa 5.1 3.5 1.4 0.2
## 2 setosa 4.9 3 1.4 0.2
## 3 setosa 4.7 3.2 1.3 0.2
## 4 setosa 4.6 3.1 1.5 0.2
## 5 setosa 5 3.6 1.4 0.2
## 6 setosa 5.4 3.9 1.7 0.4
## 7 setosa 4.6 3.4 1.4 0.3
## 8 setosa 5 3.4 1.5 0.2
## 9 setosa 4.4 2.9 1.4 0.2
## 10 setosa 4.9 3.1 1.5 0.1
## # … with 140 more rows
1 make a list column
n_iris <- iris %>% group_by(Species) %>% nest()
2 work with list column
mod_fun <- function(df) lm(Sepal.Length ~ ., data = df)
m_iris <- n_iris %>%
mutate(model = map(data, mod_fun))
3 simplify the list column
b_fun <- function(mod) coefficients(mod)[[1]]
m_iris %>% transmute(Species, beta = map_dbl(model, b_fun))
## # A tibble: 3 x 2
## # Groups: Species [3]
## Species beta
## <fct> <dbl>
## 1 setosa 2.35
## 2 versicolor 1.90
## 3 virginica 0.700
## install.packages("miniCRAN")
library(miniCRAN)
tags <- "xts"
pkgDep(tags, availPkgs = cranJuly2014)
## [1] "xts" "zoo" "lattice" "timeDate" "quadprog"
## [6] "Hmisc" "survival" "Formula" "latticeExtra" "cluster"
## [11] "RColorBrewer" "BH" "timeSeries" "tseries" "its"
## [16] "chron" "fts" "tis"
画出依赖关系图。
dg <- makeDepGraph(tags, enhances = TRUE, availPkgs = cranJuly2014)
plot(dg, legendPosition = c(-1, 1), vertex.size = 20)
repos<-getOption("repos")
repos
curl<-contrib.url(repos)
aps<-available.packages(curl)
aps[which(row.names(aps)=='xts'),]
从所有的软件包中,找到对应软件依赖包的列表。
要找到xts和TTR软件的依赖包
libs<-c("arules","rmarkdown","rJava","tidyverse","data.table","ggplot2","sparklyr","DBI","prophet","h2o","Hmisc","randomForest","scorecard","pROC","RJDBC","RMySQL","rsconnect")
libs <- c("tidymodels")
pkgList<-pkgDep(pkg=libs,availPkgs=aps,repos=repos)
pkgList
运行下载函数,下载所有的依赖包。
dp<-download.packages(pkgList,"/Users/milin/R语言统计分析/packages/R/tidymodels",type=getOption("pkgType"))
dp