主要介绍purr,stringr,tidymodel,dplyr ,collapse,nimble这些包。

purr

遍历结果

在purr包中,遍历结果的函数包括map,map2,pmap,invoke_map 和lmap,等等我们首先来看map

map

map函数通过对列表或原子向量的每个元素应用函数并返回与输入长度相同的对象来转换它们的输入。 map函数总是会返回一个列表,map_lgl(), map_int(), map_dbl() 和 map_chr() 会返回一个原子向量。map_dfr() 和 map_dfc() 将返回数据框。使用的格式如下所示:

map(.x, .f, ...)

map_lgl(.x, .f, ...)

map_chr(.x, .f, ...)

map_int(.x, .f, ...)

map_dbl(.x, .f, ...)

map_raw(.x, .f, ...)

map_dfr(.x, .f, ..., .id = NULL)

map_dfc(.x, .f, ...)

walk(.x, .f, ...)

关于函数中的参数:

.x :一个列表或者一个向量 .f :一个函数,公式或者一个向量

1 向量

library(purrr)

1:5 %>% map(rnorm,n=10)
## [[1]]
##  [1]  0.26183272  0.64823698  1.12379658  1.25297323  1.31303127  1.79512995
##  [7]  0.04887375 -0.03267512  0.49480743  0.42284264
## 
## [[2]]
##  [1] 3.1370780 1.8082809 3.4511962 1.6337153 1.6445375 2.0303527 3.9565843
##  [8] 1.2004453 1.3725841 0.3694713
## 
## [[3]]
##  [1] 3.465118 2.398062 2.550844 2.673526 4.893808 4.612892 2.245932 3.725401
##  [9] 4.893135 1.000897
## 
## [[4]]
##  [1] 4.673522 2.810388 4.588789 2.402792 3.456575 3.852705 4.487871 3.410695
##  [9] 4.851114 4.277678
## 
## [[5]]
##  [1] 5.446313 4.067071 6.363740 5.655202 4.452537 5.426301 4.714484 5.854443
##  [9] 5.133658 5.429843

上面的代码等价于

1:5 %>%
  map(function(x) rnorm(10, x))

同样等价于

1:5 %>%
  map(~ rnorm(10, .x))

我们对map函数的输出结果进行汇总

1:5 %>%
  map(~ rnorm(10, .x)) %>% map_dbl(mean)
## [1] 1.084967 1.437935 3.124143 4.075858 4.842703

通过map函数,分组构建回归模型

mtcars %>%
  split(.$cyl) %>%
  map(~ lm(mpg ~ wt, data = .x)) %>%
  map(summary)
## $`4`
## 
## Call:
## lm(formula = mpg ~ wt, data = .x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1513 -1.9795 -0.6272  1.9299  5.2523 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   39.571      4.347   9.104 7.77e-06 ***
## wt            -5.647      1.850  -3.052   0.0137 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.332 on 9 degrees of freedom
## Multiple R-squared:  0.5086, Adjusted R-squared:  0.454 
## F-statistic: 9.316 on 1 and 9 DF,  p-value: 0.01374
## 
## 
## $`6`
## 
## Call:
## lm(formula = mpg ~ wt, data = .x)
## 
## Residuals:
##      Mazda RX4  Mazda RX4 Wag Hornet 4 Drive        Valiant       Merc 280 
##        -0.1250         0.5840         1.9292        -0.6897         0.3547 
##      Merc 280C   Ferrari Dino 
##        -1.0453        -1.0080 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   28.409      4.184   6.789  0.00105 **
## wt            -2.780      1.335  -2.083  0.09176 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.165 on 5 degrees of freedom
## Multiple R-squared:  0.4645, Adjusted R-squared:  0.3574 
## F-statistic: 4.337 on 1 and 5 DF,  p-value: 0.09176
## 
## 
## $`8`
## 
## Call:
## lm(formula = mpg ~ wt, data = .x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1491 -1.4664 -0.8458  1.5711  3.7619 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  23.8680     3.0055   7.942 4.05e-06 ***
## wt           -2.1924     0.7392  -2.966   0.0118 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.024 on 12 degrees of freedom
## Multiple R-squared:  0.423,  Adjusted R-squared:  0.3749 
## F-statistic: 8.796 on 1 and 12 DF,  p-value: 0.01179

需要注意的是这里是如何调用公式的。

map2

map2 和map函数类似,但是map2函数可以同时处理两个列表。我们来看一个例子:

x <- list(1, 1, 1)
y <- list(10, 20, 30)

map2(x, y, ~ .x + .y) # 等价于 map2(x, y, `+`)
## [[1]]
## [1] 11
## 
## [[2]]
## [1] 21
## 
## [[3]]
## [1] 31

我们还可以分组构建模型,并进一步分组预测:

by_cyl <- mtcars %>% split(.$cyl)
mods <- by_cyl %>% map(~ lm(mpg ~ wt, data = .))
map2(mods, by_cyl, predict)
## $`4`
##     Datsun 710      Merc 240D       Merc 230       Fiat 128    Honda Civic 
##       26.47010       21.55719       21.78307       27.14774       30.45125 
## Toyota Corolla  Toyota Corona      Fiat X1-9  Porsche 914-2   Lotus Europa 
##       29.20890       25.65128       28.64420       27.48656       31.02725 
##     Volvo 142E 
##       23.87247 
## 
## $`6`
##      Mazda RX4  Mazda RX4 Wag Hornet 4 Drive        Valiant       Merc 280 
##       21.12497       20.41604       19.47080       18.78968       18.84528 
##      Merc 280C   Ferrari Dino 
##       18.84528       20.70795 
## 
## $`8`
##   Hornet Sportabout          Duster 360          Merc 450SE          Merc 450SL 
##            16.32604            16.04103            14.94481            15.69024 
##         Merc 450SLC  Cadillac Fleetwood Lincoln Continental   Chrysler Imperial 
##            15.58061            12.35773            11.97625            12.14945 
##    Dodge Challenger         AMC Javelin          Camaro Z28    Pontiac Firebird 
##            16.15065            16.33700            15.44907            15.43811 
##      Ford Pantera L       Maserati Bora 
##            16.91800            16.04103

这里首先是通过split函数将数据集拆成几个部分。然后通过map函数分组构建回归模型,最后是用map2函数对每一个回归模型进行预测。

如果有多个列表需要同时处理,则需要使用到pmap函数,示例代码如下所示:

x <- rnorm(10)
y <- rnorm(10)
z <- rnorm(10)
a <- rnorm(10)

pmap(list(x, y, z,a), sum)
## [[1]]
## [1] 3.770395
## 
## [[2]]
## [1] -1.972205
## 
## [[3]]
## [1] -0.2585453
## 
## [[4]]
## [1] -0.03204745
## 
## [[5]]
## [1] -2.732995
## 
## [[6]]
## [1] 0.4160886
## 
## [[7]]
## [1] -1.071157
## 
## [[8]]
## [1] -0.7131349
## 
## [[9]]
## [1] -1.44945
## 
## [[10]]
## [1] -2.542039

在这个例子中,我们首先创建了4个列表,我们希望分别计算这四个列表中对应元素的和,这个时候我们就需要使用pmap函数。 另外一种写法是:

pmap(list(x,y,z,a),function(a,b,c,d) a+b+c+d)
## [[1]]
## [1] 3.770395
## 
## [[2]]
## [1] -1.972205
## 
## [[3]]
## [1] -0.2585453
## 
## [[4]]
## [1] -0.03204745
## 
## [[5]]
## [1] -2.732995
## 
## [[6]]
## [1] 0.4160886
## 
## [[7]]
## [1] -1.071157
## 
## [[8]]
## [1] -0.7131349
## 
## [[9]]
## [1] -1.44945
## 
## [[10]]
## [1] -2.542039

invoke_map

map函数以用来遍历数据的,如果想要遍历函数,则需要使用invoke_map函数。首先我们可以通invoke_map过调用一个带有参数列表的函数

list(c("A","B","C"), c("a","b","c")) %>%
  invoke(paste, ., sep = "-")
## [1] "A-a" "B-b" "C-c"

如果我们有两个函数,

invoke_map(list(runif, rnorm), list(list(n = 10)))
## [[1]]
##  [1] 0.17100382 0.31905889 0.67579353 0.73374089 0.74985077 0.69917388
##  [7] 0.11186007 0.08910544 0.95665860 0.22939053
## 
## [[2]]
##  [1] -0.02762745 -1.73680261 -0.92023588  0.79473687  0.02973006  0.61277657
##  [7]  0.01774265 -1.10955484  0.91303617 -0.56892899

map 函数的拓展

map函数有几个拓展函数,map_if,map_at 和map_depth 。

map_if函数会首先进行判断,然后再调用相应的函数,我们看一个例子

iris %>% map_if(is.factor,as.character,.else = as.integer)
## $Sepal.Length
##   [1] 5 4 4 4 5 5 4 5 4 4 5 4 4 4 5 5 5 5 5 5 5 5 4 5 4 5 5 5 5 4 4 5 5 5 4 5 5
##  [38] 4 4 5 5 4 4 5 5 4 5 4 5 5 7 6 6 5 6 5 6 4 6 5 5 5 6 6 5 6 5 5 6 5 5 6 6 6
##  [75] 6 6 6 6 6 5 5 5 5 6 5 6 6 6 5 5 5 6 5 5 5 5 5 6 5 5 6 5 7 6 6 7 4 7 6 7 6
## [112] 6 6 5 5 6 6 7 7 6 6 5 7 6 6 7 6 6 6 7 7 7 6 6 6 7 6 6 6 6 6 6 5 6 6 6 6 6
## [149] 6 5
## 
## $Sepal.Width
##   [1] 3 3 3 3 3 3 3 3 2 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3
##  [38] 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 2 2 2 3 2 2 2 2 3 2 2 2 3 3 2 2 2 3 2 2 2
##  [75] 2 3 2 3 2 2 2 2 2 2 3 3 3 2 3 2 2 3 2 2 2 3 2 2 2 2 3 2 3 2 3 3 2 2 2 3 3
## [112] 2 3 2 2 3 3 3 2 2 3 2 2 2 3 3 2 3 2 3 2 3 2 2 2 3 3 3 3 3 3 3 2 3 3 3 2 3
## [149] 3 3
## 
## $Petal.Length
##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 3 4 3 3 4 4 4 3 4 4 4 4 3 4 4 4 4
##  [75] 4 4 4 5 4 3 3 3 3 5 4 4 4 4 4 4 4 4 4 3 4 4 4 4 3 4 6 5 5 5 5 6 4 6 5 6 5
## [112] 5 5 5 5 5 5 6 6 5 5 4 6 4 5 6 4 4 5 5 6 6 5 5 5 6 5 5 4 5 5 5 5 5 5 5 5 5
## [149] 5 5
## 
## $Petal.Width
##   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 2 2 1 1 1 2 2
## [112] 1 2 2 2 2 1 2 2 1 2 2 2 1 2 1 1 1 2 1 1 2 2 1 1 2 2 1 1 2 2 2 1 2 2 2 1 2
## [149] 2 1
## 
## $Species
##   [1] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##   [6] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [11] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [16] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [21] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [26] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [31] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [36] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [41] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [46] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [51] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [56] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [61] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [66] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [71] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [76] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [81] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [86] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [91] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [96] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [101] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [106] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [111] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [116] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [121] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [126] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [131] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [136] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [141] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [146] "virginica"  "virginica"  "virginica"  "virginica"  "virginica"

上面的代码含义是,如果这一列是因子类型的,那么将这一列转变成为字符类型,如果不是因子类型的转变成为整数类型。

map_at函数可以处理具体位置的数据,我们看几个例子:

iris %>% map_at(c(4, 5), is.numeric)
## $Sepal.Length
##   [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
##  [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
##  [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
##  [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
##  [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
##  [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9
## 
## $Sepal.Width
##   [1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5
##  [19] 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2
##  [37] 3.5 3.6 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3
##  [55] 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8
##  [73] 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5
##  [91] 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9
## [109] 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2
## [127] 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2
## [145] 3.3 3.0 2.5 3.0 3.4 3.0
## 
## $Petal.Length
##   [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
##  [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
##  [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
##  [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
##  [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
##  [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
## 
## $Petal.Width
## [1] TRUE
## 
## $Species
## [1] FALSE

这个函数指定了判断第四列和第五列是不是数值类型的。除了是用列数,还可以使用列名。

iris %>% map_at("Species", tolower)
## $Sepal.Length
##   [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
##  [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
##  [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
##  [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
##  [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
##  [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9
## 
## $Sepal.Width
##   [1] 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5
##  [19] 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2
##  [37] 3.5 3.6 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3 3.2 3.2 3.1 2.3
##  [55] 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8
##  [73] 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5
##  [91] 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9
## [109] 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2
## [127] 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2
## [145] 3.3 3.0 2.5 3.0 3.4 3.0
## 
## $Petal.Length
##   [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4
##  [19] 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2
##  [37] 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0
##  [55] 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0
##  [73] 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0
##  [91] 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3
## [109] 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0
## [127] 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9
## [145] 5.7 5.2 5.0 5.2 5.4 5.1
## 
## $Petal.Width
##   [1] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3
##  [19] 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2
##  [37] 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2 1.4 1.5 1.5 1.3
##  [55] 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3
##  [73] 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3
##  [91] 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3 2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8
## [109] 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8
## [127] 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3
## [145] 2.5 2.3 1.9 2.0 2.3 1.8
## 
## $Species
##   [1] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##   [6] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [11] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [16] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [21] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [26] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [31] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [36] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [41] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [46] "setosa"     "setosa"     "setosa"     "setosa"     "setosa"    
##  [51] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [56] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [61] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [66] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [71] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [76] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [81] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [86] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [91] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
##  [96] "versicolor" "versicolor" "versicolor" "versicolor" "versicolor"
## [101] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [106] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [111] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [116] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [121] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [126] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [131] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [136] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [141] "virginica"  "virginica"  "virginica"  "virginica"  "virginica" 
## [146] "virginica"  "virginica"  "virginica"  "virginica"  "virginica"

上面的代码表示将Species列的值全部变成小写的。

x <- list(a = list(foo = 1:2, bar = 3:4), b = list(baz = 5:6))
str(x)
## List of 2
##  $ a:List of 2
##   ..$ foo: int [1:2] 1 2
##   ..$ bar: int [1:2] 3 4
##  $ b:List of 1
##   ..$ baz: int [1:2] 5 6
map_depth(x, 2, paste, collapse = "/")
## $a
## $a$foo
## [1] "1/2"
## 
## $a$bar
## [1] "3/4"
## 
## 
## $b
## $b$baz
## [1] "5/6"

因为列表是可以嵌套的,通过map_depth函数可以处理嵌套的列表,上面的代码中,2表示的是第二层的列表。

操作列表

选取或者删除列表中的元素是关于数据处理的常用的操作。通过pluck函数可以轻松的对数据进行选取,选取的方式有两种,通过元素位置或者名字,我们来看一个例子。

obj1 <- list("A", list(1, nam = "a"))
obj2 <- list("B", list(2, nam = "b"))
x <- list(obj1, obj2)

pluck(x, 1) # 等价于x[[1]]
## [[1]]
## [1] "A"
## 
## [[2]]
## [[2]][[1]]
## [1] 1
## 
## [[2]]$nam
## [1] "a"

上面的代码中,首先创建了一个数据集,然后使用pluck函数选取了第一层的第一个元素。同样,我们还可以选取更深层级的元素。

pluck(x,1,2) # 等价于x[[1]][[2]]
## [[1]]
## [1] 1
## 
## $nam
## [1] "a"

从结果中可以看到, 这里还嵌套了一个列表,并且,有一个元素是有名字的,对于有名字的元素可以使用名字来获取。

pluck(x,1,2,"nam") 
## [1] "a"

需要注意的是,chuck函数可以实现pluck函数一样的事情,但是这两个函数的不同点在于当一个元素不存在时,pull()总是返回NULL,在这种情况下chuck()总是抛出一个错误。另外,通过pluck函数还可以实现对于原始数据的修改。

pluck(x,1,2,"nam") <- "c"

x
## [[1]]
## [[1]][[1]]
## [1] "A"
## 
## [[1]][[2]]
## [[1]][[2]][[1]]
## [1] 1
## 
## [[1]][[2]]$nam
## [1] "c"
## 
## 
## 
## [[2]]
## [[2]][[1]]
## [1] "B"
## 
## [[2]][[2]]
## [[2]][[2]][[1]]
## [1] 2
## 
## [[2]][[2]]$nam
## [1] "b"

可以看到,原始数据中,a变成了c。pluck函数能够通过位置或者名称索引数据,如果想要通过一个条件来筛选数据,那么就需要使用keep函数。

list(1:2,2:3,3:4,4:5,5:6,6:7)%>%
  keep(function(x) mean(x) > 6) # 等价于 keep(~ mean(.x) > 6)
## [[1]]
## [1] 6 7

在这个例子中,我们首先创建了一个列表,然后对列表进行一个判断,如果列表的中元素的平均值大于6,那么保留对应的元素。那么如果想要删除满足条件的函数,则需要使用discard函数

list(1:2,2:3,3:4,4:5,5:6,6:7)%>%
  discard(function(x) mean(x) > 6)
## [[1]]
## [1] 1 2
## 
## [[2]]
## [1] 2 3
## 
## [[3]]
## [1] 3 4
## 
## [[4]]
## [1] 4 5
## 
## [[5]]
## [1] 5 6

如果我们想要删除列表中那些空的元素,那么可以使用compact函数,示例代码如下所示:

list(a=1,b=NULL,c=list(),d=NA) %>% compact()
## $a
## [1] 1
## 
## $d
## [1] NA

compact函数只会删除掉NULL和控list,不会处理NA。head_while是另外一个非常好用的函数,这个函数会返回列表的便利结果,直到元素不满足条件,我们来看一个例子,示例代码如下所示。

pos <- function(x) x >= 0
head_while(5:-5, pos)
## [1] 5 4 3 2 1 0
tail_while(5:-5, negate(pos))
## [1] -1 -2 -3 -4 -5

head_while或者tail_while相当于一个for循环,如果满足条件,那么返回对应的结果,如果不满足条件,那么就停止。

重塑列表

flatten函数可以将嵌套的列表展开

x <- rerun(2, sample(4))
x
## [[1]]
## [1] 1 4 3 2
## 
## [[2]]
## [1] 1 2 4 3

首先我们创建一个列表,这个列表嵌套了两个列表。

x %>% flatten()
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 4
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] 2
## 
## [[5]]
## [1] 1
## 
## [[6]]
## [1] 2
## 
## [[7]]
## [1] 4
## 
## [[8]]
## [1] 3

从结果中可以看到,数据似乎被"拉长了" 。另外,使用transpose函数可以转换数据格式。

x <- rerun(5, x = runif(1), y = runif(5))
x
## [[1]]
## [[1]]$x
## [1] 0.09884474
## 
## [[1]]$y
## [1] 0.16572093 0.23756631 0.07406567 0.27570395 0.65871268
## 
## 
## [[2]]
## [[2]]$x
## [1] 0.9270122
## 
## [[2]]$y
## [1] 0.67392157 0.83594149 0.01376925 0.29699908 0.21493212
## 
## 
## [[3]]
## [[3]]$x
## [1] 0.6002352
## 
## [[3]]$y
## [1] 0.37654226 0.70192682 0.02910238 0.42529595 0.81272848
## 
## 
## [[4]]
## [[4]]$x
## [1] 0.7588233
## 
## [[4]]$y
## [1] 0.2300723 0.7944437 0.7304211 0.5757704 0.1007236
## 
## 
## [[5]]
## [[5]]$x
## [1] 0.3424634
## 
## [[5]]$y
## [1] 0.7198985 0.2464443 0.5224496 0.1382845 0.8438813

这里我们首先使用rerun生成一个列表,rerun函数的作用类似于rep函数,会重复生成数据。然后我们使用transpose函数

x %>% transpose()
## $x
## $x[[1]]
## [1] 0.09884474
## 
## $x[[2]]
## [1] 0.9270122
## 
## $x[[3]]
## [1] 0.6002352
## 
## $x[[4]]
## [1] 0.7588233
## 
## $x[[5]]
## [1] 0.3424634
## 
## 
## $y
## $y[[1]]
## [1] 0.16572093 0.23756631 0.07406567 0.27570395 0.65871268
## 
## $y[[2]]
## [1] 0.67392157 0.83594149 0.01376925 0.29699908 0.21493212
## 
## $y[[3]]
## [1] 0.37654226 0.70192682 0.02910238 0.42529595 0.81272848
## 
## $y[[4]]
## [1] 0.2300723 0.7944437 0.7304211 0.5757704 0.1007236
## 
## $y[[5]]
## [1] 0.7198985 0.2464443 0.5224496 0.1382845 0.8438813

可以看到,x和y对应的数据分别被放到一个子列表下面去了。

SUMMARISE LISTS

y <- list(0:10, 5.5)
y %>% every(is.numeric)
## [1] TRUE
y %>% every(is.integer)
## [1] FALSE
y %>% some(is.integer)
## [1] TRUE
y %>% none(is.character)
## [1] TRUE
x <- list(1:10, 5, 9.9)
x %>% has_element(1:10)
## [1] TRUE
x %>% has_element(3)
## [1] FALSE
is_even <- function(x) x %% 2 == 0

3:10 %>% detect(is_even)
## [1] 4
3:10 %>% detect_index(is_even)
## [1] 2
x <- list(
  list(),
  list(list()),
  list(list(list(1)))
)
vec_depth(x)
## [1] 5
x %>% map_int(vec_depth)
## [1] 1 2 4

JOIN (TO) LISTS

append(1:5, 0:1, after = 3)
## [1] 1 2 3 0 1 4 5
x <- as.list(1:3)

x %>% append("a")
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] "a"
x %>% prepend("a")
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] 1
## 
## [[3]]
## [1] 2
## 
## [[4]]
## [1] 3
inputs <- list(arg1 = "a", arg2 = "b")


splice(inputs, arg3 = c("c1", "c2"),inputs)
## $arg1
## [1] "a"
## 
## $arg2
## [1] "b"
## 
## $arg3
## [1] "c1" "c2"
## 
## $arg1
## [1] "a"
## 
## $arg2
## [1] "b"

TRANSFORM LIST

mtcars %>% modify_at(c(1, 4, 5), as.character) # 也可以使用变量名mtcars %>% modify_at(c("cyl", "am"), as.character)
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4             21   6 160.0 110  3.9 2.620 16.46  0  1    4    4
## Mazda RX4 Wag         21   6 160.0 110  3.9 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215    3 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97  3.7 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2         26   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora         15   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
iris %>%
  modify_if(is.factor, as.character)
##     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 1            5.1         3.5          1.4         0.2     setosa
## 2            4.9         3.0          1.4         0.2     setosa
## 3            4.7         3.2          1.3         0.2     setosa
## 4            4.6         3.1          1.5         0.2     setosa
## 5            5.0         3.6          1.4         0.2     setosa
## 6            5.4         3.9          1.7         0.4     setosa
## 7            4.6         3.4          1.4         0.3     setosa
## 8            5.0         3.4          1.5         0.2     setosa
## 9            4.4         2.9          1.4         0.2     setosa
## 10           4.9         3.1          1.5         0.1     setosa
## 11           5.4         3.7          1.5         0.2     setosa
## 12           4.8         3.4          1.6         0.2     setosa
## 13           4.8         3.0          1.4         0.1     setosa
## 14           4.3         3.0          1.1         0.1     setosa
## 15           5.8         4.0          1.2         0.2     setosa
## 16           5.7         4.4          1.5         0.4     setosa
## 17           5.4         3.9          1.3         0.4     setosa
## 18           5.1         3.5          1.4         0.3     setosa
## 19           5.7         3.8          1.7         0.3     setosa
## 20           5.1         3.8          1.5         0.3     setosa
## 21           5.4         3.4          1.7         0.2     setosa
## 22           5.1         3.7          1.5         0.4     setosa
## 23           4.6         3.6          1.0         0.2     setosa
## 24           5.1         3.3          1.7         0.5     setosa
## 25           4.8         3.4          1.9         0.2     setosa
## 26           5.0         3.0          1.6         0.2     setosa
## 27           5.0         3.4          1.6         0.4     setosa
## 28           5.2         3.5          1.5         0.2     setosa
## 29           5.2         3.4          1.4         0.2     setosa
## 30           4.7         3.2          1.6         0.2     setosa
## 31           4.8         3.1          1.6         0.2     setosa
## 32           5.4         3.4          1.5         0.4     setosa
## 33           5.2         4.1          1.5         0.1     setosa
## 34           5.5         4.2          1.4         0.2     setosa
## 35           4.9         3.1          1.5         0.2     setosa
## 36           5.0         3.2          1.2         0.2     setosa
## 37           5.5         3.5          1.3         0.2     setosa
## 38           4.9         3.6          1.4         0.1     setosa
## 39           4.4         3.0          1.3         0.2     setosa
## 40           5.1         3.4          1.5         0.2     setosa
## 41           5.0         3.5          1.3         0.3     setosa
## 42           4.5         2.3          1.3         0.3     setosa
## 43           4.4         3.2          1.3         0.2     setosa
## 44           5.0         3.5          1.6         0.6     setosa
## 45           5.1         3.8          1.9         0.4     setosa
## 46           4.8         3.0          1.4         0.3     setosa
## 47           5.1         3.8          1.6         0.2     setosa
## 48           4.6         3.2          1.4         0.2     setosa
## 49           5.3         3.7          1.5         0.2     setosa
## 50           5.0         3.3          1.4         0.2     setosa
## 51           7.0         3.2          4.7         1.4 versicolor
## 52           6.4         3.2          4.5         1.5 versicolor
## 53           6.9         3.1          4.9         1.5 versicolor
## 54           5.5         2.3          4.0         1.3 versicolor
## 55           6.5         2.8          4.6         1.5 versicolor
## 56           5.7         2.8          4.5         1.3 versicolor
## 57           6.3         3.3          4.7         1.6 versicolor
## 58           4.9         2.4          3.3         1.0 versicolor
## 59           6.6         2.9          4.6         1.3 versicolor
## 60           5.2         2.7          3.9         1.4 versicolor
## 61           5.0         2.0          3.5         1.0 versicolor
## 62           5.9         3.0          4.2         1.5 versicolor
## 63           6.0         2.2          4.0         1.0 versicolor
## 64           6.1         2.9          4.7         1.4 versicolor
## 65           5.6         2.9          3.6         1.3 versicolor
## 66           6.7         3.1          4.4         1.4 versicolor
## 67           5.6         3.0          4.5         1.5 versicolor
## 68           5.8         2.7          4.1         1.0 versicolor
## 69           6.2         2.2          4.5         1.5 versicolor
## 70           5.6         2.5          3.9         1.1 versicolor
## 71           5.9         3.2          4.8         1.8 versicolor
## 72           6.1         2.8          4.0         1.3 versicolor
## 73           6.3         2.5          4.9         1.5 versicolor
## 74           6.1         2.8          4.7         1.2 versicolor
## 75           6.4         2.9          4.3         1.3 versicolor
## 76           6.6         3.0          4.4         1.4 versicolor
## 77           6.8         2.8          4.8         1.4 versicolor
## 78           6.7         3.0          5.0         1.7 versicolor
## 79           6.0         2.9          4.5         1.5 versicolor
## 80           5.7         2.6          3.5         1.0 versicolor
## 81           5.5         2.4          3.8         1.1 versicolor
## 82           5.5         2.4          3.7         1.0 versicolor
## 83           5.8         2.7          3.9         1.2 versicolor
## 84           6.0         2.7          5.1         1.6 versicolor
## 85           5.4         3.0          4.5         1.5 versicolor
## 86           6.0         3.4          4.5         1.6 versicolor
## 87           6.7         3.1          4.7         1.5 versicolor
## 88           6.3         2.3          4.4         1.3 versicolor
## 89           5.6         3.0          4.1         1.3 versicolor
## 90           5.5         2.5          4.0         1.3 versicolor
## 91           5.5         2.6          4.4         1.2 versicolor
## 92           6.1         3.0          4.6         1.4 versicolor
## 93           5.8         2.6          4.0         1.2 versicolor
## 94           5.0         2.3          3.3         1.0 versicolor
## 95           5.6         2.7          4.2         1.3 versicolor
## 96           5.7         3.0          4.2         1.2 versicolor
## 97           5.7         2.9          4.2         1.3 versicolor
## 98           6.2         2.9          4.3         1.3 versicolor
## 99           5.1         2.5          3.0         1.1 versicolor
## 100          5.7         2.8          4.1         1.3 versicolor
## 101          6.3         3.3          6.0         2.5  virginica
## 102          5.8         2.7          5.1         1.9  virginica
## 103          7.1         3.0          5.9         2.1  virginica
## 104          6.3         2.9          5.6         1.8  virginica
## 105          6.5         3.0          5.8         2.2  virginica
## 106          7.6         3.0          6.6         2.1  virginica
## 107          4.9         2.5          4.5         1.7  virginica
## 108          7.3         2.9          6.3         1.8  virginica
## 109          6.7         2.5          5.8         1.8  virginica
## 110          7.2         3.6          6.1         2.5  virginica
## 111          6.5         3.2          5.1         2.0  virginica
## 112          6.4         2.7          5.3         1.9  virginica
## 113          6.8         3.0          5.5         2.1  virginica
## 114          5.7         2.5          5.0         2.0  virginica
## 115          5.8         2.8          5.1         2.4  virginica
## 116          6.4         3.2          5.3         2.3  virginica
## 117          6.5         3.0          5.5         1.8  virginica
## 118          7.7         3.8          6.7         2.2  virginica
## 119          7.7         2.6          6.9         2.3  virginica
## 120          6.0         2.2          5.0         1.5  virginica
## 121          6.9         3.2          5.7         2.3  virginica
## 122          5.6         2.8          4.9         2.0  virginica
## 123          7.7         2.8          6.7         2.0  virginica
## 124          6.3         2.7          4.9         1.8  virginica
## 125          6.7         3.3          5.7         2.1  virginica
## 126          7.2         3.2          6.0         1.8  virginica
## 127          6.2         2.8          4.8         1.8  virginica
## 128          6.1         3.0          4.9         1.8  virginica
## 129          6.4         2.8          5.6         2.1  virginica
## 130          7.2         3.0          5.8         1.6  virginica
## 131          7.4         2.8          6.1         1.9  virginica
## 132          7.9         3.8          6.4         2.0  virginica
## 133          6.4         2.8          5.6         2.2  virginica
## 134          6.3         2.8          5.1         1.5  virginica
## 135          6.1         2.6          5.6         1.4  virginica
## 136          7.7         3.0          6.1         2.3  virginica
## 137          6.3         3.4          5.6         2.4  virginica
## 138          6.4         3.1          5.5         1.8  virginica
## 139          6.0         3.0          4.8         1.8  virginica
## 140          6.9         3.1          5.4         2.1  virginica
## 141          6.7         3.1          5.6         2.4  virginica
## 142          6.9         3.1          5.1         2.3  virginica
## 143          5.8         2.7          5.1         1.9  virginica
## 144          6.8         3.2          5.9         2.3  virginica
## 145          6.7         3.3          5.7         2.5  virginica
## 146          6.7         3.0          5.2         2.3  virginica
## 147          6.3         2.5          5.0         1.9  virginica
## 148          6.5         3.0          5.2         2.0  virginica
## 149          6.2         3.4          5.4         2.3  virginica
## 150          5.9         3.0          5.1         1.8  virginica
x <- c(foo = 1L, bar = 2L)
y <- c(TRUE, FALSE)
modify2(x, y, ~ if (.y) .x else 0L)
## foo bar 
##   1   0
l1 <- list(
  obj1 = list(
    prop1 = list(param1 = 1:2, param2 = 3:4),
    prop2 = list(param1 = 5:6, param2 = 7:8)
  ),
  obj2 = list(
    prop1 = list(param1 = 9:10, param2 = 11:12),
    prop2 = list(param1 = 12:14, param2 = 15:17)
  )
)

l1 %>% modify_depth(3, sum) 
## $obj1
## $obj1$prop1
## $obj1$prop1$param1
## [1] 3
## 
## $obj1$prop1$param2
## [1] 7
## 
## 
## $obj1$prop2
## $obj1$prop2$param1
## [1] 11
## 
## $obj1$prop2$param2
## [1] 15
## 
## 
## 
## $obj2
## $obj2$prop1
## $obj2$prop1$param1
## [1] 19
## 
## $obj2$prop1$param2
## [1] 23
## 
## 
## $obj2$prop2
## $obj2$prop2$param1
## [1] 39
## 
## $obj2$prop2$param2
## [1] 48
l1 %>% modify_depth(3, `+`, 100L)
## $obj1
## $obj1$prop1
## $obj1$prop1$param1
## [1] 101 102
## 
## $obj1$prop1$param2
## [1] 103 104
## 
## 
## $obj1$prop2
## $obj1$prop2$param1
## [1] 105 106
## 
## $obj1$prop2$param2
## [1] 107 108
## 
## 
## 
## $obj2
## $obj2$prop1
## $obj2$prop1$param1
## [1] 109 110
## 
## $obj2$prop1$param2
## [1] 111 112
## 
## 
## $obj2$prop2
## $obj2$prop2$param1
## [1] 112 113 114
## 
## $obj2$prop2$param2
## [1] 115 116 117

WORK WITH LISTS

Array_branch()和array_tree()通过将数组转换为列表,使数组能够与purrr的函数一起使用。这种强制的细节受到旁注论证的控制。Array_tree()创建一个分层列表(树),它的级别与margin中指定的维度一样多,而array_branch()则沿着所有提到的维度创建一个平面列表(类似于一个分支)。

x <- array(1:12, c(2, 2, 3))

array_branch(x)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] 4
## 
## [[5]]
## [1] 5
## 
## [[6]]
## [1] 6
## 
## [[7]]
## [1] 7
## 
## [[8]]
## [1] 8
## 
## [[9]]
## [1] 9
## 
## [[10]]
## [1] 10
## 
## [[11]]
## [1] 11
## 
## [[12]]
## [1] 12
array_branch(x, 1)
## [[1]]
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    3    7   11
## 
## [[2]]
##      [,1] [,2] [,3]
## [1,]    2    6   10
## [2,]    4    8   12
array_tree(x)
## [[1]]
## [[1]][[1]]
## [[1]][[1]][[1]]
## [1] 1
## 
## [[1]][[1]][[2]]
## [1] 5
## 
## [[1]][[1]][[3]]
## [1] 9
## 
## 
## [[1]][[2]]
## [[1]][[2]][[1]]
## [1] 3
## 
## [[1]][[2]][[2]]
## [1] 7
## 
## [[1]][[2]][[3]]
## [1] 11
## 
## 
## 
## [[2]]
## [[2]][[1]]
## [[2]][[1]][[1]]
## [1] 2
## 
## [[2]][[1]][[2]]
## [1] 6
## 
## [[2]][[1]][[3]]
## [1] 10
## 
## 
## [[2]][[2]]
## [[2]][[2]][[1]]
## [1] 4
## 
## [[2]][[2]][[2]]
## [1] 8
## 
## [[2]][[2]][[3]]
## [1] 12

Cross2()返回.x和.y元素的乘积集。Cross3()接受一个额外的.z参数。Cross()接受list .l并返回列表中所有元素的笛卡尔积,其中包含一个元素组合。Cross_df()类似于cross(),但返回一个按行组合的数据帧。

data <- list(
  id = c("John", "Jane"),
  greeting = c("Hello.", "Bonjour."),
  sep = c("! ", "... ")
)

data %>%
  cross()
## [[1]]
## [[1]]$id
## [1] "John"
## 
## [[1]]$greeting
## [1] "Hello."
## 
## [[1]]$sep
## [1] "! "
## 
## 
## [[2]]
## [[2]]$id
## [1] "Jane"
## 
## [[2]]$greeting
## [1] "Hello."
## 
## [[2]]$sep
## [1] "! "
## 
## 
## [[3]]
## [[3]]$id
## [1] "John"
## 
## [[3]]$greeting
## [1] "Bonjour."
## 
## [[3]]$sep
## [1] "! "
## 
## 
## [[4]]
## [[4]]$id
## [1] "Jane"
## 
## [[4]]$greeting
## [1] "Bonjour."
## 
## [[4]]$sep
## [1] "! "
## 
## 
## [[5]]
## [[5]]$id
## [1] "John"
## 
## [[5]]$greeting
## [1] "Hello."
## 
## [[5]]$sep
## [1] "... "
## 
## 
## [[6]]
## [[6]]$id
## [1] "Jane"
## 
## [[6]]$greeting
## [1] "Hello."
## 
## [[6]]$sep
## [1] "... "
## 
## 
## [[7]]
## [[7]]$id
## [1] "John"
## 
## [[7]]$greeting
## [1] "Bonjour."
## 
## [[7]]$sep
## [1] "... "
## 
## 
## [[8]]
## [[8]]$id
## [1] "Jane"
## 
## [[8]]$greeting
## [1] "Bonjour."
## 
## [[8]]$sep
## [1] "... "
args <- data %>% cross_df()

args
## # A tibble: 8 x 3
##   id    greeting sep   
##   <chr> <chr>    <chr> 
## 1 John  Hello.   "! "  
## 2 Jane  Hello.   "! "  
## 3 John  Bonjour. "! "  
## 4 Jane  Bonjour. "! "  
## 5 John  Hello.   "... "
## 6 Jane  Hello.   "... "
## 7 John  Bonjour. "... "
## 8 Jane  Bonjour. "... "
filter <- function(x, y) x >= y
cross2(1:5, 1:5, .filter = filter) %>% str()
## List of 10
##  $ :List of 2
##   ..$ : int 1
##   ..$ : int 2
##  $ :List of 2
##   ..$ : int 1
##   ..$ : int 3
##  $ :List of 2
##   ..$ : int 2
##   ..$ : int 3
##  $ :List of 2
##   ..$ : int 1
##   ..$ : int 4
##  $ :List of 2
##   ..$ : int 2
##   ..$ : int 4
##  $ :List of 2
##   ..$ : int 3
##   ..$ : int 4
##  $ :List of 2
##   ..$ : int 1
##   ..$ : int 5
##  $ :List of 2
##   ..$ : int 2
##   ..$ : int 5
##  $ :List of 2
##   ..$ : int 3
##   ..$ : int 5
##  $ :List of 2
##   ..$ : int 4
##   ..$ : int 5
seq_len(3) %>%
  cross2(., ., .filter = `==`) %>%
  map(setNames, c("x", "y"))
## [[1]]
## [[1]]$x
## [1] 2
## 
## [[1]]$y
## [1] 1
## 
## 
## [[2]]
## [[2]]$x
## [1] 3
## 
## [[2]]$y
## [1] 1
## 
## 
## [[3]]
## [[3]]$x
## [1] 1
## 
## [[3]]$y
## [1] 2
## 
## 
## [[4]]
## [[4]]$x
## [1] 3
## 
## [[4]]$y
## [1] 2
## 
## 
## [[5]]
## [[5]]$x
## [1] 1
## 
## [[5]]$y
## [1] 3
## 
## 
## [[6]]
## [[6]]$x
## [1] 2
## 
## [[6]]$y
## [1] 3
seq_len(3) %>%
  list(x = ., y = .) %>%
  cross(.filter = `==`)
## [[1]]
## [[1]]$x
## [1] 2
## 
## [[1]]$y
## [1] 1
## 
## 
## [[2]]
## [[2]]$x
## [1] 3
## 
## [[2]]$y
## [1] 1
## 
## 
## [[3]]
## [[3]]$x
## [1] 1
## 
## [[3]]$y
## [1] 2
## 
## 
## [[4]]
## [[4]]$x
## [1] 3
## 
## [[4]]$y
## [1] 2
## 
## 
## [[5]]
## [[5]]$x
## [1] 1
## 
## [[5]]$y
## [1] 3
## 
## 
## [[6]]
## [[6]]$x
## [1] 2
## 
## [[6]]$y
## [1] 3

Reduce Lists

Reduce()是将vector的元素组合成单个值的操作。该组合由.f驱动,这是一个二进制函数,它接受两个值并返回一个值:将f减为1:3计算值f(f(1,2), 3)。

1:3 %>% reduce(`+`)
## [1] 6
paste2 <- function(x, y, sep = ".") paste(x, y, sep = sep)
letters[1:4] %>% reduce(paste2)
## [1] "a.b.c.d"
letters[1:4] %>% reduce2(c("-", ".", "-"), paste2)
## [1] "a-b.c-d"
x <- list(c(0, 1), c(2, 3), c(4, 5))
y <- list(c(6, 7), c(8, 9))
reduce2(x, y, paste)
## [1] "0 2 6 4 8" "1 3 7 5 9"

Accumulate()将一个包含2个参数的函数依次应用于向量的元素。该函数的每个应用程序都使用前一个应用程序的初始值或结果作为第一个参数。第二个参数是向量的下一个值。每个应用程序的结果以列表的形式返回。在处理整个向量之前,积累可以选择性地终止,以响应由积累函数返回的done()信号。

与accumulate()相反,reduce()以同样的方式应用2个参数的函数,但是丢弃除最终函数应用程序之外的所有结果。

Accumulate2()将一个函数依次应用于.x和.y两个列表中的元素。

1:5 %>% accumulate(`+`)
## [1]  1  3  6 10 15
accumulate(letters[1:5], paste, sep = ".")
## [1] "a"         "a.b"       "a.b.c"     "a.b.c.d"   "a.b.c.d.e"
accumulate(letters[1:5], paste, sep = ".",.dir = "backward")
## [1] "a.b.c.d.e" "b.c.d.e"   "c.d.e"     "d.e"       "e"
paste2 <- function(x, y, sep = ".") paste(x, y, sep = sep)
letters[1:4] %>% accumulate(paste2)
## [1] "a"       "a.b"     "a.b.c"   "a.b.c.d"
letters[1:4] %>% accumulate2(c("-", ".", "-"), paste2)
## [[1]]
## [1] "a"
## 
## [[2]]
## [1] "a-b"
## 
## [[3]]
## [1] "a-b.c"
## 
## [[4]]
## [1] "a-b.c-d"

Modify function behavior

组合多个函数

add1 <- function(x) x + 1
compose(add1, add1)(8)
## [1] 10
fn <- compose(~ paste(.x, "foo"), ~ paste(.x, "bar"))
fn("input")
## [1] "input bar foo"

更改函数所接受的输入类型。

x <- list(x = c(1:100, NA, 1000), na.rm = TRUE, trim = 0.9)
lift_dl(mean)(x)
## [1] 51
lift(mean)(x)
## [1] 51

rerun 会重复一个表达式n次

10 %>% rerun(rnorm(5))
## [[1]]
## [1] -0.9056931  1.6306982  1.6524178  0.4251034  1.5473966
## 
## [[2]]
## [1] -1.358256 -1.099187  1.141827  1.520544 -1.238645
## 
## [[3]]
## [1] -1.2699996 -0.9242835 -0.7316490 -1.5328326  1.1109396
## 
## [[4]]
## [1] -0.9821572 -0.5997459 -1.5452152 -0.7241080  0.5180241
## 
## [[5]]
## [1] -0.4849522 -1.1122814  0.3323760 -1.7095975  0.8948831
## 
## [[6]]
## [1] -0.61528961  0.86061005 -0.08723839 -0.92864749 -0.88771569
## 
## [[7]]
## [1]  0.7054922 -0.8543814  0.6537769 -0.3061426  0.4992769
## 
## [[8]]
## [1] -0.4998756 -1.4026105  1.9358640 -1.4204936 -1.4207250
## 
## [[9]]
## [1] -0.2076087  0.7225889  1.0549358 -0.1766170  1.0105946
## 
## [[10]]
## [1]  0.02485606 -0.04419857  1.59732266  1.24072346 -0.76393331

将一个函数改为她的否定

is.na(NA)
## [1] TRUE
negate(is.na)(NA)
## [1] FALSE

创建一个函数的版本,它的一些参数预先设置为值。

my_long_variable <- 1:10
plot2 <- partial(plot, my_long_variable)
plot2()

plot2(runif(10), type = "l")

也就是说,partial可以创建关于函数的函数

safely,quietly,possibly ,auto_browse 函数,可以对现有的函数进行封装,从而

safe_log <- safely(log)
safe_log(10)
## $result
## [1] 2.302585
## 
## $error
## NULL
safe_log("a")
## $result
## NULL
## 
## $error
## <simpleError in .Primitive("log")(x, base): non-numeric argument to mathematical function>
list("a", 10, 100) %>%
  map_dbl(possibly(log, 0))
## [1] 0.000000 2.302585 4.605170

可以看到,这里将报错的结果修改为了0

Nested Data

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ dplyr   1.0.5
## ✓ tibble  3.1.0     ✓ stringr 1.4.0
## ✓ tidyr   1.1.3     ✓ forcats 0.5.1
## ✓ readr   1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
n_iris <- iris %>% group_by(Species) %>% nest()
n_iris %>% unnest()
## Warning: `cols` is now required when using unnest().
## Please use `cols = c(data)`
## # A tibble: 150 x 5
## # Groups:   Species [3]
##    Species Sepal.Length Sepal.Width Petal.Length Petal.Width
##    <fct>          <dbl>       <dbl>        <dbl>       <dbl>
##  1 setosa           5.1         3.5          1.4         0.2
##  2 setosa           4.9         3            1.4         0.2
##  3 setosa           4.7         3.2          1.3         0.2
##  4 setosa           4.6         3.1          1.5         0.2
##  5 setosa           5           3.6          1.4         0.2
##  6 setosa           5.4         3.9          1.7         0.4
##  7 setosa           4.6         3.4          1.4         0.3
##  8 setosa           5           3.4          1.5         0.2
##  9 setosa           4.4         2.9          1.4         0.2
## 10 setosa           4.9         3.1          1.5         0.1
## # … with 140 more rows

List Column Workflow

1 make a list column

n_iris <- iris %>% group_by(Species) %>% nest()

2 work with list column

mod_fun <- function(df) lm(Sepal.Length ~ ., data = df)
m_iris <- n_iris %>%
mutate(model = map(data, mod_fun))

3 simplify the list column

b_fun <- function(mod) coefficients(mod)[[1]]
m_iris %>% transmute(Species, beta = map_dbl(model, b_fun))
## # A tibble: 3 x 2
## # Groups:   Species [3]
##   Species     beta
##   <fct>      <dbl>
## 1 setosa     2.35 
## 2 versicolor 1.90 
## 3 virginica  0.700

批量下载包

## install.packages("miniCRAN")



library(miniCRAN)
tags <- "xts"
pkgDep(tags, availPkgs = cranJuly2014)
##  [1] "xts"          "zoo"          "lattice"      "timeDate"     "quadprog"    
##  [6] "Hmisc"        "survival"     "Formula"      "latticeExtra" "cluster"     
## [11] "RColorBrewer" "BH"           "timeSeries"   "tseries"      "its"         
## [16] "chron"        "fts"          "tis"

画出依赖关系图。

dg <- makeDepGraph(tags, enhances = TRUE, availPkgs = cranJuly2014)
plot(dg, legendPosition = c(-1, 1), vertex.size = 20)


repos<-getOption("repos")
repos


curl<-contrib.url(repos)
aps<-available.packages(curl)


aps[which(row.names(aps)=='xts'),]

从所有的软件包中,找到对应软件依赖包的列表。

要找到xts和TTR软件的依赖包

libs<-c("arules","rmarkdown","rJava","tidyverse","data.table","ggplot2","sparklyr","DBI","prophet","h2o","Hmisc","randomForest","scorecard","pROC","RJDBC","RMySQL","rsconnect")
libs <- c("tidymodels")
pkgList<-pkgDep(pkg=libs,availPkgs=aps,repos=repos)
pkgList

运行下载函数,下载所有的依赖包。

dp<-download.packages(pkgList,"/Users/milin/R语言统计分析/packages/R/tidymodels",type=getOption("pkgType"))
dp