knitr::opts_chunk$set(echo = TRUE,cache=F,error=F,warning=F,message=F)
See more on
Non-standard-evaluation and standard evaluation in dplyr
Tidy evaluation, most common actions
Leseer known dplyr 0.7* tricks
library(tidyverse)
library(magrittr)
#head(mtcars)
dat<-as_data_frame(mtcars)
Use purrr::pmap_dbl for multi-input function via non-standard-evaluation (NSE).
pmap_dbl returns double outputs from multiple inputs.
knitr::kable(
dat%>%mutate(new=pmap_dbl(list(mpg,cyl),
function(mpg,cyl) {mpg*cyl+cyl^2+2}))%>%head()
)
| mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | new |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 | 164.0 |
| 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 | 164.0 |
| 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 | 109.2 |
| 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 | 166.4 |
| 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 | 215.6 |
| 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 | 146.6 |
Actually, the function doesn’t need to have same variable names.
knitr::kable(
dat%>%mutate(new=pmap_dbl(list(mpg,cyl),
function(x,y) {x*y+y^2+2}))%>%head()
)
| mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | new |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 | 164.0 |
| 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 | 164.0 |
| 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 | 109.2 |
| 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 | 166.4 |
| 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 | 215.6 |
| 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 | 146.6 |
Let’s use Standard-evaluation (SE). Dplyr 0.7.0 supports SE format, and you can use variable names as an input in dplyr.
Below, I used rlang::syms
Here, we got variable names via var<-as.list(names(dat)[1:3]). This might be more useful than using NSE (e.g., mpg, cyl, disp) because it might be possible that there are too many columns to type everything.
test<-function(df,...){
vars<-rlang::syms(...)
#vars<-quos(...)
print("vars is ")
print(vars)
print("... is ")
print(...)
#print(paste0("length of vars is ", length(vars)))
select_test<-df%>%select(!!!vars)
function_test<-df%>%mutate(new=pmap_dbl(list(!!!vars), function(x,y,z){x*y+y^2+2+z}))
#function_test<-df%>%mutate(new=pmap_dbl(list(mpg,cyl), function(x,y){x*y+y^2+2}))
print("select_test is ")
print(head(select_test))
print("function_test is ")
print(head(function_test))
}
var<-as.list(names(dat)[1:3]) #var<-list("mpg","cyl","disp")
test(dat,var)
## [1] "vars is "
## [[1]]
## mpg
##
## [[2]]
## cyl
##
## [[3]]
## disp
##
## [1] "... is "
## [[1]]
## [1] "mpg"
##
## [[2]]
## [1] "cyl"
##
## [[3]]
## [1] "disp"
##
## [1] "select_test is "
## # A tibble: 6 x 3
## mpg cyl disp
## <dbl> <dbl> <dbl>
## 1 21.0 6 160
## 2 21.0 6 160
## 3 22.8 4 108
## 4 21.4 6 258
## 5 18.7 8 360
## 6 18.1 6 225
## [1] "function_test is "
## # A tibble: 6 x 12
## mpg cyl disp hp drat wt qsec vs am gear carb new
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 324.0
## 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 324.0
## 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 217.2
## 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 424.4
## 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 575.6
## 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 371.6
You don’t need to use it in function format..
vars<-rlang::syms(var)
dff<-dat%>%
mutate(new=pmap_dbl(list(!!!vars), function(x,y,z){x*y+y^2+2+z}))
knitr::kable(head(dff))
| mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | new |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 | 324.0 |
| 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 | 324.0 |
| 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 | 217.2 |
| 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 | 424.4 |
| 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 | 575.6 |
| 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 | 371.6 |
Finally, you can index each variable.
dff2<-dat%>%
mutate(new=pmap_dbl(list(!!!vars[1:2]), function(x,y){x+y}))%>%
select(c(!!!vars[c(1,3)],new,!!!vars[2]))
knitr::kable(head(dff2))
| mpg | disp | new | cyl |
|---|---|---|---|
| 21.0 | 160 | 27.0 | 6 |
| 21.0 | 160 | 27.0 | 6 |
| 22.8 | 108 | 26.8 | 4 |
| 21.4 | 258 | 27.4 | 6 |
| 18.7 | 360 | 26.7 | 8 |
| 18.1 | 225 | 24.1 | 6 |
```