dtplyr: Data Table Back-End for ’dplyr’
dtplyr 패키지는 dplyr문법으로 data.table 패키지를 호출하여 속도를 개선해주는 패키지
Provides a data.table backend for ’dplyr’. The goal of ’dtplyr’ is to allow you to write ’dplyr’ code that is automatically translated to the equivalent, but usually much faster, data.table code.
See https://cran.r-project.org/web/packages/dtplyr/dtplyr.pdf.
Description * collect( ) returns a tibble, grouped if needed
compute( ) returns a new lazy_dt
as.data.table( ) returns a data.table
as.data.frame( ) returns a data frame
as_tibble( ) returns a tibble
summary <- mtcars %>%
select(mpg, cyl, hp, am) %>%
filter(mpg > 15) %>%
mutate(mpg_round = round(mpg)) %>%
group_by(cyl, mpg_round, am) %>%
tally() %>%
filter(n >= 1)
summary
## # A tibble: 20 x 4
## # Groups: cyl, mpg_round [17]
## cyl mpg_round am n
## <dbl> <dbl> <dbl> <int>
## 1 4 21 1 1
## 2 4 22 0 1
## 3 4 23 0 1
## 4 4 23 1 1
## 5 4 24 0 1
## 6 4 26 1 1
## 7 4 27 1 1
## 8 4 30 1 2
## 9 4 32 1 1
## 10 4 34 1 1
## 11 6 18 0 2
## 12 6 19 0 1
## 13 6 20 1 1
## 14 6 21 0 1
## 15 6 21 1 2
## 16 8 15 0 2
## 17 8 16 0 2
## 18 8 16 1 1
## 19 8 17 0 1
## 20 8 19 0 2
# lazy_dt Create a "lazy" data.table for use with dplyr verbs
library(dplyr, warn.conflicts = FALSE)
mtcars2 <- lazy_dt(mtcars)
mtcars2
## Source: local data table [32 x 11]
## Call: `_DT1`
##
## mpg cyl disp hp drat wt qsec vs am gear carb
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
## 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
## 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
## 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
## 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
## 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 3]
## Call: `_DT1`[, .(mpg, cyl, disp)]
##
## mpg cyl disp
## <dbl> <dbl> <dbl>
## 1 21 6 160
## 2 21 6 160
## 3 22.8 4 108
## 4 21.4 6 258
## 5 18.7 8 360
## 6 18.1 6 225
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 2]
## Call: `_DT1`[, .(x = mpg, y = cyl)]
##
## x y
## <dbl> <dbl>
## 1 21 6
## 2 21 6
## 3 22.8 4
## 4 21.4 6
## 5 18.7 8
## 6 18.1 6
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 1]
## Call: `_DT1`[cyl == 4, .(mpg)]
##
## mpg
## <dbl>
## 1 22.8
## 2 24.4
## 3 22.8
## 4 32.4
## 5 30.4
## 6 33.9
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 2]
## Call: `_DT1`[, .(mpg, cyl)][cyl == 4]
##
## mpg cyl
## <dbl> <dbl>
## 1 22.8 4
## 2 24.4 4
## 3 22.8 4
## 4 32.4 4
## 5 30.4 4
## 6 33.9 4
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 13]
## Call: copy(`_DT1`)[, `:=`(c("cyl2", "cyl4"), {
## cyl2 <- cyl * 2
## cyl4 <- cyl2 * 2
## .(cyl2, cyl4)
## })]
##
## mpg cyl disp hp drat wt qsec vs am gear carb cyl2 cyl4
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 12 24
## 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 12 24
## 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 8 16
## 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 12 24
## 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 16 32
## 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 12 24
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 2]
## Call: `_DT1`[, .(cyl2 = cyl * 2, vs2 = vs * 2)]
##
## cyl2 vs2
## <dbl> <dbl>
## 1 12 0
## 2 12 0
## 3 8 2
## 4 12 2
## 5 16 0
## 6 12 2
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 12]
## Call: `_DT1`[cyl == 8][, `:=`(cyl2 = cyl * 2)]
##
## mpg cyl disp hp drat wt qsec vs am gear carb cyl2
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 16
## 2 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 16
## 3 16.4 8 276. 180 3.07 4.07 17.4 0 0 3 3 16
## 4 17.3 8 276. 180 3.07 3.73 17.6 0 0 3 3 16
## 5 15.2 8 276. 180 3.07 3.78 18 0 0 3 3 16
## 6 10.4 8 472 205 2.93 5.25 18.0 0 0 3 4 16
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 2]
## Call: `_DT1`[, .(mpg = mean(mpg)), keyby = .(cyl)]
##
## cyl mpg
## <dbl> <dbl>
## 1 4 26.7
## 2 6 19.7
## 3 8 15.1
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 11]
## Call: copy(`_DT1`)[, `:=`(mpg = mean(mpg)), keyby = .(cyl)]
##
## mpg cyl disp hp drat wt qsec vs am gear carb
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 26.7 4 108 93 3.85 2.32 18.6 1 1 4 1
## 2 26.7 4 147. 62 3.69 3.19 20 1 0 4 2
## 3 26.7 4 141. 95 3.92 3.15 22.9 1 0 4 2
## 4 26.7 4 78.7 66 4.08 2.2 19.5 1 1 4 1
## 5 26.7 4 75.7 52 4.93 1.62 18.5 1 1 4 2
## 6 26.7 4 71.1 65 4.22 1.84 19.9 1 1 4 1
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results
## Source: local data table [?? x 2]
## Call: `_DT1`[, .SD[mpg < mean(mpg), .(hp = mean(hp))], keyby = .(cyl)]
##
## cyl hp
## <dbl> <dbl>
## 1 4 91.2
## 2 6 132.
## 3 8 246.
##
## # Use as.data.table()/as.data.frame()/as_tibble() to access results