Base R vs tidyverse and other notes from useR! 2019 conference

Alexander Matrunich, a data analyst @Bolt
Tallinn R users meetup. 2019-09-24

useR!2019

conference participants by country

1178 participants total, 351 #RLadies, 5 gender diverse, 5 kids, 184 students, 331 academics and 509 participants from the industry. 151 oral presentations, 71 flash presentations and 94 posters (among 467 submitted abstracts). A useR! participant emitted 720 kg CO2 for transportation.

Base R and tidyverse

1993: R, a free open source implementation of S.

2005: Hadley Wickham published reshape package and then ggplot.

RStudio IDE realized in 2011. Hadley joined Rstudio Inc. as Chief Scientist in 2014. RStudio team contributes to tidyverse and promotes it.

(reshape, ggplot) -> (reshape2, ggplot2) -> (dplyr, tidyr, purrr, !ggvis) -> tidyverse

Most downloaded R packages

head(iris[iris$Species == "setosa", "Sepal.Length"])
[1] 5.1 4.9 4.7 4.6 5.0 5.4
library(dplyr)
iris %>% 
  filter(Species == "setosa") %>% 
  select(Sepal.Length) %>% 
  sample_n(5)
  Sepal.Length
1          4.7
2          5.4
3          5.4
4          4.9
5          5.1

Base R

fit1 <- lm(Sepal.Width ~ Sepal.Length, data = iris)
summary(fit1)

Call:
lm(formula = Sepal.Width ~ Sepal.Length, data = iris)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.1095 -0.2454 -0.0167  0.2763  1.3338 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.41895    0.25356   13.48   <2e-16 ***
Sepal.Length -0.06188    0.04297   -1.44    0.152    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4343 on 148 degrees of freedom
Multiple R-squared:  0.01382,   Adjusted R-squared:  0.007159 
F-statistic: 2.074 on 1 and 148 DF,  p-value: 0.1519

Tidyverse style

library(broom)
tidy(fit1)
term estimate std.error statistic p.value
(Intercept) 3.4189468 0.2535623 13.483658 0.0000000
Sepal.Length -0.0618848 0.0429670 -1.440287 0.1518983
glance(fit1)
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual
0.0138227 0.0071593 0.4343032 2.074427 0.1518983 2 -86.73221 179.4644 188.4963 27.91566 148
augment(fit1)
Sepal.Width Sepal.Length .fitted .se.fit .resid .hat .sigma .cooksd .std.resid
3.5 5.1 3.103334 0.0477237 0.3966656 0.0120748 0.4345331 0.0051602 0.9189026
3.0 4.9 3.115711 0.0538546 -0.1157113 0.0153766 0.4356718 0.0005629 -0.2685021
3.2 4.7 3.128088 0.0605870 0.0719117 0.0194613 0.4357368 0.0002775 0.1672146
3.1 4.6 3.134277 0.0641202 -0.0342768 0.0217974 0.4357686 0.0000709 -0.0797981
3.6 5.0 3.109523 0.0506998 0.4904772 0.0136278 0.4338701 0.0089324 1.1371174
3.9 5.4 3.084769 0.0402531 0.8152311 0.0085904 0.4305138 0.0153976 1.8852159
3.4 4.6 3.134277 0.0641202 0.2657232 0.0217974 0.4352142 0.0042637 0.6186173
3.4 5.0 3.109523 0.0506998 0.2904772 0.0136278 0.4351098 0.0031329 0.6734394
2.9 4.4 3.146654 0.0714381 -0.2466537 0.0270567 0.4352896 0.0046095 -0.5757724
3.1 4.9 3.115711 0.0538546 -0.0157113 0.0153766 0.4357760 0.0000104 -0.0364573
3.7 5.4 3.084769 0.0402531 0.6152311 0.0085904 0.4327877 0.0087694 1.4227174
3.4 4.8 3.121900 0.0571585 0.2781002 0.0173211 0.4351632 0.0036774 0.6459552
3.0 4.8 3.121900 0.0571585 -0.1218998 0.0173211 0.4356599 0.0007065 -0.2831419
3.0 4.3 3.152842 0.0751984 -0.1528422 0.0299799 0.4355899 0.0019731 -0.3573221
4.0 5.8 3.060015 0.0355096 0.9399850 0.0066850 0.4287788 0.0158692 2.1716227
4.4 5.7 3.066203 0.0359915 1.3337965 0.0068678 0.4215645 0.0328370 3.0817185
3.9 5.4 3.084769 0.0402531 0.8152311 0.0085904 0.4305138 0.0153976 1.8852159
3.5 5.1 3.103334 0.0477237 0.3966656 0.0120748 0.4345331 0.0051602 0.9189026
3.8 5.7 3.066203 0.0359915 0.7337965 0.0068678 0.4315253 0.0099388 1.6954268
3.8 5.1 3.103334 0.0477237 0.6966656 0.0120748 0.4319264 0.0159172 1.6138728
3.4 5.4 3.084769 0.0402531 0.3152311 0.0085904 0.4349949 0.0023022 0.7289696
3.7 5.1 3.103334 0.0477237 0.5966656 0.0120748 0.4329561 0.0116756 1.3822161
3.6 4.6 3.134277 0.0641202 0.4657232 0.0217974 0.4340438 0.0130974 1.0842275
3.3 5.1 3.103334 0.0477237 0.1966656 0.0120748 0.4354723 0.0012685 0.4555892
3.4 4.8 3.121900 0.0571585 0.2781002 0.0173211 0.4351632 0.0036774 0.6459552
3.0 5.0 3.109523 0.0506998 -0.1095228 0.0136278 0.4356830 0.0004454 -0.2539167
3.4 5.0 3.109523 0.0506998 0.2904772 0.0136278 0.4351098 0.0031329 0.6734394
3.5 5.2 3.097146 0.0449616 0.4028541 0.0107176 0.4344956 0.0047113 0.9325983
3.4 5.2 3.097146 0.0449616 0.3028541 0.0107176 0.4350537 0.0026626 0.7011005
3.2 4.7 3.128088 0.0605870 0.0719117 0.0194613 0.4357368 0.0002775 0.1672146
3.1 4.8 3.121900 0.0571585 -0.0218998 0.0173211 0.4357741 0.0000228 -0.0508676
3.4 5.4 3.084769 0.0402531 0.3152311 0.0085904 0.4349949 0.0023022 0.7289696
4.1 5.2 3.097146 0.0449616 1.0028541 0.0107176 0.4277694 0.0291955 2.3215848
4.2 5.5 3.078580 0.0384068 1.1214196 0.0078204 0.4257699 0.0264832 2.5922681
3.1 4.9 3.115711 0.0538546 -0.0157113 0.0153766 0.4357760 0.0000104 -0.0364573
3.2 5.0 3.109523 0.0506998 0.0904772 0.0136278 0.4357132 0.0003040 0.2097613
3.5 5.5 3.078580 0.0384068 0.4214196 0.0078204 0.4343786 0.0037399 0.9741514
3.6 4.9 3.115711 0.0538546 0.4842887 0.0153766 0.4339148 0.0098608 1.1237667
3.0 4.4 3.146654 0.0714381 -0.1466537 0.0270567 0.4356054 0.0016296 -0.3423389
3.4 5.1 3.103334 0.0477237 0.2966656 0.0120748 0.4350821 0.0028864 0.6872459
3.5 5.0 3.109523 0.0506998 0.3904772 0.0136278 0.4345697 0.0056614 0.9052784
2.3 4.5 3.140465 0.0677417 -0.8404652 0.0243291 0.4300899 0.0478568 -1.9591831
3.2 4.4 3.146654 0.0714381 0.0533463 0.0270567 0.4357551 0.0002156 0.1245281
3.5 5.0 3.109523 0.0506998 0.3904772 0.0136278 0.4345697 0.0056614 0.9052784
3.8 5.1 3.103334 0.0477237 0.6966656 0.0120748 0.4319264 0.0159172 1.6138728
3.0 4.8 3.121900 0.0571585 -0.1218998 0.0173211 0.4356599 0.0007065 -0.2831419
3.8 5.1 3.103334 0.0477237 0.6966656 0.0120748 0.4319264 0.0159172 1.6138728
3.2 4.6 3.134277 0.0641202 0.0657232 0.0217974 0.4357435 0.0002608 0.1530071
3.7 5.3 3.090957 0.0424555 0.6090426 0.0095561 0.4328449 0.0095786 1.4090930
3.3 5.0 3.109523 0.0506998 0.1904772 0.0136278 0.4354908 0.0013471 0.4416004
3.2 7.0 2.985753 0.0610524 0.2142467 0.0197615 0.4354123 0.0025025 0.4982592
3.2 6.4 3.022884 0.0427732 0.1771159 0.0096997 0.4355306 0.0008225 0.4098085
3.1 6.9 2.991942 0.0576089 0.1080583 0.0175951 0.4356852 0.0005643 0.2510266
2.3 5.5 3.078580 0.0384068 -0.7785804 0.0078204 0.4309828 0.0127656 -1.7997629
2.8 6.5 3.016696 0.0453161 -0.2166956 0.0108873 0.4354072 0.0013852 -0.5016886
2.8 5.7 3.066203 0.0359915 -0.2662035 0.0068678 0.4352207 0.0013080 -0.6150595
3.3 6.3 3.029073 0.0405274 0.2709274 0.0087079 0.4351996 0.0017242 0.6265547
2.4 4.9 3.115711 0.0538546 -0.7157113 0.0153766 0.4316982 0.0215367 -1.6607710
2.9 6.6 3.010507 0.0481090 -0.1105072 0.0122706 0.4356814 0.0004072 -0.2560227
2.7 5.2 3.097146 0.0449616 -0.3971459 0.0107176 0.4345318 0.0045787 -0.9193838
2.0 5.0 3.109523 0.0506998 -1.1095228 0.0136278 0.4259252 0.0457090 -2.5723069
3.0 5.9 3.053826 0.0355442 -0.0538265 0.0066981 0.4357552 0.0000521 -0.1243548
2.2 6.0 3.047638 0.0360940 -0.8476380 0.0069069 0.4300939 0.0133385 -1.9584947
2.9 6.1 3.041450 0.0371360 -0.1414496 0.0073115 0.4356206 0.0003935 -0.3268903
2.9 5.6 3.072392 0.0369699 -0.1723920 0.0072462 0.4355442 0.0005792 -0.3983852
3.1 6.7 3.004319 0.0511109 0.0956813 0.0138497 0.4357055 0.0003456 0.2218516
3.0 5.6 3.072392 0.0369699 -0.0723920 0.0072462 0.4357367 0.0001021 -0.1672925
2.7 5.8 3.060015 0.0355096 -0.3600150 0.0066850 0.4347583 0.0023279 -0.8317332
2.2 6.2 3.035261 0.0386305 -0.8352611 0.0079118 0.4302541 0.0148663 -1.9308745
2.5 5.6 3.072392 0.0369699 -0.5723920 0.0072462 0.4331944 0.0063856 -1.3227558
3.2 5.9 3.053826 0.0355442 0.1461735 0.0066981 0.4356100 0.0003845 0.3377030
2.8 6.1 3.041450 0.0371360 -0.2414496 0.0073115 0.4353193 0.0011466 -0.5579906
2.5 6.3 3.029073 0.0405274 -0.5290726 0.0087079 0.4335683 0.0065754 -1.2235490
2.8 6.1 3.041450 0.0371360 -0.2414496 0.0073115 0.4353193 0.0011466 -0.5579906
2.9 6.4 3.022884 0.0427732 -0.1228841 0.0096997 0.4356589 0.0003959 -0.2843278
3.0 6.6 3.010507 0.0481090 -0.0105072 0.0122706 0.4357771 0.0000037 -0.0243430
2.8 6.8 2.998130 0.0542871 -0.1981302 0.0156245 0.4354666 0.0016779 -0.4598088
3.0 6.7 3.004319 0.0511109 -0.0043187 0.0138497 0.4357778 0.0000007 -0.0100135
2.9 6.0 3.047638 0.0360940 -0.1476380 0.0069069 0.4356066 0.0004047 -0.3411224
2.6 5.7 3.066203 0.0359915 -0.4662035 0.0068678 0.4340664 0.0040118 -1.0771568
2.4 5.5 3.078580 0.0384068 -0.6785804 0.0078204 0.4321403 0.0096970 -1.5686033
2.4 5.5 3.078580 0.0384068 -0.6785804 0.0078204 0.4321403 0.0096970 -1.5686033
2.7 5.8 3.060015 0.0355096 -0.3600150 0.0066850 0.4347583 0.0023279 -0.8317332
2.7 6.0 3.047638 0.0360940 -0.3476380 0.0069069 0.4348271 0.0022436 -0.8032288
3.0 5.4 3.084769 0.0402531 -0.0847689 0.0085904 0.4357214 0.0001665 -0.1960275
3.4 6.0 3.047638 0.0360940 0.3523620 0.0069069 0.4348010 0.0023050 0.8141435
3.1 6.7 3.004319 0.0511109 0.0956813 0.0138497 0.4357055 0.0003456 0.2218516
2.3 6.3 3.029073 0.0405274 -0.7290726 0.0087079 0.4315724 0.0124863 -1.6860750
3.0 5.6 3.072392 0.0369699 -0.0723920 0.0072462 0.4357367 0.0001021 -0.1672925
2.5 5.5 3.078580 0.0384068 -0.5785804 0.0078204 0.4331365 0.0070495 -1.3374438
2.6 5.5 3.078580 0.0384068 -0.4785804 0.0078204 0.4339724 0.0048233 -1.1062843
3.0 6.1 3.041450 0.0371360 -0.0414496 0.0073115 0.4357644 0.0000338 -0.0957901
2.6 5.8 3.060015 0.0355096 -0.4600150 0.0066850 0.4341120 0.0038007 -1.0627606
2.3 5.0 3.109523 0.0506998 -0.8095228 0.0136278 0.4305611 0.0243325 -1.8767898
2.7 5.6 3.072392 0.0369699 -0.3723920 0.0072462 0.4346863 0.0027028 -0.8605705
3.0 5.7 3.066203 0.0359915 -0.0662035 0.0068678 0.4357435 0.0000809 -0.1529622
2.9 5.7 3.066203 0.0359915 -0.1662035 0.0068678 0.4355608 0.0005099 -0.3840109
2.9 6.2 3.035261 0.0386305 -0.1352611 0.0079118 0.4356340 0.0003899 -0.3126833
2.5 5.1 3.103334 0.0477237 -0.6033344 0.0120748 0.4328925 0.0119381 -1.3976646
2.8 5.7 3.066203 0.0359915 -0.2662035 0.0068678 0.4352207 0.0013080 -0.6150595
3.3 6.3 3.029073 0.0405274 0.2709274 0.0087079 0.4351996 0.0017242 0.6265547
2.7 5.8 3.060015 0.0355096 -0.3600150 0.0066850 0.4347583 0.0023279 -0.8317332
3.0 7.1 2.979565 0.0645983 0.0204352 0.0221236 0.4357746 0.0000256 0.0475822
2.9 6.3 3.029073 0.0405274 -0.1290726 0.0087079 0.4356468 0.0003913 -0.2984972
3.0 6.5 3.016696 0.0453161 -0.0166956 0.0108873 0.4357758 0.0000082 -0.0386534
3.0 7.6 2.948622 0.0833936 0.0513776 0.0368705 0.4357566 0.0002781 0.1205421
2.5 4.9 3.115711 0.0538546 -0.6157113 0.0153766 0.4327623 0.0159389 -1.4287262
2.9 7.3 2.967188 0.0719360 -0.0671878 0.0274351 0.4357417 0.0003471 -0.1568694
2.5 6.7 3.004319 0.0511109 -0.5043187 0.0138497 0.4337602 0.0096017 -1.1693390
3.6 7.2 2.973376 0.0682305 0.6266237 0.0246815 0.4326242 0.0270070 1.4609674
3.2 6.5 3.016696 0.0453161 0.1833044 0.0108873 0.4355127 0.0009912 0.4243819
2.7 6.4 3.022884 0.0427732 -0.3228841 0.0096997 0.4349555 0.0027334 -0.7470853
3.0 6.8 2.998130 0.0542871 0.0018698 0.0156245 0.4357779 0.0000001 0.0043393
2.5 5.7 3.066203 0.0359915 -0.5662035 0.0068678 0.4332511 0.0059174 -1.3082054
2.8 5.8 3.060015 0.0355096 -0.2600150 0.0066850 0.4352464 0.0012143 -0.6007059
3.2 6.4 3.022884 0.0427732 0.1771159 0.0096997 0.4355306 0.0008225 0.4098085
3.0 6.5 3.016696 0.0453161 -0.0166956 0.0108873 0.4357758 0.0000082 -0.0386534
3.8 7.7 2.942434 0.0873016 0.8575661 0.0404072 0.4297545 0.0855468 2.0157239
2.6 7.7 2.942434 0.0873016 -0.3424339 0.0404072 0.4348231 0.0136402 -0.8048968
2.2 6.0 3.047638 0.0360940 -0.8476380 0.0069069 0.4300939 0.0133385 -1.9584947
3.2 6.9 2.991942 0.0576089 0.2080583 0.0175951 0.4354339 0.0020920 0.4833332
2.8 5.6 3.072392 0.0369699 -0.2723920 0.0072462 0.4351942 0.0014461 -0.6294779
2.8 7.7 2.942434 0.0873016 -0.1424339 0.0404072 0.4356129 0.0023599 -0.3347933
2.7 6.3 3.029073 0.0405274 -0.3290726 0.0087079 0.4349245 0.0025438 -0.7610231
3.3 6.7 3.004319 0.0511109 0.2956813 0.0138497 0.4350854 0.0033005 0.6855818
3.2 7.2 2.973376 0.0682305 0.2266237 0.0246815 0.4353667 0.0035324 0.5283711
2.8 6.2 3.035261 0.0386305 -0.2352611 0.0079118 0.4353423 0.0011794 -0.5438535
3.0 6.1 3.041450 0.0371360 -0.0414496 0.0073115 0.4357644 0.0000338 -0.0957901
2.8 6.4 3.022884 0.0427732 -0.2228841 0.0096997 0.4353862 0.0013025 -0.5157065
3.0 7.2 2.973376 0.0682305 0.0266237 0.0246815 0.4357723 0.0000488 0.0620729
2.8 7.4 2.960999 0.0757040 -0.1609993 0.0303845 0.4355692 0.0022207 -0.3764707
3.8 7.9 2.930057 0.0952182 0.8699431 0.0480677 0.4295278 0.1064160 2.0530274
2.8 6.4 3.022884 0.0427732 -0.2228841 0.0096997 0.4353862 0.0013025 -0.5157065
2.8 6.3 3.029073 0.0405274 -0.2290726 0.0087079 0.4353646 0.0012326 -0.5297601
2.6 6.1 3.041450 0.0371360 -0.4414496 0.0073115 0.4342430 0.0038329 -1.0201911
3.0 7.7 2.942434 0.0873016 0.0575661 0.0404072 0.4357510 0.0003855 0.1353101
3.4 6.3 3.029073 0.0405274 0.3709274 0.0087079 0.4346933 0.0032320 0.8578177
3.1 6.4 3.022884 0.0427732 0.0771159 0.0096997 0.4357311 0.0001559 0.1784297
3.0 6.0 3.047638 0.0360940 -0.0476380 0.0069069 0.4357601 0.0000421 -0.1100692
3.1 6.9 2.991942 0.0576089 0.1080583 0.0175951 0.4356852 0.0005643 0.2510266
3.1 6.7 3.004319 0.0511109 0.0956813 0.0138497 0.4357055 0.0003456 0.2218516
3.1 6.9 2.991942 0.0576089 0.1080583 0.0175951 0.4356852 0.0005643 0.2510266
2.7 5.8 3.060015 0.0355096 -0.3600150 0.0066850 0.4347583 0.0023279 -0.8317332
3.2 6.8 2.998130 0.0542871 0.2018698 0.0156245 0.4354547 0.0017419 0.4684874
3.3 6.7 3.004319 0.0511109 0.2956813 0.0138497 0.4350854 0.0033005 0.6855818
3.0 6.7 3.004319 0.0511109 -0.0043187 0.0138497 0.4357778 0.0000007 -0.0100135
2.5 6.3 3.029073 0.0405274 -0.5290726 0.0087079 0.4335683 0.0065754 -1.2235490
3.0 6.5 3.016696 0.0453161 -0.0166956 0.0108873 0.4357758 0.0000082 -0.0386534
3.4 6.2 3.035261 0.0386305 0.3647389 0.0079118 0.4347300 0.0028348 0.8431676
3.0 5.9 3.053826 0.0355442 -0.0538265 0.0066981 0.4357552 0.0000521 -0.1243548

... and other notes

  • ggvega: translator from 'ggplot2' to 'Vega-Lite'
  • tourr: tour methods for multivariate data visualisation
  • colour schemes
  • graphical inference

Graphical inference: maps

Cancer deaths in Texas, darker colors = more deaths. Which of the six plots is made from a real dataset and not simulated under the null hypothesis of spatial independence?

Cancer deaths in Texas, darker colors = more deaths. Which of the six plots is made from a real dataset and not simulated under the null hypothesis of spatial independence?

Graphical inference: t-tests

A plot of the real data is hidden among eight innocents, plots of data generated from the null distribution.

A plot of the real data is hidden among eight innocents, plots of data generated from the null distribution.

Graphical inference: word clouds

Clouds of selected words from first and sixth editions of Darwin’s “Origin of Species”. Four of them were generated under the null hypothesis of no difference between edition, and one is the true data.

Clouds of selected words from first and sixth editions of Darwin’s “Origin of Species”. Four of them were generated under the null hypothesis of no difference between edition, and one is the true data.

Links

Image credentials and sources