RStudio Intro

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. See also https://bookdown.org/yihui/rmarkdown/

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Note: R is case sensitive. So Cars and cars are different.

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

YAML options

  • this is the beginning section of your markdown file that lets you determine the title and output format

Chunk options

  • this lets you specify what kind of code, any labels you want, and if you want the code to be included with the output

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Install packages

Check out https://rladiessydney.org/courses/ryouwithme/01-basicbasics-2/

# Install the tidyverse "meta" package

#install.packages("tidyverse")

# Install the lme4 package

#install.packages("lme4")

Load a package

Hint: Try the (PC) Ctrl + Alt + I/(Mac) Cmd + Option + I shortcut.

Hint: Use Tab for code completeion

# Load tidyverse

library(tidyverse)
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'pillar'
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'tibble'
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'hms'
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
# Load lme4

library(lme4)
## Warning: package 'lme4' was built under R version 4.1.2
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack

Data Frame

# Load sleepstudy -- what's sleepstudy?

sleepstudy
##     Reaction Days Subject
## 1   249.5600    0     308
## 2   258.7047    1     308
## 3   250.8006    2     308
## 4   321.4398    3     308
## 5   356.8519    4     308
## 6   414.6901    5     308
## 7   382.2038    6     308
## 8   290.1486    7     308
## 9   430.5853    8     308
## 10  466.3535    9     308
## 11  222.7339    0     309
## 12  205.2658    1     309
## 13  202.9778    2     309
## 14  204.7070    3     309
## 15  207.7161    4     309
## 16  215.9618    5     309
## 17  213.6303    6     309
## 18  217.7272    7     309
## 19  224.2957    8     309
## 20  237.3142    9     309
## 21  199.0539    0     310
## 22  194.3322    1     310
## 23  234.3200    2     310
## 24  232.8416    3     310
## 25  229.3074    4     310
## 26  220.4579    5     310
## 27  235.4208    6     310
## 28  255.7511    7     310
## 29  261.0125    8     310
## 30  247.5153    9     310
## 31  321.5426    0     330
## 32  300.4002    1     330
## 33  283.8565    2     330
## 34  285.1330    3     330
## 35  285.7973    4     330
## 36  297.5855    5     330
## 37  280.2396    6     330
## 38  318.2613    7     330
## 39  305.3495    8     330
## 40  354.0487    9     330
## 41  287.6079    0     331
## 42  285.0000    1     331
## 43  301.8206    2     331
## 44  320.1153    3     331
## 45  316.2773    4     331
## 46  293.3187    5     331
## 47  290.0750    6     331
## 48  334.8177    7     331
## 49  293.7469    8     331
## 50  371.5811    9     331
## 51  234.8606    0     332
## 52  242.8118    1     332
## 53  272.9613    2     332
## 54  309.7688    3     332
## 55  317.4629    4     332
## 56  309.9976    5     332
## 57  454.1619    6     332
## 58  346.8311    7     332
## 59  330.3003    8     332
## 60  253.8644    9     332
## 61  283.8424    0     333
## 62  289.5550    1     333
## 63  276.7693    2     333
## 64  299.8097    3     333
## 65  297.1710    4     333
## 66  338.1665    5     333
## 67  332.0265    6     333
## 68  348.8399    7     333
## 69  333.3600    8     333
## 70  362.0428    9     333
## 71  265.4731    0     334
## 72  276.2012    1     334
## 73  243.3647    2     334
## 74  254.6723    3     334
## 75  279.0244    4     334
## 76  284.1912    5     334
## 77  305.5248    6     334
## 78  331.5229    7     334
## 79  335.7469    8     334
## 80  377.2990    9     334
## 81  241.6083    0     335
## 82  273.9472    1     335
## 83  254.4907    2     335
## 84  270.8021    3     335
## 85  251.4519    4     335
## 86  254.6362    5     335
## 87  245.4523    6     335
## 88  235.3110    7     335
## 89  235.7541    8     335
## 90  237.2466    9     335
## 91  312.3666    0     337
## 92  313.8058    1     337
## 93  291.6112    2     337
## 94  346.1222    3     337
## 95  365.7324    4     337
## 96  391.8385    5     337
## 97  404.2601    6     337
## 98  416.6923    7     337
## 99  455.8643    8     337
## 100 458.9167    9     337
## 101 236.1032    0     349
## 102 230.3167    1     349
## 103 238.9256    2     349
## 104 254.9220    3     349
## 105 250.7103    4     349
## 106 269.7744    5     349
## 107 281.5648    6     349
## 108 308.1020    7     349
## 109 336.2806    8     349
## 110 351.6451    9     349
## 111 256.2968    0     350
## 112 243.4543    1     350
## 113 256.2046    2     350
## 114 255.5271    3     350
## 115 268.9165    4     350
## 116 329.7247    5     350
## 117 379.4445    6     350
## 118 362.9184    7     350
## 119 394.4872    8     350
## 120 389.0527    9     350
## 121 250.5265    0     351
## 122 300.0576    1     351
## 123 269.8939    2     351
## 124 280.5891    3     351
## 125 271.8274    4     351
## 126 304.6336    5     351
## 127 287.7466    6     351
## 128 266.5955    7     351
## 129 321.5418    8     351
## 130 347.5655    9     351
## 131 221.6771    0     352
## 132 298.1939    1     352
## 133 326.8785    2     352
## 134 346.8555    3     352
## 135 348.7402    4     352
## 136 352.8287    5     352
## 137 354.4266    6     352
## 138 360.4326    7     352
## 139 375.6406    8     352
## 140 388.5417    9     352
## 141 271.9235    0     369
## 142 268.4369    1     369
## 143 257.2424    2     369
## 144 277.6566    3     369
## 145 314.8222    4     369
## 146 317.2135    5     369
## 147 298.1353    6     369
## 148 348.1229    7     369
## 149 340.2800    8     369
## 150 366.5131    9     369
## 151 225.2640    0     370
## 152 234.5235    1     370
## 153 238.9008    2     370
## 154 240.4730    3     370
## 155 267.5373    4     370
## 156 344.1937    5     370
## 157 281.1481    6     370
## 158 347.5855    7     370
## 159 365.1630    8     370
## 160 372.2288    9     370
## 161 269.8804    0     371
## 162 272.4428    1     371
## 163 277.8989    2     371
## 164 281.7895    3     371
## 165 279.1705    4     371
## 166 284.5120    5     371
## 167 259.2658    6     371
## 168 304.6306    7     371
## 169 350.7807    8     371
## 170 369.4692    9     371
## 171 269.4117    0     372
## 172 273.4740    1     372
## 173 297.5968    2     372
## 174 310.6316    3     372
## 175 287.1726    4     372
## 176 329.6076    5     372
## 177 334.4818    6     372
## 178 343.2199    7     372
## 179 369.1417    8     372
## 180 364.1236    9     372
summary(sleepstudy)
##     Reaction          Days        Subject   
##  Min.   :194.3   Min.   :0.0   308    : 10  
##  1st Qu.:255.4   1st Qu.:2.0   309    : 10  
##  Median :288.7   Median :4.5   310    : 10  
##  Mean   :298.5   Mean   :4.5   330    : 10  
##  3rd Qu.:336.8   3rd Qu.:7.0   331    : 10  
##  Max.   :466.4   Max.   :9.0   332    : 10  
##                                (Other):120
glimpse(sleepstudy)
## Rows: 180
## Columns: 3
## $ Reaction <dbl> 249.5600, 258.7047, 250.8006, 321.4398, 356.8519, 414.6901, 3…
## $ Days     <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0…
## $ Subject  <fct> 308, 308, 308, 308, 308, 308, 308, 308, 308, 308, 309, 309, 3…
# Extract one column

sleepstudy$Reaction
##   [1] 249.5600 258.7047 250.8006 321.4398 356.8519 414.6901 382.2038 290.1486
##   [9] 430.5853 466.3535 222.7339 205.2658 202.9778 204.7070 207.7161 215.9618
##  [17] 213.6303 217.7272 224.2957 237.3142 199.0539 194.3322 234.3200 232.8416
##  [25] 229.3074 220.4579 235.4208 255.7511 261.0125 247.5153 321.5426 300.4002
##  [33] 283.8565 285.1330 285.7973 297.5855 280.2396 318.2613 305.3495 354.0487
##  [41] 287.6079 285.0000 301.8206 320.1153 316.2773 293.3187 290.0750 334.8177
##  [49] 293.7469 371.5811 234.8606 242.8118 272.9613 309.7688 317.4629 309.9976
##  [57] 454.1619 346.8311 330.3003 253.8644 283.8424 289.5550 276.7693 299.8097
##  [65] 297.1710 338.1665 332.0265 348.8399 333.3600 362.0428 265.4731 276.2012
##  [73] 243.3647 254.6723 279.0244 284.1912 305.5248 331.5229 335.7469 377.2990
##  [81] 241.6083 273.9472 254.4907 270.8021 251.4519 254.6362 245.4523 235.3110
##  [89] 235.7541 237.2466 312.3666 313.8058 291.6112 346.1222 365.7324 391.8385
##  [97] 404.2601 416.6923 455.8643 458.9167 236.1032 230.3167 238.9256 254.9220
## [105] 250.7103 269.7744 281.5648 308.1020 336.2806 351.6451 256.2968 243.4543
## [113] 256.2046 255.5271 268.9165 329.7247 379.4445 362.9184 394.4872 389.0527
## [121] 250.5265 300.0576 269.8939 280.5891 271.8274 304.6336 287.7466 266.5955
## [129] 321.5418 347.5655 221.6771 298.1939 326.8785 346.8555 348.7402 352.8287
## [137] 354.4266 360.4326 375.6406 388.5417 271.9235 268.4369 257.2424 277.6566
## [145] 314.8222 317.2135 298.1353 348.1229 340.2800 366.5131 225.2640 234.5235
## [153] 238.9008 240.4730 267.5373 344.1937 281.1481 347.5855 365.1630 372.2288
## [161] 269.8804 272.4428 277.8989 281.7895 279.1705 284.5120 259.2658 304.6306
## [169] 350.7807 369.4692 269.4117 273.4740 297.5968 310.6316 287.1726 329.6076
## [177] 334.4818 343.2199 369.1417 364.1236
select(sleepstudy,Days)
##     Days
## 1      0
## 2      1
## 3      2
## 4      3
## 5      4
## 6      5
## 7      6
## 8      7
## 9      8
## 10     9
## 11     0
## 12     1
## 13     2
## 14     3
## 15     4
## 16     5
## 17     6
## 18     7
## 19     8
## 20     9
## 21     0
## 22     1
## 23     2
## 24     3
## 25     4
## 26     5
## 27     6
## 28     7
## 29     8
## 30     9
## 31     0
## 32     1
## 33     2
## 34     3
## 35     4
## 36     5
## 37     6
## 38     7
## 39     8
## 40     9
## 41     0
## 42     1
## 43     2
## 44     3
## 45     4
## 46     5
## 47     6
## 48     7
## 49     8
## 50     9
## 51     0
## 52     1
## 53     2
## 54     3
## 55     4
## 56     5
## 57     6
## 58     7
## 59     8
## 60     9
## 61     0
## 62     1
## 63     2
## 64     3
## 65     4
## 66     5
## 67     6
## 68     7
## 69     8
## 70     9
## 71     0
## 72     1
## 73     2
## 74     3
## 75     4
## 76     5
## 77     6
## 78     7
## 79     8
## 80     9
## 81     0
## 82     1
## 83     2
## 84     3
## 85     4
## 86     5
## 87     6
## 88     7
## 89     8
## 90     9
## 91     0
## 92     1
## 93     2
## 94     3
## 95     4
## 96     5
## 97     6
## 98     7
## 99     8
## 100    9
## 101    0
## 102    1
## 103    2
## 104    3
## 105    4
## 106    5
## 107    6
## 108    7
## 109    8
## 110    9
## 111    0
## 112    1
## 113    2
## 114    3
## 115    4
## 116    5
## 117    6
## 118    7
## 119    8
## 120    9
## 121    0
## 122    1
## 123    2
## 124    3
## 125    4
## 126    5
## 127    6
## 128    7
## 129    8
## 130    9
## 131    0
## 132    1
## 133    2
## 134    3
## 135    4
## 136    5
## 137    6
## 138    7
## 139    8
## 140    9
## 141    0
## 142    1
## 143    2
## 144    3
## 145    4
## 146    5
## 147    6
## 148    7
## 149    8
## 150    9
## 151    0
## 152    1
## 153    2
## 154    3
## 155    4
## 156    5
## 157    6
## 158    7
## 159    8
## 160    9
## 161    0
## 162    1
## 163    2
## 164    3
## 165    4
## 166    5
## 167    6
## 168    7
## 169    8
## 170    9
## 171    0
## 172    1
## 173    2
## 174    3
## 175    4
## 176    5
## 177    6
## 178    7
## 179    8
## 180    9
# Extract column by index (index=position)

select(sleepstudy, 2)
##     Days
## 1      0
## 2      1
## 3      2
## 4      3
## 5      4
## 6      5
## 7      6
## 8      7
## 9      8
## 10     9
## 11     0
## 12     1
## 13     2
## 14     3
## 15     4
## 16     5
## 17     6
## 18     7
## 19     8
## 20     9
## 21     0
## 22     1
## 23     2
## 24     3
## 25     4
## 26     5
## 27     6
## 28     7
## 29     8
## 30     9
## 31     0
## 32     1
## 33     2
## 34     3
## 35     4
## 36     5
## 37     6
## 38     7
## 39     8
## 40     9
## 41     0
## 42     1
## 43     2
## 44     3
## 45     4
## 46     5
## 47     6
## 48     7
## 49     8
## 50     9
## 51     0
## 52     1
## 53     2
## 54     3
## 55     4
## 56     5
## 57     6
## 58     7
## 59     8
## 60     9
## 61     0
## 62     1
## 63     2
## 64     3
## 65     4
## 66     5
## 67     6
## 68     7
## 69     8
## 70     9
## 71     0
## 72     1
## 73     2
## 74     3
## 75     4
## 76     5
## 77     6
## 78     7
## 79     8
## 80     9
## 81     0
## 82     1
## 83     2
## 84     3
## 85     4
## 86     5
## 87     6
## 88     7
## 89     8
## 90     9
## 91     0
## 92     1
## 93     2
## 94     3
## 95     4
## 96     5
## 97     6
## 98     7
## 99     8
## 100    9
## 101    0
## 102    1
## 103    2
## 104    3
## 105    4
## 106    5
## 107    6
## 108    7
## 109    8
## 110    9
## 111    0
## 112    1
## 113    2
## 114    3
## 115    4
## 116    5
## 117    6
## 118    7
## 119    8
## 120    9
## 121    0
## 122    1
## 123    2
## 124    3
## 125    4
## 126    5
## 127    6
## 128    7
## 129    8
## 130    9
## 131    0
## 132    1
## 133    2
## 134    3
## 135    4
## 136    5
## 137    6
## 138    7
## 139    8
## 140    9
## 141    0
## 142    1
## 143    2
## 144    3
## 145    4
## 146    5
## 147    6
## 148    7
## 149    8
## 150    9
## 151    0
## 152    1
## 153    2
## 154    3
## 155    4
## 156    5
## 157    6
## 158    7
## 159    8
## 160    9
## 161    0
## 162    1
## 163    2
## 164    3
## 165    4
## 166    5
## 167    6
## 168    7
## 169    8
## 170    9
## 171    0
## 172    1
## 173    2
## 174    3
## 175    4
## 176    5
## 177    6
## 178    7
## 179    8
## 180    9
# Extract two rows

slice(sleepstudy, 1:2)
##   Reaction Days Subject
## 1 249.5600    0     308
## 2 258.7047    1     308
# Compute the mean and sd, and chain them together

c(mean(sleepstudy$Reaction), sd(sleepstudy$Reaction))
## [1] 298.50789  56.32876
# Correlation matrix with psych::pairs.panel()

psych::pairs.panels(sleepstudy)

# Find out what a function does (use `?function_name`, e.g., `?pairs.panel`)

?pairs.panels

Basic Markdown Elements

Italic and bolded texts

italics

bold

Lists (Ordered and Unordered)

(list_data <- c(4, 2, 1, 3, 5))
## [1] 4 2 1 3 5
(ordered <-sort(list_data))
## [1] 1 2 3 4 5
(unordered <- list_data)
## [1] 4 2 1 3 5

Equations (LaTeX)

Inline: The correlation between . . .

$ y = mx + b $

Display:

\[\begin{equation} y = mx + b \end{equation}\]

Cheatsheet

More detailed cheatsheet https://raw.githubusercontent.com/rstudio/cheatsheets/master/rmarkdown-2.0.pdf


Exercises

Complete the following in this R Markdown document:

  1. Copy the following LaTeX equation to below: A_1 = \pi r^2. How does this say about writing Greek letters and subscripts/superscripts? \[[A_1 = \pi r^2]\]

  2. Install and then load the modelsummary package, and run the following. You’ll need to remove eval=FALSE so that it runs. Find out what this code chunk does.

    # Install and load the modelsummary package first; otherwise it won't run
    library(modelsummary)
    ## Warning: package 'modelsummary' was built under R version 4.1.2
    (fm1 <- lm(dist ~ speed, data = cars))
    ## 
    ## Call:
    ## lm(formula = dist ~ speed, data = cars)
    ## 
    ## Coefficients:
    ## (Intercept)        speed  
    ##     -17.579        3.932
    (fm2 <- lm(dist ~ poly(speed, 2), data = cars))
    ## 
    ## Call:
    ## lm(formula = dist ~ poly(speed, 2), data = cars)
    ## 
    ## Coefficients:
    ##     (Intercept)  poly(speed, 2)1  poly(speed, 2)2  
    ##           42.98           145.55            23.00
    (fm3 <- lm(log(dist) ~ log(speed), data = cars))
    ## 
    ## Call:
    ## lm(formula = log(dist) ~ log(speed), data = cars)
    ## 
    ## Coefficients:
    ## (Intercept)   log(speed)  
    ##     -0.7297       1.6024
    msummary(list(fm1, fm2, fm3))

    Model 1

    Model 2

    Model 3

    (Intercept)

    −17.579

    42.980

    −0.730

    (6.758)

    (2.146)

    (0.376)

    speed

    3.932

    (0.416)

    poly(speed, 2)1

    145.552

    (15.176)

    poly(speed, 2)2

    22.996

    (15.176)

    log(speed)

    1.602

    (0.140)

    Num.Obs.

    50

    50

    50

    R2

    0.651

    0.667

    0.733

    R2 Adj.

    0.644

    0.653

    0.728

    AIC

    419.2

    418.8

    55.5

    BIC

    424.9

    426.4

    61.3

    Log.Lik.

    −206.578

    −205.386

    −24.766

    F

    89.567

    47.141

    131.873

    RMSE

    15.38

    15.18

    0.41

  3. Run the following. You’ll need to remove eval=FALSE so that it runs. Find out what this code chunk does.

    ggplot(cars, aes(x = log(speed), y = log(dist))) + 
      geom_point() + 
      geom_smooth()
    ## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

  4. Knit the document to HTML, PDF, and Word. If you run into an error when knitting to any one of the formats, record the error message. Which format do you prefer?

  1. Go to the top of this Rmd file, and change the line inside YAML

     ```
       html_document: default
     ```
    
     to
    
     ```
       html_document: 
           toc: TRUE
     ```
    
     Knit the document again. What does it do?

I got this error:

Error in yaml::yaml.load(…, eval.expr = TRUE) : Scanner error: mapping values are not allowed in this context at line 6, column 17 Calls: … parse_yaml_front_matter -> yaml_load -> Execution halted

  1. Think about some things you might want to accomplish with R and R Markdown using RStudio and Google to see if people have make packages that will help with that process, or just Google around about these topics and see what you see. Either way, describe 5 interesting things that you found and include URLs.

    1. cleaning outliers from big datasets - the filter function looks like a good way to establish a threshold for identifying outliers (https://stackoverflow.com/questions/65320747/removing-outliers-within-each-group-of-a-dataframe)

    2. getting a nice figure for doing factor analysis - the fa.stats function seems to cover this (https://rpubs.com/vsi/factor_analysis)

    3. better way to visualize big correlation matrix - the corrgram function gives a good exploratory look at big correlation matrices (https://cran.r-project.org/web/packages/corrgram/vignettes/corrgram_examples.html)

    4. way to handle missing data on big datasets - seems there are many options for addressing this, including using the is.na function to identify rows with outliers. other tips are also included on this page (https://towardsdatascience.com/data-cleaning-with-r-and-the-tidyverse-detecting-missing-values-ea23c519bc62)

    5. more straightforward way to combine data from different datasets - there are left join and right join options that allow you to stitch together columns from different datasets (https://datascienceplus.com/merging-datasets-with-tidyverse/)