R for Economics and Social Sciences Research

class: center, middle

# R for Economics and Social Science Research

#### Norberto E. Milla, Jr.

---

class: center, middle

# Day 1: Introduction to R

---

#  Overview of R and RStudio

* is open source and freely available

* is a cross-platform language

* has an extensive and coherent set of tools for statistical analysis

* has an extensive and highly flexible graphical facility for producing publication-ready graphics

* has an expanding set of freely available ‘packages’ to extend R’s capabilities

* has an extensive support network with numerous online and freely available documents

---
#  Overview of R and RStudio

* RStudio is an add-on user-friendly interface to R

* It incorporates the R Console, a script editor and other useful functionality

---
# Installing and loading packages

* R packages are collections of functions, data sets, and documentation that enhance R's functionality (e. g. <tt>tidyverse</tt>, <tt>ggplot2</tt>, <tt>readxl</tt>)

* They help users to efficiently perform specific tasks more easily

* Many R packages can be downloaded from the Comprehensive R Archiving Network (CRAN): *>22,000 packages*

* Type in the Console <tt>install.packages("*packagename*")</tt>

* Alternative: Click **Tools** in the Menu bar and select *Install Packages...*

---

# Data types and basic operations

* R **objects** are fundamental data containers: <tt>vectors</tt>, <tt>matrices</tt>, <tt>data frames</tt>, <tt>lists</tt>, and <tt>functions</tt>

* Objects are created using the assignment operator: <tt>*<-*</tt>

``` r
a <- 10
b <- 5
c <- sqrt(a^2 + b^2)
print(c)
```

```
## [1] 11.18034
```

* A few rules in naming objects in R:

- R is case-sensitive: <tt>Weight</tt> is different from <tt>weight</tt>
   
   - Object names should be explicit and not too long
   
   - Do not start a name with a number such as <tt>2cm</tt>

---

# Data types and basic operations: vector

* one-dimensional arrays that hold elements of the same type, such as numbers, characters, or logical values

``` r
numeric_vector <- c(1, 2, 3, 4)          # Numeric vector
character_vector <- c("apple", "banana") # Character vector
logical_vector <- c(TRUE, FALSE, TRUE)   # Logical vector
```

``` r
numeric_vector
```

```
## [1] 1 2 3 4
```

``` r
character_vector
```

```
## [1] "apple"  "banana"
```

``` r
logical_vector
```

```
## [1]  TRUE FALSE  TRUE
```

---

# Data types and basic operations: vector

``` r
x <- c(1, 2, 3) 
y <- c(4, 5, 6)
x + y # Adds the elements of x and y
```

```
## [1] 5 7 9
```

``` r
y^x # y is raised to power x
```

```
## [1]   4  25 216
```

``` r
x * y # To get the product of the elements of x and y
```

```
## [1]  4 10 18
```

---
# Data types and basic operations: vector

* Elements of a vector can be accessed as follows:

``` r
set.seed(1234) # Allows to generate the same set of random numbers
z <- rnorm(n = 6, mean = 0, sd = 1)
z # Prints all elements of z
```

```
## [1] -1.2070657  0.2774292  1.0844412 -2.3456977  0.4291247  0.5060559
```

``` r
z[4] # Extracts the 4th element of z
```

```
## [1] -2.345698
```

``` r
z[4:6] # Extracts the 4th through the 7th elements of z
```

```
## [1] -2.3456977  0.4291247  0.5060559
```

``` r
z[c(1,3,5)] # Extracts the 1st, ,3rd, and 5th elements of z
```

```
## [1] -1.2070657  1.0844412  0.4291247
```
---
# Data types and basic operations: vector

* Basic functions in working with vectors: <tt>length()</tt>, <tt>sum()</tt>, <tt>mean()</tt>, <tt>sd()</tt>

``` r
length(z) # Determines the number of elements of z
```

```
## [1] 6
```

``` r
sum(z) # Determines the sum of the elements of z
```

```
## [1] -1.255712
```

``` r
mean(z) # Calculates the mean/average of the elements of z
```

```
## [1] -0.2092854
```

``` r
sd(z) # Calculates the standard deviation of the elements of z
```

```
## [1] 1.295355
```

---
# Data types and basic operations: matrix

* two-dimensional arrangement of data (of same type) in rows and columns

* are collections of vectors organized into rows and columns

``` r
A <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE) # Create a 3x3 numeric matrix
print(A)
```

```
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
```

``` r
B <- matrix(1:9, nrow = 3, ncol = 3, byrow = FALSE) # Create a 3x3 numeric matrix
print(B)
```

```
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
```

---
# Data types and basic operations: matrix

* Addition, subtraction and multiplication of matrices are shown below:

``` r
A + B # Addition
```

```
##      [,1] [,2] [,3]
## [1,]    2    6   10
## [2,]    6   10   14
## [3,]   10   14   18
```

``` r
A - B # Subutraction
```

```
##      [,1] [,2] [,3]
## [1,]    0   -2   -4
## [2,]    2    0   -2
## [3,]    4    2    0
```

``` r
A %*% B # Multiplication
```

```
##      [,1] [,2] [,3]
## [1,]   14   32   50
## [2,]   32   77  122
## [3,]   50  122  194
```

---
# Data types and basic operations: list

* Used to store mixtures of data types

``` r
mylist <- list(char_vector = c("black", "yellow", "orange"),
               logic_vector = c(TRUE, TRUE, FALSE, TRUE, FALSE, FALSE),
               num_mat= matrix(1:6, nrow = 3))
print(mylist)
```

```
## $char_vector
## [1] "black"  "yellow" "orange"
## 
## $logic_vector
## [1]  TRUE  TRUE FALSE  TRUE FALSE FALSE
## 
## $num_mat
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
```

---
# Data types and basic operations: list

* We can apply indexing to extract one or more elements of a list just like in a vector

``` r
print(mylist[2]) # Prints the logical vector
```

```
## $logic_vector
## [1]  TRUE  TRUE FALSE  TRUE FALSE FALSE
```

``` r
print(mylist[3]) # Prints the numeric matrix
```

```
## $num_mat
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
```

---
# Data types and basic operations: data frame

.pull-left[
* a two-dimensional, tabular data structure where each column can contain elements of different data types (numeric, character, logical, etc.)

* every row corresponds to an observation or case (e. g. student, firm, university); a column corresponds to a variable (e. g. age, sex, marital status)

* most commonly used data structure for statistical analysis and data manipulation
]

.pull-right[
<img src="pic4.png" width="100%" style="display: block; margin: auto;" />
]

---
# Data types and basic operations: data frame

``` r
stud_height <- c(180, 155, 160, 167, 181, 165)
stud_weight <- c(65, 50, 52, 58, 70, 60)
stud_names <- c("Theo", "Anthony", "Vincent", "Angelo", "Lee", "Antonette")

stud_record <- data.frame(Names = stud_names, 
                          Height = stud_height, 
                          Weight = stud_weight)

print(stud_record)
```

```
##       Names Height Weight
## 1      Theo    180     65
## 2   Anthony    155     50
## 3   Vincent    160     52
## 4    Angelo    167     58
## 5       Lee    181     70
## 6 Antonette    165     60
```
---
# R Markdown

- a simple and easy to use plain text language where one can type R codes and see the results (e.g. plots and tables) after running these codes in the one document

- useful to generate a single nicely formatted and reproducible document (like a report, publication, thesis chapter or a web page, slides)

- the document can be rendered in HTML, pdf, or Word format

---
# Data management: importing data

* We can import data sets using various functions in R

- <tt>read.table()</tt> for text (*.txt*)  files
   
   - <tt>read.csv()</tt> for Comma-delimited Excel (*.csv*)  files
   
   - <tt>read_dta()</tt> or <tt>read_stata()</tt> for Stata (*.dta*) files [**haven** package]
   
   - <tt>read_excel()</tt> for Excel (*.xls* or *.xlsx*) files [**readxl** package]

---
# Data management: importing data

**Setting up the working directory**

- Type in the Console <tt>setwd("path")</tt>

- Click **Session** in the menu bar, select  **Set Working Directory** then **Choose Directory..**

- Click **Files** in the <tt>Files, Plots, Packages, Help</tt> pane in RStudio. Then click the icon (encircled) shown below. Browse to the desired folder.

- Then click the gear icon (encircled) shown below and select *Set As Working Directory*

---
# Data management: importing data

``` r
ndhs <- read_stata("PHBR82FL.DTA")
head(ndhs)
```

```
## # A tibble: 6 × 1,252
##   caseid       bidx v000   v001  v002  v003  v004   v005  v006  v007  v008 v008a
##   <chr>       <dbl> <chr> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "       1 …     1 PH8       1     4     2     1 116381     5  2022  1469 44691
## 2 "       1 …     2 PH8       1     4     2     1 116381     5  2022  1469 44691
## 3 "       1 …     1 PH8       1     6     2     1 116381     5  2022  1469 44686
## 4 "       1 …     2 PH8       1     6     2     1 116381     5  2022  1469 44686
## 5 "       1 …     1 PH8       1     7     6     1 116381     5  2022  1469 44693
## 6 "       1 …     2 PH8       1     7     6     1 116381     5  2022  1469 44693
## # ℹ 1,240 more variables: v009 <dbl>, v010 <dbl>, v011 <dbl>, v012 <dbl>,
## #   v013 <dbl+lbl>, v014 <dbl+lbl>, v015 <dbl+lbl>, v016 <dbl>, v017 <dbl>,
## #   v018 <dbl+lbl>, v019 <dbl+lbl>, v019a <dbl+lbl>, v020 <dbl+lbl>,
## #   v021 <dbl>, v022 <dbl>, v023 <dbl>, v024 <dbl+lbl>, v025 <dbl+lbl>,
## #   v026 <dbl+lbl>, v027 <dbl>, v028 <dbl>, v029 <dbl>, v030 <dbl>, v031 <dbl>,
## #   v032 <dbl>, v034 <dbl+lbl>, v040 <dbl>, v042 <dbl+lbl>, v044 <dbl+lbl>,
## #   v045a <dbl+lbl>, v045b <dbl+lbl>, v045c <dbl+lbl>, v046 <dbl+lbl>, …
```

---
# Data management: importing data

``` r
library(haven)
library(readxl)
ets1 <- read_excel("Profile of on-going students (Region 1).xlsx")
head(ets1)
```

```
## # A tibble: 6 × 19
##    ...1 Region   SUC Target Student_Cat_Final   Age   Sex Civil_stats HHsize
##   <dbl>  <dbl> <dbl>  <dbl>             <dbl> <dbl> <dbl>       <dbl>  <dbl>
## 1     1      1     2      1                 2    18     1           1      2
## 2     2      1     1      0                 2    24     1           1      2
## 3     3      1     2      0                 2    19     1           1      2
## 4     4      1     1      1                 2    21     0           1      1
## 5     5      1     4      1                 2    20     0           1      2
## 6     6      1     5      0                 2    19     0           1      1
## # ℹ 10 more variables: Religion <dbl>, `Monthly Income` <chr>,
## #   Birth_Order <chr>, Living_arrange <dbl>, Listahanan <dbl>,
## #   `4P's beneficiary` <dbl>, `Pre-college Ed` <dbl>, HS_type <dbl>,
## #   Strand <dbl>, GWA <dbl>
```

---
# Data management: data wrangling

Common verbs in the **dplyr** package:

- <tt>select()</tt>: picks variables based on their names

- <tt>filter()</tt>: picks cases based on their values

- <tt>mutate()</tt>: adds new variables that are functions of existing variables

- <tt>summarize()</tt>: generates summary statistics such as mean, median, and SD

- <tt>group_by()</tt>: generate summaries by group

- <tt>arrange()</tt>: changes the ordering of the rows

- <tt>rename()</tt>: replace the name of a variable

---
# Data management: data wrangling

The **tidyverse** package:  collection of R packages designed for data science

---

# Data management: data wrangling

``` r
ets1 %>% # pipe operator (CNTRL + SHIFT + M)
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand) %>% 
  head()
```

```
## # A tibble: 6 × 7
##   Target   Age   Sex HHsize   GWA HS_type Strand
##    <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl>
## 1      1    18     1      2  85         0      0
## 2      0    24     1      2  85         1      1
## 3      0    19     1      2  87         0      0
## 4      1    21     0      1  90         0      0
## 5      1    20     0      2  NA         0      0
## 6      0    19     0      1  92.0       0      0
```

---
# Data management: data wrangling

``` r
ets1 %>%
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand) %>% 
  mutate(Sex_recode = recode(Sex,
                             "0" = "Male",
                             "1" = "Female")) %>%
  head(n=10)
```

```
## # A tibble: 10 × 8
##    Target   Age   Sex HHsize   GWA HS_type Strand Sex_recode
##     <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl> <chr>     
##  1      1    18     1      2  85         0      0 Female    
##  2      0    24     1      2  85         1      1 Female    
##  3      0    19     1      2  87         0      0 Female    
##  4      1    21     0      1  90         0      0 Male      
##  5      1    20     0      2  NA         0      0 Male      
##  6      0    19     0      1  92.0       0      0 Male      
##  7      1    20     1      3  96         0      1 Female    
##  8      1    23     0      3  NA         1      1 Male      
##  9      1    21     1      2  93         1      0 Female    
## 10      1    19     1      3  80         0      0 Female
```

---

# Data management: data wrangling

``` r
ets1 %>%
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand) %>% 
  mutate(Sex_recode = recode(Sex,
                             "0" = "Male",
                             "1" = "Female")) %>%
  filter(GWA>90 & Sex_recode == "Male")
```

```
## # A tibble: 218 × 8
##    Target   Age   Sex HHsize   GWA HS_type Strand Sex_recode
##     <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl> <chr>     
##  1      0    19     0      1  92.0       0      0 Male      
##  2      0    19     0      2  94         0      0 Male      
##  3      0    21     0      2  91         0      1 Male      
##  4      0    20     0      2  94.4       0      0 Male      
##  5      0    20     0      3  92         0      0 Male      
##  6      1    21     0      2  95         0      0 Male      
##  7      1    25     0      2  92         0      0 Male      
##  8      0    19     0      2  91         1      1 Male      
##  9      0    21     0      2  91.5       1      1 Male      
## 10      0    21     0      1  91         0      0 Male      
## # ℹ 208 more rows
```

---

# Data management: data wrangling

``` r
ets1 %>%
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand) %>% 
  mutate(Sex_recode = recode(Sex,
                             "0" = "Male",
                             "1" = "Female")) %>%
  drop_na(Age) %>% 
  mutate(Age_Cat = if_else(Age<=20, "Less than 20",
                           if_else(Age<=30,"21-30",
                                   "31 & up")))
```

```
## # A tibble: 1,749 × 9
##    Target   Age   Sex HHsize   GWA HS_type Strand Sex_recode Age_Cat     
##     <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl> <chr>      <chr>       
##  1      1    18     1      2  85         0      0 Female     Less than 20
##  2      0    24     1      2  85         1      1 Female     21-30       
##  3      0    19     1      2  87         0      0 Female     Less than 20
##  4      1    21     0      1  90         0      0 Male       21-30       
##  5      1    20     0      2  NA         0      0 Male       Less than 20
##  6      0    19     0      1  92.0       0      0 Male       Less than 20
##  7      1    20     1      3  96         0      1 Female     Less than 20
##  8      1    23     0      3  NA         1      1 Male       21-30       
##  9      1    21     1      2  93         1      0 Female     21-30       
## 10      1    19     1      3  80         0      0 Female     Less than 20
## # ℹ 1,739 more rows
```
---

# Data management: data wrangling

``` r
ets1 %>%
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand) %>% 
  mutate(Sex_recode = recode(Sex,
                             "0" = "Male",
                             "1" = "Female")) %>%
  drop_na(Age) %>% 
  arrange(Age) %>% 
  mutate(Age_Cat = cut(Age,
                       breaks = 3,
                       labels = c("AgeGrp1", "AgeGrp2", "AgeGrp3"))) %>% 
  group_by(Age_Cat) %>% 
  count()
```

```
## # A tibble: 3 × 2
## # Groups:   Age_Cat [3]
##   Age_Cat     n
##   <fct>   <int>
## 1 AgeGrp1  1705
## 2 AgeGrp2    41
## 3 AgeGrp3     3
```

---
class: center, middle

## LUNCH BREAK
---
# Descriptive statistics: frequency tables

``` r
etsdata <- ets1 %>%
  select(Target, Age, Sex, HHsize, GWA, HS_type, Strand, ) %>% 
  mutate(Target = recode(Target,
                         "0" = "General Students",
                         "1" = "Equity Target Students"),
         Sex = recode(Sex,
                      "0" = "Male",
                      "1" = "Female"),
         HHsize = recode(HHsize,
                         "1" = "Small",
                         "2" = "Medium",
                         "3" = "Large"),
         HS_type = recode(HS_type,
                          "0" = "Public",
                          "1" = "Private"),
         Strand = recode(Strand,
                         "0" = "Non-STEM",
                         "1" = "STEM"))
```

---
# Descriptive statistics: frequency tables

``` r
etsdata %>% 
  select(Sex, Strand) %>% 
  tbl_summary()
```

<div id="bgvzfontnx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#bgvzfontnx table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#bgvzfontnx thead, #bgvzfontnx tbody, #bgvzfontnx tfoot, #bgvzfontnx tr, #bgvzfontnx td, #bgvzfontnx th {
border-style: none;
}
#bgvzfontnx p {
margin: 0;
padding: 0;
}
#bgvzfontnx .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#bgvzfontnx .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#bgvzfontnx .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#bgvzfontnx .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#bgvzfontnx .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#bgvzfontnx .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#bgvzfontnx .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#bgvzfontnx .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#bgvzfontnx .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#bgvzfontnx .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#bgvzfontnx .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#bgvzfontnx .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#bgvzfontnx .gt_spanner_row {
border-bottom-style: hidden;
}
#bgvzfontnx .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#bgvzfontnx .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#bgvzfontnx .gt_from_md > :first-child {
margin-top: 0;
}
#bgvzfontnx .gt_from_md > :last-child {
margin-bottom: 0;
}
#bgvzfontnx .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#bgvzfontnx .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#bgvzfontnx .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#bgvzfontnx .gt_row_group_first td {
border-top-width: 2px;
}
#bgvzfontnx .gt_row_group_first th {
border-top-width: 2px;
}
#bgvzfontnx .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#bgvzfontnx .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#bgvzfontnx .gt_first_summary_row.thick {
border-top-width: 2px;
}
#bgvzfontnx .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#bgvzfontnx .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#bgvzfontnx .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#bgvzfontnx .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#bgvzfontnx .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#bgvzfontnx .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#bgvzfontnx .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#bgvzfontnx .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#bgvzfontnx .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#bgvzfontnx .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#bgvzfontnx .gt_left {
text-align: left;
}
#bgvzfontnx .gt_center {
text-align: center;
}
#bgvzfontnx .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#bgvzfontnx .gt_font_normal {
font-weight: normal;
}
#bgvzfontnx .gt_font_bold {
font-weight: bold;
}
#bgvzfontnx .gt_font_italic {
font-style: italic;
}
#bgvzfontnx .gt_super {
font-size: 65%;
}
#bgvzfontnx .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#bgvzfontnx .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#bgvzfontnx .gt_indent_1 {
text-indent: 5px;
}
#bgvzfontnx .gt_indent_2 {
text-indent: 10px;
}
#bgvzfontnx .gt_indent_3 {
text-indent: 15px;
}
#bgvzfontnx .gt_indent_4 {
text-indent: 20px;
}
#bgvzfontnx .gt_indent_5 {
text-indent: 25px;
}
#bgvzfontnx .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#bgvzfontnx div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label"><span class="gt_from_md"><strong>Characteristic</strong></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_0"><span class="gt_from_md"><strong>N = 1,750</strong></span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left">Sex</td>
<td headers="stat_0" class="gt_row gt_center"><br /></td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Female</td>
<td headers="stat_0" class="gt_row gt_center">1,105 (63%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Male</td>
<td headers="stat_0" class="gt_row gt_center">645 (37%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">Strand</td>
<td headers="stat_0" class="gt_row gt_center"><br /></td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Non-STEM</td>
<td headers="stat_0" class="gt_row gt_center">1,385 (79%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">    STEM</td>
<td headers="stat_0" class="gt_row gt_center">365 (21%)</td></tr>
  </tbody>
  
  <tfoot class="gt_footnotes">
    <tr>
      <td class="gt_footnote" colspan="2"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span> <span class="gt_from_md">n (%)</span></td>
    </tr>
  </tfoot>
</table>
</div>

---
# Descriptive statistics: frequency tables

``` r
etsdata %>%
  select(Sex, Strand) %>% 
  tbl_summary(by = Strand)
```

<div id="svdhjkbuov" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#svdhjkbuov table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#svdhjkbuov thead, #svdhjkbuov tbody, #svdhjkbuov tfoot, #svdhjkbuov tr, #svdhjkbuov td, #svdhjkbuov th {
border-style: none;
}
#svdhjkbuov p {
margin: 0;
padding: 0;
}
#svdhjkbuov .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#svdhjkbuov .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#svdhjkbuov .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#svdhjkbuov .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#svdhjkbuov .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#svdhjkbuov .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#svdhjkbuov .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#svdhjkbuov .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#svdhjkbuov .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#svdhjkbuov .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#svdhjkbuov .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#svdhjkbuov .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#svdhjkbuov .gt_spanner_row {
border-bottom-style: hidden;
}
#svdhjkbuov .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#svdhjkbuov .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#svdhjkbuov .gt_from_md > :first-child {
margin-top: 0;
}
#svdhjkbuov .gt_from_md > :last-child {
margin-bottom: 0;
}
#svdhjkbuov .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#svdhjkbuov .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#svdhjkbuov .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#svdhjkbuov .gt_row_group_first td {
border-top-width: 2px;
}
#svdhjkbuov .gt_row_group_first th {
border-top-width: 2px;
}
#svdhjkbuov .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#svdhjkbuov .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#svdhjkbuov .gt_first_summary_row.thick {
border-top-width: 2px;
}
#svdhjkbuov .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#svdhjkbuov .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#svdhjkbuov .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#svdhjkbuov .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#svdhjkbuov .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#svdhjkbuov .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#svdhjkbuov .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#svdhjkbuov .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#svdhjkbuov .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#svdhjkbuov .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#svdhjkbuov .gt_left {
text-align: left;
}
#svdhjkbuov .gt_center {
text-align: center;
}
#svdhjkbuov .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#svdhjkbuov .gt_font_normal {
font-weight: normal;
}
#svdhjkbuov .gt_font_bold {
font-weight: bold;
}
#svdhjkbuov .gt_font_italic {
font-style: italic;
}
#svdhjkbuov .gt_super {
font-size: 65%;
}
#svdhjkbuov .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#svdhjkbuov .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#svdhjkbuov .gt_indent_1 {
text-indent: 5px;
}
#svdhjkbuov .gt_indent_2 {
text-indent: 10px;
}
#svdhjkbuov .gt_indent_3 {
text-indent: 15px;
}
#svdhjkbuov .gt_indent_4 {
text-indent: 20px;
}
#svdhjkbuov .gt_indent_5 {
text-indent: 25px;
}
#svdhjkbuov .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#svdhjkbuov div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label"><span class="gt_from_md"><strong>Characteristic</strong></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_1"><span class="gt_from_md"><strong>Non-STEM</strong><br />
N = 1,385</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_2"><span class="gt_from_md"><strong>STEM</strong><br />
N = 365</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left">Sex</td>
<td headers="stat_1" class="gt_row gt_center"><br /></td>
<td headers="stat_2" class="gt_row gt_center"><br /></td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Female</td>
<td headers="stat_1" class="gt_row gt_center">882 (64%)</td>
<td headers="stat_2" class="gt_row gt_center">223 (61%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Male</td>
<td headers="stat_1" class="gt_row gt_center">503 (36%)</td>
<td headers="stat_2" class="gt_row gt_center">142 (39%)</td></tr>
  </tbody>
  
  <tfoot class="gt_footnotes">
    <tr>
      <td class="gt_footnote" colspan="3"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span> <span class="gt_from_md">n (%)</span></td>
    </tr>
  </tfoot>
</table>
</div>
---
# Descriptive statistics: frequency tables

``` r
etsdata %>%
  select(Sex, Strand) %>% 
  tbl_cross(percent = "row") %>% 
  bold_labels() %>% 
  add_p(source_note=TRUE)
```

<div id="vyjwvxmycs" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#vyjwvxmycs table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#vyjwvxmycs thead, #vyjwvxmycs tbody, #vyjwvxmycs tfoot, #vyjwvxmycs tr, #vyjwvxmycs td, #vyjwvxmycs th {
border-style: none;
}
#vyjwvxmycs p {
margin: 0;
padding: 0;
}
#vyjwvxmycs .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#vyjwvxmycs .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#vyjwvxmycs .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#vyjwvxmycs .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#vyjwvxmycs .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#vyjwvxmycs .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#vyjwvxmycs .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#vyjwvxmycs .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#vyjwvxmycs .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#vyjwvxmycs .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#vyjwvxmycs .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#vyjwvxmycs .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#vyjwvxmycs .gt_spanner_row {
border-bottom-style: hidden;
}
#vyjwvxmycs .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#vyjwvxmycs .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#vyjwvxmycs .gt_from_md > :first-child {
margin-top: 0;
}
#vyjwvxmycs .gt_from_md > :last-child {
margin-bottom: 0;
}
#vyjwvxmycs .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#vyjwvxmycs .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#vyjwvxmycs .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#vyjwvxmycs .gt_row_group_first td {
border-top-width: 2px;
}
#vyjwvxmycs .gt_row_group_first th {
border-top-width: 2px;
}
#vyjwvxmycs .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#vyjwvxmycs .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#vyjwvxmycs .gt_first_summary_row.thick {
border-top-width: 2px;
}
#vyjwvxmycs .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#vyjwvxmycs .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#vyjwvxmycs .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#vyjwvxmycs .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#vyjwvxmycs .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#vyjwvxmycs .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#vyjwvxmycs .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#vyjwvxmycs .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#vyjwvxmycs .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#vyjwvxmycs .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#vyjwvxmycs .gt_left {
text-align: left;
}
#vyjwvxmycs .gt_center {
text-align: center;
}
#vyjwvxmycs .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#vyjwvxmycs .gt_font_normal {
font-weight: normal;
}
#vyjwvxmycs .gt_font_bold {
font-weight: bold;
}
#vyjwvxmycs .gt_font_italic {
font-style: italic;
}
#vyjwvxmycs .gt_super {
font-size: 65%;
}
#vyjwvxmycs .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#vyjwvxmycs .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#vyjwvxmycs .gt_indent_1 {
text-indent: 5px;
}
#vyjwvxmycs .gt_indent_2 {
text-indent: 10px;
}
#vyjwvxmycs .gt_indent_3 {
text-indent: 15px;
}
#vyjwvxmycs .gt_indent_4 {
text-indent: 20px;
}
#vyjwvxmycs .gt_indent_5 {
text-indent: 25px;
}
#vyjwvxmycs .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#vyjwvxmycs div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings gt_spanner_row">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="2" colspan="1" scope="col" id="label"></th>
      <th class="gt_center gt_columns_top_border gt_column_spanner_outer" rowspan="1" colspan="2" scope="colgroup" id="**Strand**">
        <div class="gt_column_spanner"><span class="gt_from_md"><strong>Strand</strong></span></div>
      </th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="2" colspan="1" scope="col" id="stat_0"><span class="gt_from_md"><strong>Total</strong></span></th>
    </tr>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_1"><span class="gt_from_md">Non-STEM</span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_2"><span class="gt_from_md">STEM</span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left" style="font-weight: bold;">Sex</td>
<td headers="stat_1" class="gt_row gt_center"><br /></td>
<td headers="stat_2" class="gt_row gt_center"><br /></td>
<td headers="stat_0" class="gt_row gt_center"><br /></td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Female</td>
<td headers="stat_1" class="gt_row gt_center">882 (80%)</td>
<td headers="stat_2" class="gt_row gt_center">223 (20%)</td>
<td headers="stat_0" class="gt_row gt_center">1,105 (100%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">    Male</td>
<td headers="stat_1" class="gt_row gt_center">503 (78%)</td>
<td headers="stat_2" class="gt_row gt_center">142 (22%)</td>
<td headers="stat_0" class="gt_row gt_center">645 (100%)</td></tr>
    <tr><td headers="label" class="gt_row gt_left" style="font-weight: bold;">Total</td>
<td headers="stat_1" class="gt_row gt_center">1,385 (79%)</td>
<td headers="stat_2" class="gt_row gt_center">365 (21%)</td>
<td headers="stat_0" class="gt_row gt_center">1,750 (100%)</td></tr>
  </tbody>
  <tfoot class="gt_sourcenotes">
    <tr>
      <td class="gt_sourcenote" colspan="4"><span class="gt_from_md">Pearson’s Chi-squared test, p=0.4</span></td>
    </tr>
  </tfoot>
  
</table>
</div>
---

# Descriptive statistics: numerical summaries

``` r
etsdata %>%
  select(Age, Sex) %>% 
  group_by(Sex) %>% 
  drop_na(Age) %>% 
  summarize(N=length(Age),
            Mean=mean(Age),
            Median=median(Age),
            SD=sd(Age),
            Min=min(Age),
            Max=max(Age))
```

```
## # A tibble: 2 × 7
##   Sex        N  Mean Median    SD   Min   Max
##   <chr>  <int> <dbl>  <dbl> <dbl> <dbl> <dbl>
## 1 Female  1104  20.6     20  1.75    18    40
## 2 Male     645  20.8     21  1.81    17    31
```

---
# Descriptive statistics: numerical summaries

``` r
etsdata %>%
  select(HHsize, Age, GWA) %>% 
  drop_na(Age,GWA) %>% 
  tbl_summary(by = HHsize,
              include = c(Age,GWA),
              statistic = list(all_continuous() ~ "{mean} ({sd})"))
```

<div id="pkdczdujff" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#pkdczdujff table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#pkdczdujff thead, #pkdczdujff tbody, #pkdczdujff tfoot, #pkdczdujff tr, #pkdczdujff td, #pkdczdujff th {
border-style: none;
}
#pkdczdujff p {
margin: 0;
padding: 0;
}
#pkdczdujff .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#pkdczdujff .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#pkdczdujff .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#pkdczdujff .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#pkdczdujff .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#pkdczdujff .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#pkdczdujff .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#pkdczdujff .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#pkdczdujff .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#pkdczdujff .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#pkdczdujff .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#pkdczdujff .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#pkdczdujff .gt_spanner_row {
border-bottom-style: hidden;
}
#pkdczdujff .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#pkdczdujff .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#pkdczdujff .gt_from_md > :first-child {
margin-top: 0;
}
#pkdczdujff .gt_from_md > :last-child {
margin-bottom: 0;
}
#pkdczdujff .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#pkdczdujff .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#pkdczdujff .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#pkdczdujff .gt_row_group_first td {
border-top-width: 2px;
}
#pkdczdujff .gt_row_group_first th {
border-top-width: 2px;
}
#pkdczdujff .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#pkdczdujff .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#pkdczdujff .gt_first_summary_row.thick {
border-top-width: 2px;
}
#pkdczdujff .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#pkdczdujff .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#pkdczdujff .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#pkdczdujff .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#pkdczdujff .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#pkdczdujff .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#pkdczdujff .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#pkdczdujff .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#pkdczdujff .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#pkdczdujff .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#pkdczdujff .gt_left {
text-align: left;
}
#pkdczdujff .gt_center {
text-align: center;
}
#pkdczdujff .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#pkdczdujff .gt_font_normal {
font-weight: normal;
}
#pkdczdujff .gt_font_bold {
font-weight: bold;
}
#pkdczdujff .gt_font_italic {
font-style: italic;
}
#pkdczdujff .gt_super {
font-size: 65%;
}
#pkdczdujff .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#pkdczdujff .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#pkdczdujff .gt_indent_1 {
text-indent: 5px;
}
#pkdczdujff .gt_indent_2 {
text-indent: 10px;
}
#pkdczdujff .gt_indent_3 {
text-indent: 15px;
}
#pkdczdujff .gt_indent_4 {
text-indent: 20px;
}
#pkdczdujff .gt_indent_5 {
text-indent: 25px;
}
#pkdczdujff .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#pkdczdujff div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label"><span class="gt_from_md"><strong>Characteristic</strong></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_1"><span class="gt_from_md"><strong>Large</strong><br />
N = 335</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_2"><span class="gt_from_md"><strong>Medium</strong><br />
N = 1,062</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_3"><span class="gt_from_md"><strong>Small</strong><br />
N = 174</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left">Age</td>
<td headers="stat_1" class="gt_row gt_center">20.83 (1.71)</td>
<td headers="stat_2" class="gt_row gt_center">20.52 (1.77)</td>
<td headers="stat_3" class="gt_row gt_center">20.62 (1.55)</td></tr>
    <tr><td headers="label" class="gt_row gt_left">GWA</td>
<td headers="stat_1" class="gt_row gt_center">89.6 (4.2)</td>
<td headers="stat_2" class="gt_row gt_center">90.2 (4.1)</td>
<td headers="stat_3" class="gt_row gt_center">90.0 (4.2)</td></tr>
  </tbody>
  
  <tfoot class="gt_footnotes">
    <tr>
      <td class="gt_footnote" colspan="4"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span> <span class="gt_from_md">Mean (SD)</span></td>
    </tr>
  </tfoot>
</table>
</div>

---

# Quick review of hypothesis testing

- **Testing hypotheses**: a procedure used to decide which of two competing hypotheses are consistent with data observed in a random sample

- **Null hypothesis** (`$H_0$`): hypothesis indicating no "effect" (no change, no improvement, no correlation)
   
   - **Alternative hypothesis** (`$H_1$`): researcher's hypothesis that indicates an "effect" 
   
   - **Test statistic**: a summary of the observed data in the random sample that is used as evidence *for* or *against* the null hypothesis
   
   - **Level of significance**: the probability of wrongly rejecting a true null hypothesis (`$\alpha = 0.01, \mathbf{0.05}$`)
   
   - **p-value**: the chance that the observed results (or more extreme results) would occur **IF** the null hypothesis were true
   
      - Smaller p-values indicate *disagreement* between the observed data and the null hypothesis: **Reject `$H_0$` if p-value `$\leq \alpha$`**

---

# Statistical tests on means

- Tests on means of two independent groups

- <tt>Student's t test</tt>: *normal distributions with equal variances*
   
   - <tt>Welch's t test</tt>: *normal distributions with unequal variances*
   
   - <tt>Mann-Whitney U test</tt>: *non-normal distribution*

- Tests on means of two matched/paired groups

- <tt>Paired t test</tt>: *normally distributed pairwise differences*
   
   - <tt>Signed rank test</tt>: *non-normal pairwise differences*
---
# Statistical tests on means

``` r
etsdata %>%
  select(Strand, GWA) %>% 
  drop_na(GWA) %>% 
  tbl_summary(by = Strand,
              include = GWA,
              statistic = list(all_continuous() ~ "{mean} ({sd})"))
```

<div id="giphczpvwb" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#giphczpvwb table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#giphczpvwb thead, #giphczpvwb tbody, #giphczpvwb tfoot, #giphczpvwb tr, #giphczpvwb td, #giphczpvwb th {
border-style: none;
}
#giphczpvwb p {
margin: 0;
padding: 0;
}
#giphczpvwb .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#giphczpvwb .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#giphczpvwb .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#giphczpvwb .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#giphczpvwb .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#giphczpvwb .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#giphczpvwb .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#giphczpvwb .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#giphczpvwb .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#giphczpvwb .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#giphczpvwb .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#giphczpvwb .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#giphczpvwb .gt_spanner_row {
border-bottom-style: hidden;
}
#giphczpvwb .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#giphczpvwb .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#giphczpvwb .gt_from_md > :first-child {
margin-top: 0;
}
#giphczpvwb .gt_from_md > :last-child {
margin-bottom: 0;
}
#giphczpvwb .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#giphczpvwb .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#giphczpvwb .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#giphczpvwb .gt_row_group_first td {
border-top-width: 2px;
}
#giphczpvwb .gt_row_group_first th {
border-top-width: 2px;
}
#giphczpvwb .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#giphczpvwb .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#giphczpvwb .gt_first_summary_row.thick {
border-top-width: 2px;
}
#giphczpvwb .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#giphczpvwb .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#giphczpvwb .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#giphczpvwb .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#giphczpvwb .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#giphczpvwb .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#giphczpvwb .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#giphczpvwb .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#giphczpvwb .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#giphczpvwb .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#giphczpvwb .gt_left {
text-align: left;
}
#giphczpvwb .gt_center {
text-align: center;
}
#giphczpvwb .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#giphczpvwb .gt_font_normal {
font-weight: normal;
}
#giphczpvwb .gt_font_bold {
font-weight: bold;
}
#giphczpvwb .gt_font_italic {
font-style: italic;
}
#giphczpvwb .gt_super {
font-size: 65%;
}
#giphczpvwb .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#giphczpvwb .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#giphczpvwb .gt_indent_1 {
text-indent: 5px;
}
#giphczpvwb .gt_indent_2 {
text-indent: 10px;
}
#giphczpvwb .gt_indent_3 {
text-indent: 15px;
}
#giphczpvwb .gt_indent_4 {
text-indent: 20px;
}
#giphczpvwb .gt_indent_5 {
text-indent: 25px;
}
#giphczpvwb .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#giphczpvwb div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label"><span class="gt_from_md"><strong>Characteristic</strong></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_1"><span class="gt_from_md"><strong>Non-STEM</strong><br />
N = 1,229</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_2"><span class="gt_from_md"><strong>STEM</strong><br />
N = 342</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left">GWA</td>
<td headers="stat_1" class="gt_row gt_center">89.5 (4.1)</td>
<td headers="stat_2" class="gt_row gt_center">92.1 (3.2)</td></tr>
  </tbody>
  
  <tfoot class="gt_footnotes">
    <tr>
      <td class="gt_footnote" colspan="3"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span> <span class="gt_from_md">Mean (SD)</span></td>
    </tr>
  </tfoot>
</table>
</div>

- `$H_0: \mu_S = \mu_N$`, where: S = STEM, N=Non-STEM

- `$H_1: \mu_S > \mu_N$`
---
# Statistical tests on means

``` r
# Test of Normality
etsdata %>% 
  select(Strand, GWA) %>% 
  group_by(Strand) %>% 
  shapiro_test(GWA)
```

```
## # A tibble: 2 × 4
##   Strand   variable statistic        p
##   <chr>    <chr>        <dbl>    <dbl>
## 1 Non-STEM GWA          0.965 1.06e-16
## 2 STEM     GWA          0.962 8.25e- 8
```

``` r
# Test of Equal Variance
etsdata %>% 
  select(Strand, GWA) %>% 
  levene_test(GWA ~ Strand)
```

```
## # A tibble: 1 × 4
##     df1   df2 statistic        p
##   <int> <int>     <dbl>    <dbl>
## 1     1  1569      12.8 0.000352
```

---
# Statistical tests on means

``` r
etsdata %>% 
  select(Strand, GWA) %>% 
  drop_na(GWA) %>% 
  wilcox.test(GWA ~ Strand, 
         data = .,
         alternative = "less")
```

```
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  GWA by Strand
## W = 129227, p-value < 2.2e-16
## alternative hypothesis: true location shift is less than 0
```

---
# Statistical tests on means

``` r
etsdata %>%
  select(Strand, Age, GWA) %>% 
  drop_na(Age, GWA) %>% 
  tbl_summary(by = Strand,
              include = c(Age, GWA),
              statistic = list(all_continuous() ~ "{mean} ({sd})")) %>% 
  add_p(test = list(all_continuous() ~ "wilcox.test"),
        test.args = all_tests("wilcox.test") ~ list(var.equal = TRUE))
```

<div id="jvyqcypitl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>#jvyqcypitl table {
font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
#jvyqcypitl thead, #jvyqcypitl tbody, #jvyqcypitl tfoot, #jvyqcypitl tr, #jvyqcypitl td, #jvyqcypitl th {
border-style: none;
}
#jvyqcypitl p {
margin: 0;
padding: 0;
}
#jvyqcypitl .gt_table {
display: table;
border-collapse: collapse;
line-height: normal;
margin-left: auto;
margin-right: auto;
color: #333333;
font-size: 16px;
font-weight: normal;
font-style: normal;
background-color: #FFFFFF;
width: auto;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #A8A8A8;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #A8A8A8;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
}
#jvyqcypitl .gt_caption {
padding-top: 4px;
padding-bottom: 4px;
}
#jvyqcypitl .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
border-bottom-color: #FFFFFF;
border-bottom-width: 0;
}
#jvyqcypitl .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
padding-top: 3px;
padding-bottom: 5px;
padding-left: 5px;
padding-right: 5px;
border-top-color: #FFFFFF;
border-top-width: 0;
}
#jvyqcypitl .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#jvyqcypitl .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#jvyqcypitl .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
}
#jvyqcypitl .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 6px;
padding-left: 5px;
padding-right: 5px;
overflow-x: hidden;
}
#jvyqcypitl .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: normal;
text-transform: inherit;
padding-top: 0;
padding-bottom: 0;
padding-left: 4px;
padding-right: 4px;
}
#jvyqcypitl .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#jvyqcypitl .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#jvyqcypitl .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: bottom;
padding-top: 5px;
padding-bottom: 5px;
overflow-x: hidden;
display: inline-block;
width: 100%;
}
#jvyqcypitl .gt_spanner_row {
border-bottom-style: hidden;
}
#jvyqcypitl .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
text-align: left;
}
#jvyqcypitl .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
vertical-align: middle;
}
#jvyqcypitl .gt_from_md > :first-child {
margin-top: 0;
}
#jvyqcypitl .gt_from_md > :last-child {
margin-bottom: 0;
}
#jvyqcypitl .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
margin: 10px;
border-top-style: solid;
border-top-width: 1px;
border-top-color: #D3D3D3;
border-left-style: none;
border-left-width: 1px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 1px;
border-right-color: #D3D3D3;
vertical-align: middle;
overflow-x: hidden;
}
#jvyqcypitl .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
}
#jvyqcypitl .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
font-weight: initial;
text-transform: inherit;
border-right-style: solid;
border-right-width: 2px;
border-right-color: #D3D3D3;
padding-left: 5px;
padding-right: 5px;
vertical-align: top;
}
#jvyqcypitl .gt_row_group_first td {
border-top-width: 2px;
}
#jvyqcypitl .gt_row_group_first th {
border-top-width: 2px;
}
#jvyqcypitl .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#jvyqcypitl .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#jvyqcypitl .gt_first_summary_row.thick {
border-top-width: 2px;
}
#jvyqcypitl .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#jvyqcypitl .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
}
#jvyqcypitl .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-top-style: double;
border-top-width: 6px;
border-top-color: #D3D3D3;
}
#jvyqcypitl .gt_last_grand_summary_row_top {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
padding-right: 5px;
border-bottom-style: double;
border-bottom-width: 6px;
border-bottom-color: #D3D3D3;
}
#jvyqcypitl .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#jvyqcypitl .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#jvyqcypitl .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#jvyqcypitl .gt_footnote {
margin: 0px;
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#jvyqcypitl .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
border-left-style: none;
border-left-width: 2px;
border-left-color: #D3D3D3;
border-right-style: none;
border-right-width: 2px;
border-right-color: #D3D3D3;
}
#jvyqcypitl .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
padding-left: 5px;
padding-right: 5px;
}
#jvyqcypitl .gt_left {
text-align: left;
}
#jvyqcypitl .gt_center {
text-align: center;
}
#jvyqcypitl .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#jvyqcypitl .gt_font_normal {
font-weight: normal;
}
#jvyqcypitl .gt_font_bold {
font-weight: bold;
}
#jvyqcypitl .gt_font_italic {
font-style: italic;
}
#jvyqcypitl .gt_super {
font-size: 65%;
}
#jvyqcypitl .gt_footnote_marks {
font-size: 75%;
vertical-align: 0.4em;
position: initial;
}
#jvyqcypitl .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#jvyqcypitl .gt_indent_1 {
text-indent: 5px;
}
#jvyqcypitl .gt_indent_2 {
text-indent: 10px;
}
#jvyqcypitl .gt_indent_3 {
text-indent: 15px;
}
#jvyqcypitl .gt_indent_4 {
text-indent: 20px;
}
#jvyqcypitl .gt_indent_5 {
text-indent: 25px;
}
#jvyqcypitl .katex-display {
display: inline-flex !important;
margin-bottom: 0.75em !important;
}
#jvyqcypitl div.Reactable > div.rt-table > div.rt-thead > div.rt-tr.rt-tr-group-header > div.rt-th-group:after {
height: 0px !important;
}
</style>
<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
  <thead>
    <tr class="gt_col_headings">
      <th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" scope="col" id="label"><span class="gt_from_md"><strong>Characteristic</strong></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_1"><span class="gt_from_md"><strong>Non-STEM</strong><br />
N = 1,229</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="stat_2"><span class="gt_from_md"><strong>STEM</strong><br />
N = 342</span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span></th>
      <th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1" scope="col" id="p.value"><span class="gt_from_md"><strong>p-value</strong></span><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>2</sup></span></th>
    </tr>
  </thead>
  <tbody class="gt_table_body">
    <tr><td headers="label" class="gt_row gt_left">Age</td>
<td headers="stat_1" class="gt_row gt_center">20.65 (1.85)</td>
<td headers="stat_2" class="gt_row gt_center">20.41 (1.27)</td>
<td headers="p.value" class="gt_row gt_center">0.11</td></tr>
    <tr><td headers="label" class="gt_row gt_left">GWA</td>
<td headers="stat_1" class="gt_row gt_center">89.5 (4.1)</td>
<td headers="stat_2" class="gt_row gt_center">92.1 (3.2)</td>
<td headers="p.value" class="gt_row gt_center"><0.001</td></tr>
  </tbody>
  
  <tfoot class="gt_footnotes">
    <tr>
      <td class="gt_footnote" colspan="4"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>1</sup></span> <span class="gt_from_md">Mean (SD)</span></td>
    </tr>
    <tr>
      <td class="gt_footnote" colspan="4"><span class="gt_footnote_marks" style="white-space:nowrap;font-style:italic;font-weight:normal;line-height:0;"><sup>2</sup></span> <span class="gt_from_md">Wilcoxon rank sum test</span></td>
    </tr>
  </tfoot>
</table>
</div>

---
# Statistical tests on means

``` r
etsdata %>% 
  select(Strand, GWA) %>% 
  ggbetweenstats(x = Strand,
                 y = GWA,
                 violin.args = list(width = 0),
                 type = "nonparametric",
                 var.equal = TRUE,
                 bayes.args = list(width=0))
```

---
# Statistical tests on means
<img src="Day-1--Slide-presentation_files/figure-html/unnamed-chunk-39-1.png" width="65%" style="display: block; margin: auto;" />

---
# Statistical tests on means
.pull-left[

``` r
etsdata %>% 
  select(HHsize, GWA) %>% 
  ggbetweenstats(x = HHsize,
                 y = GWA,
                 violin.args = list(width = 0),
                 type = "nonparametric",
                 var.equal = TRUE,
                 bayes.args = list(width=0))
```
]

.pull-right[
![](Day-1--Slide-presentation_files/figure-html/unnamed-chunk-41-1.png)
]
---
# Correlation analysis: basic ideas

- Correlation analysis is concerned with the analysis of linear relationship between two or more variables

- It is used to determine the strength and direction, as well as statistical significance, of the correlation between variables

- The correlation between two variables could be positive or negative

- Positive correlation: `$X\uparrow$` and `$Y\uparrow$` or `$X\downarrow$` and `$Y\downarrow$`

- Negative correlation: `$X\uparrow$` and `$Y\downarrow$` or `$X\downarrow$` and `$Y\uparrow$`

---

# Correlation analysis: scatter plot
.pull-left[
- It is a chart of the x-values (X-axis) and y-values (Y-axis)

- It is a visual representation of the relationship of X and Y

- Also known as *scatter diagram*
]

.pull-right[
<img src="Day-1--Slide-presentation_files/figure-html/unnamed-chunk-42-1.png" width="100%" />
]

---
# Correlation analysis: correlation coefficient

- measures the strength or magnitude of the correlation between the variables

- **Pearson r**: both variables are measured in at least interval scale; bivariate normal distribution
   
   - **Spearman rho**: both variables are measured in at ordinal scale; non-normal data
   
   - **Point-biserial**: one variable is binary and the other is interval or ratio
   
   - **Rank-biserial**: one variable is binary and the other is ordinal

- the value of a correlation coefficient ranges from -1 to +1

---

# Correlation analysis: correlation coefficient

- a zero correlation coefficient indicates that the variables are NOT LINEARLY independent

---
# Correlation analysis: test of significance

- `$H_0$`: Correlation coefficient is equal to zero. (There is no linear relationship between the variables.) `$\Longrightarrow H_0: \rho = 0$`
   
- `$H_1$`: Correlation coefficient is not equal to zero. (There is linear relationship between the variables.) `$\Longrightarrow H_1: \rho \neq 0$`

- Test statistic:

$$
t = \frac{r \sqrt{n-2}}{\sqrt{1-r^2}}
$$

- Reject `$H_0$` if p-value associated with `$t$` is less than the significance level (`$\alpha$`)

---
# Correlation analysis

``` r
shapiro.test(College$Top10perc)
```

```
## 
##  Shapiro-Wilk normality test
## 
## data:  College$Top10perc
## W = 0.88742, p-value < 2.2e-16
```

``` r
shapiro.test(College$Grad.Rate)
```

```
## 
##  Shapiro-Wilk normality test
## 
## data:  College$Grad.Rate
## W = 0.9948, p-value = 0.009424
```
---
# Correlation analysis

``` r
cor.test(x = College$Top10perc,
         y = College$Grad.Rate,
         method = "spearman")
```

```
## 
##  Spearman's rank correlation rho
## 
## data:  College$Top10perc and College$Grad.Rate
## S = 40500256, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.4819798
```
---

# Correlation analysis

``` r
College %>% 
  select(Top10perc, Grad.Rate) %>% 
  ggscatterstats(x = Top10perc, y = Grad.Rate, type = "nonparametric",
                 bf.message = FALSE)
```

---
# Correlation analysis: visualization

- <tt>corrplot()</tt> from the **corrplot** package

- <tt>ggpairs()</tt> from the **GGally** package

- <tt>ggcorr()</tt> also from the **GGally** package

- <tt>pairs.panel()</tt> from the **psych** package

---
# Correlation analysis

``` r
College %>% 
  select(Top10perc, PhD, S.F.Ratio, Expend, Grad.Rate) %>% 
  cor() %>% 
  corrplot(type = "lower", tl.cex = .75, tl.col = "black")
```

---
# Correlation analysis

``` r
College %>% 
  select(Private, Top10perc, PhD, S.F.Ratio, Expend, Grad.Rate) %>% 
  mutate(Private = recode(Private, "No" = "Public", "Yes" = "Private")) %>% 
  ggpairs(columns = 2:6, aes(colour = Private))
```