Ruoxi_Lin_Homework

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

packages <- c("tidyverse", "infer", "fst", "modelsummary", "broom") 

new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

lapply(packages, library, character.only = TRUE)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

## [[1]]
##  [1] "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"     "readr"    
##  [7] "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"     "graphics" 
## [13] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "infer"     "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"    
##  [7] "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"    
## [13] "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "fst"       "infer"     "lubridate" "forcats"   "stringr"   "dplyr"    
##  [7] "purrr"     "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse"
## [13] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [19] "base"     
## 
## [[4]]
##  [1] "modelsummary" "fst"          "infer"        "lubridate"    "forcats"     
##  [6] "stringr"      "dplyr"        "purrr"        "readr"        "tidyr"       
## [11] "tibble"       "ggplot2"      "tidyverse"    "stats"        "graphics"    
## [16] "grDevices"    "utils"        "datasets"     "methods"      "base"        
## 
## [[5]]
##  [1] "broom"        "modelsummary" "fst"          "infer"        "lubridate"   
##  [6] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
## [11] "tidyr"        "tibble"       "ggplot2"      "tidyverse"    "stats"       
## [16] "graphics"     "grDevices"    "utils"        "datasets"     "methods"     
## [21] "base"

ess <- read_fst("All-ESS-Data8.fst")

Task 1

belgium_data <- ess %>%
  filter(cntry == "BE", wrkprty %in% c(1, 2)) %>%
  mutate(
    trstep = as.numeric(trstep),
    wrkprty = factor(case_when(
      wrkprty == 1 ~ "Yes",
      wrkprty == 2 ~ "No"
    ))
  )

model <- lm(trstep ~ wrkprty, data = belgium_data)

coefficients <- coef(model)
print(coefficients)

## (Intercept)  wrkprtyYes 
##   7.5491077  -0.9610318

(Intercept) 7.5491077:

The intercept represents the predicted value of the outcome variable (in this case, Trust in the European Parliament, 0-10) when the explanatory variable is zero or, in the case of categorical variables, at their reference level.

Put differently, when looking at the category that’s NOT “Yes” in wrkprty (thus, “No”), the expected value of trstep is approximately 7.5491077.

(wrkprtyYes) -0.9610318:

This coefficient represents the difference in the predicted value of trust in the European Parliament between the “Yes” category and the reference category (“No”) for the wrkprty variable.

Specifically, having “Yes” is associated with a decrease of approximately 0.9610318 in the predicted value of trust in the European Parliament, compared to having a “No”.

In summary:

If a person has not worked in a political party (“No”, assuming that’s the reference category), their predicted average trust in the European Parliament is 7.5491077. If a person has worked in a political party (“Yes”), their predicted average trust in the European Parliament is approximately 6.5880759 (given 7.5491077 - 0.9610318).

In summary:

Task 2

bulgaria_data <- ess %>%
  filter(cntry == "BG", brncntr %in% c(1, 2)) %>%
  mutate(
    stfdem = as.numeric(stfdem),
    brncntr = factor(case_when(
      brncntr == 1 ~ "Yes",
      brncntr == 2 ~ "No"
    )),
    native = recode(brncntr,
                     `1` = "Yes",
                     `2` = "No",
                     `7` = NA_character_,
                     `8` = NA_character_,
                     `9` = NA_character_)
  )

model <- lm(stfdem ~ native, data = bulgaria_data)
summary(model)

## 
## Call:
## lm(formula = stfdem ~ native, data = bulgaria_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.704 -7.704 -5.704 -3.704 90.296 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8.5385     2.1074   4.052 5.12e-05 ***
## nativeYes     0.1654     2.1158   0.078    0.938    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.49 on 13236 degrees of freedom
## Multiple R-squared:  4.615e-07,  Adjusted R-squared:  -7.509e-05 
## F-statistic: 0.006108 on 1 and 13236 DF,  p-value: 0.9377

For this model, the expected average level of satisfaction with democracy among non-Bulgarian-born respondents is 8.5385+0.1654=8.7039.

Task 3

remotes::install_github("datalorax/equatiomatic")

## Skipping install of 'equatiomatic' from a github remote, the SHA1 (29ff168f) has not changed since last install.
##   Use `force = TRUE` to force installation

equatiomatic::extract_eq(model, use_coefs = TRUE)

\[ \operatorname{\widehat{stfdem}} = 8.54 + 0.17(\operatorname{native}_{\operatorname{Yes}}) \]

Ruoxi_Lin_Homework_8

2023-11-19

R Markdown

Including Plots