R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

packages <- c("tidyverse", "infer", "fst", "modelsummary", "broom") 

new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

lapply(packages, library, character.only = TRUE)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## [[1]]
##  [1] "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"     "readr"    
##  [7] "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"     "graphics" 
## [13] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "infer"     "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"    
##  [7] "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"    
## [13] "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "fst"       "infer"     "lubridate" "forcats"   "stringr"   "dplyr"    
##  [7] "purrr"     "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse"
## [13] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [19] "base"     
## 
## [[4]]
##  [1] "modelsummary" "fst"          "infer"        "lubridate"    "forcats"     
##  [6] "stringr"      "dplyr"        "purrr"        "readr"        "tidyr"       
## [11] "tibble"       "ggplot2"      "tidyverse"    "stats"        "graphics"    
## [16] "grDevices"    "utils"        "datasets"     "methods"      "base"        
## 
## [[5]]
##  [1] "broom"        "modelsummary" "fst"          "infer"        "lubridate"   
##  [6] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
## [11] "tidyr"        "tibble"       "ggplot2"      "tidyverse"    "stats"       
## [16] "graphics"     "grDevices"    "utils"        "datasets"     "methods"     
## [21] "base"
ess <- read_fst("All-ESS-Data8.fst")

Task 1

belgium_data <- ess %>%
  filter(cntry == "BE", wrkprty %in% c(1, 2)) %>%
  mutate(
    trstep = as.numeric(trstep),
    wrkprty = factor(case_when(
      wrkprty == 1 ~ "Yes",
      wrkprty == 2 ~ "No"
    ))
  )

model <- lm(trstep ~ wrkprty, data = belgium_data)

coefficients <- coef(model)
print(coefficients)
## (Intercept)  wrkprtyYes 
##   7.5491077  -0.9610318

(Intercept) 7.5491077:

The intercept represents the predicted value of the outcome variable (in this case, Trust in the European Parliament, 0-10) when the explanatory variable is zero or, in the case of categorical variables, at their reference level.

Put differently, when looking at the category that’s NOT “Yes” in wrkprty (thus, “No”), the expected value of trstep is approximately 7.5491077.

(wrkprtyYes) -0.9610318:

This coefficient represents the difference in the predicted value of trust in the European Parliament between the “Yes” category and the reference category (“No”) for the wrkprty variable.

Specifically, having “Yes” is associated with a decrease of approximately 0.9610318 in the predicted value of trust in the European Parliament, compared to having a “No”.

In summary:

If a person has not worked in a political party (“No”, assuming that’s the reference category), their predicted average trust in the European Parliament is 7.5491077. If a person has worked in a political party (“Yes”), their predicted average trust in the European Parliament is approximately 6.5880759 (given 7.5491077 - 0.9610318).

In summary:

If a person has not worked in a political party (“No”, assuming that’s the reference category), their predicted average trust in the European Parliament is 7.5491077. If a person has worked in a political party (“Yes”), their predicted average trust in the European Parliament is approximately 6.5880759 (given 7.5491077 - 0.9610318).

Task 2

bulgaria_data <- ess %>%
  filter(cntry == "BG", brncntr %in% c(1, 2)) %>%
  mutate(
    stfdem = as.numeric(stfdem),
    brncntr = factor(case_when(
      brncntr == 1 ~ "Yes",
      brncntr == 2 ~ "No"
    )),
    native = recode(brncntr,
                     `1` = "Yes",
                     `2` = "No",
                     `7` = NA_character_,
                     `8` = NA_character_,
                     `9` = NA_character_)
  )

model <- lm(stfdem ~ native, data = bulgaria_data)
summary(model)
## 
## Call:
## lm(formula = stfdem ~ native, data = bulgaria_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.704 -7.704 -5.704 -3.704 90.296 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8.5385     2.1074   4.052 5.12e-05 ***
## nativeYes     0.1654     2.1158   0.078    0.938    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.49 on 13236 degrees of freedom
## Multiple R-squared:  4.615e-07,  Adjusted R-squared:  -7.509e-05 
## F-statistic: 0.006108 on 1 and 13236 DF,  p-value: 0.9377

For this model, the expected average level of satisfaction with democracy among non-Bulgarian-born respondents is 8.5385+0.1654=8.7039.

Task 3

remotes::install_github("datalorax/equatiomatic")
## Skipping install of 'equatiomatic' from a github remote, the SHA1 (29ff168f) has not changed since last install.
##   Use `force = TRUE` to force installation
equatiomatic::extract_eq(model, use_coefs = TRUE)

\[ \operatorname{\widehat{stfdem}} = 8.54 + 0.17(\operatorname{native}_{\operatorname{Yes}}) \]