This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
packages <- c("tidyverse", "infer", "fst", "modelsummary", "broom")
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)
lapply(packages, library, character.only = TRUE)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## [[1]]
## [1] "lubridate" "forcats" "stringr" "dplyr" "purrr" "readr"
## [7] "tidyr" "tibble" "ggplot2" "tidyverse" "stats" "graphics"
## [13] "grDevices" "utils" "datasets" "methods" "base"
##
## [[2]]
## [1] "infer" "lubridate" "forcats" "stringr" "dplyr" "purrr"
## [7] "readr" "tidyr" "tibble" "ggplot2" "tidyverse" "stats"
## [13] "graphics" "grDevices" "utils" "datasets" "methods" "base"
##
## [[3]]
## [1] "fst" "infer" "lubridate" "forcats" "stringr" "dplyr"
## [7] "purrr" "readr" "tidyr" "tibble" "ggplot2" "tidyverse"
## [13] "stats" "graphics" "grDevices" "utils" "datasets" "methods"
## [19] "base"
##
## [[4]]
## [1] "modelsummary" "fst" "infer" "lubridate" "forcats"
## [6] "stringr" "dplyr" "purrr" "readr" "tidyr"
## [11] "tibble" "ggplot2" "tidyverse" "stats" "graphics"
## [16] "grDevices" "utils" "datasets" "methods" "base"
##
## [[5]]
## [1] "broom" "modelsummary" "fst" "infer" "lubridate"
## [6] "forcats" "stringr" "dplyr" "purrr" "readr"
## [11] "tidyr" "tibble" "ggplot2" "tidyverse" "stats"
## [16] "graphics" "grDevices" "utils" "datasets" "methods"
## [21] "base"
ess <- read_fst("All-ESS-Data8.fst")
Task 1
belgium_data <- ess %>%
filter(cntry == "BE", wrkprty %in% c(1, 2)) %>%
mutate(
trstep = as.numeric(trstep),
wrkprty = factor(case_when(
wrkprty == 1 ~ "Yes",
wrkprty == 2 ~ "No"
))
)
model <- lm(trstep ~ wrkprty, data = belgium_data)
coefficients <- coef(model)
print(coefficients)
## (Intercept) wrkprtyYes
## 7.5491077 -0.9610318
(Intercept) 7.5491077:
The intercept represents the predicted value of the outcome variable (in this case, Trust in the European Parliament, 0-10) when the explanatory variable is zero or, in the case of categorical variables, at their reference level.
Put differently, when looking at the category that’s NOT “Yes” in wrkprty (thus, “No”), the expected value of trstep is approximately 7.5491077.
(wrkprtyYes) -0.9610318:
This coefficient represents the difference in the predicted value of trust in the European Parliament between the “Yes” category and the reference category (“No”) for the wrkprty variable.
Specifically, having “Yes” is associated with a decrease of approximately 0.9610318 in the predicted value of trust in the European Parliament, compared to having a “No”.
In summary:
If a person has not worked in a political party (“No”, assuming that’s the reference category), their predicted average trust in the European Parliament is 7.5491077. If a person has worked in a political party (“Yes”), their predicted average trust in the European Parliament is approximately 6.5880759 (given 7.5491077 - 0.9610318).
In summary:
If a person has not worked in a political party (“No”, assuming that’s the reference category), their predicted average trust in the European Parliament is 7.5491077. If a person has worked in a political party (“Yes”), their predicted average trust in the European Parliament is approximately 6.5880759 (given 7.5491077 - 0.9610318).
Task 2
bulgaria_data <- ess %>%
filter(cntry == "BG", brncntr %in% c(1, 2)) %>%
mutate(
stfdem = as.numeric(stfdem),
brncntr = factor(case_when(
brncntr == 1 ~ "Yes",
brncntr == 2 ~ "No"
)),
native = recode(brncntr,
`1` = "Yes",
`2` = "No",
`7` = NA_character_,
`8` = NA_character_,
`9` = NA_character_)
)
model <- lm(stfdem ~ native, data = bulgaria_data)
summary(model)
##
## Call:
## lm(formula = stfdem ~ native, data = bulgaria_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.704 -7.704 -5.704 -3.704 90.296
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.5385 2.1074 4.052 5.12e-05 ***
## nativeYes 0.1654 2.1158 0.078 0.938
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21.49 on 13236 degrees of freedom
## Multiple R-squared: 4.615e-07, Adjusted R-squared: -7.509e-05
## F-statistic: 0.006108 on 1 and 13236 DF, p-value: 0.9377
For this model, the expected average level of satisfaction with democracy among non-Bulgarian-born respondents is 8.5385+0.1654=8.7039.
Task 3
remotes::install_github("datalorax/equatiomatic")
## Skipping install of 'equatiomatic' from a github remote, the SHA1 (29ff168f) has not changed since last install.
## Use `force = TRUE` to force installation
equatiomatic::extract_eq(model, use_coefs = TRUE)
\[ \operatorname{\widehat{stfdem}} = 8.54 + 0.17(\operatorname{native}_{\operatorname{Yes}}) \]