R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

packages <- c("tidyverse", "infer", "fst", "modelsummary", "effects", "survey", "MASS", "aod", "interactions", "kableExtra", "flextable", "scales") # add any you need here

# Install packages if they aren't installed already
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

# Load the packages
lapply(packages, library, character.only = TRUE)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: carData
## 
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
## 
## Loading required package: grid
## 
## Loading required package: Matrix
## 
## 
## Attaching package: 'Matrix'
## 
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## 
## Loading required package: survival
## 
## 
## Attaching package: 'survey'
## 
## 
## The following object is masked from 'package:graphics':
## 
##     dotchart
## 
## 
## 
## Attaching package: 'MASS'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     select
## 
## 
## 
## Attaching package: 'aod'
## 
## 
## The following object is masked from 'package:survival':
## 
##     rats
## 
## 
## 
## Attaching package: 'kableExtra'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows
## 
## 
## 
## Attaching package: 'flextable'
## 
## 
## The following objects are masked from 'package:kableExtra':
## 
##     as_image, footnote
## 
## 
## The following object is masked from 'package:purrr':
## 
##     compose
## 
## 
## 
## Attaching package: 'scales'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## 
## The following object is masked from 'package:readr':
## 
##     col_factor

## [[1]]
##  [1] "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"     "readr"    
##  [7] "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"     "graphics" 
## [13] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "infer"     "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"    
##  [7] "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"    
## [13] "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "fst"       "infer"     "lubridate" "forcats"   "stringr"   "dplyr"    
##  [7] "purrr"     "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse"
## [13] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [19] "base"     
## 
## [[4]]
##  [1] "modelsummary" "fst"          "infer"        "lubridate"    "forcats"     
##  [6] "stringr"      "dplyr"        "purrr"        "readr"        "tidyr"       
## [11] "tibble"       "ggplot2"      "tidyverse"    "stats"        "graphics"    
## [16] "grDevices"    "utils"        "datasets"     "methods"      "base"        
## 
## [[5]]
##  [1] "effects"      "carData"      "modelsummary" "fst"          "infer"       
##  [6] "lubridate"    "forcats"      "stringr"      "dplyr"        "purrr"       
## [11] "readr"        "tidyr"        "tibble"       "ggplot2"      "tidyverse"   
## [16] "stats"        "graphics"     "grDevices"    "utils"        "datasets"    
## [21] "methods"      "base"        
## 
## [[6]]
##  [1] "survey"       "survival"     "Matrix"       "grid"         "effects"     
##  [6] "carData"      "modelsummary" "fst"          "infer"        "lubridate"   
## [11] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
## [16] "tidyr"        "tibble"       "ggplot2"      "tidyverse"    "stats"       
## [21] "graphics"     "grDevices"    "utils"        "datasets"     "methods"     
## [26] "base"        
## 
## [[7]]
##  [1] "MASS"         "survey"       "survival"     "Matrix"       "grid"        
##  [6] "effects"      "carData"      "modelsummary" "fst"          "infer"       
## [11] "lubridate"    "forcats"      "stringr"      "dplyr"        "purrr"       
## [16] "readr"        "tidyr"        "tibble"       "ggplot2"      "tidyverse"   
## [21] "stats"        "graphics"     "grDevices"    "utils"        "datasets"    
## [26] "methods"      "base"        
## 
## [[8]]
##  [1] "aod"          "MASS"         "survey"       "survival"     "Matrix"      
##  [6] "grid"         "effects"      "carData"      "modelsummary" "fst"         
## [11] "infer"        "lubridate"    "forcats"      "stringr"      "dplyr"       
## [16] "purrr"        "readr"        "tidyr"        "tibble"       "ggplot2"     
## [21] "tidyverse"    "stats"        "graphics"     "grDevices"    "utils"       
## [26] "datasets"     "methods"      "base"        
## 
## [[9]]
##  [1] "interactions" "aod"          "MASS"         "survey"       "survival"    
##  [6] "Matrix"       "grid"         "effects"      "carData"      "modelsummary"
## [11] "fst"          "infer"        "lubridate"    "forcats"      "stringr"     
## [16] "dplyr"        "purrr"        "readr"        "tidyr"        "tibble"      
## [21] "ggplot2"      "tidyverse"    "stats"        "graphics"     "grDevices"   
## [26] "utils"        "datasets"     "methods"      "base"        
## 
## [[10]]
##  [1] "kableExtra"   "interactions" "aod"          "MASS"         "survey"      
##  [6] "survival"     "Matrix"       "grid"         "effects"      "carData"     
## [11] "modelsummary" "fst"          "infer"        "lubridate"    "forcats"     
## [16] "stringr"      "dplyr"        "purrr"        "readr"        "tidyr"       
## [21] "tibble"       "ggplot2"      "tidyverse"    "stats"        "graphics"    
## [26] "grDevices"    "utils"        "datasets"     "methods"      "base"        
## 
## [[11]]
##  [1] "flextable"    "kableExtra"   "interactions" "aod"          "MASS"        
##  [6] "survey"       "survival"     "Matrix"       "grid"         "effects"     
## [11] "carData"      "modelsummary" "fst"          "infer"        "lubridate"   
## [16] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
## [21] "tidyr"        "tibble"       "ggplot2"      "tidyverse"    "stats"       
## [26] "graphics"     "grDevices"    "utils"        "datasets"     "methods"     
## [31] "base"        
## 
## [[12]]
##  [1] "scales"       "flextable"    "kableExtra"   "interactions" "aod"         
##  [6] "MASS"         "survey"       "survival"     "Matrix"       "grid"        
## [11] "effects"      "carData"      "modelsummary" "fst"          "infer"       
## [16] "lubridate"    "forcats"      "stringr"      "dplyr"        "purrr"       
## [21] "readr"        "tidyr"        "tibble"       "ggplot2"      "tidyverse"   
## [26] "stats"        "graphics"     "grDevices"    "utils"        "datasets"    
## [31] "methods"      "base"

##task 1

## Loading dataset
germany_data <- read_fst("germany_data.fst")

#renaming
df <- germany_data

df$weight <- df$dweight * df$pweight
survey_design <- svydesign(ids = ~1, data = df, weights = ~weight)

# Setting values greater than 10 to NA
df$trstplt <- ifelse(df$trstplt > 10, NA, df$trstplt)
df$trstprl <- ifelse(df$trstprl > 10, NA, df$trstprl)
df$trstprt <- ifelse(df$trstprt > 10, NA, df$trstprt)

# Creating and rescaling the trust variable
df$trust <- scales::rescale(df$trstplt + df$trstprl + df$trstprt, na.rm = TRUE, to = c(0, 100))

# Here's how you would "flip it" because populist is conceived by N + I as 'anti-trust'
# As Schafer explains there are some issues with that conceptualization, but N + I is also widely applied
df$populist <- scales::rescale(df$trust, na.rm = TRUE, to=c(100,0))

df <- df %>%
  mutate(behave = ipbhprp,
         secure = impsafe,
         safety = ipstrgv,
         tradition = imptrad,
         rules = ipfrule) %>%
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"),
                ~ na_if(.x, 7) %>% na_if(8) %>% na_if(9))) %>%
  # Apply the reverse coding
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"), ~ 7 - .x ))

# Now you can calculate 'schwartzauth' after the NA recoding
df$auth <- scales::rescale(df$behave + 
                      df$secure + 
                      df$safety + 
                      df$tradition + 
                      df$rules, to=c(0,100), na.rm=TRUE)


df <- df %>% filter(!is.na(auth))

df <- df %>%
  mutate(
    polID = case_when(
      lrscale %in% 0:3 ~ "Left",                    
      lrscale %in% 7:10 ~ "Right",                     
      lrscale %in% 4:6 ~ "Moderate",                  
      lrscale %in% c(77, 88, 99) ~ NA_character_      
    ),
   religious = case_when(
      rlgdgr %in% c(77, 88, 99) ~ NA_real_,
      TRUE ~ rlgdgr
    )
  )

model3 <- lm(auth ~ polID + religious, data = df, weights = weight)
model4 <- lm(auth ~ polID + religious + polID*religious, data = df, weights = weight)

modelsummary(
  list(model3, model4),
  fmt = 1,
  estimate  = c( "{estimate} ({std.error}){stars}",
                "{estimate} ({std.error}){stars}"),
  statistic = NULL)

	(1)	(2)
(Intercept)	55.3 (0.3)***	56.4 (0.3)***
polIDModerate	4.6 (0.3)***	3.4 (0.4)***
polIDRight	7.7 (0.4)***	4.2 (0.7)***
religious	0.9 (0.0)***	0.6 (0.1)***
polIDModerate × religious		0.3 (0.1)***
polIDRight × religious		0.8 (0.1)***
Num.Obs.	23373	23373
R2	0.048	0.050
R2 Adj.	0.048	0.050
AIC	200594.1	200558.6
BIC	200634.4	200615.0
Log.Lik.	−100292.049	−100272.298
RMSE	17.27	17.24

Interpretation of coefficients for MLR without the interaction term (model3):

The intercept (57.6 ***): This indicates that the expected authoritarian score is 57.6 for individuals

on the ‘left’ with a religiousness score of 0, with high statistical significance.

religionYes (8.9 ***): Being religious is associated with an 8.9 unit increase in the authoritarian score

compared to non-religious individuals, holding political identification constant.

IDRight (3.9 ***): Identifying as politically right is associated with a 3.9 unit increase in the

authoritarian score compared to those on the left, holding religiousness constant.

Interpretation of model fit metrics:

Num.Obs.: The model was fitted using 1176 observations.

R2 (0.095): The model explains 9.5% of the variance in the authoritarian score, which is relatively low.

R2 Adj. (0.093): After adjusting for the number of predictors, the model explains 9.3% of the variance.

AIC (9760.7 vs. 9762.6) and BIC (9780.9 vs. 9788.0): The AIC and BIC are slightly higher for model4,

suggesting that the interaction term does not substantially improve the model fit.

Log.Lik. (−4876.334): The log-likelihood is nearly identical between model3 and model4, indicating similar fit.

RMSE (15.28): The root mean square error is the same for both models, indicating similar predictive accuracy.

##task two

interaction_plot <- effect("polID*religious", model4, na.rm=TRUE)

# Plotting the interaction effect
plot(interaction_plot,
     main="Interaction effect",
     xlab="Religiousness",
     ylab="Authoritarian attitudes scale")

interaction_plot

## 
##  polID*religious effect
##           religious
## polID             0        2        5        8       10
##   Left     56.40706 57.56775 59.30879 61.04983 62.21053
##   Moderate 59.85063 61.66176 64.37844 67.09513 68.90626
##   Right    60.60237 63.35032 67.47224 71.59416 74.34211

#Examining the model 4 interaction plot: # The interaction effect plot illustrates how political affiliation (Left, Moderate, Right) differentiates the association between religiousness and authoritarian attitudes. The slopes of the lines indicate that the authoritarian views scale rises with religiousness, with a stronger correlation for politically right-leaning persons. In particular, the ‘Right’ category’s sharpest slope suggests a higher correlation between religiousness # and authoritarian views for this demographic. ‘Left’ or ‘Moderate’ identity holders, on the other hand, have a less pronounced but still beneficial association. # The lines’ non-parallel character suggests that political affiliation influences the association between religiousness and authoritarian views, supporting the existence of an interaction effect.

##task 3

netherlands_data <- read_fst("netherlands_data.fst")

dn <- netherlands_data

dn$weight <- dn$dweight * dn$pweight
survey_design <- svydesign(ids = ~1, data = dn, weights = ~weight)

# Setting values greater than 10 to NA
dn$trstplt <- ifelse(dn$trstplt > 10, NA, dn$trstplt)
dn$trstprl <- ifelse(dn$trstprl > 10, NA, dn$trstprl)
dn$trstprt <- ifelse(dn$trstprt > 10, NA, dn$trstprt)


# Creating and rescaling the trust variable
dn$trust <- scales::rescale(dn$trstplt + dn$trstprl + dn$trstprt, na.rm = TRUE, to = c(0, 100))
dn$populist <- scales::rescale(dn$trust, na.rm = TRUE, to=c(100,0))

dn <- dn %>%
  mutate(
   
    Capital = case_when(
      ppltrst %in% c(77, 88, 99) ~ NA_character_,
      ppltrst >= 0 & ppltrst <= 3 ~ "Low",
      ppltrst >= 4 & ppltrst <= 6 ~ "Mid",
      ppltrst >= 7 & ppltrst <= 10 ~ "High",
      TRUE ~ as.character(ppltrst)
    ),
    
    Politically = case_when(
      polintr %in% c(1, 2) ~ "Interested",
      polintr %in% c(3, 4) ~ "Not Interested",
      polintr %in% c(7, 8, 9) ~ NA_character_
    )
  )

dn <- dn %>%
  mutate(
    behave = ipbhprp,
         secure = impsafe,
         safety = ipstrgv,
         tradition = imptrad,
         rules = ipfrule) %>%
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"),
                ~ na_if(.x, 7) %>% na_if(8) %>% na_if(9))) %>%
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"), ~ 7 - .x ))

dn$auth <- scales::rescale(dn$behave + 
                      dn$secure + 
                      dn$safety + 
                      dn$tradition + 
                      dn$rules, to=c(0,100), na.rm=TRUE)


dn <- dn %>% filter(!is.na(auth))

dn <- dn %>%
  mutate(religion = case_when(
    rlgblg == 2 ~ "No",
    rlgblg == 1 ~ "Yes",
    rlgblg %in% c(7, 8, 9) ~ NA_character_,
    TRUE ~ as.character(rlgblg)
  ))

table(dn$religion)

## 
##    No   Yes 
## 10913  6750

dn <- dn %>%
  mutate(ID = case_when(
    lrscale >= 0 & lrscale <= 4 ~ "Left",
    lrscale >= 6 & lrscale <= 10 ~ "Right",
    lrscale > 10 ~ NA_character_,  
    TRUE ~ NA_character_  
  ))
table(dn$ID)

## 
##  Left Right 
##  5449  7010

dn <- dn %>%
  mutate(
    cohort = ifelse(yrbrn < 1930 | yrbrn > 2000, NA, yrbrn),

    gen = case_when(
      yrbrn %in% 1900:1945 ~ "1",
      yrbrn %in% 1946:1964 ~ "2",
      yrbrn %in% 1965:1979 ~ "3",
      yrbrn %in% 1980:1996 ~ "4",
      TRUE ~ as.character(yrbrn)  
    ),
   
    gen = factor(gen,
                 levels = c("1", "2", "3", "4"),
                 labels = c("Interwar", "Baby Boomers", "Gen X", "Millennials"))
  )
table(dn$gen)

## 
##     Interwar Baby Boomers        Gen X  Millennials 
##         3771         6147         4548         2730

model5 <- lm(auth ~ religion + ID + gen, data = dn, weights = weight)
model6 <- lm(auth ~ religion + ID + gen + religion*gen, data = dn, weights = weight)

modelsummary(
  list(model5, model6),
  fmt = 1,
  estimate  = c( "{estimate} ({std.error}){stars}",
                "{estimate} ({std.error}){stars}"),
  statistic = NULL,
  coef_omit = "Intercept")

	(1)	(2)
religionYes	7.2 (0.3)***	5.7 (0.7)***
IDRight	4.2 (0.3)***	4.3 (0.3)***
genBaby Boomers	−5.2 (0.4)***	−5.9 (0.6)***
genGen X	−7.7 (0.4)***	−8.8 (0.6)***
genMillennials	−8.4 (0.5)***	−9.7 (0.6)***
religionYes × genBaby Boomers		1.1 (0.8)
religionYes × genGen X		2.3 (0.9)**
religionYes × genMillennials		2.9 (1.0)**
Num.Obs.	12116	12116
R2	0.124	0.125
R2 Adj.	0.124	0.125
AIC	101219.3	101213.5
BIC	101271.2	101287.5
Log.Lik.	−50602.672	−50596.757
RMSE	15.27	15.26

#Coefficient interpretation for MLR without interaction (model 5): # religionYes (7.2 ): Keeping political identification and generational category constant, there is a 7.2 unit increase in authoritarian views when a person is religious. This relationship has significant significance. #IDRight (4.2 ): Keeping other variables equal, political identification on the right is highly significant in relation to a 4.2 unit increase in authoritarian views as compared to the left. # genBaby Boomers (-5.2 ): Among other things being equal, there is a strong significant correlation between being a Baby Boomer with a 5.2 unit reduction in authoritarian views relative to the Interwar generation. # genGen X (-7.7 ): Among strong significance, belonging to Gen X is linked to a 7.7 unit decline in authoritarian views # as compared to the Interwar generation. #Model fit measures for both models are interpreted as follows: # Num.Obs.: A total of 12116 observations were used to fit both models. # R2 (0.124): The models modestly account for 12.4% of the variation in authoritarian views. # R2 Adj. (0.124 vs. 0.125): Adding the interaction terms almost completely preserves the adjusted R-squared, indicating that the interaction terms do not significantly increase the model’s explanatory power. #despite its complexity, the model with interactions (model 6) shows a little lower AIC (101219.3 vs. 101213.5) #showing a significantly better fit to the data.

##task 4

dn <- dn %>%
  mutate(
    cohort = ifelse(yrbrn < 1930 | yrbrn > 2000, NA, yrbrn),
    gen = case_when(
      yrbrn %in% 1900:1945 ~ "1",
      yrbrn %in% 1946:1964 ~ "2",
      yrbrn %in% 1965:1979 ~ "3",
      yrbrn %in% 1980:1996 ~ "4",
      TRUE ~ as.character(yrbrn)  
    ),
    # After recoding, the gen variable is converted into a factor with labels for clearer interpretation.
    # Factors are used in R to handle categorical variables.
    gen = factor(gen,
                 levels = c("1", "2", "3", "4"),
                 labels = c("Interwar", "Baby Boomers", "Gen X", "Millennials"))
  )
table(dn$gen)

## 
##     Interwar Baby Boomers        Gen X  Millennials 
##         3771         6147         4548         2730

dn <- dn %>%
  mutate(
    polID = case_when(
      lrscale %in% 0:3 ~ "Left",                    
      lrscale %in% 7:10 ~ "Right",                     
      lrscale %in% 4:6 ~ "Moderate",                  
      lrscale %in% c(77, 88, 99) ~ NA_character_      
    ),
   religious = case_when(
      rlgdgr %in% c(77, 88, 99) ~ NA_real_,
      TRUE ~ rlgdgr
    )
  )

model7 <- lm(auth ~ cohort + polID + cohort*polID, data = dn, weights = weight)

interact_plot(model7, pred = cohort, modx = polID, jnplot = TRUE)

# The trend lines show that authoritarian attitudes tend to decline with more recent birth cohorts for all political identifications (Left, Moderate, Right). When it comes to views towards authority, younger generations are less so than older generations. # Nevertheless, the lines’ slopes vary according to political identity. The group labelled as ‘Left’ exhibits the shallowest slope, # suggesting a less abrupt decrease in authoritarian sentiments among different cohorts. # Whereas the ‘Moderate’ and ‘Right’ groups have steeper slopes, younger members of these political groupings exhibit a more marked decline in #authoritarian views. # The ‘Right’ and ‘Moderate’ lines crossing indicates that older cohorts tend to have more authoritarian attitudes than the ‘Right,’ but this tendency reverses with younger cohorts, when the ‘Moderate’ shows a minor increase in comparison to the ‘Right’.

##task 5

 model8 <- lm(auth ~ religious + ID + religious*ID, data = dn, weights = weight)
interact_plot(model8, pred = religious, modx = ID, jnplot = TRUE)

# differs depending on one’s political affiliation (Left or Right). The positive slopes for both groups #indicate that the authoritarian attitude score rises in tandem with religiousness. #The ‘Right’s’ steeper slope suggests that those who identify as politically right-leaning see a more marked rise in authoritarian tendencies along with religiousness. # Although there is still a positive association, the influence of religiousness on authoritarian tendencies is weaker for individuals on the ‘Left’. This implies that the association between religiosity and authoritarian beliefs may be moderated by political identity. #The lack of crossing lines suggests that, throughout the observed range of religiousness, the relative influence of religiousness on authoritarian views does not reverse between the political identifications of the left and right.

afana_xena_homework_5

Xena Afana

2024-03-25