Kim_Eric_Homework

Setting up environment

# List of packages
packages <- c("tidyverse", "infer", "fst", "modelsummary", "effects", "survey", "interactions", "tidycat", "broom")

# Install packages if they aren't installed already
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

# Load the packages
lapply(packages, library, character.only = TRUE)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

## Warning: 패키지 'infer'는 R 버전 4.3.3에서 작성되었습니다

## Warning: 패키지 'effects'는 R 버전 4.3.3에서 작성되었습니다

## 필요한 패키지를 로딩중입니다: carData

## Warning: 패키지 'carData'는 R 버전 4.3.3에서 작성되었습니다

## lattice theme set by effectsTheme()
## See ?effectsTheme for details.

## Warning: 패키지 'survey'는 R 버전 4.3.3에서 작성되었습니다

## 필요한 패키지를 로딩중입니다: grid
## 필요한 패키지를 로딩중입니다: Matrix
## 
## 다음의 패키지를 부착합니다: 'Matrix'
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## 필요한 패키지를 로딩중입니다: survival
## 
## 다음의 패키지를 부착합니다: 'survey'
## 
## The following object is masked from 'package:graphics':
## 
##     dotchart

## Warning: 패키지 'interactions'는 R 버전 4.3.3에서 작성되었습니다

## Warning: 패키지 'tidycat'는 R 버전 4.3.3에서 작성되었습니다

## Registered S3 methods overwritten by 'broom':
##   method            from  
##   tidy.glht         jtools
##   tidy.summary.glht jtools

## [[1]]
##  [1] "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"     "readr"    
##  [7] "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"     "graphics" 
## [13] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "infer"     "lubridate" "forcats"   "stringr"   "dplyr"     "purrr"    
##  [7] "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse" "stats"    
## [13] "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "fst"       "infer"     "lubridate" "forcats"   "stringr"   "dplyr"    
##  [7] "purrr"     "readr"     "tidyr"     "tibble"    "ggplot2"   "tidyverse"
## [13] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [19] "base"     
## 
## [[4]]
##  [1] "modelsummary" "fst"          "infer"        "lubridate"    "forcats"     
##  [6] "stringr"      "dplyr"        "purrr"        "readr"        "tidyr"       
## [11] "tibble"       "ggplot2"      "tidyverse"    "stats"        "graphics"    
## [16] "grDevices"    "utils"        "datasets"     "methods"      "base"        
## 
## [[5]]
##  [1] "effects"      "carData"      "modelsummary" "fst"          "infer"       
##  [6] "lubridate"    "forcats"      "stringr"      "dplyr"        "purrr"       
## [11] "readr"        "tidyr"        "tibble"       "ggplot2"      "tidyverse"   
## [16] "stats"        "graphics"     "grDevices"    "utils"        "datasets"    
## [21] "methods"      "base"        
## 
## [[6]]
##  [1] "survey"       "survival"     "Matrix"       "grid"         "effects"     
##  [6] "carData"      "modelsummary" "fst"          "infer"        "lubridate"   
## [11] "forcats"      "stringr"      "dplyr"        "purrr"        "readr"       
## [16] "tidyr"        "tibble"       "ggplot2"      "tidyverse"    "stats"       
## [21] "graphics"     "grDevices"    "utils"        "datasets"     "methods"     
## [26] "base"        
## 
## [[7]]
##  [1] "interactions" "survey"       "survival"     "Matrix"       "grid"        
##  [6] "effects"      "carData"      "modelsummary" "fst"          "infer"       
## [11] "lubridate"    "forcats"      "stringr"      "dplyr"        "purrr"       
## [16] "readr"        "tidyr"        "tibble"       "ggplot2"      "tidyverse"   
## [21] "stats"        "graphics"     "grDevices"    "utils"        "datasets"    
## [26] "methods"      "base"        
## 
## [[8]]
##  [1] "tidycat"      "interactions" "survey"       "survival"     "Matrix"      
##  [6] "grid"         "effects"      "carData"      "modelsummary" "fst"         
## [11] "infer"        "lubridate"    "forcats"      "stringr"      "dplyr"       
## [16] "purrr"        "readr"        "tidyr"        "tibble"       "ggplot2"     
## [21] "tidyverse"    "stats"        "graphics"     "grDevices"    "utils"       
## [26] "datasets"     "methods"      "base"        
## 
## [[9]]
##  [1] "broom"        "tidycat"      "interactions" "survey"       "survival"    
##  [6] "Matrix"       "grid"         "effects"      "carData"      "modelsummary"
## [11] "fst"          "infer"        "lubridate"    "forcats"      "stringr"     
## [16] "dplyr"        "purrr"        "readr"        "tidyr"        "tibble"      
## [21] "ggplot2"      "tidyverse"    "stats"        "graphics"     "grDevices"   
## [26] "utils"        "datasets"     "methods"      "base"

Task 1

Graph the histogram distribution using the populist scale (as we did in the tutorial) for the country of Sweden. What do you note?

Loading data for Sweden and recoding the variables

sweden_data <- read.fst("sweden_data.fst")

sweden_data$trstplt <- ifelse(sweden_data$trstplt > 10, NA, sweden_data$trstplt)
sweden_data$trstprl <- ifelse(sweden_data$trstprl > 10, NA, sweden_data$trstprl)
sweden_data$trstprt <- ifelse(sweden_data$trstprt > 10, NA, sweden_data$trstprt)

sweden_data$trust <- scales::rescale(sweden_data$trstplt + sweden_data$trstprl + sweden_data$trstprt, na.rm = TRUE, to = c(0, 100))

sweden_data$populist <- scales::rescale(sweden_data$trust, na.rm = TRUE, to=c(100,0))

Creating the distribution of populist scale for Sweden

ggplot(sweden_data, aes(x = populist)) +
  geom_histogram(bins = 30, fill = "blue", color = "black") +
  theme_minimal() +
  labs(title = "Distribution of Populist Scale for Sweden",
       x = "Populist Scale",
       y = "Count")

## Warning: Removed 2503 rows containing non-finite values (`stat_bin()`).

What do I note from the historgram distribution of populism in Sweden?

I note that the distribution is unimodal; between 47.5 and 50 of populist scale is where the values occur most frequently in this histogram. Additionally, there are more people below 50 of populist scale compared to those above 50 of populist scale, indicating that there are more people that are less interested populism than those who are more interested in populism, for the country of Sweden.

Task 2

Run a linear regression with populist attitudes as the outcome, educational attainment (recoded as a binary “BA or more” and “No BA”), and using the Swedish data. Print the intercept and coefficients only and interpret. Then do a tidy model table displaying the p-value (with digits = 3) and interpret.

Loading data for Sweden and recoding the variables we need

df <- read.fst("sweden_data.fst")

model_data <- df %>%
  mutate(
     educ.ba = case_when(
      essround < 5 & edulvla == 5 ~ "BA or more",
      essround >= 5 & edulvlb > 600 ~ "BA or more",
      TRUE ~ "No BA"),
  edulvla = ifelse(edulvla %in% c(77, 88, 99), NA_integer_, edulvla),
  edulvlb = ifelse(edulvlb %in% c(5555, 7777, 8888), NA_integer_, edulvlb),
  educ.ba = factor(educ.ba, levels = c("No BA", "BA or more")),
  
    ID = case_when(
      lrscale %in% 0:4 ~ "Left",
      lrscale %in% 6:10 ~ "Right",
      lrscale %in% c(77, 88, 99) ~ NA_character_,
      TRUE ~ NA_character_
    ),
  )
model_data$trstplt <- ifelse(model_data$trstplt > 10, NA, model_data$trstplt)
model_data$trstprl <- ifelse(model_data$trstprl > 10, NA, model_data$trstprl)
model_data$trstprt <- ifelse(model_data$trstprt > 10, NA, model_data$trstprt)

model_data$trust <- scales::rescale(model_data$trstplt + model_data$trstprl + model_data$trstprt, na.rm = TRUE, to = c(0, 100))

model_data$populist <- scales::rescale(model_data$trust, na.rm = TRUE, to=c(100,0))

table(model_data$populist)

## 
##                0 3.33333333333333 6.66666666666667               10 
##               46               15               47               69 
## 13.3333333333333 16.6666666666667               20 23.3333333333333 
##              123              180              398              505 
## 26.6666666666667               30 33.3333333333333 36.6666666666667 
##              806              879              931             1011 
##               40 43.3333333333333 46.6666666666667               50 
##             1143             1118              956             1161 
## 53.3333333333333 56.6666666666667               60 63.3333333333333 
##              804              841              683              631 
## 66.6666666666667               70 73.3333333333333 76.6666666666667 
##              544              596              431              364 
##               80 83.3333333333333 86.6666666666667               90 
##              331              257              200              164 
## 93.3333333333333 96.6666666666667              100 
##              155               80              244

table(model_data$educ.ba)

## 
##      No BA BA or more 
##      13477       4739

Running a linear regression with populist attitudes as the outcome and educational attainment as a single predictor

model1 <- lm(populist ~ educ.ba, data = model_data)

Printing the intercept and coefficients only and interpreting them

coefficients_sweden <- coef(model1)
print(coefficients_sweden)

##       (Intercept) educ.baBA or more 
##         51.358680         -7.868121

The intercept value of 51.4 represents the average score on the populist scale for the “no BA” group. Simply put, the expected average populist scale score would be 51.4 for those who identify as “no BA” in Sweden.

The coefficient of -7.9 associated with the “BA or more” group indicates that, on average, those in in the “BA or more” group have populist scale scores that are 7.9 points lower than those who identify as “No BA” (based on the re-coding of variables) in Sweden.

According to these outputs, for Sweden, people who identify as “No BA”, on average, possess stronger populists attitudes than those who identify as “BA or more”.

Doing a tidy model table, displaying and interpreting p-value (with digits = 3)

tidy(model1) |>
knitr::kable(digits = 3)

term	estimate	std.error	statistic	p.value
(Intercept)	51.359	0.181	283.363	0
educ.baBA or more	-7.868	0.350	-22.494	0

The p-value of 0 indicates that it is statistically significant. Thus, identifying as ‘No BA’ is statistically associated with stronger populist attitudes and vice versa, for the country of Sweden.

Task 3

Now using the Italian dataset and the authoritarian values scale (as we did in the tutorial), graph the average by survey year. Interpret.

Loading data and creating authoritarian values scale on human modules items

df <- read.fst("italy_data.fst")

df <- df %>%
  mutate(behave = ipbhprp,
         secure = impsafe,
         safety = ipstrgv,
         tradition = imptrad,
         rules = ipfrule) %>%
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"),
                ~ na_if(.x, 7) %>% na_if(8) %>% na_if(9))) %>%
  
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"), ~ 7 - .x ))

df$auth <- scales::rescale(df$behave + 
                      df$secure + 
                      df$safety + 
                      df$tradition + 
                      df$rules, to=c(0,100), na.rm=TRUE)

df <- df %>% filter(!is.na(auth))
table(df$secure)

## 
##    1    2    3    4    5    6 
##   54  139  626 1856 2876 2881

table(df$auth)

## 
##   0  12  16  20  24  28  32  36  40  44  48  52  56  60  64  68  72  76  80  84 
##   7   4   4   9  13  20  27  57 113 155 163 323 367 596 647 789 823 897 992 733 
##  88  92  96 100 
## 632 465 299 297

df$year <- NA
replacements <- c(2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018, 2020)
for(i in 1:10){
  df$year[df$essround == i] <- replacements[i]
}
auth_avg <- df %>%
  group_by(year) %>%
  summarize(auth_avg = mean(auth, na.rm = TRUE))

plot_auth <- ggplot(auth_avg, aes(x = year, y = auth_avg)) +
  geom_point(aes(color = auth_avg), alpha = 1.0) + 
  geom_line(aes(group = 1), color = "blue", linetype = "dashed") +  
  labs(title = "Authoritarian Scale Average by Year (Italy)",
       x = "Survey Year",
       y = "Authoritarian Scale Average") +
  theme_minimal() +
  theme(legend.position = "none") +
  scale_y_continuous(limits = c(0, 100))

print(plot_auth)

table(auth_avg)

##       auth_avg
## year   72.6634225466342 72.780300115429 73.2129963898917 75.1807780320366
##   2012                0               0                0                1
##   2016                1               0                0                0
##   2018                0               1                0                0
##   2020                0               0                1                0

Interpreting the graph

Between 2012 and 2016, there is a small decrease (75.2 for 2012, 72.6 for 2016) in the authoritarian scale average in Italy. Then, beginning from the year of 2016, there is a very small, but still noticeable, increasing trend in the authoritarian scale average in Italy (72.7 for 2016, 72.8 for 2018, and 73.2 for 2020).

Task 4

Now run a linear regression model, using the modelsummary package, with authoritarian values as the outcome, and generations as the predictor/potential explanatory variable, and using the Italian data again. Rename the coefficients using the coef_rename function. Interpret the coefficients, whether they are statistically significant, and the adjusted R-squared.

Loading data for Italy and recoding the variables I need

df <- read.fst("italy_data.fst")

model_data <- df %>%
  mutate(
    age = ifelse(agea == 999, NA, agea),
    cohort = ifelse(yrbrn < 1930 | yrbrn > 2000, NA, yrbrn),
    gen = case_when(
      yrbrn %in% 1900:1945 ~ "1",
      yrbrn %in% 1946:1964 ~ "2",
      yrbrn %in% 1965:1979 ~ "3",
      yrbrn %in% 1980:1996 ~ "4",
      TRUE ~ as.character(cohort)
    ),
    gen = factor(gen,
                 levels = c("1", "2", "3", "4"),
                 labels = c("Interwar", "Baby Boomers", "Gen X", "Millennials"))
  )
table(model_data$gen)

## 
##     Interwar Baby Boomers        Gen X  Millennials 
##         1470         3179         2636         2031

model_data <- model_data %>%
  mutate(behave = ipbhprp,
         secure = impsafe,
         safety = ipstrgv,
         tradition = imptrad,
         rules = ipfrule) %>%
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"),
                ~ na_if(.x, 7) %>% na_if(8) %>% na_if(9))) %>%
  
  mutate(across(c("behave", "secure", "safety", "tradition", "rules"), ~ 7 - .x ))

model_data$auth <- scales::rescale(model_data$behave + 
                      model_data$secure + 
                      model_data$safety + 
                      model_data$tradition + 
                      model_data$rules, to=c(0,100), na.rm=TRUE)

model_data <- model_data %>% filter(!is.na(auth))
table(model_data$secure)

## 
##    1    2    3    4    5    6 
##   54  139  626 1856 2876 2881

Running a linear regression model, using the modelsummary package

model2 <- lm(auth ~ gen, data = model_data)

modelsummary(
  list(model2),
  fmt = 1,
  estimate  = c( "{estimate} ({std.error}){stars}"),
  statistic = NULL,
  coef_omit = "Intercept",
  coef_rename = c("genBaby Boomers" = "Baby Boomers", "genGen X" = "Gen X", "genMillennials" = "Millennials")
)

	(1)
Baby Boomers	-1.1 (0.6)*
Gen X	-2.1 (0.6)***
Millennials	-3.6 (0.6)***
Num.Obs.	7631
R2	0.006
R2 Adj.	0.006
AIC	62934.0
BIC	62968.7
Log.Lik.	-31462.009
RMSE	14.94

tidy(model2)

## # A tibble: 4 × 5
##   term            estimate std.error statistic  p.value
##   <chr>              <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)        75.3      0.467    162.   0       
## 2 genBaby Boomers    -1.12     0.552     -2.04 4.14e- 2
## 3 genGen X           -2.08     0.564     -3.69 2.29e- 4
## 4 genMillennials     -3.59     0.584     -6.15 8.01e-10

Interpreting the coefficients and the adjusted R-squared

Coefficients interpretation:

Baby Boomers (-1.1, p<0.05): Compared to the Interwar generation (reference category that is omitted), Baby Boomers present scores that are, on average, 1.1 points lower on the authoritarian values scale (0-100). The single asterisk (*) denotes statistical significance at a lower p-value than the alpha threshold of 0.05 (p = 0.04142871).

Gen X (-2.1, p<0.001): Generation X shows an even more pronounced decrease, with scores 2.1 points lower than the Interwar generation on the authoritarian values scale. The triple asterisks (***) denote statistical significance at a very lower p-value than the alpha threshold of 0.05 (p = 0.00002291382).

Millennials (-3.6, p<0.001): Millennials exhibit the largest decrease in authoritarian values, with their scores being 3.6 points lower than those of the Interwar generation. This marked difference is statistically significant, indicating that Millennials diverge most strongly from the Interwar generation in comparing their authoritarian values. Again, the triple asterisks (***) denote statistical significance at a very, very, very lower p-value than the alpha threshold of 0.05 (p = 0.0000000008013214).

Adjusted R-Squared interpretation (R2 Adj. = 0.006):

The adjusted R-squared value of 0.006 means that the ‘gen’ variable (with the four coded generations) explain approximately 0.6% of the variance in the authoritarian values scale for Italy, after accounting for generations as the predictor. While statistically significant, this suggests that the model explains only a small portion of the variability in authoritarian values, pointing to other factors outside generational differences that may play critical roles in shaping these attitudes.

Task 5

Now visualize, using the modelplot() function from the modelsummary package, coefficient estimates and 95% confidence intervals for the following three models:

Model 1 = populist attitudes scale as the outcome; left/right as the single predictor (omitting moderates as we did in the tutorial), and using data for Greece.

Model 2 = populist attitudes scale as the outcome; gender as the single predictor, and using data for Greece.

Model 3 = populist attitudes scale as the outcome; generations as the single predictor (using the four generational category recode we did in the tutorial), and using data for Greece. Interpret.

Loading data for Greece and recoding the variables I need

df <- read.fst("greece_data.fst")

model_data <- df %>%
  mutate(
    ID = case_when(
      lrscale %in% 0:4 ~ "Left",
      lrscale %in% 6:10 ~ "Right", 
      lrscale %in% c(77, 88, 99) ~ NA_character_,
      TRUE ~ NA_character_,
    ),

    gndr = case_when(
      gndr == 1 ~ "Male",
      gndr == 2 ~ "Female",
      gndr == 9 ~ NA_character_,
      TRUE ~ as.character(gndr)
    ),
    
    age = ifelse(agea == 999, NA, agea),

    cohort = ifelse(yrbrn < 1930 | yrbrn > 2000, NA, yrbrn),

    gen = case_when(
      yrbrn %in% 1900:1945 ~ "1",
      yrbrn %in% 1946:1964 ~ "2",
      yrbrn %in% 1965:1979 ~ "3",
      yrbrn %in% 1980:1996 ~ "4",
      TRUE ~ as.character(cohort)
    ),
    
    gen = factor(gen,
                 levels = c("1", "2", "3", "4"),
                 labels = c("Interwar", "Baby Boomers", "Gen X", "Millennials"))
    )
model_data$trstplt <- ifelse(model_data$trstplt > 10, NA, model_data$trstplt)
model_data$trstprl <- ifelse(model_data$trstprl > 10, NA, model_data$trstprl)
model_data$trstprt <- ifelse(model_data$trstprt > 10, NA, model_data$trstprt)

model_data$trust <- scales::rescale(model_data$trstplt + model_data$trstprl + model_data$trstprt, na.rm = TRUE, to = c(0, 100))

model_data$populist <- scales::rescale(model_data$trust, na.rm = TRUE, to=c(100,0))

table(model_data$ID)

## 
##  Left Right 
##  2688  3522

table(model_data$gndr)

## 
## Female   Male 
##   6929   5629

table(model_data$gen)

## 
##     Interwar Baby Boomers        Gen X  Millennials 
##         2978         3484         3498         2330

Creating models

model3 <- lm(populist ~ ID, data = model_data)
model4 <- lm(populist ~ gndr, data = model_data)
model5 <- lm(populist ~ gen, data = model_data)

models2 <- list(
   "Model 1" = lm(populist ~ ID, data = model_data),
   "Model 2" = lm(populist ~ gndr, data = model_data),
   "Model 3" = lm(populist ~ gen, data = model_data))

modelsummary(models2, fmt = 1,
  estimate  = c("{estimate} ({std.error}){stars}",
                "{estimate} ({std.error}){stars}",
                "{estimate} ({std.error}){stars}"),
  coef_rename = c("genBaby Boomers" = "Baby Boomers", "genGen X" = "Gen X", "genMillennials" = "Millennials"),
  statistic = NULL,
  coef_omit = "Intercept")

	Model 1	Model 2	Model 3
IDRight	-8.8 (0.6)***
gndrMale		-1.0 (0.5)*
Baby Boomers			3.5 (0.7)***
Gen X			5.6 (0.7)***
Millennials			5.8 (0.7)***
Num.Obs.	4936	9823	9562
R2	0.037	0.000	0.009
R2 Adj.	0.036	0.000	0.009
AIC	44723.8	88873.2	86473.9
BIC	44743.3	88894.8	86509.7
Log.Lik.	-22358.894	-44433.590	-43231.944
RMSE	22.44	22.30	22.25

tidy(model3)

## # A tibble: 2 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    72.0      0.478     151.  0       
## 2 IDRight        -8.80     0.643     -13.7 6.33e-42

tidy(model4)

## # A tibble: 2 × 5
##   term        estimate std.error statistic p.value
##   <chr>          <dbl>     <dbl>     <dbl>   <dbl>
## 1 (Intercept)   70.0       0.305    230.    0     
## 2 gndrMale      -0.967     0.452     -2.14  0.0324

tidy(model5)

## # A tibble: 4 × 5
##   term            estimate std.error statistic  p.value
##   <chr>              <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)        65.8      0.508    130.   0       
## 2 genBaby Boomers     3.55     0.661      5.37 7.97e- 8
## 3 genGen X            5.58     0.659      8.47 2.85e-17
## 4 genMillennials      5.79     0.706      8.21 2.51e-16

Visualizing coefficient estimates and 95% confidence intervals for the models and interpretting

modelplot(models2, coef_omit = 'Interc', coef_rename = c("genBaby Boomers" = "Baby Boomers", "genGen X" = "Gen X", "genMillennials" = "Millennials"))

Coefficient estimates interpretation:

Millennials (5.8, p<0.001): Millennials present the largest increase in populist values, with their scores being 5.8 points higher than those of the Interwar generation (as the reference group). This marked difference is statistically significant, indicating that Millennials, on average, possess the strongest populist attitudes among the four types of generations in the dataset, for the country of Greece. The triple asterisks (***) denote statistical significance at a very, very, very lower p-value than the alpha threshold of 0.05 (p = 0.0000000000000002513449).

Gen X (5.6, p<0.001): Generation X presents a slightly less pronounced increase compared to Millennials, with their scores being 5.6 points higher than the Interwar generation (the reference group) on the populist values scale. Again, the triple asterisks (***) denote statistical significance at a very, very, very lower p-value than the alpha threshold of 0.05 (p = 0.00000000000000002853823).

Baby Boomers (3.6, p<0.001): Compared to the Interwar generation (reference category that is omitted), Baby Boomers present scores that are, on average, 3.6 points higher on the populist values scale. Again, the triple asterisks (***) denote statistical significance at a very, very lower p-value than the alpha threshold of 0.05 (p = 0.00000007974151).

gnderMale (-1.0, p<0.05): Male is statistically significant as their scores are 1.0 point lower than female on the populist values scale, with the p-value of 0.032, indicating that males, on average, possess a slightly weaker populist attitudes than females, for the country of Greece. The single asterisk (*) denotes statistical significance at a lower p-value than the alpha threshold of 0.05 (p = 0.03244758).

IDRight (-8.8, p<0.001): Compared to those who lean politically left, those who lean politically right present scores that are, on average, 8.8 points lower on the populist values scale. This marked difference is statistically significant, indicating that those who lean politically right, on average, possess quite weaker populist attitudes than those who lean politically left, for the country of Greece. The triple asterisks (***) denote statistical significance at an exceedingly lower p-value than the alpha threshold of 0.05 (p = 0.000000000000000000000000000000000000000006333249).

confint(model3, level=0.95)

##                2.5 %    97.5 %
## (Intercept)  71.0653 72.938927
## IDRight     -10.0591 -7.539754

confint(model4, level=0.95)

##                 2.5 %      97.5 %
## (Intercept) 69.416151 70.60995391
## gndrMale    -1.852559 -0.08087043

confint(model5, level=0.95)

##                     2.5 %    97.5 %
## (Intercept)     64.777052 66.768087
## genBaby Boomers  2.254559  4.845308
## genGen X         4.288926  6.872488
## genMillennials   4.409253  7.175453

95% confidence interval interpretation:

For the variable IDRight, the confidence interval ranging from 61.0 (71.1-10.1) to 65.4 (72.9-7.5) of populist scores scale is expected to situate the true population parameter, across 95% of random samples.

For the variable gndrMale, the confidence interval ranging from 67.5 (69.4-1.9) to 70.5 (70.6-0.1) of populist scores scale is expected to situate the true population parameter, across 95% of random samples.

For the variable genBaby Boomers, the confidence interval ranging from 67.0 (64.8+2.2) to 71.6 (66.8+4.8) of populist scores scale is expected to situate the true population parameter, across 95% of random samples.

For the variable genGen X, the confidence interval ranging from 69.1 (64.8+4.3) to 73.7 (66.8+6.9) of populist scores scale is expected to situate the true population parameter, across 95% of random samples.

For the variable genMillennials, the confidence interval ranging from 69.2 (64.8+4.4) to 74.0 (66.8+7.2) of populist scores scale is expected to situate the true population parameter, across 95% of random samples.

Kim_Eric_Homework_4

Seokang Eric Kim

2024-03-13