This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(foreign)
# Variable name Label
# -----------------------------------------------------------------------------------------------
# ess11_reg ESS11REG – Regional-level data (identifier)
#
# reg11_area_2023 Area (km²) – 2023
# reg11_tpopsz_2023 Population size – Total – 2023
# reg11_fpopsz_2023 Population size – Female – 2023
# reg11_mpopsz_2023 Population size – Male – 2023
# reg11_pode_2023 Population density – 2023
#
# reg11_lbirth_2023 Live births (total) – 2023
# reg11_death_2023 Deaths (total) – 2023
# reg11_natgrow_2023 Natural change of population – 2023
# reg11_cnmigrat_2023 Net migration plus statistical adjustment – 2023
# reg11_grow_2023 Total population change – 2023
#
# reg11_gbirthrt_2023 Crude birth rate – 2023
# reg11_gdeathrt_2023 Crude death rate – 2023
# reg11_natgrowrt_2023 Crude rate of natural change of population – 2023
# reg11_cnmigratrt_2023 Crude rate of net migration plus statistical adjustment – 2023
# reg11_growrt_2023 Crude rate of total population change – 2023
#
# reg11_gdp_eurhab_2023 GDP at current market prices – Euro per inhabitant – 2023
# reg11_gdp_mio_eur_2023 GDP at current market prices – Million euro – 2023
# reg11_gdp_eurhab_eu27_2020_2023 GDP per inhabitant in % of EU27 (from 2020) average – 2023
# reg11_gdp_mio_nac_2023 GDP at current market prices – Million units of national currency – 2023
# reg11_gdp_mio_pps_eu27_2020_2023 GDP at current market prices – Million PPS (EU27 from 2020) – 2023
# reg11_gdp_pps_eu27_2020_hab_2023 GDP at current market prices – PPS per inhabitant – 2023
# reg11_gdp_pps_hab_eu27_2020_2023 GDP per inhabitant in % of EU27 (from 2020) average (PPS) – 2023
#
# read region data (see above)
dfreg <- read.spss("ESS11_ML_Region.sav",
to.data.frame = TRUE)
names(dfreg)
## [1] "region" "reg11_area_2023"
## [3] "reg11_tpopsz_2023" "reg11_fpopsz_2023"
## [5] "reg11_mpopsz_2023" "reg11_pode_2023"
## [7] "reg11_lbirth_2023" "reg11_death_2023"
## [9] "reg11_natgrow_2023" "reg11_cnmigrat_2023"
## [11] "reg11_grow_2023" "reg11_gbirthrt_2023"
## [13] "reg11_gdeathrt_2023" "reg11_natgrowrt_2023"
## [15] "reg11_cnmigratrt_2023" "reg11_growrt_2023"
## [17] "reg11_gdp_eurhab_2023" "reg11_gdp_mio_eur_2023"
## [19] "reg11_gdp_eurhab_eu27_2020_2023" "reg11_gdp_mio_nac_2023"
## [21] "reg11_gdp_mio_pps_eu27_2020_2023" "reg11_gdp_pps_eu27_2020_hab_2023"
## [23] "reg11_gdp_pps_hab_eu27_2020_2023"
dfreg$region = trimws(dfreg$region)
# check
unique(dfreg$region)
## [1] "AT11 " "AT12 " "AT13 " "AT21 " "AT22 " "AT31 " "AT32 " "AT33 " "AT34 "
## [10] "BE10 " "BE21 " "BE22 " "BE23 " "BE24 " "BE25 " "BE31 " "BE32 " "BE33 "
## [19] "BE34 " "BE35 " "BG311" "BG312" "BG313" "BG314" "BG315" "BG321" "BG322"
## [28] "BG323" "BG324" "BG325" "BG331" "BG332" "BG333" "BG334" "BG341" "BG342"
## [37] "BG343" "BG344" "BG411" "BG412" "BG413" "BG414" "BG415" "BG421" "BG422"
## [46] "BG423" "BG424" "BG425" "CY0 " "DE1 " "DE2 " "DE3 " "DE4 " "DE6 "
## [55] "DE7 " "DE8 " "DE9 " "DEA " "DEB " "DEC " "DED " "DEE " "DEF "
## [64] "DEG " "EL30 " "EL41 " "EL42 " "EL43 " "EL51 " "EL52 " "EL53 " "EL54 "
## [73] "EL61 " "EL62 " "EL63 " "EL64 " "EL65 " "ES11 " "ES12 " "ES13 " "ES21 "
## [82] "ES22 " "ES23 " "ES24 " "ES30 " "ES41 " "ES42 " "ES43 " "ES51 " "ES52 "
## [91] "ES53 " "ES61 " "ES62 " "ES64 " "ES70 " "FI196" "FI1B1" "FI1C1" "FI1C2"
## [100] "FI1C5" "FI1D5" "FI1D7" "FI200" "FR10 " "FRB0 " "FRC1 " "FRC2 " "FRD1 "
## [109] "FRD2 " "FRE1 " "FRE2 " "FRF1 " "FRF2 " "FRF3 " "FRG0 " "FRH0 " "FRI1 "
## [118] "FRI2 " "FRI3 " "FRJ1 " "FRJ2 " "FRK1 " "FRK2 " "FRL0 " "HR021" "HR022"
## [127] "HR023" "HR024" "HR025" "HR026" "HR027" "HR028" "HR031" "HR032" "HR033"
## [136] "HR034" "HR035" "HR036" "HR037" "HU110" "HU120" "HU211" "HU212" "HU213"
## [145] "HU221" "HU222" "HU223" "HU231" "HU232" "HU233" "HU311" "HU312" "HU313"
## [154] "HU321" "HU322" "HU323" "HU331" "HU332" "HU333" "IE041" "IE042" "IE051"
## [163] "IE052" "IE053" "IE061" "IE062" "IE063" "ITC " "ITF " "ITG " "ITH "
## [172] "ITI " "LT011" "LT021" "LT022" "LT023" "LT024" "LT025" "LT026" "LT027"
## [181] "LT028" "LT029" "ME0 " "NL11 " "NL12 " "NL13 " "NL21 " "NL22 " "NL23 "
# still white spaces at the end of region code.
# so we have to trim non-regular white space manually:
dfreg$region = gsub("^[[:space:]\u00A0]+|[[:space:]\u00A0]+$", "", dfreg$region)
# check
unique(dfreg$region) # yep, much better now
## [1] "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32" "AT33" "AT34"
## [10] "BE10" "BE21" "BE22" "BE23" "BE24" "BE25" "BE31" "BE32" "BE33"
## [19] "BE34" "BE35" "BG311" "BG312" "BG313" "BG314" "BG315" "BG321" "BG322"
## [28] "BG323" "BG324" "BG325" "BG331" "BG332" "BG333" "BG334" "BG341" "BG342"
## [37] "BG343" "BG344" "BG411" "BG412" "BG413" "BG414" "BG415" "BG421" "BG422"
## [46] "BG423" "BG424" "BG425" "CY0" "DE1" "DE2" "DE3" "DE4" "DE6"
## [55] "DE7" "DE8" "DE9" "DEA" "DEB" "DEC" "DED" "DEE" "DEF"
## [64] "DEG" "EL30" "EL41" "EL42" "EL43" "EL51" "EL52" "EL53" "EL54"
## [73] "EL61" "EL62" "EL63" "EL64" "EL65" "ES11" "ES12" "ES13" "ES21"
## [82] "ES22" "ES23" "ES24" "ES30" "ES41" "ES42" "ES43" "ES51" "ES52"
## [91] "ES53" "ES61" "ES62" "ES64" "ES70" "FI196" "FI1B1" "FI1C1" "FI1C2"
## [100] "FI1C5" "FI1D5" "FI1D7" "FI200" "FR10" "FRB0" "FRC1" "FRC2" "FRD1"
## [109] "FRD2" "FRE1" "FRE2" "FRF1" "FRF2" "FRF3" "FRG0" "FRH0" "FRI1"
## [118] "FRI2" "FRI3" "FRJ1" "FRJ2" "FRK1" "FRK2" "FRL0" "HR021" "HR022"
## [127] "HR023" "HR024" "HR025" "HR026" "HR027" "HR028" "HR031" "HR032" "HR033"
## [136] "HR034" "HR035" "HR036" "HR037" "HU110" "HU120" "HU211" "HU212" "HU213"
## [145] "HU221" "HU222" "HU223" "HU231" "HU232" "HU233" "HU311" "HU312" "HU313"
## [154] "HU321" "HU322" "HU323" "HU331" "HU332" "HU333" "IE041" "IE042" "IE051"
## [163] "IE052" "IE053" "IE061" "IE062" "IE063" "ITC" "ITF" "ITG" "ITH"
## [172] "ITI" "LT011" "LT021" "LT022" "LT023" "LT024" "LT025" "LT026" "LT027"
## [181] "LT028" "LT029" "ME0" "NL11" "NL12" "NL13" "NL21" "NL22" "NL23"
nrow(dfreg)
## [1] 189
ncol(dfreg)
## [1] 23
table(complete.cases(dfreg))
##
## TRUE
## 189
# number of regions
length(unique(dfreg$region))
## [1] 189
# read original data
df10 = read.spss("ESS10e03_3.sav", to.data.frame = TRUE, use.value.labels = FALSE)
df11 = read.spss("ESS11_unlabeled.0-10.sav", to.data.frame = TRUE, use.value.labels = FALSE)
df10$region = trimws(df10$region)
df11$region = trimws(df11$region)
nrow(df10)
## [1] 37611
ncol(df10)
## [1] 621
nrow(df11)
## [1] 40156
ncol(df11)
## [1] 640
table(complete.cases(df10))
##
## FALSE
## 37611
table(complete.cases(df11))
##
## FALSE
## 40156
# number of regions in df10
length(unique(df10$region))
## [1] 248
# number of regions in df11
length(unique(df11$region))
## [1] 264
# regions in df10 but not in df11
unique(df10$region)[!(unique(df10$region) %in% unique(df11$region))]
## [1] "BG331" "BG334" "BG411" "BG423" "BG415" "BG344" "BG421" "BG342" "BG425"
## [10] "BG323" "BG321" "BG312" "BG322" "BG413" "BG332" "BG343" "BG314" "BG422"
## [19] "BG333" "BG341" "BG325" "BG324" "BG315" "BG414" "BG311" "BG412" "BG424"
## [28] "CZ051" "CZ071" "CZ042" "CZ010" "CZ031" "CZ032" "CZ080" "CZ063" "CZ053"
## [37] "CZ052" "CZ020" "CZ072" "CZ041" "CZ064" "EE004" "EE001" "EE008" "EE00A"
## [46] "EE009" "FI1D9" "FI1D8" "IS002" "IS001" NA "ME0" "MK002" "MK008"
## [55] "MK007" "MK005" "MK006" "MK003" "MK001" "MK004"
# regions in df11 but not in df10
unique(df11$region)[!(unique(df11$region) %in% unique(df10$region))]
## [1] "AT31" "AT22" "AT33" "AT32" "AT12" "AT11" "AT13" "AT34" "AT21"
## [10] "CY0" "DEF" "DE2" "DEB" "DE8" "DEA" "DE7" "DE9" "DE1"
## [19] "DED" "DE4" "DEG" "DE3" "DE6" "DEE" "DEC" "ES21" "ES61"
## [28] "ES12" "ES52" "ES30" "ES11" "ES51" "ES24" "ES42" "ES41" "ES64"
## [37] "ES13" "ES43" "ES23" "ES70" "ES53" "ES62" "ES22" "FI1D6" "FI1D4"
## [46] "IS01" "IS02" "PL91" "PL71" "PL81" "PL61" "PL21" "PL41" "PL51"
## [55] "PL42" "PL63" "PL92" "PL82" "PL52" "PL72" "PL22" "PL43" "PL84"
## [64] "PL62" "RS21" "RS22" "RS11" "RS12" "SE22" "SE23" "SE12" "SE31"
## [73] "SE33" "SE21" "SE11" "SE32"
# number of regions in df10 but not in df11
length(unique(df10$region)[!(unique(df10$region) %in% unique(df11$region))])
## [1] 60
#length(setdiff(df10$region, df11$region))
# number of regions in df11 but not in df10
length(unique(df11$region)[!(unique(df11$region) %in% unique(df10$region))])
## [1] 76
#length(setdiff(df11$region, df10$region))
# number of regions in both df11 OR df10
length(union(df10$region, df11$region))
## [1] 324
# number of regions in both df11 AND df10
length(intersect(df10$region, df11$region))
## [1] 188
# list of all variable names from both rounds
allvars = union(names(df10), names(df11))
# variable names from one round which do not exist in the other round
vars_not_in_df10 = setdiff(allvars, names(df10))
vars_not_in_df11 = setdiff(allvars, names(df11))
# add missing columns to both rounds and fill with NA
if (length(vars_not_in_df10) > 0) df10[vars_not_in_df10] = NA
if (length(vars_not_in_df11) > 0) df11[vars_not_in_df11] = NA
# align order of columns in both rounds
df10 = df10[allvars]
df11 = df11[allvars]
nrow(df10)
## [1] 37611
ncol(df10)
## [1] 912
nrow(df11)
## [1] 40156
ncol(df11)
## [1] 912
# number of regions in df10
length(unique(df10$region))
## [1] 248
# number of regions in df11
length(unique(df11$region))
## [1] 264
# append both data sets
df = rbind(df10, df11)
nrow(df)
## [1] 77767
ncol(df)
## [1] 912
# number of regions in df
length(unique(df$region))
## [1] 324
#### "Right-wing Populist Voter", ESS10
# Country ESS10_variable Right-wing_populist_party_included ESS10_Code(s)
# ----------------------------------------------------------------------------------------------------------------------
# Belgium prtvtebe Vlaams Belang 6
# Bulgaria prtvtebg VMRO, Ataka, Vazrazhdane 6, 8, 9
# Switzerland prtvthch Swiss People's Party (SVP) 1
# Croatia prtvtbhr DP / Hrvatski suverenisti / Hrast (nationalist bloc) 3
# Czechia prtvtecz Svoboda a přímá demokracie (SPD) 8
# Estonia prtvthee EKRE (Eesti Konservatiivne Rahvaerakond) 6
# Finland prtvtefi True Finns 5
# France prtvtefr Front National (FN) 11
# Greece prtvtdgr Ελληνική Λύση, Χρυσή Αυγή 5, 7
# Hungary prtvtghu Fidesz, Jobbik 3, 4
# Iceland prtvtdis Miðflokkurinn 6
# Ireland prtvtdie (none included) —
# Italy prtvtdit Lega, Fratelli d’Italia, CasaPound 3, 5, 10
# Lithuania prtvclt1–3 National Alliance (NS) 7
# Montenegro prtvtame Democratic Front (DF) 9
# Netherlands prtvthnl Party for Freedom (PVV), Forum for Democracy, JA21 3, 13, 16
# North Macedonia prtvtmk VMRO-DPMNE, Levica 2, 5
# Norway prtvtbno Fremskrittspartiet (Progress Party) 8
# Portugal prtvtdpt CHEGA, PNR 4, 15
# Slovenia prtvtfsi SDS, SNS 8, 11
# Slovakia prtvtesk ĽS Naše Slovensko 4
# United Kingdom prtvtdgb UK Independence Party, Brexit Party 7, 8
#### "Right-wing Populist Voter", ESS11
# Country ESS11_variable Right-wing_populist_party_included ESS11_Code(s)
# ----------------------------------------------------------------------------------------------------------------------------
# Austria prtvtdat FPÖ 3
# Croatia prtvtchr DP Miroslava Škore / Hrvatski suverenisti / Hrast (nationalist bloc) 3
# Finland prtvtffi True Finns 8
# Germany prtvgde1, prtvgde2 Alternative for Germany (AfD) 6
# Hungary prtvthhu Mi Hazánk 5
# Ireland prtvteie (none included) —
# Lithuania prtvclt1–3 National Alliance (NS) 7
# Netherlands prtvtinl Party for Freedom (PVV) 3
# Norway prtvtcno Fremskrittspartiet (Progress Party) 8
# Slovakia prtvtesk ĽS Naše Slovensko 4
# Slovenia prtvtgsi SDS – Slovenska demokratska stranka 8
# Switzerland prtvthch Swiss People’s Party (SVP) 1
# United Kingdom prtvtdgb UKIP, Brexit Party 7,8
# there is voting information for
countries = c("AT", "BE", "BG", "CH", "CZ", "DE", "EE", "FI", "FR",
"GB", "GR", "HR", "HU", "IS", "IT", "LT", "ME", "MK",
"NL", "NO", "PT", "SI", "SK")
# there is no voting information for
setdiff(unique(df$cntry), countries)
## [1] "IE" "CY" "ES" "PL" "RS" "SE"
# reduce df to counttries providing voting info
df = df[!(df$cntry %in% setdiff(unique(df$cntry), countries)),]
#check
length(unique(df$cntry)) == length(countries)
## [1] TRUE
# define right-wing populist voters
df$rwpop =
(df$cntry == "AT" & df$prtvtdat == 3) |
(df$cntry == "BE" & df$prtvtebe == 6) |
(df$cntry == "BG" & df$prtvtebg %in% c(6, 8, 9)) |
(df$cntry == "CH" & df$prtvthch == 1) |
(df$cntry == "CZ" & df$prtvtecz == 8) |
(df$cntry == "DE" & (df$prtvgde1 %in% c(6) | df$prtvgde2 %in% c(6))) |
(df$cntry == "EE" & df$prtvthee == 6) |
(df$cntry == "FI" & df$prtvtffi %in% c(8)) | #11
(df$cntry == "FI" & df$prtvtefi == 5) | #10
(df$cntry == "FR" & df$prtvtefr == 11) |
(df$cntry == "GB" & df$prtvtdgb %in% c(7,8)) |
(df$cntry == "GR" & df$prtvtdgr %in% c(5, 7)) |
(df$cntry == "HR" & df$prtvtchr == 3) |
(df$cntry == "HU" & df$prtvthhu == 5) | # 11
(df$cntry == "HU" & df$prtvtghu %in% c(3, 4)) | #10
(df$cntry == "IS" & df$prtvtdis == 6) |
(df$cntry == "IT" & df$prtvtdit %in% c(3, 5, 10)) |
(df$cntry == "LT" & (
df$prtvclt1 == 7 |
df$prtvclt2 == 7 |
df$prtvclt3 == 7 )) |
(df$cntry == "ME" & df$prtvtame == 9) |
(df$cntry == "MK" & df$prtvtmk %in% c(2, 5)) |
(df$cntry == "NL" & df$prtvtinl == 3) | #11
(df$cntry == "NL" & df$prtvthnl %in% c(3, 13, 16)) | #10
(df$cntry == "NO" & df$prtvtbno == 8) |
(df$cntry == "PT" & df$prtvtdpt %in% c(4, 15)) |
(df$cntry == "SK" & df$prtvtesk == 4) |
(df$cntry == "SI" & df$prtvtgsi == 8) | # 11
(df$cntry == "SI" & df$prtvtfsi %in% c(8, 11)) # 10
# rwpop[df$is.na(rwpop)] = 0
table(df$rwpop, useNA="always")
##
## FALSE TRUE <NA>
## 38496 3922 24798
tapply(df$rwpop, df$cntry, mean, na.rm=T)
## AT BE BG CH CZ DE
## 0.154471545 0.087710084 0.028329654 0.226429102 0.064139942 0.042561983
## EE FI FR GB GR HR
## 0.130663857 0.239906832 0.122425629 0.004941758 0.015118790 0.054862843
## HU IS IT LT ME MK
## 0.394225983 0.045171340 0.048864668 1.000000000 0.245635910 0.158152554
## NL NO PT SI SK
## 0.148014440 0.083333333 0.006228589 0.377445339 0.053929122
#### Define complex Variable "Depression Score"
# CES-D8 items
df$d1 = as.numeric(df$fltdpr)
df$d2 = as.numeric(df$flteeff)
df$d3 = as.numeric(df$fltlnl)
df$d4 = 5 - as.numeric(df$enjlf)
df$d5 = as.numeric(df$fltsd)
df$d6 = 5 - as.numeric(df$wrhpp)
df$d7 = as.numeric(df$slprl)
df$d8 = as.numeric(df$cldgng)
# CES-D8 score
df$cesd8 = rowMeans(df[, paste0("d", 1:8)])
#### Define complex Variable "BMI"
grep("weight", names(df))
## [1] 7 9 10 695
names(df)[grep("weight", names(df))] # list all variable names containing "weight"
## [1] "dweight" "pweight" "anweight" "weighta"
names(df)[694:696] # look up adjacent variable names
## [1] "height" "weighta" "dshltgp"
class(df$height)
## [1] "numeric"
#df$height = as.numeric(as.character(df$height))
mean(df$height, na.rm=T)
## [1] 171.082
df$height = df$height /100 # body height in meter
class(df$weighta)
## [1] "numeric"
#df$weighta = as.numeric(as.character(df$weighta))
df$bmi = df$weighta / (df$height**2)
# https://apps.who.int/nutrition/landscape/help.aspx?menu=0&helpid=420
# BMI < 17.0 indicates moderate and severe thinness
# BMI < 18.5 indicates underweight
# BMI 18.5–24.9 indicates normal weight
# BMI ≥ 25.0 indicates overweight
# BMI ≥ 30.0 indicates obesity
##################################################
##### SEMINAR PAPER STUDENT INPUT ################
##################################################
# define list of variable names to be used for aggregated data
#varnames = c("cntry", "region", "gndr", "agea", "health", "rwpop", "cesd8", "bmi", "eisced",
#"hinctnta")
#🔧 1️⃣ varnames expanding
varnames = c(
"cntry",
"region",
"gndr",
"agea",
"health",
"rwpop",
"eisced", # education
"hinctnta", # household income (deciles)
"cesd8", # depression score
"bmi" # body mass index
)
Hypothesis 1 (Age): # Regions with a higher average age exhibit a higher share of right-wing populist voters.
Hypothesis 2 (Education): # Regions with lower average levels of education exhibit a higher share of right-wing populist voters.
Hypothesis 3 (Health): # Regions with poorer average health exhibit a higher share of right-wing populist voters.
Hypothesis 4 (Depression / Mental Health): # Regions with higher average depression scores (CES-D8) exhibit a higher share of right-wing populist voters.
Hypothesis 5 (Gender): # Regions with a higher proportion of men exhibit a higher share of right-wing populist voters.
##################################################
##### END SEMINAR PAPER STUDENT INPUT ############
##################################################
# and limit the dataset to complete cases for these variables
summary(df[,varnames]) ## check for NAs
## cntry region gndr agea
## Length:67216 Length:67216 Min. :1.000 Min. :15.00
## Class :character Class :character 1st Qu.:1.000 1st Qu.:36.00
## Mode :character Mode :character Median :2.000 Median :52.00
## Mean :1.535 Mean :51.13
## 3rd Qu.:2.000 3rd Qu.:66.00
## Max. :2.000 Max. :90.00
## NA's :446
## health rwpop eisced hinctnta
## Min. :1.000 Mode :logical Min. : 1.000 Min. : 1.000
## 1st Qu.:1.000 FALSE:38496 1st Qu.: 3.000 1st Qu.: 3.000
## Median :2.000 TRUE :3922 Median : 4.000 Median : 5.000
## Mean :2.152 NA's :24798 Mean : 4.241 Mean : 5.488
## 3rd Qu.:3.000 3rd Qu.: 6.000 3rd Qu.: 8.000
## Max. :5.000 Max. :55.000 Max. :10.000
## NA's :82 NA's :253 NA's :14217
## cesd8 bmi
## Min. :1.000 Min. :16.00
## 1st Qu.:1.375 1st Qu.:22.76
## Median :1.625 Median :25.39
## Mean :1.703 Mean :25.78
## 3rd Qu.:2.000 3rd Qu.:28.23
## Max. :4.000 Max. :40.00
## NA's :36423 NA's :37615
nrow(df)
## [1] 67216
df = df[complete.cases(df[,varnames]), varnames]
nrow(df)
## [1] 14128
df = df[df$cntry %in% countries, ]
nrow(df)
## [1] 14128
# change region codes so that regions with < 30 respondents will be summarized to
# "other regions in country"
# get small regions
region_n = aggregate(cbind(n = !is.na(region)) ~ cntry+region, df, sum, na.rm = TRUE)
region_n = region_n[region_n$n < 30, ]
# create new region code comrpising all small regions of a single country
region_n$newRegion = paste0(region_n$cntry, "_OTH")
# replace region in original data with new region code
tmp = merge(df, region_n[, c("region", "newRegion")],
by = c("region"), all.x = TRUE, sort = FALSE)
unique(tmp$region)
## [1] "AT34" "BE34" "BE31" "CH07" "DE8" "DEC" "FI1D5" "FI197" "FI1C1"
## [10] "FI194" "FI1D7" "FI193" "FI1D1" "FI195" "FI196" "FI1D6" "FI1C3" "FI1C4"
## [19] "FI1D2" "FI1C2" "FI1D4" "FI1C5" "FI1D3" "UKN" "EL62" "EL42" "EL41"
## [28] "HR023" "HR021" "HR022" "HR036" "HR035" "HR026" "HR033" "HR032" "HR063"
## [37] "HR024" "HR034" "HR061" "HR027" "HR037" "HU232" "HU211" "HU333" "HU212"
## [46] "HU312" "HU223" "HU213" "HU221" "HU332" "HU313" "LT023" "LT011" "LT022"
## [55] "NL23" "SI038" "SI036" "SI035" "SI033" "AT31" "AT22" "AT33" "AT12"
## [64] "AT11" "AT13" "AT32" "DEA" "DE1" "AT21" "DEF" "DE2" "DE3"
## [73] "DE7" "DEB" "DE4" "DEG" "BE24" "BE23" "BE25" "BE32" "BE10"
## [82] "BE21" "BE33" "BE22" "DE6" "BE35" "HU120" "HU231" "HU311" "HU323"
## [91] "HU321" "HU110" "HU331" "NL31" "UKK" "UKL" "UKD" "UKG" "UKH"
## [100] "UKC" "UKF" "NL42" "UKJ" "UKM" "CH02" "CH03" "CH05" "CH01"
## [109] "CH06" "CH04" "UKI" "EL64" "EL63" "EL52" "EL61" "DE9" "DED"
## [118] "DEE" "EL30" "EL51" "EL43" "EL65" "EL53" "EL54" "ITI" "HR025"
## [127] "HR050" "NL32" "HR062" "NL33" "NL34" "HR065" "NL13" "HR064" "NL41"
## [136] "FI1B1" "NL21" "HU222" "HU322" "NL22" "NL12" "UKE" "PT11" "PT18"
## [145] "PT17" "ITF" "ITC" "ITG" "ITH" "HR031" "SK023" "HR028" "SK010"
## [154] "SK031" "SK042" "NL11" "SK032" "SK021" "SK041" "SK022" "PT16" "PT15"
## [163] "SI031" "SI041" "SI042" "SI034" "SI043" "SI044" "SI032" "SI037"
length(unique(tmp$region))
## [1] 170
# replace region where a newRegion is available
tmp$region = ifelse(!is.na(tmp$newRegion), tmp$newRegion, tmp$region)
unique(tmp$region)
## [1] "AT_OTH" "BE_OTH" "CH_OTH" "DE_OTH" "FI_OTH" "GB_OTH" "GR_OTH" "HR_OTH"
## [9] "HU_OTH" "LT_OTH" "NL_OTH" "SI_OTH" "AT31" "AT22" "AT33" "AT12"
## [17] "AT11" "AT13" "AT32" "DEA" "DE1" "AT21" "DEF" "DE2"
## [25] "DE3" "DE7" "DEB" "DE4" "DEG" "BE24" "BE23" "BE25"
## [33] "BE32" "BE10" "BE21" "BE33" "BE22" "DE6" "BE35" "HU120"
## [41] "HU231" "HU311" "HU323" "HU321" "HU110" "HU331" "NL31" "UKK"
## [49] "UKL" "UKD" "UKG" "UKH" "UKC" "UKF" "NL42" "UKJ"
## [57] "UKM" "CH02" "CH03" "CH05" "CH01" "CH06" "CH04" "UKI"
## [65] "EL64" "EL63" "EL52" "EL61" "DE9" "DED" "DEE" "EL30"
## [73] "EL51" "EL43" "EL65" "EL53" "EL54" "ITI" "HR025" "HR050"
## [81] "NL32" "HR062" "NL33" "NL34" "HR065" "NL13" "HR064" "NL41"
## [89] "FI1B1" "NL21" "HU222" "HU322" "NL22" "NL12" "UKE" "PT11"
## [97] "PT18" "PT17" "ITF" "ITC" "ITG" "ITH" "HR031" "SK023"
## [105] "HR028" "SK010" "SK031" "SK042" "NL11" "SK032" "SK021" "SK041"
## [113] "SK022" "PT16" "PT15" "SI031" "SI041" "SI042" "SI034" "SI043"
## [121] "SI044" "SI032" "SI037"
length(unique(tmp$region))
## [1] 123
# drop helper column (optional)
tmp$newRegion = NULL
df = tmp
unique(df$region)
## [1] "AT_OTH" "BE_OTH" "CH_OTH" "DE_OTH" "FI_OTH" "GB_OTH" "GR_OTH" "HR_OTH"
## [9] "HU_OTH" "LT_OTH" "NL_OTH" "SI_OTH" "AT31" "AT22" "AT33" "AT12"
## [17] "AT11" "AT13" "AT32" "DEA" "DE1" "AT21" "DEF" "DE2"
## [25] "DE3" "DE7" "DEB" "DE4" "DEG" "BE24" "BE23" "BE25"
## [33] "BE32" "BE10" "BE21" "BE33" "BE22" "DE6" "BE35" "HU120"
## [41] "HU231" "HU311" "HU323" "HU321" "HU110" "HU331" "NL31" "UKK"
## [49] "UKL" "UKD" "UKG" "UKH" "UKC" "UKF" "NL42" "UKJ"
## [57] "UKM" "CH02" "CH03" "CH05" "CH01" "CH06" "CH04" "UKI"
## [65] "EL64" "EL63" "EL52" "EL61" "DE9" "DED" "DEE" "EL30"
## [73] "EL51" "EL43" "EL65" "EL53" "EL54" "ITI" "HR025" "HR050"
## [81] "NL32" "HR062" "NL33" "NL34" "HR065" "NL13" "HR064" "NL41"
## [89] "FI1B1" "NL21" "HU222" "HU322" "NL22" "NL12" "UKE" "PT11"
## [97] "PT18" "PT17" "ITF" "ITC" "ITG" "ITH" "HR031" "SK023"
## [105] "HR028" "SK010" "SK031" "SK042" "NL11" "SK032" "SK021" "SK041"
## [113] "SK022" "PT16" "PT15" "SI031" "SI041" "SI042" "SI034" "SI043"
## [121] "SI044" "SI032" "SI037"
##################################################
##### SEMINAR PAPER STUDENT INPUT ################
##################################################
# aggregation data by country and region
#🔧 2️⃣ Aggregation expanding
dfa = aggregate(
cbind(
pct_male = (gndr == 1),
mean_age = agea,
pct_good_health = (health %in% c(1, 2)),
mean_education = eisced,
mean_income = hinctnta,
mean_cesd8 = cesd8,
mean_bmi = bmi,
pct_rwpop = rwpop
) ~ cntry + region,
df, mean, na.rm = TRUE
)
The individual-level survey responses are aggregated to the regional level in order to compute regional means and proportions.
Each region is assigned values for the proportion of men, mean age, average health status, average level of education, mean household income, average depression score (CES-D8), mean body mass index (BMI), and the proportion of right-wing populist voters.
Missing values are excluded from the calculations.
##################################################
##### END SEMINAR PAPER STUDENT INPUT ############
##################################################
nrow(dfa)
## [1] 123
summary(dfa)
## cntry region pct_male mean_age
## Length:123 Length:123 Min. :0.2388 Min. :45.95
## Class :character Class :character 1st Qu.:0.4364 1st Qu.:51.93
## Mode :character Mode :character Median :0.4878 Median :54.25
## Mean :0.4890 Mean :54.23
## 3rd Qu.:0.5488 3rd Qu.:56.55
## Max. :0.6935 Max. :61.97
## pct_good_health mean_education mean_income mean_cesd8
## Min. :0.3846 Min. :2.531 Min. :3.609 Min. :1.430
## 1st Qu.:0.6200 1st Qu.:3.807 1st Qu.:5.006 1st Qu.:1.573
## Median :0.6848 Median :4.330 Median :5.560 Median :1.681
## Mean :0.6789 Mean :4.328 Mean :5.568 Mean :1.695
## 3rd Qu.:0.7599 3rd Qu.:4.727 3rd Qu.:6.158 3rd Qu.:1.784
## Max. :1.0000 Max. :6.686 Max. :7.420 Max. :2.098
## mean_bmi pct_rwpop
## Min. :23.21 Min. :0.00000
## 1st Qu.:25.53 1st Qu.:0.00000
## Median :26.10 Median :0.03974
## Mean :26.11 Mean :0.09149
## 3rd Qu.:26.74 3rd Qu.:0.11396
## Max. :27.93 Max. :1.00000
# should be equal
unique(df$region)
## [1] "AT_OTH" "BE_OTH" "CH_OTH" "DE_OTH" "FI_OTH" "GB_OTH" "GR_OTH" "HR_OTH"
## [9] "HU_OTH" "LT_OTH" "NL_OTH" "SI_OTH" "AT31" "AT22" "AT33" "AT12"
## [17] "AT11" "AT13" "AT32" "DEA" "DE1" "AT21" "DEF" "DE2"
## [25] "DE3" "DE7" "DEB" "DE4" "DEG" "BE24" "BE23" "BE25"
## [33] "BE32" "BE10" "BE21" "BE33" "BE22" "DE6" "BE35" "HU120"
## [41] "HU231" "HU311" "HU323" "HU321" "HU110" "HU331" "NL31" "UKK"
## [49] "UKL" "UKD" "UKG" "UKH" "UKC" "UKF" "NL42" "UKJ"
## [57] "UKM" "CH02" "CH03" "CH05" "CH01" "CH06" "CH04" "UKI"
## [65] "EL64" "EL63" "EL52" "EL61" "DE9" "DED" "DEE" "EL30"
## [73] "EL51" "EL43" "EL65" "EL53" "EL54" "ITI" "HR025" "HR050"
## [81] "NL32" "HR062" "NL33" "NL34" "HR065" "NL13" "HR064" "NL41"
## [89] "FI1B1" "NL21" "HU222" "HU322" "NL22" "NL12" "UKE" "PT11"
## [97] "PT18" "PT17" "ITF" "ITC" "ITG" "ITH" "HR031" "SK023"
## [105] "HR028" "SK010" "SK031" "SK042" "NL11" "SK032" "SK021" "SK041"
## [113] "SK022" "PT16" "PT15" "SI031" "SI041" "SI042" "SI034" "SI043"
## [121] "SI044" "SI032" "SI037"
unique(dfa$region)
## [1] "AT_OTH" "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32"
## [9] "AT33" "BE_OTH" "BE10" "BE21" "BE22" "BE23" "BE24" "BE25"
## [17] "BE32" "BE33" "BE35" "CH_OTH" "CH01" "CH02" "CH03" "CH04"
## [25] "CH05" "CH06" "DE_OTH" "DE1" "DE2" "DE3" "DE4" "DE6"
## [33] "DE7" "DE9" "DEA" "DEB" "DED" "DEE" "DEF" "DEG"
## [41] "EL30" "EL43" "EL51" "EL52" "EL53" "EL54" "EL61" "EL63"
## [49] "EL64" "EL65" "FI_OTH" "FI1B1" "GB_OTH" "GR_OTH" "HR_OTH" "HR025"
## [57] "HR028" "HR031" "HR050" "HR062" "HR064" "HR065" "HU_OTH" "HU110"
## [65] "HU120" "HU222" "HU231" "HU311" "HU321" "HU322" "HU323" "HU331"
## [73] "ITC" "ITF" "ITG" "ITH" "ITI" "LT_OTH" "NL_OTH" "NL11"
## [81] "NL12" "NL13" "NL21" "NL22" "NL31" "NL32" "NL33" "NL34"
## [89] "NL41" "NL42" "PT11" "PT15" "PT16" "PT17" "PT18" "SI_OTH"
## [97] "SI031" "SI032" "SI034" "SI037" "SI041" "SI042" "SI043" "SI044"
## [105] "SK010" "SK021" "SK022" "SK023" "SK031" "SK032" "SK041" "SK042"
## [113] "UKC" "UKD" "UKE" "UKF" "UKG" "UKH" "UKI" "UKJ"
## [121] "UKK" "UKL" "UKM"
length(unique(df$region))
## [1] 123
length(unique(dfa$region))
## [1] 123
# different aggregation functions cannot be used in a single aggregate command
# that's why we add the group sizes separately
#dfn = aggregate(cbind(n = !is.na(cntry)) ~ cntry, df, sum, na.rm = TRUE)
dfn = aggregate(cbind(n = !is.na(cntry)) ~ cntry+region, df, sum, na.rm = TRUE)
nrow(dfn)
## [1] 123
# merge group sizes with data
#dfa = merge(dfa, dfn, by = c("cntry"))
dfa = merge(dfa, dfn, by = c("cntry","region"))
nrow(dfa)
## [1] 123
dfa
## cntry region pct_male mean_age pct_good_health mean_education mean_income
## 1 AT AT_OTH 0.4827586 57.44828 0.7586207 4.137931 5.896552
## 2 AT AT11 0.4571429 59.54286 0.6571429 3.542857 5.985714
## 3 AT AT12 0.4456929 59.82397 0.7003745 3.764045 5.393258
## 4 AT AT13 0.3785047 61.11215 0.7009346 5.948598 4.985981
## 5 AT AT21 0.2941176 61.00000 0.7450980 3.588235 4.568627
## 6 AT AT22 0.4878049 54.67683 0.6768293 4.006098 4.628049
## 7 AT AT31 0.4730539 58.86228 0.6886228 4.485030 5.083832
## 8 AT AT32 0.4615385 59.95385 0.6153846 4.523077 5.323077
## 9 AT AT33 0.4365079 61.53968 0.7063492 4.460317 5.301587
## 10 BE BE_OTH 0.5227273 51.70455 0.8636364 5.795455 6.477273
## 11 BE BE10 0.5416667 56.62500 0.7083333 5.416667 5.375000
## 12 BE BE21 0.5956284 52.46448 0.7267760 5.502732 6.415301
## 13 BE BE22 0.5212766 57.44681 0.6914894 4.585106 5.680851
## 14 BE BE23 0.5620438 53.71533 0.7080292 4.708029 6.408759
## 15 BE BE24 0.6086957 55.03261 0.6956522 6.293478 6.804348
## 16 BE BE25 0.5714286 55.38393 0.6696429 5.321429 6.258929
## 17 BE BE32 0.4117647 54.98824 0.6470588 4.800000 5.517647
## 18 BE BE33 0.3734940 52.46988 0.7108434 5.734940 5.710843
## 19 BE BE35 0.5151515 54.60606 0.7272727 5.393939 5.030303
## 20 CH CH_OTH 0.6666667 54.25000 0.8333333 5.166667 5.333333
## 21 CH CH01 0.5894737 55.40000 0.8421053 5.042105 5.642105
## 22 CH CH02 0.5228758 54.84967 0.8039216 4.647059 5.830065
## 23 CH CH03 0.5584416 59.53247 0.8311688 4.519481 5.493506
## 24 CH CH04 0.5116279 59.46512 0.8488372 5.883721 6.081395
## 25 CH CH05 0.6290323 54.61290 0.8709677 4.709677 5.790323
## 26 CH CH06 0.6086957 54.62319 0.8115942 4.478261 5.289855
## 27 DE DE_OTH 0.4318182 52.40909 0.6818182 4.409091 4.500000
## 28 DE DE1 0.4930556 48.90278 0.6527778 4.565972 6.663194
## 29 DE DE2 0.4941520 49.45906 0.6578947 4.312865 6.479532
## 30 DE DE3 0.3918919 49.58108 0.6081081 4.918919 6.094595
## 31 DE DE4 0.6307692 49.27692 0.5230769 4.000000 5.692308
## 32 DE DE6 0.4411765 50.23529 0.8823529 5.205882 7.235294
## 33 DE DE7 0.5424837 48.15033 0.6209150 4.254902 5.921569
## 34 DE DE9 0.5529412 53.34706 0.6058824 4.400000 6.270588
## 35 DE DEA 0.5208333 51.07765 0.6136364 4.329545 6.380682
## 36 DE DEB 0.5063291 45.94937 0.6708861 4.405063 6.113924
## 37 DE DED 0.5625000 56.60417 0.5937500 4.697917 5.947917
## 38 DE DEE 0.5689655 57.27586 0.4482759 3.896552 4.810345
## 39 DE DEF 0.5301205 55.24096 0.6265060 4.530120 6.361446
## 40 DE DEG 0.5416667 53.33333 0.6250000 4.333333 5.437500
## 41 FI FI_OTH 0.6935484 49.24194 0.6370968 4.516129 6.016129
## 42 FI FI1B1 0.5588235 53.58824 0.8235294 4.882353 6.411765
## 43 GB GB_OTH 0.4482759 54.10345 0.6896552 3.551724 4.517241
## 44 GB UKC 0.5681818 49.97727 0.5681818 3.977273 4.568182
## 45 GB UKD 0.5980392 52.60784 0.6764706 4.539216 5.303922
## 46 GB UKE 0.5921053 53.03947 0.6447368 5.197368 4.881579
## 47 GB UKF 0.5148515 52.48515 0.6633663 4.405941 4.356436
## 48 GB UKG 0.5108696 51.56522 0.6847826 5.413043 5.152174
## 49 GB UKH 0.5688073 55.94495 0.7706422 4.532110 5.944954
## 50 GB UKI 0.4683544 51.07595 0.7594937 5.367089 6.050633
## 51 GB UKJ 0.5384615 55.22436 0.6538462 6.685897 6.435897
## 52 GB UKK 0.4150943 55.33962 0.6603774 5.037736 5.330189
## 53 GB UKL 0.5000000 51.96875 0.6875000 5.656250 4.500000
## 54 GB UKM 0.5588235 55.54412 0.6617647 4.514706 5.029412
## 55 GR EL30 0.4365482 52.13706 0.8147208 4.124365 6.200508
## 56 GR EL43 0.5272727 56.59091 0.6727273 3.363636 5.500000
## 57 GR EL51 0.4756098 49.51220 0.8536585 3.939024 5.012195
## 58 GR EL52 0.4714286 50.29714 0.8657143 3.991429 5.100000
## 59 GR EL53 0.4262295 54.60656 0.7704918 3.245902 5.065574
## 60 GR EL54 0.5079365 61.96825 0.7936508 2.920635 5.412698
## 61 GR EL61 0.4800000 46.90667 0.9000000 4.633333 5.066667
## 62 GR EL63 0.3636364 52.86364 0.7272727 3.659091 4.568182
## 63 GR EL64 0.4339623 60.03774 0.5471698 2.735849 4.113208
## 64 GR EL65 0.4363636 48.96364 0.7636364 3.945455 5.272727
## 65 GR GR_OTH 0.6857143 49.51429 0.7428571 4.800000 7.142857
## 66 HR HR_OTH 0.5550239 58.03349 0.5311005 3.607656 4.870813
## 67 HR HR025 0.4603175 56.01587 0.4285714 3.873016 4.809524
## 68 HR HR028 0.3902439 57.53659 0.3902439 3.853659 4.634146
## 69 HR HR031 0.5476190 58.42857 0.6190476 4.476190 5.928571
## 70 HR HR050 0.3888889 56.51852 0.6851852 4.287037 5.574074
## 71 HR HR062 0.4047619 57.78571 0.5000000 3.571429 6.142857
## 72 HR HR064 0.3714286 54.80000 0.6285714 3.714286 5.914286
## 73 HR HR065 0.3404255 53.91489 0.7872340 3.808511 6.000000
## 74 HU HU_OTH 0.4271845 56.07282 0.6165049 3.626214 4.932039
## 75 HU HU110 0.3056995 49.90155 0.7823834 4.186528 7.419689
## 76 HU HU120 0.2388060 49.95522 0.8358209 3.813433 7.320896
## 77 HU HU222 0.4358974 59.66667 0.6666667 3.538462 6.564103
## 78 HU HU231 0.3877551 53.69388 0.5918367 3.428571 4.326531
## 79 HU HU311 0.3916667 54.10000 0.5000000 3.291667 3.975000
## 80 HU HU321 0.4642857 50.35714 0.6964286 3.642857 4.392857
## 81 HU HU322 0.3888889 51.69444 0.7500000 3.750000 3.666667
## 82 HU HU323 0.4844720 51.56522 0.5403727 3.366460 4.850932
## 83 HU HU331 0.4523810 54.78571 0.6428571 3.595238 5.714286
## 84 IT ITC 0.4605010 52.40848 0.7398844 3.803468 5.190751
## 85 IT ITF 0.4699739 54.02089 0.6240209 3.151436 4.146214
## 86 IT ITG 0.4076433 53.35032 0.6815287 3.324841 4.980892
## 87 IT ITH 0.5295858 52.36686 0.7603550 3.857988 5.997041
## 88 IT ITI 0.4887640 52.04213 0.6601124 3.500000 5.887640
## 89 LT LT_OTH 0.3333333 49.33333 1.0000000 6.333333 6.333333
## 90 NL NL_OTH 0.3684211 54.42105 0.6315789 4.526316 5.578947
## 91 NL NL11 0.5128205 54.41026 0.7948718 4.743590 5.769231
## 92 NL NL12 0.5208333 57.62500 0.8125000 4.833333 5.750000
## 93 NL NL13 0.6086957 56.21739 0.7826087 4.695652 6.586957
## 94 NL NL21 0.5049505 51.51485 0.7920792 4.346535 6.356436
## 95 NL NL22 0.5695364 53.62252 0.8079470 4.960265 6.370861
## 96 NL NL31 0.4807692 50.64423 0.7884615 4.865385 6.567308
## 97 NL NL32 0.5789474 54.35789 0.7631579 5.078947 6.968421
## 98 NL NL33 0.4802260 53.97740 0.7401130 4.768362 6.468927
## 99 NL NL34 0.5428571 54.71429 0.7428571 3.971429 6.714286
## 100 NL NL41 0.4569536 51.95364 0.7152318 4.927152 6.649007
## 101 NL NL42 0.5492958 53.22535 0.6901408 4.478873 6.309859
## 102 PT PT11 0.4562842 52.17486 0.5573770 3.262295 4.937158
## 103 PT PT15 0.5483871 51.54839 0.5806452 3.000000 4.580645
## 104 PT PT16 0.3768116 57.25725 0.4601449 3.021739 3.800725
## 105 PT PT17 0.4457364 53.05039 0.6550388 4.120155 5.465116
## 106 PT PT18 0.3906250 59.78125 0.4218750 2.531250 3.609375
## 107 SI SI_OTH 0.5294118 51.90196 0.8235294 4.235294 5.000000
## 108 SI SI031 0.5833333 54.61111 0.5000000 3.805556 5.166667
## 109 SI SI032 0.4854369 54.53398 0.6601942 4.330097 5.213592
## 110 SI SI034 0.4444444 49.74074 0.7283951 4.518519 6.000000
## 111 SI SI037 0.5652174 50.73913 0.6739130 4.195652 6.173913
## 112 SI SI041 0.5031056 53.77640 0.7204969 4.850932 6.192547
## 113 SI SI042 0.6491228 50.73684 0.7368421 4.210526 6.105263
## 114 SI SI043 0.5750000 54.37500 0.6500000 4.050000 4.925000
## 115 SI SI044 0.3947368 60.39474 0.5000000 3.552632 4.947368
## 116 SK SK010 0.4565217 58.97826 0.4565217 4.217391 7.108696
## 117 SK SK021 0.4428571 58.57143 0.4285714 3.942857 5.414286
## 118 SK SK022 0.4728682 55.74419 0.5038760 4.317829 5.286822
## 119 SK SK023 0.5600000 53.36667 0.6933333 3.966667 5.586667
## 120 SK SK031 0.4405594 57.73427 0.4545455 4.335664 5.314685
## 121 SK SK032 0.4065934 56.98901 0.3846154 3.626374 4.307692
## 122 SK SK041 0.4029851 54.53731 0.5671642 4.283582 5.388060
## 123 SK SK042 0.5000000 57.33333 0.5476190 4.011905 5.559524
## mean_cesd8 mean_bmi pct_rwpop n
## 1 1.530172 25.65850 0.17241379 29
## 2 1.571429 25.47077 0.14285714 70
## 3 1.562266 25.70209 0.11610487 267
## 4 1.568341 25.56164 0.10747664 214
## 5 1.607843 26.15031 0.15686275 51
## 6 1.529726 26.50661 0.18292683 164
## 7 1.690120 25.99666 0.19760479 167
## 8 1.665385 25.18255 0.15384615 65
## 9 1.681548 25.45578 0.16666667 126
## 10 1.480114 24.63566 0.00000000 44
## 11 1.789062 24.07798 0.00000000 48
## 12 1.599044 26.12873 0.13661202 183
## 13 1.595745 26.12325 0.18085106 94
## 14 1.562956 26.02955 0.10218978 137
## 15 1.600543 25.68808 0.07608696 92
## 16 1.575893 26.44825 0.13392857 112
## 17 1.794118 26.98694 0.00000000 85
## 18 1.722892 24.33762 0.00000000 83
## 19 1.768939 24.75450 0.03030303 33
## 20 1.614583 24.26184 0.16666667 12
## 21 1.515789 25.20338 0.09473684 95
## 22 1.526961 25.20670 0.26143791 153
## 23 1.521104 25.34956 0.24675325 77
## 24 1.578488 24.98325 0.15116279 86
## 25 1.469758 24.79281 0.37096774 62
## 26 1.465580 25.77022 0.17391304 69
## 27 1.789773 24.93724 0.11363636 44
## 28 1.661458 25.63838 0.02430556 288
## 29 1.637061 25.85977 0.02631579 342
## 30 1.746622 25.44850 0.04054054 74
## 31 1.788462 27.23272 0.10769231 65
## 32 1.452206 25.95352 0.00000000 34
## 33 1.755719 26.60990 0.01960784 153
## 34 1.687500 26.71238 0.04705882 170
## 35 1.680161 26.28106 0.05303030 528
## 36 1.759494 25.32294 0.00000000 79
## 37 1.683594 26.27621 0.08333333 96
## 38 1.801724 27.77265 0.06896552 58
## 39 1.656627 26.20553 0.02409639 83
## 40 1.679688 26.99110 0.10416667 48
## 41 1.571573 27.72343 1.00000000 124
## 42 1.602941 27.15768 1.00000000 34
## 43 1.706897 27.00836 0.00000000 29
## 44 2.008523 27.16156 0.00000000 44
## 45 1.685049 25.86093 0.00000000 102
## 46 1.588816 27.34275 0.00000000 76
## 47 1.730198 26.28055 0.01980198 101
## 48 1.748641 26.25703 0.01086957 92
## 49 1.663991 25.89544 0.00000000 109
## 50 1.634494 25.59192 0.00000000 79
## 51 1.698718 25.91027 0.01282051 156
## 52 1.700472 25.82941 0.00000000 106
## 53 1.792969 27.04563 0.00000000 32
## 54 1.696691 26.59329 0.01470588 68
## 55 1.945749 26.03364 0.00000000 394
## 56 1.879545 27.45669 0.00000000 110
## 57 1.766768 25.14961 0.00000000 82
## 58 1.720000 25.14670 0.00000000 350
## 59 1.928279 25.76392 0.00000000 61
## 60 2.043651 26.70398 0.00000000 63
## 61 2.098333 26.59869 0.00000000 150
## 62 1.963068 26.89429 0.00000000 44
## 63 1.889151 27.46300 0.00000000 53
## 64 1.970455 26.41764 0.00000000 55
## 65 1.939286 26.09709 0.00000000 35
## 66 1.711722 27.58606 0.08133971 209
## 67 1.777778 27.06480 0.09523810 63
## 68 1.789634 27.93299 0.02439024 41
## 69 1.565476 27.47418 0.00000000 42
## 70 1.515046 25.95715 0.03703704 108
## 71 1.970238 27.80422 0.00000000 42
## 72 1.664286 26.54907 0.02857143 35
## 73 1.534574 25.44099 0.06382979 47
## 74 1.805825 26.62052 0.11165049 206
## 75 1.672927 25.80950 0.15025907 193
## 76 1.669776 26.32125 0.04477612 134
## 77 1.663462 27.33137 0.05128205 39
## 78 1.681122 27.04244 0.02040816 49
## 79 1.968750 25.80854 0.03333333 120
## 80 1.801339 26.14935 0.07142857 56
## 81 2.059028 25.81821 0.08333333 36
## 82 1.909161 27.29849 0.03726708 161
## 83 2.014881 27.12768 0.02380952 42
## 84 1.697495 24.25050 0.00000000 519
## 85 1.933094 25.82954 0.00000000 383
## 86 1.771497 23.94052 0.00000000 157
## 87 1.689719 24.64319 0.00000000 338
## 88 1.749298 24.95986 0.00000000 356
## 89 1.750000 23.20759 1.00000000 3
## 90 1.473684 24.44203 0.10526316 19
## 91 1.493590 25.75522 0.02564103 39
## 92 1.479167 26.13090 0.06250000 48
## 93 1.510870 26.79257 0.02173913 46
## 94 1.558168 25.39421 0.06930693 101
## 95 1.488411 25.42422 0.06622517 151
## 96 1.627404 24.84196 0.01923077 104
## 97 1.561184 25.26850 0.01052632 190
## 98 1.473164 25.67125 0.05649718 177
## 99 1.446429 25.68057 0.11428571 35
## 100 1.603477 25.65819 0.03973510 151
## 101 1.501761 25.69814 0.08450704 71
## 102 1.899249 26.13351 0.00000000 366
## 103 1.633065 25.04001 0.00000000 31
## 104 1.938859 26.65494 0.00000000 276
## 105 1.764050 26.05313 0.00000000 258
## 106 1.929688 26.18804 0.00000000 64
## 107 1.600490 26.43409 0.23529412 51
## 108 1.572917 26.10682 0.19444444 36
## 109 1.648058 26.45109 0.22330097 103
## 110 1.577160 26.66520 0.30864198 81
## 111 1.573370 27.56286 0.23913043 46
## 112 1.561335 25.50544 0.18012422 161
## 113 1.429825 27.44512 0.29824561 57
## 114 1.515625 26.09102 0.12500000 40
## 115 1.585526 26.85990 0.10526316 38
## 116 2.005435 27.88829 0.00000000 46
## 117 1.994643 27.87327 0.08571429 70
## 118 1.696705 27.29013 0.03875969 129
## 119 1.610000 25.69103 0.04000000 150
## 120 1.698427 26.42991 0.05594406 143
## 121 1.883242 26.47323 0.14285714 91
## 122 1.779851 26.77478 0.02985075 67
## 123 1.702381 27.90144 0.02380952 84
unique(dfa$region)
## [1] "AT_OTH" "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32"
## [9] "AT33" "BE_OTH" "BE10" "BE21" "BE22" "BE23" "BE24" "BE25"
## [17] "BE32" "BE33" "BE35" "CH_OTH" "CH01" "CH02" "CH03" "CH04"
## [25] "CH05" "CH06" "DE_OTH" "DE1" "DE2" "DE3" "DE4" "DE6"
## [33] "DE7" "DE9" "DEA" "DEB" "DED" "DEE" "DEF" "DEG"
## [41] "FI_OTH" "FI1B1" "GB_OTH" "UKC" "UKD" "UKE" "UKF" "UKG"
## [49] "UKH" "UKI" "UKJ" "UKK" "UKL" "UKM" "EL30" "EL43"
## [57] "EL51" "EL52" "EL53" "EL54" "EL61" "EL63" "EL64" "EL65"
## [65] "GR_OTH" "HR_OTH" "HR025" "HR028" "HR031" "HR050" "HR062" "HR064"
## [73] "HR065" "HU_OTH" "HU110" "HU120" "HU222" "HU231" "HU311" "HU321"
## [81] "HU322" "HU323" "HU331" "ITC" "ITF" "ITG" "ITH" "ITI"
## [89] "LT_OTH" "NL_OTH" "NL11" "NL12" "NL13" "NL21" "NL22" "NL31"
## [97] "NL32" "NL33" "NL34" "NL41" "NL42" "PT11" "PT15" "PT16"
## [105] "PT17" "PT18" "SI_OTH" "SI031" "SI032" "SI034" "SI037" "SI041"
## [113] "SI042" "SI043" "SI044" "SK010" "SK021" "SK022" "SK023" "SK031"
## [121] "SK032" "SK041" "SK042"
unique(dfreg$region)
## [1] "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32" "AT33" "AT34"
## [10] "BE10" "BE21" "BE22" "BE23" "BE24" "BE25" "BE31" "BE32" "BE33"
## [19] "BE34" "BE35" "BG311" "BG312" "BG313" "BG314" "BG315" "BG321" "BG322"
## [28] "BG323" "BG324" "BG325" "BG331" "BG332" "BG333" "BG334" "BG341" "BG342"
## [37] "BG343" "BG344" "BG411" "BG412" "BG413" "BG414" "BG415" "BG421" "BG422"
## [46] "BG423" "BG424" "BG425" "CY0" "DE1" "DE2" "DE3" "DE4" "DE6"
## [55] "DE7" "DE8" "DE9" "DEA" "DEB" "DEC" "DED" "DEE" "DEF"
## [64] "DEG" "EL30" "EL41" "EL42" "EL43" "EL51" "EL52" "EL53" "EL54"
## [73] "EL61" "EL62" "EL63" "EL64" "EL65" "ES11" "ES12" "ES13" "ES21"
## [82] "ES22" "ES23" "ES24" "ES30" "ES41" "ES42" "ES43" "ES51" "ES52"
## [91] "ES53" "ES61" "ES62" "ES64" "ES70" "FI196" "FI1B1" "FI1C1" "FI1C2"
## [100] "FI1C5" "FI1D5" "FI1D7" "FI200" "FR10" "FRB0" "FRC1" "FRC2" "FRD1"
## [109] "FRD2" "FRE1" "FRE2" "FRF1" "FRF2" "FRF3" "FRG0" "FRH0" "FRI1"
## [118] "FRI2" "FRI3" "FRJ1" "FRJ2" "FRK1" "FRK2" "FRL0" "HR021" "HR022"
## [127] "HR023" "HR024" "HR025" "HR026" "HR027" "HR028" "HR031" "HR032" "HR033"
## [136] "HR034" "HR035" "HR036" "HR037" "HU110" "HU120" "HU211" "HU212" "HU213"
## [145] "HU221" "HU222" "HU223" "HU231" "HU232" "HU233" "HU311" "HU312" "HU313"
## [154] "HU321" "HU322" "HU323" "HU331" "HU332" "HU333" "IE041" "IE042" "IE051"
## [163] "IE052" "IE053" "IE061" "IE062" "IE063" "ITC" "ITF" "ITG" "ITH"
## [172] "ITI" "LT011" "LT021" "LT022" "LT023" "LT024" "LT025" "LT026" "LT027"
## [181] "LT028" "LT029" "ME0" "NL11" "NL12" "NL13" "NL21" "NL22" "NL23"
# finally merge with regional data
ncol(dfa)
## [1] 11
nrow(dfa)
## [1] 123
ncol(dfreg)
## [1] 23
nrow(dfreg)
## [1] 189
# recode small regions before merge (see above)
tmp = merge(dfreg, region_n[, c("region", "newRegion")],
by = c("region"), all.x = TRUE, sort = FALSE)
unique(tmp$region)
## [1] "AT34" "BE31" "BE34" "DE8" "DEC" "EL41" "EL42" "EL62" "FI196"
## [10] "FI1C1" "FI1C2" "FI1C5" "FI1D5" "FI1D7" "HR021" "HR022" "HR023" "HR024"
## [19] "HR026" "HR027" "HR032" "HR033" "HR034" "HR035" "HR036" "HR037" "HU211"
## [28] "HU212" "HU213" "HU221" "HU223" "HU232" "HU312" "HU313" "HU332" "HU333"
## [37] "LT011" "LT022" "LT023" "NL23" "AT11" "AT12" "AT13" "AT21" "AT22"
## [46] "AT31" "AT32" "AT33" "CY0" "BE10" "BE21" "BE22" "BE23" "BE24"
## [55] "BE25" "BG324" "BE32" "BE33" "DEB" "BE35" "BG311" "BG312" "BG313"
## [64] "BG314" "BG315" "BG321" "BG322" "BG323" "BG414" "BG325" "BG331" "BG332"
## [73] "BG333" "BG334" "BG341" "BG342" "BG343" "BG344" "BG411" "BG412" "BG413"
## [82] "DE7" "BG415" "BG421" "BG422" "BG423" "BG424" "BG425" "ES51" "DE1"
## [91] "DE2" "DE3" "DE4" "DE6" "EL43" "EL51" "DE9" "DEA" "LT026"
## [100] "HU110" "DED" "DEE" "DEF" "DEG" "EL30" "FRC1" "FRC2" "ES21"
## [109] "ES22" "EL52" "EL53" "EL54" "EL61" "FRF3" "EL63" "EL64" "EL65"
## [118] "ES11" "ES12" "ES13" "ES64" "ES70" "ES23" "ES24" "ES30" "ES41"
## [127] "ES42" "ES43" "HR025" "ES52" "ES53" "ES61" "ES62" "NL21" "FRD1"
## [136] "FRD2" "FI1B1" "LT025" "FRF1" "LT027" "HU120" "LT029" "FI200" "FR10"
## [145] "FRB0" "HU222" "FRJ1" "FRJ2" "FRK1" "FRE1" "FRE2" "IE061" "FRF2"
## [154] "HU321" "FRG0" "FRH0" "FRI1" "FRI2" "FRI3" "HR031" "LT021" "HU231"
## [163] "FRK2" "FRL0" "HU311" "IE062" "IE063" "ITC" "ITF" "ITG" "ITH"
## [172] "HR028" "NL13" "IE041" "IE042" "IE051" "LT024" "IE053" "ME0" "NL11"
## [181] "LT028" "HU322" "IE052" "HU323" "NL12" "ITI" "NL22" "HU233" "HU331"
length(unique(tmp$region))
## [1] 189
# replace region where a newRegion is available
tmp$region = ifelse(!is.na(tmp$newRegion), tmp$newRegion, tmp$region)
unique(tmp$region)
## [1] "AT_OTH" "BE_OTH" "DE_OTH" "GR_OTH" "FI_OTH" "HR_OTH" "HU_OTH" "LT_OTH"
## [9] "NL_OTH" "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32"
## [17] "AT33" "CY0" "BE10" "BE21" "BE22" "BE23" "BE24" "BE25"
## [25] "BG324" "BE32" "BE33" "DEB" "BE35" "BG311" "BG312" "BG313"
## [33] "BG314" "BG315" "BG321" "BG322" "BG323" "BG414" "BG325" "BG331"
## [41] "BG332" "BG333" "BG334" "BG341" "BG342" "BG343" "BG344" "BG411"
## [49] "BG412" "BG413" "DE7" "BG415" "BG421" "BG422" "BG423" "BG424"
## [57] "BG425" "ES51" "DE1" "DE2" "DE3" "DE4" "DE6" "EL43"
## [65] "EL51" "DE9" "DEA" "LT026" "HU110" "DED" "DEE" "DEF"
## [73] "DEG" "EL30" "FRC1" "FRC2" "ES21" "ES22" "EL52" "EL53"
## [81] "EL54" "EL61" "FRF3" "EL63" "EL64" "EL65" "ES11" "ES12"
## [89] "ES13" "ES64" "ES70" "ES23" "ES24" "ES30" "ES41" "ES42"
## [97] "ES43" "HR025" "ES52" "ES53" "ES61" "ES62" "NL21" "FRD1"
## [105] "FRD2" "FI1B1" "LT025" "FRF1" "LT027" "HU120" "LT029" "FI200"
## [113] "FR10" "FRB0" "HU222" "FRJ1" "FRJ2" "FRK1" "FRE1" "FRE2"
## [121] "IE061" "FRF2" "HU321" "FRG0" "FRH0" "FRI1" "FRI2" "FRI3"
## [129] "HR031" "LT021" "HU231" "FRK2" "FRL0" "HU311" "IE062" "IE063"
## [137] "ITC" "ITF" "ITG" "ITH" "HR028" "NL13" "IE041" "IE042"
## [145] "IE051" "LT024" "IE053" "ME0" "NL11" "LT028" "HU322" "IE052"
## [153] "HU323" "NL12" "ITI" "NL22" "HU233" "HU331"
length(unique(tmp$region))
## [1] 158
# drop helper column (optional)
tmp$newRegion = NULL
dfreg = tmp
unique(dfreg$region)
## [1] "AT_OTH" "BE_OTH" "DE_OTH" "GR_OTH" "FI_OTH" "HR_OTH" "HU_OTH" "LT_OTH"
## [9] "NL_OTH" "AT11" "AT12" "AT13" "AT21" "AT22" "AT31" "AT32"
## [17] "AT33" "CY0" "BE10" "BE21" "BE22" "BE23" "BE24" "BE25"
## [25] "BG324" "BE32" "BE33" "DEB" "BE35" "BG311" "BG312" "BG313"
## [33] "BG314" "BG315" "BG321" "BG322" "BG323" "BG414" "BG325" "BG331"
## [41] "BG332" "BG333" "BG334" "BG341" "BG342" "BG343" "BG344" "BG411"
## [49] "BG412" "BG413" "DE7" "BG415" "BG421" "BG422" "BG423" "BG424"
## [57] "BG425" "ES51" "DE1" "DE2" "DE3" "DE4" "DE6" "EL43"
## [65] "EL51" "DE9" "DEA" "LT026" "HU110" "DED" "DEE" "DEF"
## [73] "DEG" "EL30" "FRC1" "FRC2" "ES21" "ES22" "EL52" "EL53"
## [81] "EL54" "EL61" "FRF3" "EL63" "EL64" "EL65" "ES11" "ES12"
## [89] "ES13" "ES64" "ES70" "ES23" "ES24" "ES30" "ES41" "ES42"
## [97] "ES43" "HR025" "ES52" "ES53" "ES61" "ES62" "NL21" "FRD1"
## [105] "FRD2" "FI1B1" "LT025" "FRF1" "LT027" "HU120" "LT029" "FI200"
## [113] "FR10" "FRB0" "HU222" "FRJ1" "FRJ2" "FRK1" "FRE1" "FRE2"
## [121] "IE061" "FRF2" "HU321" "FRG0" "FRH0" "FRI1" "FRI2" "FRI3"
## [129] "HR031" "LT021" "HU231" "FRK2" "FRL0" "HU311" "IE062" "IE063"
## [137] "ITC" "ITF" "ITG" "ITH" "HR028" "NL13" "IE041" "IE042"
## [145] "IE051" "LT024" "IE053" "ME0" "NL11" "LT028" "HU322" "IE052"
## [153] "HU323" "NL12" "ITI" "NL22" "HU233" "HU331"
dfa = merge(dfa, dfreg, by = "region")
ncol(dfa)
## [1] 33
nrow(dfa)
## [1] 103
# Last clean up
# remove still existing very small regions
dfa[dfa$n < 30, c("region", "n")]
## region n
## 1 AT_OTH 29
## 95 LT_OTH 3
## 96 LT_OTH 3
## 97 LT_OTH 3
## 98 NL_OTH 19
dfa = dfa[dfa$n >= 28, ]
nrow(dfa)
## [1] 99
##################################################
##### SEMINAR PAPER STUDENT INPUT ################
##################################################
# exclude regions with 0 or 100% right-wing populist vote
dfa <- dfa[dfa$pct_rwpop > 0 & dfa$pct_rwpop < 1, ]
#####end of student input
nrow(dfa)
## [1] 66
Prior to the analysis, regions in which the proportion of right-wing populist voters was 0% or 100% were excluded. Such extreme values could distort the estimates and compromise the reliability of statistical analyses, including correlation and regression models.
### Analysis
cor(dfa[,3:(ncol(dfa)-1)], dfa$pct_rwpop, use = "complete.obs") # all pairwise correlations
## [,1]
## pct_male -0.14233860
## mean_age 0.39581613
## pct_good_health 0.17119316
## mean_education 0.01898300
## mean_income -0.14752417
## mean_cesd8 -0.17813497
## mean_bmi -0.20696289
## pct_rwpop 1.00000000
## n 0.01401506
## reg11_area_2023 -0.21491942
## reg11_tpopsz_2023 -0.29314018
## reg11_fpopsz_2023 -0.29287131
## reg11_mpopsz_2023 -0.29340672
## reg11_pode_2023 0.02869454
## reg11_lbirth_2023 -0.29797517
## reg11_death_2023 -0.29643683
## reg11_natgrow_2023 0.26166004
## reg11_cnmigrat_2023 -0.28420719
## reg11_grow_2023 -0.25274052
## reg11_gbirthrt_2023 -0.13695615
## reg11_gdeathrt_2023 -0.22940311
## reg11_natgrowrt_2023 0.14367709
## reg11_cnmigratrt_2023 0.08277691
## reg11_growrt_2023 0.12868925
## reg11_gdp_eurhab_2023 0.18434332
## reg11_gdp_mio_eur_2023 -0.28938147
## reg11_gdp_eurhab_eu27_2020_2023 0.18551842
## reg11_gdp_mio_nac_2023 0.08874371
## reg11_gdp_mio_pps_eu27_2020_2023 -0.28718035
## reg11_gdp_pps_eu27_2020_hab_2023 0.21739747
# start with standard model (only structural data)
model = lm(pct_rwpop ~ pct_male + mean_age , dfa)
summary(model)
##
## Call:
## lm(formula = pct_rwpop ~ pct_male + mean_age, data = dfa)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.084537 -0.023343 -0.004767 0.014157 0.098654
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.172536 0.090454 -1.907 0.061026 .
## pct_male -0.091189 0.064193 -1.421 0.160380
## mean_age 0.005510 0.001556 3.540 0.000757 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04107 on 63 degrees of freedom
## Multiple R-squared: 0.1828, Adjusted R-squared: 0.1569
## F-statistic: 7.048 on 2 and 63 DF, p-value: 0.001728
# positive and highly significant age effect
# borderline significant gender effect (female vote more right?)
First, pairwise correlations were calculated between the share of right-wing populist voters and all explanatory variables. This step serves to identify initial associations and to detect potential factors influencing voting behavior.
# Step 1: Bivariate correlations with the dependent variable
biv_cor <- cor(
dfa[, c("mean_age", "pct_male", "mean_education", "mean_income",
"pct_good_health", "mean_cesd8", "mean_bmi")],
dfa$pct_rwpop,
use = "complete.obs"
)
round(biv_cor, 3)
## [,1]
## mean_age 0.396
## pct_male -0.142
## mean_education 0.019
## mean_income -0.148
## pct_good_health 0.171
## mean_cesd8 -0.178
## mean_bmi -0.207
First, the bivariate relationships between the share of right-wing populist voters and the potential explanatory variables were examined. The regional mean age shows a relatively strong positive association. Weaker associations are observed for BMI, depression scores (CES-D8), and the proportion of individuals reporting good health. Education and income are highly correlated; therefore, only one of these variables is included in subsequent analyses to avoid redundancy and multicollinearity.
# Step 1b: Bivariate scatterplots
plot(dfa$mean_age, dfa$pct_rwpop,
xlab = "Mean age",
ylab = "RW populist vote share",
main = "Age and RW vote",
pch = 16)
abline(lm(pct_rwpop ~ mean_age, data = dfa), col = "red")
plot(dfa$pct_male, dfa$pct_rwpop,
xlab = "Share of men",
ylab = "RW populist vote share",
main = "Gender and RW vote",
pch = 16)
abline(lm(pct_rwpop ~ pct_male, data = dfa), col = "red")
##### Including Plots
# Graphical representation of the relationships
my_col <- rgb(0, 0, 1, 0.4)
plot(dfa$mean_bmi, dfa$pct_rwpop,
xlab = "Mean BMI", ylab = "Share of RW Populist Voters",
main = "BMI and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ mean_bmi, data = dfa), col = "red", lwd = 2)
# 2. Depression and Right-Wing Populist Vote
plot(dfa$mean_cesd8, dfa$pct_rwpop,
xlab = "Mean Depression Score (CES-D8)", ylab = "Share of RW Populist Voters",
main = "Depression and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ mean_cesd8, data = dfa), col = "red", lwd = 2)
# 3. Good Health and Right-Wing Populist Vote
plot(dfa$pct_good_health, dfa$pct_rwpop,
xlab = "Share of Good Health", ylab = "Share of RW Populist Voters",
main = "Good Health and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ pct_good_health, data = dfa), col = "red", lwd = 2)
# 4. Education and Right-Wing Populist Vote
plot(dfa$mean_education, dfa$pct_rwpop,
xlab = "Mean Education Level", ylab = "Share of RW Populist Voters",
main = "Education and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ mean_education, data = dfa), col = "red", lwd = 2)
# 5. Income and Right-Wing Populist Vote
plot(dfa$mean_income, dfa$pct_rwpop,
xlab = "Mean Income (Deciles)", ylab = "Share of RW Populist Voters",
main = "Income and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ mean_income, data = dfa), col = "red", lwd = 2)
# 6. Age and Right-Wing Populist Vote
plot(dfa$mean_age, dfa$pct_rwpop,
xlab = "Mean Age", ylab = "Share of RW Populist Voters",
main = "Age and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ mean_age, data = dfa), col = "red", lwd = 2)
# 7. Gender (Male Share) and Right-Wing Populist Vote
plot(dfa$pct_male, dfa$pct_rwpop,
xlab = "Share of Men", ylab = "Share of RW Populist Voters",
main = "Gender and Right-Wing Populist Vote",
pch = 16, col = my_col)
abline(lm(pct_rwpop ~ pct_male, data = dfa), col = "red", lwd = 2)
The scatterplots visualize the bivariate relationships and help identify potential outliers. For example, the plot for mean age shows a clear positive trend, whereas the association with the proportion of men appears much weaker. The plots provide an initial exploratory basis for the subsequent multivariate analysis.
# Step 2: Correlations among the predictors
pred_vars <- dfa[, c("mean_age", "pct_male", "mean_education",
"mean_income", "pct_good_health", "mean_cesd8")]
cor_matrix <- cor(pred_vars, use = "complete.obs")
round(cor_matrix, 2)
## mean_age pct_male mean_education mean_income pct_good_health
## mean_age 1.00 0.05 -0.16 -0.35 -0.22
## pct_male 0.05 1.00 0.26 0.14 -0.24
## mean_education -0.16 0.26 1.00 0.51 0.49
## mean_income -0.35 0.14 0.51 1.00 0.50
## pct_good_health -0.22 -0.24 0.49 0.50 1.00
## mean_cesd8 -0.18 -0.29 -0.51 -0.55 -0.48
## mean_cesd8
## mean_age -0.18
## pct_male -0.29
## mean_education -0.51
## mean_income -0.55
## pct_good_health -0.48
## mean_cesd8 1.00
The correlation matrix shows the relationships among the explanatory variables. Strong correlations, such as between education and income, indicate potential redundancy. To avoid multicollinearity, only education was retained as an indicator of socioeconomic status in the regression model.
# Step 3: Stepwise Regression Analysis
# Baseline model: Age + Gender
m1 <- lm(
pct_rwpop ~ scale(mean_age) + scale(pct_male),
data = dfa,
weights = n
)
summary(m1)
##
## Call:
## lm(formula = pct_rwpop ~ scale(mean_age) + scale(pct_male), data = dfa,
## weights = n)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -0.68414 -0.23467 -0.06358 0.13154 1.25331
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.090085 0.004277 21.063 < 2e-16 ***
## scale(mean_age) 0.019856 0.004148 4.787 1.06e-05 ***
## scale(pct_male) -0.011754 0.004566 -2.574 0.0124 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4201 on 63 degrees of freedom
## Multiple R-squared: 0.3059, Adjusted R-squared: 0.2839
## F-statistic: 13.88 on 2 and 63 DF, p-value: 1.01e-05
Result: The regional mean age shows a positive and statistically significant association with the share of right-wing populist voters. The proportion of men is not significant. This baseline model serves as a reference for comparison with the extended models.
#Final model: Age + Gender + Education + Health
m2 <- lm(
pct_rwpop ~ scale(mean_age) + scale(pct_male) +
scale(mean_education) + scale(pct_good_health),
data = dfa,
weights = n
)
summary(m2)
##
## Call:
## lm(formula = pct_rwpop ~ scale(mean_age) + scale(pct_male) +
## scale(mean_education) + scale(pct_good_health), data = dfa,
## weights = n)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -0.72772 -0.24498 -0.08542 0.18652 1.12733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.091945 0.004077 22.549 < 2e-16 ***
## scale(mean_age) 0.023708 0.004106 5.774 2.8e-07 ***
## scale(pct_male) -0.005136 0.005275 -0.974 0.3340
## scale(mean_education) -0.002035 0.005757 -0.353 0.7250
## scale(pct_good_health) 0.018422 0.006994 2.634 0.0107 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3956 on 61 degrees of freedom
## Multiple R-squared: 0.4039, Adjusted R-squared: 0.3648
## F-statistic: 10.33 on 4 and 61 DF, p-value: 1.869e-06
Result: The positive effect of age remains robust and significant. A higher proportion of the population reporting good health is associated with a lower share of votes for right-wing populist parties, although the effect is moderate. Education does not show a significant effect once age and health are accounted for.
#comparing models
anova(m1, m2)
## Analysis of Variance Table
##
## Model 1: pct_rwpop ~ scale(mean_age) + scale(pct_male)
## Model 2: pct_rwpop ~ scale(mean_age) + scale(pct_male) + scale(mean_education) +
## scale(pct_good_health)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 63 11.1163
## 2 61 9.5471 2 1.5692 5.0129 0.009646 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Result: The model comparison using ANOVA indicates that the extended model explains the data slightly better. Age remains the consistently strongest predictor of the share of right-wing populist voters.
#summary of the results
summary(m1)
##
## Call:
## lm(formula = pct_rwpop ~ scale(mean_age) + scale(pct_male), data = dfa,
## weights = n)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -0.68414 -0.23467 -0.06358 0.13154 1.25331
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.090085 0.004277 21.063 < 2e-16 ***
## scale(mean_age) 0.019856 0.004148 4.787 1.06e-05 ***
## scale(pct_male) -0.011754 0.004566 -2.574 0.0124 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4201 on 63 degrees of freedom
## Multiple R-squared: 0.3059, Adjusted R-squared: 0.2839
## F-statistic: 13.88 on 2 and 63 DF, p-value: 1.01e-05
summary(m2)
##
## Call:
## lm(formula = pct_rwpop ~ scale(mean_age) + scale(pct_male) +
## scale(mean_education) + scale(pct_good_health), data = dfa,
## weights = n)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -0.72772 -0.24498 -0.08542 0.18652 1.12733
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.091945 0.004077 22.549 < 2e-16 ***
## scale(mean_age) 0.023708 0.004106 5.774 2.8e-07 ***
## scale(pct_male) -0.005136 0.005275 -0.974 0.3340
## scale(mean_education) -0.002035 0.005757 -0.353 0.7250
## scale(pct_good_health) 0.018422 0.006994 2.634 0.0107 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3956 on 61 degrees of freedom
## Multiple R-squared: 0.4039, Adjusted R-squared: 0.3648
## F-statistic: 10.33 on 4 and 61 DF, p-value: 1.869e-06
##################################################
##### END SEMINAR PAPER STUDENT INPUT ############
##################################################
##### END SEMINAR PAPER STUDENT INPUT ############
##################################################
Seminararbeit Final Magdalena Fink
Right-wing populist parties have gained more and more support across
many European countries over the past decade. This trend has raised
concerns about democratic stability and the broader social conditions
which can encourage political discontent. Researches indicate that
support for populist parties is often linked to perceived social
decline, economic insecurity, and feelings of marginalization among
certain population groups (Inglehart & Norris, 2016; Gidron &
Hall, 2017).
More recently, especially health has been discussed as a relevant
determinant of political behavior. Poor mental and physical health can
increase feelings of insecurity and pessimism. These feelings may lead
to dissatisfaction with political institutions and increase support for
exclusionary or anti-establishment parties (Marmot et al., 2012; Case
& Deaton, 2020). Also Empirical studies suggest that health-related
disadvantage is socially patterned and closely linked with education,
income, and labor market status (Wilkinson & Pickett, 2009). From a
public health perspective, understanding the relationship between health
indicators and political thinking is therefore highly relevant. Laverty
and Hopkinson (2025) demonstrate that poorer population health is
associated with higher levels of right-wing populist voting at the
national level in Europe. Building on this work, the present study aims
to replicate and extend their findings using data from the European
Social Survey (ESS). The aim of this paper is to examine whether
regional differences in health and socio-demographic characteristics are
related to the share of right-wing populist voters across European
countries. By considering both physical and mental health indicators,
the study adds to existing research on the link between health
inequalities and political preferences.
Previous studies have shown that health outcomes are socially patterned and closely linked to education, income, and labor market position. Poor health is more prevalent in socioeconomically disadvantaged groups, which are also more likely to experience political alienation and distrust in institutions (Kavanagh, 2021). Mental health, especially depressive symptoms, may help explain the link between health and political behavior. Depression is often related to pessimism, feelings of helplessness, and low trust in society, which can make people more open to populist messages. Previous studies show that regions with higher psychological distress tend to display stronger support for radical or anti-establishment parties (Gidron & Hall, 2017; Schraff, 2019). Physical health indicators, such as obesity measured via Body Mass Index (BMI), can be interpreted as markers of cumulative disadvantage and unhealthy living conditions. Higher BMI levels are often associated with lower socioeconomic status and limited access to health-promoting resources, which may indirectly contribute to political dissatisfaction (Marmot et al., 2010). In addition to health related factors, socio demographic characteristics are important confounders in explaining variation in right wing populist support, and many studies in the literature include age, education, income, and gender as predictors of radical right wing voting behaviour (Stockemer, Lentz, & Mayer, 2018). Based on previous research, several hypotheses are formulated to examine the relationship between regional socio-demographic and health characteristics and support for right-wing populist parties. First, it is expected that regions with an older population structure show higher proportions of right-wing populist voters (H1). Also age has repeatedly been identified as an important determinant of voting behavior, with older populations tending to display stronger support for right-wing populist parties. Second, a lower level of education is assumed to be negatively associated with right-wing populist voting. Regions with lower average levels of education are therefore expected to exhibit higher proportions of right-wing populist voters (H2). Third, physical health is hypothesized to play an important role. Regions characterized by poorer average health outcomes are expected to show higher levels of support for right-wing populist parties (H3). Fourth, mental health is considered an additional explanatory factor. Regions with higher average levels of depressive symptoms, measured by the CES-D8 scale, are expected to have higher proportions of right-wing populist voters (H4). Finally, gender composition is included as a socio-demographic factor. Regions with a higher proportion of men are expected to exhibit higher right-wing populist vote shares (H5).
The study uses data from the European Social Survey (ESS). Data from individuals were combined and summarized for each country and region. Countries without comparable voting information were left out. The dependent variable is the proportion of right-wing populist voters (pct_rwpop), constructed by identifying country-specific right-wing populist parties based on ESS voting variables. Independent variables include:
• Average age (mean_age) • Proportion of males (pct_male) • Proportion reporting good or very good health (pct_good_health) • Mean depression score based on the CES-D8 scale (mean_cesd8) • Mean Body Mass Index (mean_bmi) • Mean education level (mean_education) • Mean household income (mean_income)
Regions with fewer than 30 respondents were aggregated into country-specific “other regions” to ensure sufficient sample sizes at the regional level. The analysis includes descriptive statistics, bivariate correlations, and multivariate linear regression models. In addition, graphical analyses were used to visually assess and illustrate the relationships implied by the hypotheses.
4.1 Sample Description The final dataset comprises 123 country–region units across several European countries. The proportion of right-wing populist voters varies considerably across regions. Substantial regional differences are observed in age structure, health indicators, educational attainment, and income levels, indicating pronounced social and health-related heterogeneity.
4.2 Bivariate Associations
4.2.1 Bivariate Correlations
Pairwise correlations between the share of right-wing populist voters and the potential explanatory variables were first examined. Regional mean age shows a relatively strong positive association with right-wing populist voting. Weaker positive associations are observed for BMI and depression scores (CES-D8), while good self-rated health and educational attainment are negatively associated. Education and income are highly correlated; to avoid redundancy and multicollinearity, only education is retained for the multivariate analysis.
4.2.2 Scatterplots
Scatterplots were used to visualize the bivariate relationships and to detect potential outliers. The plot for mean age displays a clear positive trend, whereas the association with the proportion of men is weaker and less consistent. These plots provide an initial exploratory basis for subsequent hypothesis testing in the multivariate analysis.
4.3 Predictor Correlations
The correlation matrix shows the relationships among the explanatory variables. Strong correlations, such as between education and income, indicate potential redundancy. To avoid multicollinearity, only education was retained as an indicator of socioeconomic status in the regression model.
4.4 Multivariate Regression Analysis
4.4.1 Baseline Model: Age + Gender
A baseline regression model including regional mean age and the proportion of men was estimated. Results indicate that age has a positive and statistically significant association with the share of right-wing populist voters, while gender does not show a significant effect. This baseline model provides a reference for comparison with the extended model. 4.4.2 Final Model: Age + Gender + Education + Health The extended model additionally includes education and the proportion of individuals reporting good or very good health. All predictors were standardized. The positive effect of age remains robust and significant. A higher proportion of healthy individuals is associated with a lower share of votes for right-wing populist parties, although the effect is moderate. Education does not have a significant effect once age and health are accounted for. 4.4.3 Model Comparison ANOVA model comparisons indicate that the extended model explains the data slightly better than the baseline model. Age remains the consistently strongest predictor of the share of right-wing populist voters.
4.5 Summary of Hypotheses Testing
The following section summarizes the results of the empirical hypothesis testing, highlighting the extent to which each proposed relationship is supported by the statistical analyses. H1 (Age): Strongly supported. Regions with higher average age consistently exhibit higher shares of right-wing populist voters. H2 (Education): Limited support. Negative association at the bivariate level weakens in the multivariate model. H3 (Health): Partially supported. Poorer health is associated with higher right-wing populist voting, particularly in the extended model. H4 (Depression): Supported at the bivariate level, but not in the multivariate analysis. H5 (Gender): Not supported. No significant effect of the proportion of men on voting patterns.
The results of this study largely support the proposed hypotheses. Regions with poorer physical and mental health consistently show higher shares of right-wing populist voters, even after controlling for socio-demographic and economic factors. This finding aligns with prior research suggesting that health-related disadvantage can contribute to political dissatisfaction and increase support for populist parties. The positive association between BMI and populist voting may reflect broader structural inequalities, including access to health resources and socioeconomic disadvantage. The observed relationship with depressive symptoms highlights mental health as an important political determinant, indicating that psychological well-being may influence political attitudes and voting behavior. Although education and income exhibit protective effects, their significance is attenuated in the multivariate analysis, suggesting that health factors exert an independent influence. Also several limitations should be acknowledged. First, the analysis relies on aggregated cross-sectional data, which restricts causal inference. Second, regional aggregation may obscure individual-level mechanisms and heterogeneity. Finally, unmeasured contextual variables could also contribute to observed regional differences. Despite these limitations, the findings provide robust evidence for a meaningful association between health and political behavior at the regional level. Future research could investigate longitudinal or individual-level data to better disentangle causal pathways and explore the interaction between health, socioeconomic status, and political preferences.
This study replicates and extends previous findings by demonstrating that both physical and mental health indicators are significantly associated with right-wing populist voting at the regional level in Europe. Health disparities appear to contribute to regional variation in populist support, independently of socioeconomic and demographic factors. From a public health perspective, improving population health may have broader societal benefits beyond individual well-being, potentially fostering political stability and social cohesion. Addressing health inequalities may therefore not only improve quality of life but also reduce political polarization and populist support.
Case, A., & Deaton, A. (2020). Deaths of despair and the future of capitalism. Princeton University Press.
Gidron, N., & Hall, P. A. (2017). The politics of social status: Economic and cultural roots of the populist right. The British Journal of Sociology, 68(S1), S57–S84. https://doi.org/10.1111/1468-4446.12319
Inglehart, R., & Norris, P. (2016). Trump, Brexit, and the rise of populism: Economic have-nots and cultural backlash (Faculty Research Working Paper Series RWP16-026). Harvard Kennedy School.
Kavanagh, N. M., Menon, A., & Heinze, J. E. (2021). Does health vulnerability predict voting for right-wing populist parties in Europe? American Political Science Review, 115(3), 1104–1109. https://doi.org/10.1017/S0003055421000265
Laverty, A. A., & Hopkinson, N. S. (2025). What is the relationship between population health and voting patterns? An ecological study in England. BMJ Open Respiratory Research, 12, e003526. https://doi.org/10.1136/bmjresp-2025-003526
Marmot, M., Allen, J., Bell, R., Bloomer, E., & Goldblatt, P. (2012). WHO European review of social determinants of health and the health divide. The Lancet, 380(9846), 1011–1029. https://doi.org/10.1016/S0140-6736(12)61228-8
Marmot, M., Allen, J., Goldblatt, P., Boyce, T., & McNeish, D. (2010). Fair society, healthy lives: The Marmot review. Institute of Health Equity. https://www.instituteofhealthequity.org/resources-reports/fair-society-healthy-lives-the-marmot-review
Schraff, D. (2019). Political trust during the COVID-19 pandemic: Rally around the flag or lockdown effects? European Journal of Political Research, 60(4), 1007–1017. https://doi.org/10.1111/1475-6765.12425
Stockemer, D., Lentz, T., & Mayer, D. (2018). Individual predictors of the radical right-wing vote in Europe: A meta-analysis of articles in peer-reviewed journals (1995–2016). Government and Opposition, 53(3), 569–593. https://doi.org/10.1017/gov.2018.2
Wilkinson, R. G., & Pickett, K. E. (2009). The spirit level: Why more equal societies almost always do better. Allen Lane.