This script analyzes how social determinants (income, health services, childhood conflicts, and working hours) influence depression (CES-D8 score) in Austria using ESS11 data.
# Load required packages
library(foreign) # for reading SPSS files
library(likert) # for Likert plots and summaries
library(kableExtra) # for nicer tables
First of all, the CES-D8 variables were recoded as follows:
# Convert selected items to numeric scale
cesd_items = c("fltdpr", "flteeff", "slprl", "wrhpp", "fltlnl", "enjlf", "fltsd", "cldgng")
for (item in cesd_items) {
df_aus[[paste0(item, "_n")]] = as.numeric(NA)
df_aus[[paste0(item, "_n")]][df_aus[[item]] == "None or almost none of the time"] = 1
df_aus[[paste0(item, "_n")]][df_aus[[item]] == "Some of the time"] = 2
df_aus[[paste0(item, "_n")]][df_aus[[item]] == "Most of the time"] = 3
df_aus[[paste0(item, "_n")]][df_aus[[item]] == "All or almost all of the time"] = 4
}
df_aus$wrhpp_n = 5 - df_aus$wrhpp_n
df_aus$enjlf_n = 5 - df_aus$enjlf_n
df_aus$CESD_TOTAL = rowMeans(df_aus[, paste0(cesd_items, "_n")], na.rm = TRUE)
# Extract relevant items for Likert analysis
likert_df_aus = df_aus[, cesd_items ]
# Create and display Likert summary
likert_obj = likert(likert_df_aus)
summary(likert_obj)
## Item low neutral high mean sd
## 4 wrhpp 31.05218 0 68.947819 2.872541 0.7882364
## 6 enjlf 36.49979 0 63.500214 2.812580 0.8249923
## 3 slprl 89.25831 0 10.741688 1.632992 0.7562128
## 2 flteeff 90.84327 0 9.156729 1.590716 0.7119641
## 5 fltlnl 95.36170 0 4.638298 1.286383 0.5930864
## 8 cldgng 95.78185 0 4.218151 1.362591 0.6056129
## 7 fltsd 96.58849 0 3.411514 1.360768 0.5931890
## 1 fltdpr 96.80851 0 3.191489 1.358723 0.5727590
# Plot the Likert-scale responses
plot(likert_obj)
# Convert items to numeric format
likert_numeric_df_aus = as.data.frame(lapply(df_aus[, cesd_items], as.numeric))
# Calculate means for each item
likert_means = sapply(likert_numeric_df_aus, mean, na.rm = TRUE)
# Print the means
likert_means
## fltdpr flteeff slprl wrhpp fltlnl enjlf fltsd cldgng
## 1.358723 1.590716 1.632992 2.872541 1.286383 2.812580 1.360768 1.362591
mlrm = lm(CESD_TOTAL ~ hincfel_num + cnfpplh_num + wkhtot + stfhlth_num, data = df_aus)
summary(mlrm)
##
## Call:
## lm(formula = CESD_TOTAL ~ hincfel_num + cnfpplh_num + wkhtot +
## stfhlth_num, data = df_aus)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.00641 -0.27287 -0.07585 0.20838 2.29300
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.6477758 0.0639004 41.436 < 2e-16 ***
## hincfel_num -0.1532889 0.0128009 -11.975 < 2e-16 ***
## cnfpplh_num -0.0912248 0.0092222 -9.892 < 2e-16 ***
## wkhtot -0.0003108 0.0008385 -0.371 0.711
## stfhlth_num -0.0240280 0.0038789 -6.194 7.02e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4118 on 2093 degrees of freedom
## (256 Beobachtungen als fehlend gelöscht)
## Multiple R-squared: 0.1422, Adjusted R-squared: 0.1406
## F-statistic: 86.76 on 4 and 2093 DF, p-value: < 2.2e-16
df_aus$dweight = as.numeric(df_aus$dweight)
mlrm_weighted = lm(CESD_TOTAL ~ hincfel_num + cnfpplh_num + wkhtot + stfhlth_num,
data = df_aus,
weights = df_aus$dweight)
summary(mlrm_weighted)
##
## Call:
## lm(formula = CESD_TOTAL ~ hincfel_num + cnfpplh_num + wkhtot +
## stfhlth_num, data = df_aus, weights = df_aus$dweight)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -1.31976 -0.23404 -0.04067 0.21687 2.55893
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.6461481 0.0628021 42.135 < 2e-16 ***
## hincfel_num -0.1606669 0.0124434 -12.912 < 2e-16 ***
## cnfpplh_num -0.0903881 0.0090008 -10.042 < 2e-16 ***
## wkhtot -0.0012564 0.0007874 -1.596 0.111
## stfhlth_num -0.0207764 0.0036831 -5.641 1.92e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3945 on 2093 degrees of freedom
## (256 Beobachtungen als fehlend gelöscht)
## Multiple R-squared: 0.1477, Adjusted R-squared: 0.1461
## F-statistic: 90.68 on 4 and 2093 DF, p-value: < 2.2e-16
We ran two regression models to explore what influences depression levels (CES-D8) in Austria: one without survey weights and one using the design weight (dweight) to better reflect the general population.
In both models, feeling financially secure was the strongest predictor: the more difficult people felt their financial situation was, the higher their depression scores. This effect became slightly stronger when weights were applied (from -0.153 to -0.161). Childhood conflict also remained a strong negative predictor (from -0.091 to -0.090), and health satisfaction showed a slightly weaker but still significant negative effect (from -0.024 to -0.021). Working hours per week had no clear influence in either model (changing from -0.0003 to -0.0013, both not significant).
Overall, the weighted model fit the data slightly better. It explained a bit more variance in depression scores (Adjusted R²: 0.146 vs. 0.141) and had a lower residual error (0.3945 vs. 0.4118). This shows that using survey weights helps make the results more accurate and more representative of the population.