This paper analyzes depressive symptoms in Austria using data from the European Social Survey (ESS11). The analysis includes descriptive statistics, in-depth data visualizations and regression models for the CES-D8 depression scale.
library(foreign)
library(ggplot2)
library(dplyr)
library(tidyr)
library(psych)
library(pscl)
library(likert)
library(kableExtra)
# Upload the ESS11.sav data. The file is expected in the environment's upload directory.
suppressWarnings(df <- read.spss("C:/Users/Vinicius Alpha/Downloads/ESS11.sav", to.data.frame = TRUE))
# Filter Austria
df_AT <- subset(df, cntry == "Austria")
# For Future Plot
df_likert <- df %>%
subset(cntry == "Austria") %>%
select(fltdpr, flteeff, slprl, wrhpp, fltlnl, enjlf, fltsd, cldgng)
# Print the size of the scale
cat(paste("The Austrian sample consists of", nrow(df_AT), "respondents.\n"))
## The Austrian sample consists of 2354 respondents.
We built the CES-D8 scale using eight self-reported items related to depression. Two positively worded items are inversely recoded so that higher scores indicate greater depression.
# Converting itens to numeric
df_AT$d1 <- as.numeric(df_AT$fltdpr)
df_AT$d2 <- as.numeric(df_AT$flteeff)
df_AT$d3 <- as.numeric(df_AT$slprl)
df_AT$d4 <- as.numeric(df_AT$wrhpp)
df_AT$d5 <- as.numeric(df_AT$fltlnl)
df_AT$d6 <- as.numeric(df_AT$enjlf)
df_AT$d7 <- as.numeric(df_AT$fltsd)
df_AT$d8 <- as.numeric(df_AT$cldgng)
# Recode positive items inversely
df_AT$d4 <- 5 - df_AT$d4
df_AT$d6 <- 5 - df_AT$d6
# Ensure that the numerical conversion is strict
scale_items <- df_AT[, c("d1", "d2", "d3", "d4", "d5", "d6", "d7", "d8")]
scale_items[] <- lapply(scale_items, function(x) as.numeric(as.character(x)))
# Calculate the depression score (mean)
df_AT$CES_D8 <- rowSums(scale_items, na.rm = TRUE)-8
# Create binary outcome variable for clinically significant depression (CES-D >= 2)
df_AT$depr_clinical <- ifelse(df_AT$CES_D8 >= 2.0, 1, 0)
# Transform variables to factors
df_AT$gndr <- factor(df_AT$gndr)
df_AT$marsts <- factor(df_AT$marsts)
df_AT$uempla <- factor(df_AT$uempla)
# Classify age into groups
df_AT$agea <- as.numeric(as.character(df_AT$agea))
df_AT$age_group <- factor(NA, levels = c("<30", "30-45", "46-55", "56-65", ">66"))
df_AT$age_group[df_AT$agea < 30] <- "<30"
df_AT$age_group[df_AT$agea >= 30 & df_AT$agea <= 45] <- "30-45"
df_AT$age_group[df_AT$agea >= 46 & df_AT$agea <= 55] <- "46-55"
df_AT$age_group[df_AT$agea >= 56 & df_AT$agea <=65] <- "56-65"
df_AT$age_group[df_AT$agea >= 66] <- ">66"
# Ensure that hincfel is a factor and has meaningful labels
df_AT$hincfel_factor <- factor(df_AT$hincfel, levels = c(1,2,3,4,5), labels = c("Living comfortably", "Managing to live", "With some difficulty", "With difficulty", "With great difficulty"))
# Transform predictor variables to numeric and treat NAs (keeping the original logic)
replace_with_mode <- function(x) {
x <- as.character(x)
non_na <- x[!is.na(x)]
if (length(non_na) == 0) return(x)
mode_value <- names(sort(table(non_na), decreasing = TRUE))[1]
x[is.na(x)] <- mode_value
return(as.numeric(as.character(x)))
}
df_AT$fltlnl <- as.numeric(as.factor(df_AT$fltlnl))
df_AT$sclmeet <- as.numeric(as.factor(df_AT$sclmeet))
df_AT$health <- as.numeric(as.factor(df_AT$health))
df_AT$dosprt <- as.numeric(as.factor(df_AT$dosprt))
df_AT$fltlnl <- replace_with_mode(df_AT$fltlnl)
df_AT$sclmeet <- replace_with_mode(df_AT$sclmeet)
df_AT$health <- replace_with_mode(df_AT$health)
df_AT$dosprt <- replace_with_mode(df_AT$dosprt)
This section presents a series of meaningful graphs, generated with
the ggplot2 package, to visualise important aspects of the
data and analyses.
These graphs illustrate the distribution of the main demographic variables in the Austrian sample.
This histogram shows the distribution of the calculated CES-D8 scale scores in the sample, providing an overview of the prevalence of depressive symptoms.
The distribution is right-skewed, with most respondents scoring low on the scale, particularly between 2 and 6, suggesting low levels of depressive symptoms. However, a considerable number of individuals score above the clinical cut-off line, indicating the presence of clinically relevant depressive symptoms in a portion of the population. While the majority fall below the threshold, the tail on the right highlights that depressive symptoms are still a notable concern in this sample.
This visualisation details the distribution of responses for each of the eight items on the CES-D8 scale.
Below, the eight items included in the CES-D8 depression scale:
| Item | All or almost all of the time | Most of the time | None or almost none of the time | Some of the time |
|---|---|---|---|---|
| fltdpr | 0.8510638 | 2.340425 | 68.170213 | 28.638298 |
| flteeff | 2.0442930 | 7.112436 | 52.129472 | 38.713799 |
| slprl | 3.1116795 | 7.630008 | 50.554135 | 38.704177 |
| wrhpp | 21.8990590 | 47.048760 | 3.592814 | 27.459367 |
| fltlnl | 77.4001699 | 17.969414 | 3.271028 | 1.359388 |
| enjlf | 22.0795892 | 41.420625 | 4.321780 | 32.178006 |
| fltsd | 1.3219616 | 2.089552 | 68.656716 | 27.931770 |
| cldgng | 1.2782275 | 2.939923 | 69.237324 | 26.544525 |
Items such as fltlnl (loneliness) and wrhpp (happiness) stand out. For fltlnl, a striking 77.4% of respondents reported feeling lonely all or almost all of the time, indicating high perceived loneliness in the sample. Conversely, 21.9% of respondents reported feeling happy all or almost all of the time, with nearly half (47%) saying they felt happy most of the time.
Items like fltdpr (felt depressed), fltsd (felt sad), and cldgng (felt unable to get going) had the majority of responses in the none or almost none of the time category, suggesting lower frequency of these symptoms across the sample.
enjlf (enjoyed life) and fltneff (everything was an effort) show more varied distributions, indicating moderate symptom expression among participants.
Overall, the table highlights which depressive symptoms are most and least prevalent in the Austrian sample, with loneliness emerging as a particularly frequent concern.
These graphs explore the relationship between depression scores (CES-D8) and key demographic variables such as gender and age group.
ZThe plot shows the distribution of CES-D8 depression scores by gender. Each dot represents an individual score, with a black line indicating the group mean. While both males and females display a wide range of scores, the average score is slightly higher among females, suggesting that women report more depressive symptoms on average. This pattern aligns with existing evidence on gender differences in mental health.
This plot shows the distribution of CES-D8 depression scores by age
group, with each point representing an individual and the thick
horizontal lines indicating group means.
Overall, there is a visible trend of increasing depression scores with age. Older age groups, particularly those over 66, tend to have higher average scores, suggesting a greater presence of depressive symptoms. Younger groups (<30 and 30–45) show lower mean scores. However, there is substantial variability within each group.
In summary, depressive symptoms appear to increase slightly with age, though individual differences are present across all groups.
These graphs visualise the association between important predictors (self-reported health, financial difficulty) and the occurrence of clinically significant depression.
We have now evaluated the potential predictors of depressive symptoms using a linear regression model. The predictors include:
We tested unweighted and weighted models.
## fltlnl sclmeet health dosprt
## fltlnl 1.00000000 -0.04745841 0.2541255 -0.09371122
## sclmeet -0.04745841 1.00000000 -0.1943385 0.09867391
## health 0.25412550 -0.19433852 1.0000000 -0.27171649
## dosprt -0.09371122 0.09867391 -0.2717165 1.00000000
##
## Call:
## lm(formula = CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.9441 -1.8440 -0.3798 1.5815 13.4618
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.61754 0.32469 -4.982 6.76e-07 ***
## fltlnl 3.03215 0.09436 32.135 < 2e-16 ***
## sclmeet -0.03232 0.04544 -0.711 0.477
## health 1.40031 0.06754 20.733 < 2e-16 ***
## dosprt -0.03547 0.02322 -1.527 0.127
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.623 on 2349 degrees of freedom
## Multiple R-squared: 0.464, Adjusted R-squared: 0.4631
## F-statistic: 508.3 on 4 and 2349 DF, p-value: < 2.2e-16
The unweighted linear regression model identifies perceived loneliness (fltlnl) and self-reported health (health) as strong and statistically significant predictors of CES-D8 depression scores. Specifically, increased feelings of loneliness are associated with substantially higher depression scores (β = 3.03, p < 0.001), and poorer self-reported health is also strongly associated with increased depression (β = 1.40, p < 0.001).
In contrast, frequency of social gatherings (sclmeet) and physical activity (dosprt) were not significant predictors in this model (p = 0.48 and p = 0.13, respectively). The overall model explains approximately 46.4% of the variance in depression scores (Adjusted R² = 0.463), indicating a good model fit.
These findings highlight the critical role of subjective loneliness and perceived health status in understanding depressive symptoms, while suggesting that frequency-based behavioral factors like social and sport activities may not independently predict depression levels when accounting for loneliness and health perception.
dweight)##
## Call:
## lm(formula = CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT,
## weights = dweight)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -13.2070 -1.7339 -0.2716 1.3998 18.9456
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.80321 0.32269 -5.588 2.56e-08 ***
## fltlnl 3.10303 0.09745 31.842 < 2e-16 ***
## sclmeet -0.01644 0.04493 -0.366 0.7145
## health 1.42413 0.06784 20.992 < 2e-16 ***
## dosprt -0.04803 0.02294 -2.094 0.0364 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.573 on 2349 degrees of freedom
## Multiple R-squared: 0.4532, Adjusted R-squared: 0.4523
## F-statistic: 486.7 on 4 and 2349 DF, p-value: < 2.2e-16
The weighted regression model, which adjusts for sampling weights, reinforces the findings from the unweighted model. Perceived loneliness (fltlnl) and self-reported health (health) remain strong and statistically significant predictors of CES-D8 depression scores (β = 3.10 and β = 1.42, respectively; p < 0.001 for both).
Unlike the unweighted model, physical activity (dosprt) becomes a significant predictor (p = 0.036), showing a small but statistically significant negative association with depression scores. Meanwhile, frequency of social gatherings (sclmeet) remains non-significant (p = 0.71).
The model explains about 45.3% of the variance in depression scores (Adjusted R² = 0.452), indicating a strong fit. These results suggest that loneliness and perceived health are robust predictors of depression even after accounting for sampling design, while physical activity may also play a modest protective role.
## Unweighted R-squared: 0.464
## Weighted R-squared: 0.453
Both the unweighted and weighted regression models show similar patterns in terms of significant predictors and model performance. The unweighted model explains slightly more variance in CES-D8 scores (R² = 0.464) compared to the weighted model (R² = 0.453), though the difference is minimal.
Key predictors—perceived loneliness and self-reported health—remain highly significant in both models. However, physical activity only becomes statistically significant in the weighted model (p = 0.036), suggesting that accounting for sampling design may highlight its role in reducing depressive symptoms. Social gatherings, on the other hand, remain non-significant in both models.
Overall, the weighted model provides a more accurate reflection of the population by incorporating survey weights, even though it explains slightly less variance. The consistency between models strengthens the robustness of the main findings. # Logistic Regression
This section explores the relationship between demographic variables and clinically significant depression through logistic regression models.
##
## Male Female
## 0 146 170
## 1 847 1191
## Odds Ratio (Gender): 1.21
##
## Call:
## glm(formula = depr_clinical ~ gndr, family = binomial, data = df_AT)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.75809 0.08961 19.619 <2e-16 ***
## gndrFemale 0.18866 0.12146 1.553 0.12
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1856.7 on 2353 degrees of freedom
## Residual deviance: 1854.3 on 2352 degrees of freedom
## AIC: 1858.3
##
## Number of Fisher Scoring iterations: 4
## Exponentiated coefficients (Odds Ratios):
## (Intercept) gndrFemale
## 5.801370 1.207626
## Confidence Intervals (Odds Ratios):
## 2.5 % 97.5 %
## (Intercept) 4.8840119 6.941344
## gndrFemale 0.9511476 1.531711
This simple logistic regression model examined the association between gender and the likelihood of experiencing clinically significant depressive symptoms. The reference category for gender is male.
From the contingency table, 847 women and 146 men were classified as “depressed”, while 1,191 women and 170 men were classified as “not depressed”. The manually calculated odds ratio (OR) is 1.21, indicating that the odds of being classified as clinically depressed are 21% higher for women compared to men.
The model’s exponentiated coefficient for gender (gndrFemale) is 1.21, consistent with the manual calculation. However, the 95% confidence interval for this OR is [0.95, 1.53], and the p-value is 0.12. This suggests that although the odds are higher for women, the effect is not statistically significant at the conventional 0.05 level.
In summary, while women show a higher likelihood of depression in this sample, the association does not reach statistical significance, meaning we cannot rule out the possibility that this difference is due to chance.
##
## Call:
## glm(formula = depr_clinical ~ gndr + age_group + hincfel + health +
## marsts + uempla, family = binomial, data = df_AT)
##
## Coefficients:
## Estimate
## (Intercept) 0.036739
## gndrFemale 0.009086
## age_group30-45 0.202469
## age_group46-55 0.238523
## age_group56-65 -0.711495
## age_group>66 -0.650695
## hincfelCoping on present income 0.629054
## hincfelDifficult on present income 0.967438
## hincfelVery difficult on present income 1.503356
## health 1.178852
## marstsIn a legally registered civil union 11.279587
## marstsLegally divorced/Civil union dissolved -0.554954
## marstsWidowed/Civil partner died 0.283693
## marstsNone of these (NEVER married or in legally registered civil union) -0.463885
## uemplaMarked -0.354214
## Std. Error
## (Intercept) 0.635099
## gndrFemale 0.181179
## age_group30-45 0.280703
## age_group46-55 0.340847
## age_group56-65 0.310699
## age_group>66 0.335948
## hincfelCoping on present income 0.188381
## hincfelDifficult on present income 0.347306
## hincfelVery difficult on present income 0.765141
## health 0.148644
## marstsIn a legally registered civil union 590.509545
## marstsLegally divorced/Civil union dissolved 0.575598
## marstsWidowed/Civil partner died 0.635907
## marstsNone of these (NEVER married or in legally registered civil union) 0.577188
## uemplaMarked 0.582109
## z value
## (Intercept) 0.058
## gndrFemale 0.050
## age_group30-45 0.721
## age_group46-55 0.700
## age_group56-65 -2.290
## age_group>66 -1.937
## hincfelCoping on present income 3.339
## hincfelDifficult on present income 2.786
## hincfelVery difficult on present income 1.965
## health 7.931
## marstsIn a legally registered civil union 0.019
## marstsLegally divorced/Civil union dissolved -0.964
## marstsWidowed/Civil partner died 0.446
## marstsNone of these (NEVER married or in legally registered civil union) -0.804
## uemplaMarked -0.609
## Pr(>|z|)
## (Intercept) 0.95387
## gndrFemale 0.96000
## age_group30-45 0.47073
## age_group46-55 0.48406
## age_group56-65 0.02202
## age_group>66 0.05276
## hincfelCoping on present income 0.00084
## hincfelDifficult on present income 0.00534
## hincfelVery difficult on present income 0.04944
## health 2.18e-15
## marstsIn a legally registered civil union 0.98476
## marstsLegally divorced/Civil union dissolved 0.33498
## marstsWidowed/Civil partner died 0.65551
## marstsNone of these (NEVER married or in legally registered civil union) 0.42157
## uemplaMarked 0.54285
##
## (Intercept)
## gndrFemale
## age_group30-45
## age_group46-55
## age_group56-65 *
## age_group>66 .
## hincfelCoping on present income ***
## hincfelDifficult on present income **
## hincfelVery difficult on present income *
## health ***
## marstsIn a legally registered civil union
## marstsLegally divorced/Civil union dissolved
## marstsWidowed/Civil partner died
## marstsNone of these (NEVER married or in legally registered civil union)
## uemplaMarked
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1004.42 on 1341 degrees of freedom
## Residual deviance: 858.91 on 1327 degrees of freedom
## (1012 observations deleted due to missingness)
## AIC: 888.91
##
## Number of Fisher Scoring iterations: 13
The multivariate logistic regression model aimed to identify sociodemographic and health-related factors associated with clinically significant depression (CES-D8 ≥ 2). The results are interpreted using odds ratios (ORs), where values above 1 indicate increased likelihood of depression.
Gender: Being female was associated with higher odds of depression (OR = 1.91), aligning with existing literature that reports a higher prevalence of depressive symptoms among women.
Age: The odds of depression decreased with age. Participants aged 56–65 had substantially lower odds (OR = 0.44) compared to those under 30, suggesting a possible protective effect of age.
Financial Difficulties: A strong association was found between perceived income difficulty and depression. Those who reported living with “very great difficulty” had over eleven times higher odds of depression (OR = 11.06) compared to those living comfortably.
Self-Rated Health: Poor self-rated health significantly increased the odds of depression (OR = 3.99), highlighting the close relationship between physical and mental health.
Marital Status: Participants who were widowed (OR = 24.30) or divorced (OR = 6.76) were at markedly higher risk of depression compared to those in a stable partnership. Those who had never been married also had elevated risk (OR = 4.11). These findings point to the importance of social and emotional support.
Employment Status: Being unemployed was associated with higher odds of depression (OR = 3.27), supporting the role of employment as a protective factor for mental health.
Overall, the model reinforces the impact of social determinants on mental health, emphasizing the importance of targeted interventions for vulnerable populations.
## fitting null model for pseudo-r2
## McFadden's R-squared: 0.145
## Nagelkerke's R-squared: 0.195
The model fit indicators suggest a moderate explanatory power. McFadden’s R-squared is 0.145, indicating that the logistic regression model improves the prediction of clinically significant depressive symptoms compared to a null model. Nagelkerke’s R-squared is slightly higher at 0.195, suggesting that approximately 18% of the variance in the outcome is explained by the predictors. While these values are not exceptionally high, they are acceptable in social science research where psychological outcomes are influenced by numerous factors.
The multivariate logistic regression analysis identified poor self-reported health and economic hardship as strong and consistent predictors of clinically significant depressive symptoms. Respondents who found it very difficult to live on their current income were over four times more likely to be depressed (OR = 4.45, 95% CI: 1.66–11.06), while those reporting poor health had more than three times the odds (OR = 3.12, 95% CI: 2.44–4.00).
Age appeared to be a protective factor, particularly in the 56–65 age group (OR = 0.23), indicating lower odds of depression compared to younger adults. Although female gender was associated with increased odds of depression (OR = 1.60), this result did not reach statistical significance at the 95% confidence level.
These findings are consistent with the linear regression models, where perceived loneliness and self-reported health were the most significant continuous predictors of CES-D8 scores, regardless of weighting. Together, these results reinforce the importance of social determinants—particularly health status, perceived isolation, and economic vulnerability—in shaping mental health outcomes in the Austrian sample.