Depression poses a major public-health challenge in Europe, with prevalence and severity varying markedly across countries and subpopulations. Using data from Round 11 of the European Social Survey (ESS), this paper examines how age, alcohol consumption, self-rated health, social engagement and life satisfaction relate to depressive symptoms. Two operationalisations of depression are employed: the continuous CES-D8 scale and a binary “clinical cutoff.” First, hypotheses H1–H5 are tested via Pearson correlations to assess each predictor’s bivariate association with depression. Second, an OLS model predicts the CES-D8 score and a logistic model predicts the dichotomous clinical outcome. Finally, results are compared to gauge the relative importance of psychosocial versus demographic factors in Norway, Germany, and Spain.
Hypothesis 1 (H1): Older adults exhibit higher levels of depression compared to younger individuals.
Hypothesis 2 (H2): Higher alcohol consumption correlates with higher depression levels.
Hypothesis 3 (H3): Individuals with poorer self-rated health report higher levels of depression.
Hypothesis 4 (H4): Less frequent social meetings are associated with higher depression levels.
Hypothesis 5 (H5): Higher life satisfaction is associated with lower depression levels
Respondents from Norway, Germany, and Spain (n = 5 601) were selected from ESS 11 data.
The CES-D8 scale comprises eight items (D20–D27), each rated on a 0–3 Likert scale (higher scores indicate greater symptom severity):
D20 (fltdpr): Felt depressed
D21 (flteeff): Felt everything was an effort
D22 (slprl): Sleep was restless
D23 (wrhpp): Could not get going
D24 (fltlnl): Felt lonely
D25 (enjlf): Enjoyed life (reverse-coded)
D26 (fltsd): Felt sad
D27 (cldgng): Felt discouraged
These items were converted to numeric values and combined into a single depression score by computing the row-wise mean.
# Multi Item Scale from D20-D27
# Prepare numeric and factor variables
df$d20 = as.numeric(df$fltdpr) # D20
df$d21 = as.numeric(df$flteeff) # D21
df$d22 = as.numeric(df$slprl) # D22
df$d23 = as.numeric(df$wrhpp) # D23
df$d24 = as.numeric(df$fltlnl) # D24
df$d25 = as.numeric(df$enjlf) # D25
df$d26 = as.numeric(df$fltsd) # D26
df$d27 = as.numeric(df$cldgng) # D27
# Reverse scale of d25 (enjlf)
# Reverse-coding aligns d25 (enjlf) with the depression scale (higher = more depressed)
df$d25 = 6 - df$d25
df$d23 = 5 - df$d23
item_variances = sum(apply(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], 2, var, na.rm = TRUE))
# The total variance is the variance of the sum of all depression-related items (D20-D27)
total_variance = var(rowSums(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm = TRUE), na.rm = TRUE)
df$agea = as.numeric(as.character(df$agea))
# Recoded into a numeric scale from 0 (“Never”) to 6 (“Every day”)
df$alcfreqdummy = as.numeric(factor(df$alcfreq, levels = c(
"Never", "Less than once a month", "Once a month", "Several times a month",
"Once a week", "Several times a week", "Every day"), labels = c(0,1,2,3,4,5,6)))>3
# Recoded on a 5-point scale from 1 (“Very bad”) to 5 (“Very good”)
df$healthdummy = as.numeric(factor(df$health, levels = c(
"Very bad", "Bad", "Fair", "Good", "Very good"), labels = c(1,2,3,4,5)))>3
# Convert Social Connections (sclmeet) on a 7-point scale (0 “Never” to 6 “Every day”)
df$sclmeetdummy = as.numeric(factor(df$sclmeet, levels = c(
"Never", "Less than once a month", "Once a month", "Several times a month",
"Once a week", "Several times a week", "Every day"), labels = c(0,1,2,3,4,5,6)))>4
# Convert Life Satisfaction (stflife) measured on an 11-point scale from 0 (“Extremely dissatisfied”) to 10 (“Extremely satisfied”)
df$stflifedummy = as.numeric(factor(df$stflife, levels = c(
"Extremely dissatisfied", "2", "3", "4", "5", "6", "7", "8", "9", "10", "Extremely satisfied"), labels = c(0,1,2,3,4,5,6,7,8,9,10)))>5
# Compute the depression scale
# Score = mean of item values row-wise = sum of item values / number of items
df$depression = rowSums(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm = TRUE) / 8
# replace of the remaining missing values in depression
df$depression[is.na(df$depression)] = median(df$depression, na.rm = TRUE)
summary(df$depression)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 1.500 1.750 1.793 2.000 4.125
Depression Scale (CES-D8): A composite score derived from eight items measuring depressive symptoms. Items were scored 0–3, with “enjlf” reverse-coded so that higher values indicate greater symptom severity. The scale score is the row-wise mean of these items.
# Cronbach's alpha calculation
n_items = 8 # Number of items (D20-D27)
item_variances = sum(apply(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], 2, var, na.rm = TRUE))
cronbach_alpha = (n_items / (n_items - 1)) * (1 - item_variances / total_variance)
round(cronbach_alpha, 2)
## [1] 0.82
To assess scale reliability, Cronbach’s α was computed, with values between 0.8 and 0.92 reflecting good consistency (too low suggests weak item cohesion, too high may imply redundancy). In this study, the alpha coefficient was found to be 0.82, indicates good internal consistency for the CES-D8 scale.
Dashed (mean) and dotted (median) lines indicate central tendency.
mean_age = mean(df$agea, na.rm = TRUE)
median_age = median(df$agea, na.rm = TRUE)
ggplot(df, aes(x = agea)) +
geom_histogram(
binwidth = 5,
fill = "lightpink",
color = "black",
alpha = 0.6
) +
geom_vline(xintercept = mean_age, linetype = "dashed", color = "blue") +
geom_vline(xintercept = median_age, linetype = "dotted", color = "red") +
labs(
title = "Figure 1. Age Distribution of Participants",
x = "Age (in years)",
y = "Count",
caption = paste0("Mean = ", round(mean_age,1), "; Median = ", median_age)
) +
theme_minimal()
Compare the distribution of depression scores across countries to highlight geographical variation with a boxplot.
ggplot(df, aes(x = cntry, y = depression, fill = cntry)) +
geom_boxplot(fill= "lightpink", outlier.shape = 1, alpha = 0.7) +
labs(
title = "Figure 2. Distribution of Depression Scores by Country",
x = "Country",
y = "Depression Score",
fill = "Country",
caption = "Mara Winkler"
) +
theme_minimal()
Figure 2 presents the distribution of CES-D8 depression scores by country. Spain exhibits the highest median score (≈1.8) and the widest interquartile range, indicating both elevated average symptom levels and greater heterogeneity in depressive severity. Germany shows a slightly lower median (≈1.7) with moderate spread, while Norway has the lowest median (≈1.6) and the narrowest interquartile range, suggesting more uniform, lower symptom levels. In all three samples, a small number of outliers exceed the upper whisker, reflecting respondents with particularly high depressive scores. These patterns imply that, on average, adults in Spain report more severe depressive symptoms than those in Germany or Norway.
# Scatterplot of life satisfaction vs. depression
df$stflife_n = as.numeric(as.character(df$stflife))
mean_life = mean(df$stflife_n, na.rm = TRUE)
median_life = median(df$stflife_n, na.rm = TRUE)
ggplot(df, aes(y = stflife, x = depression)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE) +
labs(
title = "Figure 3. Life Satisfaction and Depression",
x = "Life Satisfaction (0-12 scale)",
y = "Depression Score",
caption = paste0("Mean = ", round(mean_life,1), "; Median = ", median_life)
) +
theme_minimal()
As H5 predicted, higher life-satisfaction scores are strongly linked to lower depression (blue line), illustrating a clear negative bivariate trend.
Visualize how often participants consume alcohol and highlight central tendencies. This histogram displays counts by drinking frequency, showing that most participants fall in the moderate range (about 2–4 standard drinks). Very low (0–1) and very high (>6) frequencies appear less often, indicating extremes are uncommon. One “unit” equals a standard drink, ensuring all beverage types are directly comparable.
# numeric data
df$alcfreq_n = as.numeric(as.character(df$health))
# Compute numeric summaries on a separate numeric column:
mean_alc = mean(df$alcfreq_n, na.rm = TRUE)
median_alc = median(df$alcfreq_n, na.rm = TRUE)
# Bar‐chart of the factor
ggplot(df, aes(y = alcfreq)) +
geom_bar(fill = "lightpink",
color = "black",
alpha = 0.7) +
labs(
title = "Figure 4. Alcohol Consumption",
x = "Units per Week",
y = "Count",
caption = "Mara Winkler"
) +
theme_minimal()
df = df %>%
mutate(
depression_cont = rowSums(
select(., d20, d21, d22, d23, d24, d25, d26, d27),
na.rm = TRUE
) / 8
)
# alcohol factor is ordered
df = df %>%
mutate(
alcfreq = factor(alcfreq,
levels = c(
"Never",
"Less than once a month",
"Once a month",
"Several times a month",
"Once a week",
"Several times a week",
"Every day"
),
ordered = TRUE
)
)
ggplot(df, aes(x = alcfreq, y = depression_cont)) +
geom_boxplot(
fill = "lightpink",
color = "black",
outlier.shape= 1,
alpha = 0.7
) +
labs(
title = "Figure 5. Depression Scores by Alcohol Consumption Frequency",
x = "Alcohol Consumption Frequency",
y = "CES-D8 Depression Score (0–3 mean)",
caption = "Mara Winkler"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
Figure 5 tests H2 by plotting mean CES-D8 scores across alcohol-use categories. Contrary to H2’s prediction of higher depression among heavier drinkers, the median symptom score falls from approximately 1.9 for “Never” drinkers to about 1.6 for “Every day” drinkers. This inverted pattern aligns poorly with the modest positive Pearson correlation (r = .11, p < .001), suggesting that the apparent association between alcohol frequency and depression is confounded—perhaps by underlying health or life‐course factors that both discourage drinking and elevate depressive symptoms. In contrast, H1 was unsupported (r = .01, p = .683), as age bore virtually no relationship to depressive outcomes. By comparison, H3–H5 garnered strong support: poorer self-rated health (H3; r = .31, p < .001), less frequent social engagement (H4; r = −.17, p < .001), and lower life satisfaction (H5; r = −.43, p < .001) each exhibited correlations in the expected direction.
The bar chart shows participants’ self-rated health on a 1–5 scale. Most respondents rate their health in the moderate to good range (values 3–4), with fewer reporting very poor (1) or excellent (5) health. This suggests the sample generally perceives their health as above average but rarely at the extremes.
# numeric data
df$health_n = as.numeric(as.character(df$health))
## Warning: NAs introduced by coercion
# Calculate mean and median
mean_hea = mean(df$health_n, na.rm = TRUE)
median_hea = median(df$health_n, na.rm = TRUE)
# Bar chart of self-rated health (ordinal)
ggplot(df, aes(x = factor(health))) +
geom_bar(fill = "lightpink", color = "black", alpha = 0.7) +
labs(
title = "Figure 6. Distribution of Self-Rated Health",
x = "Self-Rated Health (1 = Poor, 5 = Excellent)",
y = "Count",
caption = "Mara Winkler"
) +
theme_minimal()
The histogram depicts participants’ self-reported life satisfaction on a 0–12 scale, where 0 indicates “extremely dissatisfied” and 12 “extremely satisfied.” The majority of respondents rate their satisfaction in the upper-middle range (around 7–9), demonstrating overall positive well-being. Lower scores (0–3) are uncommon, indicating few participants report very low life satisfaction.
ggplot(df, aes( stflife)) +
geom_bar(fill = "lightpink",
color = "black",
alpha = 0.7,
na.rm = TRUE) +
labs(
title = "Figure 7. Distribution of Life Satisfaction Scores",
x = "Life Satisfaction",
y = "Count",
caption = "Mara Winkler"
) +
theme_minimal()+
coord_flip()
In order to examine the combined effects of the predictors and to test whether the relationships vary by country, several regression models with interaction terms were estimated.
This lollipop chart shows each predictor’s estimated effect on depression, with 95 % confidence intervals. Dots right of zero indicate higher predicted scores (e.g. “Good self-rated health”), dots left indicate lower scores (e.g. “Very bad self-rated health”). Narrow bands (age, health) reflect precise estimates; wider bands (some drinking and social-meeting categories) reflect greater uncertainty.
model_depression = lm(depression ~ agea + alcfreq + health + sclmeet, data = df)
summary(model_depression)
##
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet,
## data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.46774 -0.29079 -0.05734 0.21594 2.25204
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.0872177 0.0656771 31.780 < 2e-16 ***
## agea -0.0040795 0.0003605 -11.317 < 2e-16 ***
## alcfreq.L -0.0657980 0.0185195 -3.553 0.000385 ***
## alcfreq.Q 0.0448499 0.0185403 2.419 0.015598 *
## alcfreq.C -0.0322463 0.0164447 -1.961 0.049950 *
## alcfreq^4 -0.0204432 0.0156690 -1.305 0.192062
## alcfreq^5 -0.0038604 0.0163910 -0.236 0.813817
## healthGood 0.1592552 0.0169680 9.386 < 2e-16 ***
## healthFair 0.3527545 0.0190244 18.542 < 2e-16 ***
## healthBad 0.6989579 0.0267062 26.172 < 2e-16 ***
## healthVery bad 0.9914527 0.0594598 16.674 < 2e-16 ***
## sclmeetLess than once a month -0.0691306 0.0671886 -1.029 0.303577
## sclmeetOnce a month -0.2284283 0.0656223 -3.481 0.000504 ***
## sclmeetSeveral times a month -0.3104225 0.0631521 -4.915 9.15e-07 ***
## sclmeetOnce a week -0.3072367 0.0632566 -4.857 1.23e-06 ***
## sclmeetSeveral times a week -0.3554486 0.0625419 -5.683 1.40e-08 ***
## sclmeetEvery day -0.4094461 0.0634752 -6.450 1.22e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4359 on 4789 degrees of freedom
## (795 observations deleted due to missingness)
## Multiple R-squared: 0.2257, Adjusted R-squared: 0.2232
## F-statistic: 87.26 on 16 and 4789 DF, p-value: < 2.2e-16
# focussing on coeffients
coefficients(model_depression)
## (Intercept) agea
## 2.087217675 -0.004079527
## alcfreq.L alcfreq.Q
## -0.065797974 0.044849873
## alcfreq.C alcfreq^4
## -0.032246325 -0.020443179
## alcfreq^5 healthGood
## -0.003860363 0.159255195
## healthFair healthBad
## 0.352754465 0.698957867
## healthVery bad sclmeetLess than once a month
## 0.991452676 -0.069130589
## sclmeetOnce a month sclmeetSeveral times a month
## -0.228428347 -0.310422461
## sclmeetOnce a week sclmeetSeveral times a week
## -0.307236726 -0.355448645
## sclmeetEvery day
## -0.409446076
coef_df = tidy(model_depression, conf.int=TRUE) %>%
filter(term != "(Intercept)")
# recode terms
coef_df = coef_df %>%
mutate(term = recode(term,
agea = "Age (years)",
alcfreq1 = "Rare drinking",
alcfreq2 = "Once a Month drinking",
alcfreq3 = "Monthly drinking",
alcfreq4 = "Weekly drinking",
alcfreq5 = "Several/week drinking",
alcfreq6 = "Daily drinking",
health1 = "Very bad self-rated health",
health2 = "Bad self-rated health",
health3 = "Fair self-rated health",
health4 = "Good self-rated health",
sclmeet1 = "Never meet",
sclmeet2 = "Monthly meet",
sclmeet3 = "Several/month meet",
sclmeet4 = "Weekly meet",
sclmeet5 = "Several/week meet",
sclmeet6 = "Daily meet"
))
# 3. Plot: lines from zero to estimate, point at estimate
ggplot(coef_df, aes(x = estimate, y = reorder(term, estimate))) +
geom_segment(aes(x = 0, xend = estimate, y = term, yend = term),
color = "lightpink", size = 1) +
geom_point(color = "lightpink", size = 3) +
geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
height = 0, color = "black") +
labs(
title = "Figure 8. OLS: Coefficients for Depression",
x = "Estimate (with 95% Confidence Intervall)",
y = "",
caption = "Mara Winkler"
) +
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Figure 8 plots the estimated OLS coefficients (points) and their 95 % confidence intervals (horizontal lines) for predicting the CES-D8 depression score. Predictors to the right of the vertical zero line—most notably “Bad self-rated health” and “Very bad self-rated health”—have positive coefficients, indicating higher depressive symptoms as health worsens (strong support for H3). In contrast, predictors to the left—especially frequent social meetings (e.g. “Daily meet,” “Several/week meet”)—have negative coefficients, indicating lower depression with greater social engagement (supporting H4). The age coefficient lies almost exactly on zero (H1 unsupported), and the alcohol‐consumption contrasts (alcfreq.L, alcfreq.Q, etc.) cluster near zero with wide intervals (H2 only weakly supported at best)
CES-D8 items were recoded from 1–4 to 0–3 and summed to yield a 0–24 symptom score. This continuum was partitioned into four severity levels:
None/Low (0–4): Minimal or absent symptoms
Mild (5–9): Subthreshold symptoms Moderate (10–14): At or above the clinical cutoff (≥10) Severe (15–24): High symptom burden indicative of probable major depression This categorization retains granularity across the symptom spectrum and facilitates analysis of how predictors relate to increasing severity.
# Recode each CES-D item from 1–4 to 0–3
df$d20_0to3 = df$d20 - 1
df$d21_0to3 = df$d21 - 1
df$d22_0to3 = df$d22 - 1
df$d23_0to3 = df$d23 - 1
df$d24_0to3 = df$d24 - 1
df$d25_0to3 = df$d25 - 1 # d25 has already been reversed correctly
df$d26_0to3 = df$d26 - 1
df$d27_0to3 = df$d27 - 1
# Compute the sum of all eight items (range: 0–24)
df$depression_sum = rowSums(df[, c(
"d20_0to3","d21_0to3","d22_0to3","d23_0to3",
"d24_0to3","d25_0to3","d26_0to3","d27_0to3"
)], na.rm = TRUE)
# Cut Off
df$depression_cat4 = cut(
df$depression_sum,
breaks = c(-1, 4, 9, 14, 24),
labels = c("1: none/low", "2: mild", "3: moderate", "4: severe")
)
table(df$depression_cat4)
##
## 1: none/low 2: mild 3: moderate 4: severe
## 2041 2603 713 238
prop.table(table(df$depression_cat4))
##
## 1: none/low 2: mild 3: moderate 4: severe
## 0.36478999 0.46523682 0.12743521 0.04253798
# Absolute frequencies for each severity category
table(df$depression_cat4)
##
## 1: none/low 2: mild 3: moderate 4: severe
## 2041 2603 713 238
# relative proportions for each category
prop.table(table(df$depression_cat4))
##
## 1: none/low 2: mild 3: moderate 4: severe
## 0.36478999 0.46523682 0.12743521 0.04253798
df %>%
count(depression_cat4) %>%
mutate(
percent = n / sum(n)
) %>%
rename(
Category = depression_cat4,
Count = n,
Proportion = percent
) %>%
kable(digits = c(0, 0, 3),
caption = "Table 1. Distribution of Depression Severity")
| Category | Count | Proportion |
|---|---|---|
| 1: none/low | 2041 | 0.364 |
| 2: mild | 2603 | 0.465 |
| 3: moderate | 713 | 0.127 |
| 4: severe | 238 | 0.042 |
| NA | 6 | 0.001 |
Logistic regression models the probability of clinical depression (binary outcome), estimating how each predictor alters the log-odds of crossing the CES-D8 clinical threshold. This approach bounds predicted probabilities between 0 and 1 and produces interpretable odds ratios for each covariate
df$depression = ifelse(df$depression_sum >= 10, 1, 0)
aModel = lm(depression ~ agea + alcfreq + health + sclmeet + cntry,
data = df,
family = binomial)
## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
## extra argument 'family' will be disregarded
summary(aModel)
##
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet +
## cntry, data = df, family = binomial)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.88761 -0.18097 -0.09602 -0.02412 1.05029
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.4433683 0.0533085 8.317 < 2e-16 ***
## agea -0.0016848 0.0002922 -5.766 8.62e-09 ***
## alcfreq.L -0.0548652 0.0150772 -3.639 0.000277 ***
## alcfreq.Q 0.0284760 0.0153494 1.855 0.063632 .
## alcfreq.C -0.0082370 0.0133097 -0.619 0.536028
## alcfreq^4 -0.0227818 0.0127306 -1.790 0.073593 .
## alcfreq^5 0.0018237 0.0132666 0.137 0.890669
## healthGood 0.0456637 0.0137583 3.319 0.000910 ***
## healthFair 0.1699774 0.0154826 10.979 < 2e-16 ***
## healthBad 0.4112672 0.0216967 18.955 < 2e-16 ***
## healthVery bad 0.5431626 0.0481254 11.286 < 2e-16 ***
## sclmeetLess than once a month -0.0972995 0.0543774 -1.789 0.073624 .
## sclmeetOnce a month -0.2374407 0.0531190 -4.470 8.00e-06 ***
## sclmeetSeveral times a month -0.3001005 0.0511142 -5.871 4.62e-09 ***
## sclmeetOnce a week -0.3107499 0.0512243 -6.066 1.41e-09 ***
## sclmeetSeveral times a week -0.3201626 0.0506756 -6.318 2.89e-10 ***
## sclmeetEvery day -0.3359493 0.0515916 -6.512 8.19e-11 ***
## cntrySpain 0.0305582 0.0119773 2.551 0.010761 *
## cntryNorway -0.0180912 0.0139649 -1.295 0.195219
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3527 on 4787 degrees of freedom
## (795 observations deleted due to missingness)
## Multiple R-squared: 0.1559, Adjusted R-squared: 0.1527
## F-statistic: 49.12 on 18 and 4787 DF, p-value: < 2.2e-16
Table 2 summarizes the odds ratios (ORs) and 95 % confidence intervals (CIs) from the logistic model predicting clinical depression:
aModel = lm(depression ~ agea + alcfreq + health + sclmeet, data=df, family=binomial)
## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
## extra argument 'family' will be disregarded
summary(aModel)
##
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet,
## data = df, family = binomial)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.88927 -0.18061 -0.09519 -0.03004 1.03025
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.450520 0.053205 8.468 < 2e-16 ***
## agea -0.001726 0.000292 -5.909 3.68e-09 ***
## alcfreq.L -0.051894 0.015003 -3.459 0.000547 ***
## alcfreq.Q 0.040029 0.015020 2.665 0.007722 **
## alcfreq.C -0.008090 0.013322 -0.607 0.543710
## alcfreq^4 -0.018994 0.012694 -1.496 0.134626
## alcfreq^5 0.002839 0.013278 0.214 0.830729
## healthGood 0.045790 0.013746 3.331 0.000871 ***
## healthFair 0.173113 0.015412 11.233 < 2e-16 ***
## healthBad 0.414194 0.021635 19.145 < 2e-16 ***
## healthVery bad 0.544761 0.048169 11.309 < 2e-16 ***
## sclmeetLess than once a month -0.095844 0.054430 -1.761 0.078322 .
## sclmeetOnce a month -0.236195 0.053161 -4.443 9.07e-06 ***
## sclmeetSeveral times a month -0.299943 0.051160 -5.863 4.86e-09 ***
## sclmeetOnce a week -0.307647 0.051244 -6.004 2.07e-09 ***
## sclmeetSeveral times a week -0.319529 0.050666 -6.307 3.11e-10 ***
## sclmeetEvery day -0.335792 0.051422 -6.530 7.25e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3531 on 4789 degrees of freedom
## (795 observations deleted due to missingness)
## Multiple R-squared: 0.1537, Adjusted R-squared: 0.1508
## F-statistic: 54.34 on 16 and 4789 DF, p-value: < 2.2e-16
exp(coef(aModel))
## (Intercept) agea
## 1.5691284 0.9982760
## alcfreq.L alcfreq.Q
## 0.9494296 1.0408410
## alcfreq.C alcfreq^4
## 0.9919428 0.9811851
## alcfreq^5 healthGood
## 1.0028427 1.0468543
## healthFair healthBad
## 1.1890008 1.5131513
## healthVery bad sclmeetLess than once a month
## 1.7241967 0.9086053
## sclmeetOnce a month sclmeetSeveral times a month
## 0.7896271 0.7408607
## sclmeetOnce a week sclmeetSeveral times a week
## 0.7351750 0.7264911
## sclmeetEvery day
## 0.7147716
exp(confint(aModel))
## 2.5 % 97.5 %
## (Intercept) 1.4137042 1.7416402
## agea 0.9977046 0.9988476
## alcfreq.L 0.9219114 0.9777693
## alcfreq.Q 1.0106399 1.0719446
## alcfreq.C 0.9663714 1.0181908
## alcfreq^4 0.9570694 1.0059085
## alcfreq^5 0.9770737 1.0292912
## healthGood 1.0190203 1.0754486
## healthFair 1.1536135 1.2254735
## healthBad 1.4503141 1.5787110
## healthVery bad 1.5688275 1.8949529
## sclmeetLess than once a month 0.8166440 1.0109223
## sclmeetOnce a month 0.7114755 0.8763632
## sclmeetSeveral times a month 0.6701596 0.8190208
## sclmeetOnce a week 0.6649060 0.8128701
## sclmeetSeveral times a week 0.6577985 0.8023572
## sclmeetEvery day 0.6462285 0.7905847
broom::tidy(aModel, conf.int = TRUE) %>%
filter(term != "(Intercept)") %>%
transmute(
Predictor = term,
OR = sprintf("%.2f", exp(estimate)),
`95% CI` = paste0(
sprintf("%.2f", exp(conf.low)),
"–",
sprintf("%.2f", exp(conf.high))
)
) %>%
kable(
col.names = c("Predictor", "Odds Ratio", "95% CI"),
caption = "Table 2. Odds Ratios and 95% CIs"
)
| Predictor | Odds Ratio | 95% CI |
|---|---|---|
| agea | 1.00 | 1.00–1.00 |
| alcfreq.L | 0.95 | 0.92–0.98 |
| alcfreq.Q | 1.04 | 1.01–1.07 |
| alcfreq.C | 0.99 | 0.97–1.02 |
| alcfreq^4 | 0.98 | 0.96–1.01 |
| alcfreq^5 | 1.00 | 0.98–1.03 |
| healthGood | 1.05 | 1.02–1.08 |
| healthFair | 1.19 | 1.15–1.23 |
| healthBad | 1.51 | 1.45–1.58 |
| healthVery bad | 1.72 | 1.57–1.89 |
| sclmeetLess than once a month | 0.91 | 0.82–1.01 |
| sclmeetOnce a month | 0.79 | 0.71–0.88 |
| sclmeetSeveral times a month | 0.74 | 0.67–0.82 |
| sclmeetOnce a week | 0.74 | 0.66–0.81 |
| sclmeetSeveral times a week | 0.73 | 0.66–0.80 |
| sclmeetEvery day | 0.71 | 0.65–0.79 |
McFadden’s pseudo-R² (0.210) indicates a 21% reduction in deviance compared to the null model as an excellent fit for a cross-sectional logistic regression. Nagelkerke’s pseudo-R² (1.402) exceeds its 0–1 range, suggesting a calculation or scaling error that requires revision. Table 3 reports two measures of explained deviance for the clinical-depression model:
df$depression_sum = rowSums(df[, c(
"d20_0to3","d21_0to3","d22_0to3","d23_0to3",
"d24_0to3","d25_0to3","d26_0to3","d27_0to3"
)], na.rm = TRUE)
df$depression_binary = ifelse(df$depression_sum >= 10, 1, 0)
aModel = glm(
depression_binary ~ agea + alcfreq + health + sclmeet + stflife + cntry,
data = df,
family = binomial(link = "logit")
)
summary(aModel)
##
## Call:
## glm(formula = depression_binary ~ agea + alcfreq + health + sclmeet +
## stflife + cntry, family = binomial(link = "logit"), data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.045040 0.684302 1.527 0.126721
## agea -0.009093 0.002585 -3.518 0.000435 ***
## alcfreq.L -0.385903 0.130194 -2.964 0.003036 **
## alcfreq.Q 0.234540 0.135190 1.735 0.082761 .
## alcfreq.C -0.012508 0.118295 -0.106 0.915794
## alcfreq^4 -0.121169 0.114337 -1.060 0.289259
## alcfreq^5 0.043631 0.122341 0.357 0.721368
## healthGood 0.315800 0.152088 2.076 0.037854 *
## healthFair 0.995284 0.155733 6.391 1.65e-10 ***
## healthBad 1.961132 0.183850 10.667 < 2e-16 ***
## healthVery bad 2.434730 0.351677 6.923 4.42e-12 ***
## sclmeetLess than once a month -0.520420 0.390390 -1.333 0.182506
## sclmeetOnce a month -1.124463 0.386098 -2.912 0.003587 **
## sclmeetSeveral times a month -1.421531 0.372045 -3.821 0.000133 ***
## sclmeetOnce a week -1.497073 0.374096 -4.002 6.29e-05 ***
## sclmeetSeveral times a week -1.543287 0.368274 -4.191 2.78e-05 ***
## sclmeetEvery day -1.635058 0.379404 -4.310 1.64e-05 ***
## stflife1 1.849615 0.960881 1.925 0.054240 .
## stflife2 0.717111 0.655449 1.094 0.273921
## stflife3 0.619405 0.607675 1.019 0.308059
## stflife4 0.035586 0.589461 0.060 0.951860
## stflife5 -0.726372 0.567305 -1.280 0.200408
## stflife6 -0.478070 0.565967 -0.845 0.398281
## stflife7 -1.481661 0.562272 -2.635 0.008410 **
## stflife8 -2.004537 0.561524 -3.570 0.000357 ***
## stflife9 -2.675239 0.575295 -4.650 3.32e-06 ***
## stflifeExtremely satisfied -2.627679 0.579182 -4.537 5.71e-06 ***
## cntrySpain 0.195390 0.100697 1.940 0.052334 .
## cntryNorway -0.285336 0.134372 -2.123 0.033714 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 4497.5 on 4790 degrees of freedom
## Residual deviance: 3336.0 on 4762 degrees of freedom
## (810 observations deleted due to missingness)
## AIC: 3394
##
## Number of Fisher Scoring iterations: 5
coef(aModel)
## (Intercept) agea
## 1.045039925 -0.009093496
## alcfreq.L alcfreq.Q
## -0.385903224 0.234539963
## alcfreq.C alcfreq^4
## -0.012507679 -0.121168841
## alcfreq^5 healthGood
## 0.043630748 0.315799998
## healthFair healthBad
## 0.995283972 1.961131734
## healthVery bad sclmeetLess than once a month
## 2.434729881 -0.520419836
## sclmeetOnce a month sclmeetSeveral times a month
## -1.124462840 -1.421531244
## sclmeetOnce a week sclmeetSeveral times a week
## -1.497072706 -1.543286619
## sclmeetEvery day stflife1
## -1.635057861 1.849614630
## stflife2 stflife3
## 0.717111359 0.619405499
## stflife4 stflife5
## 0.035586448 -0.726371705
## stflife6 stflife7
## -0.478069726 -1.481660884
## stflife8 stflife9
## -2.004537063 -2.675239086
## stflifeExtremely satisfied cntrySpain
## -2.627679229 0.195389600
## cntryNorway
## -0.285335607
or_values = exp(coef(aModel))
print("Odds Ratios (OR):")
## [1] "Odds Ratios (OR):"
print(or_values)
## (Intercept) agea
## 2.84351205 0.99094772
## alcfreq.L alcfreq.Q
## 0.67983631 1.26432700
## alcfreq.C alcfreq^4
## 0.98757022 0.88588437
## alcfreq^5 healthGood
## 1.04459656 1.37135595
## healthFair healthBad
## 2.70549252 7.10736616
## healthVery bad sclmeetLess than once a month
## 11.41273550 0.59427100
## sclmeetOnce a month sclmeetSeveral times a month
## 0.32482690 0.24134418
## sclmeetOnce a week sclmeetSeveral times a week
## 0.22378428 0.21367767
## sclmeetEvery day stflife1
## 0.19494109 6.35736911
## stflife2 stflife3
## 2.04850725 1.85782324
## stflife4 stflife5
## 1.03622722 0.48366067
## stflife6 stflife7
## 0.61997897 0.22725992
## stflife8 stflife9
## 0.13472265 0.06889036
## stflifeExtremely satisfied cntrySpain
## 0.07224593 1.21578456
## cntryNorway
## 0.75176192
ci_values = exp(confint(aModel))
## Waiting for profiling to be done...
print(ci_values)
## 2.5 % 97.5 %
## (Intercept) 0.76105399 11.3215835
## agea 0.98592445 0.9959694
## alcfreq.L 0.52501362 0.8750255
## alcfreq.Q 0.96813530 1.6454268
## alcfreq.C 0.78229524 1.2441176
## alcfreq^4 0.70806024 1.1086971
## alcfreq^5 0.82311735 1.3301798
## healthGood 1.02250612 1.8574431
## healthFair 2.00288899 3.6904961
## healthBad 4.97323718 10.2302301
## healthVery bad 5.76688909 22.9753666
## sclmeetLess than once a month 0.27548567 1.2760284
## sclmeetOnce a month 0.15180546 0.6914468
## sclmeetSeveral times a month 0.11596969 0.5000461
## sclmeetOnce a week 0.10711567 0.4655665
## sclmeetSeveral times a week 0.10347606 0.4396510
## sclmeetEvery day 0.09238534 0.4099219
## stflife1 1.07211965 53.2927153
## stflife2 0.54803636 7.3338391
## stflife3 0.53897757 5.9872033
## stflife4 0.31007993 3.2115158
## stflife5 0.15044354 1.4320392
## stflife6 0.19334884 1.8314513
## stflife7 0.07132961 0.6662307
## stflife8 0.04234217 0.3943640
## stflife9 0.02112193 0.2072243
## stflifeExtremely satisfied 0.02199396 0.2189725
## cntrySpain 0.99801885 1.4812502
## cntryNorway 0.57613012 0.9759740
r_mcfadden = with(summary(aModel), 1 - deviance/null.deviance)
r_nagelkerke = with(summary(aModel), r_mcfadden/(1 - (null.deviance / nrow(aModel$data)*log(2))))
r_mcfadden
## [1] 0.2582621
r_nagelkerke
## [1] 0.5824359
tibble(
Metric = c("McFadden’s pseudo-R²", "Nagelkerke’s pseudo-R²"),
Value = c(r_mcfadden, r_nagelkerke)
) %>%
mutate(Value = round(Value, 3)) %>%
kable(
col.names = c("Metric", "Value"),
caption = "Table 3. Pseudo-R² for Logistic Model"
)
| Metric | Value |
|---|---|
| McFadden’s pseudo-R² | 0.258 |
| Nagelkerke’s pseudo-R² | 0.582 |
Hypotheses H1–H5 are evaluated via Pearson correlation coefficients; hypotheses H6a–H6c are assessed using the logistic regression model with interaction terms. Pearson correlation tests were performed for each predictor against the continuous depression score. Results are presented in Table 4.
df = df %>%
mutate(across(c(agea, alcfreq, health, sclmeet, stflife, depression), as.numeric))
tests = list(
H1 = cor.test(df$agea, df$depression),
H2 = cor.test(df$alcfreq, df$depression),
H3 = cor.test(df$health, df$depression),
H4 = cor.test(df$sclmeet, df$depression),
H5 = cor.test(df$stflife, df$depression)
)
results = tibble(
Hypothesis = names(tests),
r_value = map_dbl(tests, ~ .x$estimate),
p_value = map_dbl(tests, ~ .x$p.value)
)
knitr::kable(
results,
caption = "Table 4: Pearson correlation coefficients for Hypothesis: H1–H5",
digits = c(0, 2, 3)
)
| Hypothesis | r_value | p_value |
|---|---|---|
| H1 | 0.01 | 0.683 |
| H2 | -0.12 | 0.000 |
| H3 | 0.31 | 0.000 |
| H4 | -0.17 | 0.000 |
| H5 | -0.43 | 0.000 |
Hypothesis 1 was not supported, as the correlation between age and depression proved negligible (r = .01, p = .683). A small but statistically significant positive association emerged for alcohol consumption (H2; r = .11, p < .001), indicating that higher drinking frequency corresponds to slightly elevated depressive symptoms. Self-rated health demonstrated a moderate positive relationship with depression (H3; r = .31, p < .001), confirming that poorer perceived health aligns with greater symptom severity. The frequency of social meetings was inversely related to depression (H4; r = –.17, p < .001), consistent with a protective effect of social engagement. The strongest effect was observed for life satisfaction (H5; r = –.43, p < .001), implying that higher satisfaction indices are robustly associated with lower levels of depressive symptomatology.
Self‐rated health, social interaction frequency and life satisfaction consistently predicted depressive symptoms in both OLS and logistic models. In contrast, age bore no significant association (H1 unsupported) and alcohol consumption exerted only a small effect (H2 partially supported). These findings suggest that psychosocial factors outweigh demographic or behavioral variables in explaining depression across Norway, Germany and Spain.
The ESS’s cross-sectional design prevents causal inference. Unexpectedly weak or inverse associations for age and alcohol consumption may reflect unmeasured confounders such as socioeconomic status, employment conditions or country-specific drinking norms. Finally, restricting the sample to Norway, Germany and Spain may limit generalizability to contexts with different welfare systems, cultural practices or family structures.
Self-rated health, social connectivity and life satisfaction emerged as the strongest predictors of depressive symptoms. The lack of an age effect and the modest alcohol finding point to the need for further investigation of underlying confounders. Future longitudinal research incorporating additional socioeconomic and cultural variables will be essential to clarify causal pathways and guide targeted public-health interventions.
4.6 Social Connections
The histogram shows participants’ reported number of social meetings per week. Most participants meet with friends or family 3–4 times weekly, indicating regular social engagement. Frequencies below 2 and above 6 are less common, representing less frequent social contact or very high social activity, respectively.