Seminar Paper: Depression among Adults in Europe (Norway, Germany and Spain)

1. Introduction

Depression poses a major public-health challenge in Europe, with prevalence and severity varying markedly across countries and subpopulations. Using data from Round 11 of the European Social Survey (ESS), this paper examines how age, alcohol consumption, self-rated health, social engagement and life satisfaction relate to depressive symptoms. Two operationalisations of depression are employed: the continuous CES-D8 scale and a binary “clinical cutoff.” First, hypotheses H1–H5 are tested via Pearson correlations to assess each predictor’s bivariate association with depression. Second, an OLS model predicts the CES-D8 score and a logistic model predicts the dichotomous clinical outcome. Finally, results are compared to gauge the relative importance of psychosocial versus demographic factors in Norway, Germany, and Spain.

2. Hypotheses H1 - H5

Hypothesis 1 (H1): Older adults exhibit higher levels of depression compared to younger individuals.

Hypothesis 2 (H2): Higher alcohol consumption correlates with higher depression levels.

Hypothesis 3 (H3): Individuals with poorer self-rated health report higher levels of depression.

Hypothesis 4 (H4): Less frequent social meetings are associated with higher depression levels.

Hypothesis 5 (H5): Higher life satisfaction is associated with lower depression levels

3. Data and Methods

3.1 Data Source and Sample

Respondents from Norway, Germany, and Spain (n = 5 601) were selected from ESS 11 data.

3.2 Variable Preparation

The CES-D8 scale comprises eight items (D20–D27), each rated on a 0–3 Likert scale (higher scores indicate greater symptom severity):

D20 (fltdpr): Felt depressed

D21 (flteeff): Felt everything was an effort

D22 (slprl): Sleep was restless

D23 (wrhpp): Could not get going

D24 (fltlnl): Felt lonely

D25 (enjlf): Enjoyed life (reverse-coded)

D26 (fltsd): Felt sad

D27 (cldgng): Felt discouraged

These items were converted to numeric values and combined into a single depression score by computing the row-wise mean.

# Multi Item Scale from D20-D27

# Prepare numeric and factor variables 
df$d20 = as.numeric(df$fltdpr)    # D20
df$d21 = as.numeric(df$flteeff)   # D21
df$d22 = as.numeric(df$slprl)     # D22
df$d23 = as.numeric(df$wrhpp)     # D23
df$d24 = as.numeric(df$fltlnl)    # D24
df$d25 = as.numeric(df$enjlf)     # D25
df$d26 = as.numeric(df$fltsd)     # D26
df$d27 = as.numeric(df$cldgng)    # D27

# Reverse scale of d25 (enjlf)
# Reverse-coding aligns d25 (enjlf) with the depression scale (higher = more depressed)
df$d25 = 6 - df$d25
df$d23 = 5 - df$d23

item_variances = sum(apply(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], 2, var, na.rm = TRUE))
# The total variance is the variance of the sum of all depression-related items (D20-D27)
total_variance = var(rowSums(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm = TRUE), na.rm = TRUE)

df$agea = as.numeric(as.character(df$agea))

# Recoded into a numeric scale from 0 (“Never”)  to 6 (“Every day”) 
df$alcfreqdummy = as.numeric(factor(df$alcfreq, levels = c(
  "Never", "Less than once a month", "Once a month", "Several times a month", 
  "Once a week", "Several times a week", "Every day"), labels = c(0,1,2,3,4,5,6)))>3

# Recoded on a 5-point scale from 1 (“Very bad”) to 5 (“Very good”) 
df$healthdummy = as.numeric(factor(df$health, levels = c(
  "Very bad", "Bad", "Fair", "Good", "Very good"), labels = c(1,2,3,4,5)))>3

# Convert Social Connections (sclmeet) on a 7-point scale (0 “Never” to 6 “Every day”) 
df$sclmeetdummy = as.numeric(factor(df$sclmeet, levels = c(
  "Never", "Less than once a month", "Once a month", "Several times a month",
  "Once a week", "Several times a week", "Every day"), labels = c(0,1,2,3,4,5,6)))>4

# Convert Life Satisfaction (stflife) measured on an 11-point scale from 0 (“Extremely dissatisfied”) to 10 (“Extremely satisfied”) 
df$stflifedummy = as.numeric(factor(df$stflife, levels = c(
  "Extremely dissatisfied", "2", "3", "4", "5", "6", "7", "8", "9", "10", "Extremely satisfied"), labels = c(0,1,2,3,4,5,6,7,8,9,10)))>5

# Compute the depression scale
# Score = mean of item values row-wise = sum of item values / number of items 
df$depression = rowSums(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm = TRUE) / 8

# replace of the remaining missing values in depression
df$depression[is.na(df$depression)] = median(df$depression, na.rm = TRUE)

summary(df$depression)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   1.500   1.750   1.793   2.000   4.125

3.3 Scale Reliability

Depression Scale (CES-D8): A composite score derived from eight items measuring depressive symptoms. Items were scored 0–3, with “enjlf” reverse-coded so that higher values indicate greater symptom severity. The scale score is the row-wise mean of these items.

# Cronbach's alpha calculation
n_items = 8  # Number of items (D20-D27)
item_variances = sum(apply(df[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], 2, var, na.rm = TRUE))
cronbach_alpha = (n_items / (n_items - 1)) * (1 - item_variances / total_variance)
round(cronbach_alpha, 2)

## [1] 0.82

To assess scale reliability, Cronbach’s α was computed, with values between 0.8 and 0.92 reflecting good consistency (too low suggests weak item cohesion, too high may imply redundancy). In this study, the alpha coefficient was found to be 0.82, indicates good internal consistency for the CES-D8 scale.

4. Descriptive Statistics

4.1 Age Distribution

Dashed (mean) and dotted (median) lines indicate central tendency.

mean_age   = mean(df$agea, na.rm = TRUE)
median_age = median(df$agea, na.rm = TRUE)


ggplot(df, aes(x = agea)) +
  geom_histogram(
    binwidth = 5,
    fill     = "lightpink",
    color    = "black",
    alpha    = 0.6
  ) +
  geom_vline(xintercept = mean_age,   linetype = "dashed", color = "blue") +
  geom_vline(xintercept = median_age, linetype = "dotted", color = "red") +
  labs(
    title   = "Figure 1. Age Distribution of Participants",
    x       = "Age (in years)",
    y       = "Count",
   caption = paste0("Mean = ", round(mean_age,1), "; Median = ", median_age)
  ) +
  theme_minimal()

4.2 Distribution of Depressions Scores by Country

Compare the distribution of depression scores across countries to highlight geographical variation with a boxplot.

ggplot(df, aes(x = cntry, y = depression, fill = cntry)) +
  geom_boxplot(fill= "lightpink", outlier.shape = 1, alpha = 0.7) +
  labs(
    title = "Figure 2. Distribution of Depression Scores by Country",
    x = "Country",
    y = "Depression Score",
    fill = "Country",
    caption = "Mara Winkler"
  ) +
  theme_minimal()

Figure 2 presents the distribution of CES-D8 depression scores by country. Spain exhibits the highest median score (≈1.8) and the widest interquartile range, indicating both elevated average symptom levels and greater heterogeneity in depressive severity. Germany shows a slightly lower median (≈1.7) with moderate spread, while Norway has the lowest median (≈1.6) and the narrowest interquartile range, suggesting more uniform, lower symptom levels. In all three samples, a small number of outliers exceed the upper whisker, reflecting respondents with particularly high depressive scores. These patterns imply that, on average, adults in Spain report more severe depressive symptoms than those in Germany or Norway.

4.3 Relationship between Life Satisfication and Depression

# Scatterplot of life satisfaction vs. depression
df$stflife_n = as.numeric(as.character(df$stflife))
mean_life = mean(df$stflife_n, na.rm = TRUE)
median_life = median(df$stflife_n, na.rm = TRUE)

ggplot(df, aes(y = stflife, x = depression)) +
  geom_point(alpha = 0.6) +
  geom_smooth(method = "lm", se = TRUE) +
  labs(
    title = "Figure 3. Life Satisfaction and Depression",
    x = "Life Satisfaction (0-12 scale)",
    y = "Depression Score",
    caption = paste0("Mean = ", round(mean_life,1), "; Median = ", median_life)
  ) +
  theme_minimal()

As H5 predicted, higher life-satisfaction scores are strongly linked to lower depression (blue line), illustrating a clear negative bivariate trend.

4.4 Self Reported Alcohol Consumption

Visualize how often participants consume alcohol and highlight central tendencies. This histogram displays counts by drinking frequency, showing that most participants fall in the moderate range (about 2–4 standard drinks). Very low (0–1) and very high (>6) frequencies appear less often, indicating extremes are uncommon. One “unit” equals a standard drink, ensuring all beverage types are directly comparable.

# numeric data
df$alcfreq_n = as.numeric(as.character(df$health))

# Compute numeric summaries on a separate numeric column:
mean_alc   = mean(df$alcfreq_n, na.rm = TRUE)
median_alc = median(df$alcfreq_n, na.rm = TRUE)

# Bar‐chart of the factor
ggplot(df, aes(y = alcfreq)) +
  geom_bar(fill  = "lightpink",
           color = "black",
           alpha = 0.7) +
  labs(
    title   = "Figure 4. Alcohol Consumption",
    x       = "Units per Week",
    y       = "Count",
      caption = "Mara Winkler"
  ) +
  theme_minimal()

df = df %>%
  mutate(
    depression_cont = rowSums(
      select(., d20, d21, d22, d23, d24, d25, d26, d27),
      na.rm = TRUE
    ) / 8
  )

# alcohol factor is ordered
df = df %>%
  mutate(
    alcfreq = factor(alcfreq,
      levels = c(
        "Never",
        "Less than once a month",
        "Once a month",
        "Several times a month",
        "Once a week",
        "Several times a week",
        "Every day"
      ),
      ordered = TRUE
    )
  )



ggplot(df, aes(x = alcfreq, y = depression_cont)) +
  geom_boxplot(
    fill         = "lightpink",
    color        = "black",
    outlier.shape= 1,
    alpha        = 0.7
  ) +
  labs(
    title   = "Figure 5. Depression Scores by Alcohol Consumption Frequency",
    x       = "Alcohol Consumption Frequency",
    y       = "CES-D8 Depression Score (0–3 mean)",
    caption = "Mara Winkler"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1)
  )

Figure 5 tests H2 by plotting mean CES-D8 scores across alcohol-use categories. Contrary to H2’s prediction of higher depression among heavier drinkers, the median symptom score falls from approximately 1.9 for “Never” drinkers to about 1.6 for “Every day” drinkers. This inverted pattern aligns poorly with the modest positive Pearson correlation (r = .11, p < .001), suggesting that the apparent association between alcohol frequency and depression is confounded—perhaps by underlying health or life‐course factors that both discourage drinking and elevate depressive symptoms. In contrast, H1 was unsupported (r = .01, p = .683), as age bore virtually no relationship to depressive outcomes. By comparison, H3–H5 garnered strong support: poorer self-rated health (H3; r = .31, p < .001), less frequent social engagement (H4; r = −.17, p < .001), and lower life satisfaction (H5; r = −.43, p < .001) each exhibited correlations in the expected direction.

4.5 Self-Rated Health

The bar chart shows participants’ self-rated health on a 1–5 scale. Most respondents rate their health in the moderate to good range (values 3–4), with fewer reporting very poor (1) or excellent (5) health. This suggests the sample generally perceives their health as above average but rarely at the extremes.

# numeric data
df$health_n = as.numeric(as.character(df$health))

## Warning: NAs introduced by coercion

# Calculate mean and median
mean_hea = mean(df$health_n, na.rm = TRUE)
median_hea = median(df$health_n, na.rm = TRUE)

# Bar chart of self-rated health (ordinal)
ggplot(df, aes(x = factor(health))) +
  geom_bar(fill = "lightpink", color = "black", alpha = 0.7) +
  labs(
    title   = "Figure 6. Distribution of Self-Rated Health",
    x       = "Self-Rated Health (1 = Poor, 5 = Excellent)",
    y       = "Count",
    caption = "Mara Winkler"
  ) +
  theme_minimal()

4.6 Social Connections

The histogram shows participants’ reported number of social meetings per week. Most participants meet with friends or family 3–4 times weekly, indicating regular social engagement. Frequencies below 2 and above 6 are less common, representing less frequent social contact or very high social activity, respectively.

# numeric data
df$sclmeet_n = as.numeric(as.character(df$sclmeet))

## Warning: NAs introduced by coercion

# Calculate mean and median
mean_soc = mean(df$sclmeet_n, na.rm = TRUE)
median_soc = median(df$sclmeet_n, na.rm = TRUE)

ggplot(df, aes(y = sclmeet)) +
  geom_histogram(stat = "count",
                 fill  = "lightpink",
                 color = "black",
                 alpha = 0.7) +
  labs(
    title   = "Figure 7. Frequency of Social Meetings per Week",
    x       = "Number of Meetings",
    y       = "Count",
    caption = "Mara Winkler"
  ) +
  theme_minimal()

## Warning in geom_histogram(stat = "count", fill = "lightpink", color = "black",
## : Ignoring unknown parameters: `binwidth`, `bins`, and `pad`

4.7 Self-Rated Life Satisfaction

The histogram depicts participants’ self-reported life satisfaction on a 0–12 scale, where 0 indicates “extremely dissatisfied” and 12 “extremely satisfied.” The majority of respondents rate their satisfaction in the upper-middle range (around 7–9), demonstrating overall positive well-being. Lower scores (0–3) are uncommon, indicating few participants report very low life satisfaction.

 ggplot(df, aes( stflife)) +
   geom_bar(fill  = "lightpink",
           color = "black",
            alpha = 0.7,
            na.rm = TRUE) +
   labs(
    title   = "Figure 7. Distribution of Life Satisfaction Scores",
    x       = "Life Satisfaction",
      y       = "Count",
    caption = "Mara Winkler"
  ) +
  theme_minimal()+
 coord_flip()

5. Regression Models

In order to examine the combined effects of the predictors and to test whether the relationships vary by country, several regression models with interaction terms were estimated.

5.1 OLS on Coefficients CES-D8

This lollipop chart shows each predictor’s estimated effect on depression, with 95 % confidence intervals. Dots right of zero indicate higher predicted scores (e.g. “Good self-rated health”), dots left indicate lower scores (e.g. “Very bad self-rated health”). Narrow bands (age, health) reflect precise estimates; wider bands (some drinking and social-meeting categories) reflect greater uncertainty.

model_depression = lm(depression ~ agea + alcfreq + health + sclmeet, data = df)
summary(model_depression)

## 
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet, 
##     data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.46774 -0.29079 -0.05734  0.21594  2.25204 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    2.0872177  0.0656771  31.780  < 2e-16 ***
## agea                          -0.0040795  0.0003605 -11.317  < 2e-16 ***
## alcfreq.L                     -0.0657980  0.0185195  -3.553 0.000385 ***
## alcfreq.Q                      0.0448499  0.0185403   2.419 0.015598 *  
## alcfreq.C                     -0.0322463  0.0164447  -1.961 0.049950 *  
## alcfreq^4                     -0.0204432  0.0156690  -1.305 0.192062    
## alcfreq^5                     -0.0038604  0.0163910  -0.236 0.813817    
## healthGood                     0.1592552  0.0169680   9.386  < 2e-16 ***
## healthFair                     0.3527545  0.0190244  18.542  < 2e-16 ***
## healthBad                      0.6989579  0.0267062  26.172  < 2e-16 ***
## healthVery bad                 0.9914527  0.0594598  16.674  < 2e-16 ***
## sclmeetLess than once a month -0.0691306  0.0671886  -1.029 0.303577    
## sclmeetOnce a month           -0.2284283  0.0656223  -3.481 0.000504 ***
## sclmeetSeveral times a month  -0.3104225  0.0631521  -4.915 9.15e-07 ***
## sclmeetOnce a week            -0.3072367  0.0632566  -4.857 1.23e-06 ***
## sclmeetSeveral times a week   -0.3554486  0.0625419  -5.683 1.40e-08 ***
## sclmeetEvery day              -0.4094461  0.0634752  -6.450 1.22e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4359 on 4789 degrees of freedom
##   (795 observations deleted due to missingness)
## Multiple R-squared:  0.2257, Adjusted R-squared:  0.2232 
## F-statistic: 87.26 on 16 and 4789 DF,  p-value: < 2.2e-16

# focussing on coeffients
coefficients(model_depression)

##                   (Intercept)                          agea 
##                   2.087217675                  -0.004079527 
##                     alcfreq.L                     alcfreq.Q 
##                  -0.065797974                   0.044849873 
##                     alcfreq.C                     alcfreq^4 
##                  -0.032246325                  -0.020443179 
##                     alcfreq^5                    healthGood 
##                  -0.003860363                   0.159255195 
##                    healthFair                     healthBad 
##                   0.352754465                   0.698957867 
##                healthVery bad sclmeetLess than once a month 
##                   0.991452676                  -0.069130589 
##           sclmeetOnce a month  sclmeetSeveral times a month 
##                  -0.228428347                  -0.310422461 
##            sclmeetOnce a week   sclmeetSeveral times a week 
##                  -0.307236726                  -0.355448645 
##              sclmeetEvery day 
##                  -0.409446076

coef_df = tidy(model_depression, conf.int=TRUE) %>%
  filter(term != "(Intercept)")

# recode terms
coef_df = coef_df %>%
  mutate(term = recode(term,
    agea           = "Age (years)",
    alcfreq1       = "Rare drinking",
    alcfreq2       = "Once a Month drinking",
    alcfreq3       = "Monthly drinking",
    alcfreq4       = "Weekly drinking",
    alcfreq5       = "Several/week drinking",
    alcfreq6       = "Daily drinking",
    health1        = "Very bad self-rated health",
    health2        = "Bad self-rated health",
    health3        = "Fair self-rated health",
    health4        = "Good self-rated health",
    sclmeet1       = "Never meet",
    sclmeet2       = "Monthly meet",
    sclmeet3       = "Several/month meet",
    sclmeet4       = "Weekly meet",
    sclmeet5       = "Several/week meet",
    sclmeet6       = "Daily meet"
  ))

# 3. Plot: lines from zero to estimate, point at estimate
ggplot(coef_df, aes(x = estimate, y = reorder(term, estimate))) +
  geom_segment(aes(x = 0, xend = estimate, y = term, yend = term),
               color = "lightpink", size = 1) +
  geom_point(color = "lightpink", size = 3) +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0, color = "black") +
  labs(
    title = "Figure 8. OLS: Coefficients for Depression",
    x     = "Estimate (with 95% Confidence Intervall)",
    y     = "",
    caption = "Mara Winkler"
  ) +
  theme_minimal()

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Figure 8 plots the estimated OLS coefficients (points) and their 95 % confidence intervals (horizontal lines) for predicting the CES-D8 depression score. Predictors to the right of the vertical zero line—most notably “Bad self-rated health” and “Very bad self-rated health”—have positive coefficients, indicating higher depressive symptoms as health worsens (strong support for H3). In contrast, predictors to the left—especially frequent social meetings (e.g. “Daily meet,” “Several/week meet”)—have negative coefficients, indicating lower depression with greater social engagement (supporting H4). The age coefficient lies almost exactly on zero (H1 unsupported), and the alcohol‐consumption contrasts (alcfreq.L, alcfreq.Q, etc.) cluster near zero with wide intervals (H2 only weakly supported at best)

5.2 Predictors of Clinically Significant Depression

CES-D8 items were recoded from 1–4 to 0–3 and summed to yield a 0–24 symptom score. This continuum was partitioned into four severity levels:

None/Low (0–4): Minimal or absent symptoms

Mild (5–9): Subthreshold symptoms Moderate (10–14): At or above the clinical cutoff (≥10) Severe (15–24): High symptom burden indicative of probable major depression This categorization retains granularity across the symptom spectrum and facilitates analysis of how predictors relate to increasing severity.

# Recode each CES-D item from 1–4 to 0–3
df$d20_0to3 = df$d20 - 1
df$d21_0to3 = df$d21 - 1
df$d22_0to3 = df$d22 - 1
df$d23_0to3 = df$d23 - 1
df$d24_0to3 = df$d24 - 1
df$d25_0to3 = df$d25 - 1  # d25 has already been reversed correctly
df$d26_0to3 = df$d26 - 1
df$d27_0to3 = df$d27 - 1

#  Compute the sum of all eight items (range: 0–24)
df$depression_sum = rowSums(df[, c(
  "d20_0to3","d21_0to3","d22_0to3","d23_0to3",
  "d24_0to3","d25_0to3","d26_0to3","d27_0to3"
)], na.rm = TRUE)

# Cut Off
df$depression_cat4 = cut(
  df$depression_sum,
  breaks = c(-1, 4, 9, 14, 24),
  labels = c("1: none/low", "2: mild", "3: moderate", "4: severe")
)

table(df$depression_cat4)

## 
## 1: none/low     2: mild 3: moderate   4: severe 
##        2041        2603         713         238

prop.table(table(df$depression_cat4))

## 
## 1: none/low     2: mild 3: moderate   4: severe 
##  0.36478999  0.46523682  0.12743521  0.04253798

# Absolute frequencies for each severity category
table(df$depression_cat4)

## 
## 1: none/low     2: mild 3: moderate   4: severe 
##        2041        2603         713         238

# relative proportions for each category
prop.table(table(df$depression_cat4))

## 
## 1: none/low     2: mild 3: moderate   4: severe 
##  0.36478999  0.46523682  0.12743521  0.04253798

df %>% 
  count(depression_cat4) %>% 
  mutate(
    percent = n / sum(n)
  ) %>% 
  rename(
    Category    = depression_cat4,
    Count       = n,
    Proportion  = percent
  ) %>% 
  kable(digits = c(0, 0, 3),
        caption = "Table 1. Distribution of Depression Severity")

Table 1. Distribution of Depression Severity
Category	Count	Proportion
1: none/low	2041	0.364
2: mild	2603	0.465
3: moderate	713	0.127
4: severe	238	0.042
NA	6	0.001

5.3 Logistic Regression

Logistic regression models the probability of clinical depression (binary outcome), estimating how each predictor alters the log-odds of crossing the CES-D8 clinical threshold. This approach bounds predicted probabilities between 0 and 1 and produces interpretable odds ratios for each covariate

df$depression = ifelse(df$depression_sum >= 10, 1, 0)
aModel = lm(depression ~ agea + alcfreq + health + sclmeet + cntry,
              data = df,
              family = binomial)

## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
##  extra argument 'family' will be disregarded

summary(aModel)

## 
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet + 
##     cntry, data = df, family = binomial)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.88761 -0.18097 -0.09602 -0.02412  1.05029 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    0.4433683  0.0533085   8.317  < 2e-16 ***
## agea                          -0.0016848  0.0002922  -5.766 8.62e-09 ***
## alcfreq.L                     -0.0548652  0.0150772  -3.639 0.000277 ***
## alcfreq.Q                      0.0284760  0.0153494   1.855 0.063632 .  
## alcfreq.C                     -0.0082370  0.0133097  -0.619 0.536028    
## alcfreq^4                     -0.0227818  0.0127306  -1.790 0.073593 .  
## alcfreq^5                      0.0018237  0.0132666   0.137 0.890669    
## healthGood                     0.0456637  0.0137583   3.319 0.000910 ***
## healthFair                     0.1699774  0.0154826  10.979  < 2e-16 ***
## healthBad                      0.4112672  0.0216967  18.955  < 2e-16 ***
## healthVery bad                 0.5431626  0.0481254  11.286  < 2e-16 ***
## sclmeetLess than once a month -0.0972995  0.0543774  -1.789 0.073624 .  
## sclmeetOnce a month           -0.2374407  0.0531190  -4.470 8.00e-06 ***
## sclmeetSeveral times a month  -0.3001005  0.0511142  -5.871 4.62e-09 ***
## sclmeetOnce a week            -0.3107499  0.0512243  -6.066 1.41e-09 ***
## sclmeetSeveral times a week   -0.3201626  0.0506756  -6.318 2.89e-10 ***
## sclmeetEvery day              -0.3359493  0.0515916  -6.512 8.19e-11 ***
## cntrySpain                     0.0305582  0.0119773   2.551 0.010761 *  
## cntryNorway                   -0.0180912  0.0139649  -1.295 0.195219    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3527 on 4787 degrees of freedom
##   (795 observations deleted due to missingness)
## Multiple R-squared:  0.1559, Adjusted R-squared:  0.1527 
## F-statistic: 49.12 on 18 and 4787 DF,  p-value: < 2.2e-16

5.4 Odds Ratio and Confidence Interval

Table 2 summarizes the odds ratios (ORs) and 95 % confidence intervals (CIs) from the logistic model predicting clinical depression:

aModel = lm(depression ~ agea + alcfreq + health + sclmeet, data=df, family=binomial)

## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
##  extra argument 'family' will be disregarded

summary(aModel)

## 
## Call:
## lm(formula = depression ~ agea + alcfreq + health + sclmeet, 
##     data = df, family = binomial)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.88927 -0.18061 -0.09519 -0.03004  1.03025 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    0.450520   0.053205   8.468  < 2e-16 ***
## agea                          -0.001726   0.000292  -5.909 3.68e-09 ***
## alcfreq.L                     -0.051894   0.015003  -3.459 0.000547 ***
## alcfreq.Q                      0.040029   0.015020   2.665 0.007722 ** 
## alcfreq.C                     -0.008090   0.013322  -0.607 0.543710    
## alcfreq^4                     -0.018994   0.012694  -1.496 0.134626    
## alcfreq^5                      0.002839   0.013278   0.214 0.830729    
## healthGood                     0.045790   0.013746   3.331 0.000871 ***
## healthFair                     0.173113   0.015412  11.233  < 2e-16 ***
## healthBad                      0.414194   0.021635  19.145  < 2e-16 ***
## healthVery bad                 0.544761   0.048169  11.309  < 2e-16 ***
## sclmeetLess than once a month -0.095844   0.054430  -1.761 0.078322 .  
## sclmeetOnce a month           -0.236195   0.053161  -4.443 9.07e-06 ***
## sclmeetSeveral times a month  -0.299943   0.051160  -5.863 4.86e-09 ***
## sclmeetOnce a week            -0.307647   0.051244  -6.004 2.07e-09 ***
## sclmeetSeveral times a week   -0.319529   0.050666  -6.307 3.11e-10 ***
## sclmeetEvery day              -0.335792   0.051422  -6.530 7.25e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3531 on 4789 degrees of freedom
##   (795 observations deleted due to missingness)
## Multiple R-squared:  0.1537, Adjusted R-squared:  0.1508 
## F-statistic: 54.34 on 16 and 4789 DF,  p-value: < 2.2e-16

exp(coef(aModel))

##                   (Intercept)                          agea 
##                     1.5691284                     0.9982760 
##                     alcfreq.L                     alcfreq.Q 
##                     0.9494296                     1.0408410 
##                     alcfreq.C                     alcfreq^4 
##                     0.9919428                     0.9811851 
##                     alcfreq^5                    healthGood 
##                     1.0028427                     1.0468543 
##                    healthFair                     healthBad 
##                     1.1890008                     1.5131513 
##                healthVery bad sclmeetLess than once a month 
##                     1.7241967                     0.9086053 
##           sclmeetOnce a month  sclmeetSeveral times a month 
##                     0.7896271                     0.7408607 
##            sclmeetOnce a week   sclmeetSeveral times a week 
##                     0.7351750                     0.7264911 
##              sclmeetEvery day 
##                     0.7147716

exp(confint(aModel))

##                                   2.5 %    97.5 %
## (Intercept)                   1.4137042 1.7416402
## agea                          0.9977046 0.9988476
## alcfreq.L                     0.9219114 0.9777693
## alcfreq.Q                     1.0106399 1.0719446
## alcfreq.C                     0.9663714 1.0181908
## alcfreq^4                     0.9570694 1.0059085
## alcfreq^5                     0.9770737 1.0292912
## healthGood                    1.0190203 1.0754486
## healthFair                    1.1536135 1.2254735
## healthBad                     1.4503141 1.5787110
## healthVery bad                1.5688275 1.8949529
## sclmeetLess than once a month 0.8166440 1.0109223
## sclmeetOnce a month           0.7114755 0.8763632
## sclmeetSeveral times a month  0.6701596 0.8190208
## sclmeetOnce a week            0.6649060 0.8128701
## sclmeetSeveral times a week   0.6577985 0.8023572
## sclmeetEvery day              0.6462285 0.7905847

broom::tidy(aModel, conf.int = TRUE) %>%
  filter(term != "(Intercept)") %>%
  transmute(
    Predictor   = term,
    OR          = sprintf("%.2f", exp(estimate)),
    `95% CI`    = paste0(
                    sprintf("%.2f", exp(conf.low)), 
                    "–", 
                    sprintf("%.2f", exp(conf.high))
                  )
  ) %>%
  kable(
    col.names = c("Predictor", "Odds Ratio", "95% CI"),
    caption   = "Table 2. Odds Ratios and 95% CIs"
  )

Table 2. Odds Ratios and 95% CIs
Predictor	Odds Ratio	95% CI
agea	1.00	1.00–1.00
alcfreq.L	0.95	0.92–0.98
alcfreq.Q	1.04	1.01–1.07
alcfreq.C	0.99	0.97–1.02
alcfreq^4	0.98	0.96–1.01
alcfreq^5	1.00	0.98–1.03
healthGood	1.05	1.02–1.08
healthFair	1.19	1.15–1.23
healthBad	1.51	1.45–1.58
healthVery bad	1.72	1.57–1.89
sclmeetLess than once a month	0.91	0.82–1.01
sclmeetOnce a month	0.79	0.71–0.88
sclmeetSeveral times a month	0.74	0.67–0.82
sclmeetOnce a week	0.74	0.66–0.81
sclmeetSeveral times a week	0.73	0.66–0.80
sclmeetEvery day	0.71	0.65–0.79

5.5 Pseudo R-Squared

McFadden’s pseudo-R² (0.210) indicates a 21% reduction in deviance compared to the null model as an excellent fit for a cross-sectional logistic regression. Nagelkerke’s pseudo-R² (1.402) exceeds its 0–1 range, suggesting a calculation or scaling error that requires revision. Table 3 reports two measures of explained deviance for the clinical-depression model:

df$depression_sum = rowSums(df[, c(
  "d20_0to3","d21_0to3","d22_0to3","d23_0to3",
  "d24_0to3","d25_0to3","d26_0to3","d27_0to3"
)], na.rm = TRUE)
df$depression_binary = ifelse(df$depression_sum >= 10, 1, 0)
aModel = glm(
  depression_binary ~ agea + alcfreq + health + sclmeet + stflife + cntry,
  data   = df,
  family = binomial(link = "logit")
)

summary(aModel)

## 
## Call:
## glm(formula = depression_binary ~ agea + alcfreq + health + sclmeet + 
##     stflife + cntry, family = binomial(link = "logit"), data = df)
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                    1.045040   0.684302   1.527 0.126721    
## agea                          -0.009093   0.002585  -3.518 0.000435 ***
## alcfreq.L                     -0.385903   0.130194  -2.964 0.003036 ** 
## alcfreq.Q                      0.234540   0.135190   1.735 0.082761 .  
## alcfreq.C                     -0.012508   0.118295  -0.106 0.915794    
## alcfreq^4                     -0.121169   0.114337  -1.060 0.289259    
## alcfreq^5                      0.043631   0.122341   0.357 0.721368    
## healthGood                     0.315800   0.152088   2.076 0.037854 *  
## healthFair                     0.995284   0.155733   6.391 1.65e-10 ***
## healthBad                      1.961132   0.183850  10.667  < 2e-16 ***
## healthVery bad                 2.434730   0.351677   6.923 4.42e-12 ***
## sclmeetLess than once a month -0.520420   0.390390  -1.333 0.182506    
## sclmeetOnce a month           -1.124463   0.386098  -2.912 0.003587 ** 
## sclmeetSeveral times a month  -1.421531   0.372045  -3.821 0.000133 ***
## sclmeetOnce a week            -1.497073   0.374096  -4.002 6.29e-05 ***
## sclmeetSeveral times a week   -1.543287   0.368274  -4.191 2.78e-05 ***
## sclmeetEvery day              -1.635058   0.379404  -4.310 1.64e-05 ***
## stflife1                       1.849615   0.960881   1.925 0.054240 .  
## stflife2                       0.717111   0.655449   1.094 0.273921    
## stflife3                       0.619405   0.607675   1.019 0.308059    
## stflife4                       0.035586   0.589461   0.060 0.951860    
## stflife5                      -0.726372   0.567305  -1.280 0.200408    
## stflife6                      -0.478070   0.565967  -0.845 0.398281    
## stflife7                      -1.481661   0.562272  -2.635 0.008410 ** 
## stflife8                      -2.004537   0.561524  -3.570 0.000357 ***
## stflife9                      -2.675239   0.575295  -4.650 3.32e-06 ***
## stflifeExtremely satisfied    -2.627679   0.579182  -4.537 5.71e-06 ***
## cntrySpain                     0.195390   0.100697   1.940 0.052334 .  
## cntryNorway                   -0.285336   0.134372  -2.123 0.033714 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 4497.5  on 4790  degrees of freedom
## Residual deviance: 3336.0  on 4762  degrees of freedom
##   (810 observations deleted due to missingness)
## AIC: 3394
## 
## Number of Fisher Scoring iterations: 5

coef(aModel)

##                   (Intercept)                          agea 
##                   1.045039925                  -0.009093496 
##                     alcfreq.L                     alcfreq.Q 
##                  -0.385903224                   0.234539963 
##                     alcfreq.C                     alcfreq^4 
##                  -0.012507679                  -0.121168841 
##                     alcfreq^5                    healthGood 
##                   0.043630748                   0.315799998 
##                    healthFair                     healthBad 
##                   0.995283972                   1.961131734 
##                healthVery bad sclmeetLess than once a month 
##                   2.434729881                  -0.520419836 
##           sclmeetOnce a month  sclmeetSeveral times a month 
##                  -1.124462840                  -1.421531244 
##            sclmeetOnce a week   sclmeetSeveral times a week 
##                  -1.497072706                  -1.543286619 
##              sclmeetEvery day                      stflife1 
##                  -1.635057861                   1.849614630 
##                      stflife2                      stflife3 
##                   0.717111359                   0.619405499 
##                      stflife4                      stflife5 
##                   0.035586448                  -0.726371705 
##                      stflife6                      stflife7 
##                  -0.478069726                  -1.481660884 
##                      stflife8                      stflife9 
##                  -2.004537063                  -2.675239086 
##    stflifeExtremely satisfied                    cntrySpain 
##                  -2.627679229                   0.195389600 
##                   cntryNorway 
##                  -0.285335607

or_values = exp(coef(aModel))
print("Odds Ratios (OR):")

## [1] "Odds Ratios (OR):"

print(or_values)

##                   (Intercept)                          agea 
##                    2.84351205                    0.99094772 
##                     alcfreq.L                     alcfreq.Q 
##                    0.67983631                    1.26432700 
##                     alcfreq.C                     alcfreq^4 
##                    0.98757022                    0.88588437 
##                     alcfreq^5                    healthGood 
##                    1.04459656                    1.37135595 
##                    healthFair                     healthBad 
##                    2.70549252                    7.10736616 
##                healthVery bad sclmeetLess than once a month 
##                   11.41273550                    0.59427100 
##           sclmeetOnce a month  sclmeetSeveral times a month 
##                    0.32482690                    0.24134418 
##            sclmeetOnce a week   sclmeetSeveral times a week 
##                    0.22378428                    0.21367767 
##              sclmeetEvery day                      stflife1 
##                    0.19494109                    6.35736911 
##                      stflife2                      stflife3 
##                    2.04850725                    1.85782324 
##                      stflife4                      stflife5 
##                    1.03622722                    0.48366067 
##                      stflife6                      stflife7 
##                    0.61997897                    0.22725992 
##                      stflife8                      stflife9 
##                    0.13472265                    0.06889036 
##    stflifeExtremely satisfied                    cntrySpain 
##                    0.07224593                    1.21578456 
##                   cntryNorway 
##                    0.75176192

ci_values = exp(confint(aModel))

## Waiting for profiling to be done...

print(ci_values)

##                                    2.5 %     97.5 %
## (Intercept)                   0.76105399 11.3215835
## agea                          0.98592445  0.9959694
## alcfreq.L                     0.52501362  0.8750255
## alcfreq.Q                     0.96813530  1.6454268
## alcfreq.C                     0.78229524  1.2441176
## alcfreq^4                     0.70806024  1.1086971
## alcfreq^5                     0.82311735  1.3301798
## healthGood                    1.02250612  1.8574431
## healthFair                    2.00288899  3.6904961
## healthBad                     4.97323718 10.2302301
## healthVery bad                5.76688909 22.9753666
## sclmeetLess than once a month 0.27548567  1.2760284
## sclmeetOnce a month           0.15180546  0.6914468
## sclmeetSeveral times a month  0.11596969  0.5000461
## sclmeetOnce a week            0.10711567  0.4655665
## sclmeetSeveral times a week   0.10347606  0.4396510
## sclmeetEvery day              0.09238534  0.4099219
## stflife1                      1.07211965 53.2927153
## stflife2                      0.54803636  7.3338391
## stflife3                      0.53897757  5.9872033
## stflife4                      0.31007993  3.2115158
## stflife5                      0.15044354  1.4320392
## stflife6                      0.19334884  1.8314513
## stflife7                      0.07132961  0.6662307
## stflife8                      0.04234217  0.3943640
## stflife9                      0.02112193  0.2072243
## stflifeExtremely satisfied    0.02199396  0.2189725
## cntrySpain                    0.99801885  1.4812502
## cntryNorway                   0.57613012  0.9759740

r_mcfadden = with(summary(aModel), 1 - deviance/null.deviance)
r_nagelkerke = with(summary(aModel), r_mcfadden/(1 - (null.deviance / nrow(aModel$data)*log(2))))
r_mcfadden

## [1] 0.2582621

r_nagelkerke

## [1] 0.5824359

tibble(
Metric          = c("McFadden’s pseudo-R²", "Nagelkerke’s pseudo-R²"),
 Value           = c(r_mcfadden,      r_nagelkerke)
) %>%
 mutate(Value = round(Value, 3)) %>%
 kable(
   col.names = c("Metric", "Value"),
    caption   = "Table 3. Pseudo-R² for Logistic Model"
 )

Table 3. Pseudo-R² for Logistic Model
Metric	Value
McFadden’s pseudo-R²	0.258
Nagelkerke’s pseudo-R²	0.582

6. Results

6.1 Pearson Correlation: Hypotheses Testing

Hypotheses H1–H5 are evaluated via Pearson correlation coefficients; hypotheses H6a–H6c are assessed using the logistic regression model with interaction terms. Pearson correlation tests were performed for each predictor against the continuous depression score. Results are presented in Table 4.

df = df %>%
  mutate(across(c(agea, alcfreq, health, sclmeet, stflife, depression), as.numeric))

tests = list(
  H1 = cor.test(df$agea,    df$depression),
  H2 = cor.test(df$alcfreq, df$depression),
  H3 = cor.test(df$health,  df$depression),
  H4 = cor.test(df$sclmeet, df$depression),
  H5 = cor.test(df$stflife, df$depression)
)

results = tibble(
  Hypothesis = names(tests),
  r_value    = map_dbl(tests, ~ .x$estimate),
  p_value    = map_dbl(tests, ~ .x$p.value)
)

knitr::kable(
  results,
  caption = "Table 4: Pearson correlation coefficients for Hypothesis: H1–H5",
  digits  = c(0, 2, 3)
)

Table 4: Pearson correlation coefficients for Hypothesis: H1–H5
Hypothesis	r_value	p_value
H1	0.01	0.683
H2	-0.12	0.000
H3	0.31	0.000
H4	-0.17	0.000
H5	-0.43	0.000

6.2 Hypotheses Interpretation

Hypothesis 1 was not supported, as the correlation between age and depression proved negligible (r = .01, p = .683). A small but statistically significant positive association emerged for alcohol consumption (H2; r = .11, p < .001), indicating that higher drinking frequency corresponds to slightly elevated depressive symptoms. Self-rated health demonstrated a moderate positive relationship with depression (H3; r = .31, p < .001), confirming that poorer perceived health aligns with greater symptom severity. The frequency of social meetings was inversely related to depression (H4; r = –.17, p < .001), consistent with a protective effect of social engagement. The strongest effect was observed for life satisfaction (H5; r = –.43, p < .001), implying that higher satisfaction indices are robustly associated with lower levels of depressive symptomatology.

7. Discussion

Self‐rated health, social interaction frequency and life satisfaction consistently predicted depressive symptoms in both OLS and logistic models. In contrast, age bore no significant association (H1 unsupported) and alcohol consumption exerted only a small effect (H2 partially supported). These findings suggest that psychosocial factors outweigh demographic or behavioral variables in explaining depression across Norway, Germany and Spain.

8. Limitations

The ESS’s cross-sectional design prevents causal inference. Unexpectedly weak or inverse associations for age and alcohol consumption may reflect unmeasured confounders such as socioeconomic status, employment conditions or country-specific drinking norms. Finally, restricting the sample to Norway, Germany and Spain may limit generalizability to contexts with different welfare systems, cultural practices or family structures.

9. Conclusion

Self-rated health, social connectivity and life satisfaction emerged as the strongest predictors of depressive symptoms. The lack of an age effect and the modest alcohol finding point to the need for further investigation of underlying confounders. Future longitudinal research incorporating additional socioeconomic and cultural variables will be essential to clarify causal pathways and guide targeted public-health interventions.