Depression in Europe

Define Depression Scale CES-D8

With ESS data depression is measured by the CES-D8 scale. First we load the data, check reliability, and compute the score:

Depression values across Europe can than be then summarized as follows:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.695   2.000   4.000     799

In general, the CES-D8 scores show a unimodal, slightly right skewed distribution:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.695   2.000   4.000     799

Hypothesis

Depression varies with age because biological, psychological, and social conditions change over the life course: it often rises in adolescence due to developmental and social stress, stabilizes or declines in midlife as coping improves, and may increase again in older age due to illness, loss, and loneliness.

In general, we expect an increase of depression scores with increasing age.

To test our hypothesis, we estimate a linear regression model.

df$agea_num <- suppressWarnings(as.numeric(as.character(df$agea)))
model <- lm(cesd8 ~ agea_num, data = df)
summary(model)
## 
## Call:
## lm(formula = cesd8 ~ agea_num, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.80824 -0.36165 -0.09376  0.26661  2.37073 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.5397891  0.0073480  209.55   <2e-16 ***
## agea_num    0.0029828  0.0001339   22.28   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4958 on 39126 degrees of freedom
##   (1028 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.01253,    Adjusted R-squared:  0.01251 
## F-statistic: 496.6 on 1 and 39126 DF,  p-value: < 2.2e-16

The bivariate linear regression model confirms the correlation results. Age is positively associated with depressive symptoms, but the estimated effect size is very small. Although the coefficient is statistically significant, the explained variance of the model is extremely low, suggesting that age alone has very limited explanatory power for depression.

Checking the correlation

cor(df$agea_num, df$cesd8, use = "complete.obs")
## [1] 0.111947
cor.test(df$agea_num, df$cesd8)
## 
##  Pearson's product-moment correlation
## 
## data:  df$agea_num and df$cesd8
## t = 22.284, df = 39126, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1021518 0.1217204
## sample estimates:
##      cor 
## 0.111947

The correlation analysis shows a weak positive association between age and depressive symptoms. As age increases, CES-D-8 scores tend to increase slightly, although the strength of the relationship is very small.

Develop a multivariate model

df$young <- ifelse(df$agea_num < 40, 1, 0)

model_multi <- lm(cesd8 ~ agea_num + young, data = df)
summary(model_multi)
## 
## Call:
## lm(formula = cesd8 ~ agea_num + young, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.83501 -0.36497 -0.09018  0.26556  2.36332 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.4631369  0.0138173 105.892  < 2e-16 ***
## agea_num    0.0041319  0.0002206  18.727  < 2e-16 ***
## young       0.0595321  0.0090898   6.549 5.85e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4955 on 39125 degrees of freedom
##   (1028 Beobachtungen als fehlend gelöscht)
## Multiple R-squared:  0.01361,    Adjusted R-squared:  0.01356 
## F-statistic:   270 on 2 and 39125 DF,  p-value: < 2.2e-16

The multivariate regression model includes age and a dummy variable indicating younger respondents. Age remains positively related to CES-D-8 scores, while the additional dummy variable shows only a minor contribution. Overall, the increase in explained variance is marginal, indicating that basic demographic characteristics explain only a small share of individual differences in depressive symptoms.

Plot Data

plot(df$agea_num, df$cesd8,
     xlab = "Age",
     ylab = "CES-D-8 score")
abline(lm(cesd8 ~ agea_num, data = df), col = "red")