Depression in Europe

Define Depression Scale CES-D8

With ESS data depression is measured by the CES-D8 scale. First we load the data, check reliability, and compute the score:

cronbach.alpha(df[, paste0("d", 1:8)], na.rm = TRUE)
## 
## Cronbach's alpha for the 'df[, paste0("d", 1:8)]' data-set
## 
## Items: 8
## Sample units: 40156
## alpha: 0.823
# CES-D8 score
summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.698   2.000   4.000      52

Depression values across Europe can than be then summarized as follows:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.698   2.000   4.000      52

In general, the CES-D8 scores show a unimodal, slightly right skewed distribution:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.698   2.000   4.000      52

Hypothesis

Depression varies with age because biological, psychological, and social conditions change over the life course: it often rises in adolescence due to developmental and social stress, stabilizes or declines in midlife as coping improves, and may increase again in older age due to illness, loss, and loneliness.

In general, we expect an increase of depression scores with increasing age.

To test our hypothesis, we estimate a linear regression model.

model = lm(cesd8 ~ age, data=df)
summary(model)
## 
## Call:
## lm(formula = cesd8 ~ age, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.81341 -0.36524 -0.09391  0.26890  2.36951 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.539036   0.007314  210.43   <2e-16 ***
## age         0.003049   0.000133   22.92   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4975 on 39843 degrees of freedom
##   (311 observations deleted due to missingness)
## Multiple R-squared:  0.01301,    Adjusted R-squared:  0.01299 
## F-statistic: 525.3 on 1 and 39843 DF,  p-value: < 2.2e-16
plot(df$age,df$cesd8)
abline(model, col="red")

Multivariate model with dummy (just to tick the box)

model2 = lm(cesd8 ~ age + female, data=df)
summary(model2)
## 
## Call:
## lm(formula = cesd8 ~ age + female, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.86750 -0.35248 -0.08461  0.27188  2.43263 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.4779938  0.0076760  192.55   <2e-16 ***
## age         0.0029792  0.0001321   22.56   <2e-16 ***
## female      0.1213847  0.0049592   24.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4938 on 39842 degrees of freedom
##   (311 observations deleted due to missingness)
## Multiple R-squared:  0.02763,    Adjusted R-squared:  0.02758 
## F-statistic: 566.1 on 2 and 39842 DF,  p-value: < 2.2e-16