With ESS data depression is measured by the CES-D8 scale. First we load the data, check reliability, and compute the score:
cronbach.alpha(df[, paste0("d", 1:8)], na.rm = TRUE)
##
## Cronbach's alpha for the 'df[, paste0("d", 1:8)]' data-set
##
## Items: 8
## Sample units: 40156
## alpha: 0.823
# CES-D8 score
summary(df$cesd8)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 1.375 1.625 1.698 2.000 4.000 52
Depression values across Europe can than be then summarized as follows:
summary(df$cesd8)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 1.375 1.625 1.698 2.000 4.000 52
In general, the CES-D8 scores show a unimodal, slightly right skewed distribution:
summary(df$cesd8)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 1.375 1.625 1.698 2.000 4.000 52
Depression varies with age because biological, psychological, and social conditions change over the life course: it often rises in adolescence due to developmental and social stress, stabilizes or declines in midlife as coping improves, and may increase again in older age due to illness, loss, and loneliness.
In general, we expect an increase of depression scores with increasing age.
To test our hypothesis, we estimate a linear regression model.
model = lm(cesd8 ~ age, data=df)
summary(model)
##
## Call:
## lm(formula = cesd8 ~ age, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.81341 -0.36524 -0.09391 0.26890 2.36951
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.539036 0.007314 210.43 <2e-16 ***
## age 0.003049 0.000133 22.92 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4975 on 39843 degrees of freedom
## (311 observations deleted due to missingness)
## Multiple R-squared: 0.01301, Adjusted R-squared: 0.01299
## F-statistic: 525.3 on 1 and 39843 DF, p-value: < 2.2e-16
plot(df$age,df$cesd8)
abline(model, col="red")
model2 = lm(cesd8 ~ age + female, data=df)
summary(model2)
##
## Call:
## lm(formula = cesd8 ~ age + female, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.86750 -0.35248 -0.08461 0.27188 2.43263
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.4779938 0.0076760 192.55 <2e-16 ***
## age 0.0029792 0.0001321 22.56 <2e-16 ***
## female 0.1213847 0.0049592 24.48 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4938 on 39842 degrees of freedom
## (311 observations deleted due to missingness)
## Multiple R-squared: 0.02763, Adjusted R-squared: 0.02758
## F-statistic: 566.1 on 2 and 39842 DF, p-value: < 2.2e-16