Depression in Europe

Define Depression Scale CES-D8

With ESS data depression is measured by the CES-D8 scale. First we load the data, check reliability, and compute the score:

Depression values across Europe can than be then summarized as follows:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.695   2.000   4.000     799

Use regression model to describe gender effect on depression in europe. To create a ```{r} function, press ctrl+alt+i

tapply(df$cesd8,df$gndr, mean, na.rm=T)
##     Male   Female 
## 1.628996 1.752677
df$female = as.numeric(df$gndr == "Female")
table(df$female)
## 
##     0     1 
## 18760 21396
model = lm(cesd8 ~ female, data=df)
summary(model)
## 
## Call:
## lm(formula = cesd8 ~ female, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7527 -0.3777 -0.1277  0.2473  2.3710 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.628996   0.003654   445.8   <2e-16 ***
## female      0.123682   0.005007    24.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4956 on 39355 degrees of freedom
##   (799 observations deleted due to missingness)
## Multiple R-squared:  0.01527,    Adjusted R-squared:  0.01524 
## F-statistic: 610.2 on 1 and 39355 DF,  p-value: < 2.2e-16
tapply(df$cesd8,df$health, mean, na.rm=T)
## Very good      Good      Fair       Bad  Very bad 
##  1.472164  1.625812  1.867098  2.265404  2.616609
lm(cesd8 ~ health, data=df)
## 
## Call:
## lm(formula = cesd8 ~ health, data = df)
## 
## Coefficients:
##    (Intercept)      healthGood      healthFair       healthBad  healthVery bad  
##         1.4722          0.1536          0.3949          0.7932          1.1444

In general, the CES-D8 scores show a unimodal, slightly right skewed distribution:

summary(df$cesd8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.375   1.625   1.695   2.000   4.000     799

Hypothesis

Depression varies with age because biological, psychological, and social conditions change over the life course: it often rises in adolescence due to developmental and social stress, stabilizes or declines in midlife as coping improves, and may increase again in older age due to illness, loss, and loneliness.

In general, we expect an increase of depression scores with increasing age.

To test our hypothesis, we estimate a linear regression model.

model_1= lm(cesd8 ~ agea, data=df)
summary(model_1)
## 
## Call:
## lm(formula = cesd8 ~ agea, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.03759 -0.36137 -0.07893  0.26964  2.39464 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.53726    0.04843  31.744  < 2e-16 ***
## agea16       0.07411    0.05472   1.354 0.175597    
## agea17       0.03023    0.05493   0.550 0.582030    
## agea18       0.10798    0.05461   1.977 0.048013 *  
## agea19       0.12499    0.05370   2.328 0.019935 *  
## agea20       0.11668    0.05428   2.150 0.031578 *  
## agea21       0.09496    0.05439   1.746 0.080804 .  
## agea22       0.09448    0.05425   1.742 0.081582 .  
## agea23       0.10096    0.05440   1.856 0.063471 .  
## agea24       0.10552    0.05365   1.967 0.049232 *  
## agea25       0.15896    0.05388   2.950 0.003177 ** 
## agea26       0.11634    0.05404   2.153 0.031342 *  
## agea27       0.09818    0.05426   1.809 0.070394 .  
## agea28       0.13110    0.05380   2.437 0.014825 *  
## agea29       0.13511    0.05347   2.527 0.011508 *  
## agea30       0.12470    0.05335   2.338 0.019416 *  
## agea31       0.09238    0.05364   1.722 0.085059 .  
## agea32       0.09954    0.05306   1.876 0.060645 .  
## agea33       0.12745    0.05249   2.428 0.015183 *  
## agea34       0.10673    0.05269   2.026 0.042797 *  
## agea35       0.12177    0.05272   2.310 0.020900 *  
## agea36       0.05820    0.05297   1.099 0.271892    
## agea37       0.11597    0.05299   2.188 0.028642 *  
## agea38       0.07890    0.05232   1.508 0.131556    
## agea39       0.06810    0.05239   1.300 0.193618    
## agea40       0.11542    0.05273   2.189 0.028618 *  
## agea41       0.12506    0.05237   2.388 0.016955 *  
## agea42       0.09132    0.05228   1.747 0.080680 .  
## agea43       0.10377    0.05231   1.984 0.047285 *  
## agea44       0.14334    0.05220   2.746 0.006032 ** 
## agea45       0.14633    0.05240   2.792 0.005235 ** 
## agea46       0.12806    0.05253   2.438 0.014774 *  
## agea47       0.10534    0.05234   2.012 0.044179 *  
## agea48       0.09292    0.05229   1.777 0.075555 .  
## agea49       0.11420    0.05197   2.197 0.028005 *  
## agea50       0.11947    0.05216   2.291 0.021992 *  
## agea51       0.15321    0.05214   2.938 0.003301 ** 
## agea52       0.15323    0.05194   2.950 0.003181 ** 
## agea53       0.15182    0.05185   2.928 0.003413 ** 
## agea54       0.14496    0.05202   2.786 0.005331 ** 
## agea55       0.18683    0.05210   3.586 0.000336 ***
## agea56       0.16667    0.05199   3.206 0.001349 ** 
## agea57       0.18987    0.05188   3.660 0.000252 ***
## agea58       0.17107    0.05163   3.314 0.000921 ***
## agea59       0.20262    0.05194   3.901 9.59e-05 ***
## agea60       0.17041    0.05196   3.280 0.001040 ** 
## agea61       0.13789    0.05201   2.651 0.008022 ** 
## agea62       0.16134    0.05230   3.085 0.002038 ** 
## agea63       0.14219    0.05194   2.737 0.006195 ** 
## agea64       0.15870    0.05160   3.076 0.002102 ** 
## agea65       0.13682    0.05200   2.631 0.008510 ** 
## agea66       0.16096    0.05196   3.098 0.001950 ** 
## agea67       0.15599    0.05187   3.007 0.002636 ** 
## agea68       0.16188    0.05178   3.126 0.001773 ** 
## agea69       0.15090    0.05208   2.898 0.003762 ** 
## agea70       0.17359    0.05228   3.321 0.000899 ***
## agea71       0.14735    0.05257   2.803 0.005067 ** 
## agea72       0.15880    0.05256   3.021 0.002519 ** 
## agea73       0.20567    0.05269   3.904 9.49e-05 ***
## agea74       0.22546    0.05259   4.287 1.81e-05 ***
## agea75       0.23595    0.05273   4.475 7.68e-06 ***
## agea76       0.24728    0.05320   4.648 3.36e-06 ***
## agea77       0.26888    0.05350   5.026 5.02e-07 ***
## agea78       0.33565    0.05412   6.202 5.62e-10 ***
## agea79       0.21659    0.05503   3.936 8.30e-05 ***
## agea80       0.25237    0.05483   4.603 4.18e-06 ***
## agea81       0.29689    0.05606   5.296 1.19e-07 ***
## agea82       0.37678    0.05617   6.707 2.01e-11 ***
## agea83       0.29440    0.05688   5.176 2.28e-07 ***
## agea84       0.43039    0.05746   7.490 7.02e-14 ***
## agea85       0.45192    0.05931   7.620 2.60e-14 ***
## agea86       0.42201    0.06095   6.923 4.47e-12 ***
## agea87       0.50033    0.06364   7.861 3.90e-15 ***
## agea88       0.27524    0.06725   4.093 4.27e-05 ***
## agea89       0.30149    0.06917   4.359 1.31e-05 ***
## agea90       0.46923    0.05832   8.046 8.78e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4939 on 39052 degrees of freedom
##   (1028 observations deleted due to missingness)
## Multiple R-squared:  0.02197,    Adjusted R-squared:  0.02009 
## F-statistic:  11.7 on 75 and 39052 DF,  p-value: < 2.2e-16
table(df$agea)
## 
##  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34 
## 107 380 371 389 460 409 404 410 397 461 439 429 415 447 479 496 468 523 604 579 
##  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54 
## 571 541 540 633 625 567 625 637 633 659 615 602 633 644 698 662 666 700 721 684 
##  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74 
## 674 688 720 776 705 696 689 637 704 783 694 710 714 738 671 649 599 600 587 594 
##  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90 
## 575 518 485 428 375 377 314 308 282 265 212 184 148 118 103 243
class(df$agea)
## [1] "factor"
# Oops, why so many parameters? because each value for age is read as a dumy variable ( as revealed by class function)
# how to solve it - transform in numeric values, but cannot use as.numeric function directly, first transform in to characters, example = as.numeric(as.character(df$agea))
age_num = as.numeric(as.character(df$agea))
model_2= lm(cesd8 ~ age_num , data=df)
plot(age_num, df$cesd8)
abline(lm(cesd8 ~ age_num, data = df), col = "red")

plot(df$agea,df$cesd8)

# what the heck is happening here???

Describe association between depression and age of respondents by applying correlation and regression analysis- Add gender to develop a multivariate model for depression.

model_3 = lm(cesd8 ~ age_num+female, data=df)
summary(model_3)
## 
## Call:
## lm(formula = cesd8 ~ age_num + female, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.86258 -0.35092 -0.08342  0.26825  2.43407 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.4784555  0.0077136  191.67   <2e-16 ***
## age_num     0.0029157  0.0001329   21.94   <2e-16 ***
## female      0.1217120  0.0049864   24.41   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.492 on 39125 degrees of freedom
##   (1028 observations deleted due to missingness)
## Multiple R-squared:  0.02734,    Adjusted R-squared:  0.02729 
## F-statistic: 549.9 on 2 and 39125 DF,  p-value: < 2.2e-16
{
  plot(df$agea, df$cesd8, ylim = c(1, 2.5), pch = 16, col = "#ff000033")
  abline(model_3)
}

# Age effect is significantly associated with depression (p<0,01)
# Gender effect is significant associated with depression (p<0,01)
# The plot did not seemed to be improved by transforming the age values in numeric