Introduction

This document analyzes depressive symptoms in Austria using data from the European Social Survey (ESS11). The analysis includes descriptive statistics and a regression model of the CES-D8 depression scale, using both unweighted and weighted approaches.

Loading Required Libraries

library(foreign)
library(likert)

## Warning: package 'likert' was built under R version 4.4.3

## Loading required package: ggplot2

## Warning: package 'ggplot2' was built under R version 4.4.3

## Loading required package: xtable

## Warning: package 'xtable' was built under R version 4.4.3

library(kableExtra)

## Warning: package 'kableExtra' was built under R version 4.4.3

library(psych)

## Warning: package 'psych' was built under R version 4.4.3

## 
## Attaching package: 'psych'

## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

library(effects)

## Warning: package 'effects' was built under R version 4.4.3

## Loading required package: carData

## Warning: package 'carData' was built under R version 4.4.3

## lattice theme set by effectsTheme()
## See ?effectsTheme for details.

library(pscl)

## Warning: package 'pscl' was built under R version 4.4.3

## Classes and Methods for R originally developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University (2002-2015),
## by and under the direction of Simon Jackman.
## hurdle and zeroinfl functions by Achim Zeileis.

Loading and Preparing the Data

setwd("C:/Users/Vinicius Alpha/Downloads/")

suppressWarnings(df <- read.spss("ESS11.sav", to.data.frame = TRUE))

# Filter for Austria only
df_AT <- subset(df, cntry == "Austria")

# Print sample size
nrow(df_AT)

## [1] 2354

The Austrian sample consists of 2354 respondents.

Depression Scale (CES-D8)

We construct the CES-D8 scale by using eight self-reported depression-related items. Two positively worded items are reverse-coded.

# Convert items to numeric
df_AT$d1 <- as.numeric(df_AT$fltdpr)
df_AT$d2 <- as.numeric(df_AT$flteeff)
df_AT$d3 <- as.numeric(df_AT$slprl)
df_AT$d4 <- as.numeric(df_AT$wrhpp)
df_AT$d5 <- as.numeric(df_AT$fltlnl)
df_AT$d6 <- as.numeric(df_AT$enjlf)
df_AT$d7 <- as.numeric(df_AT$fltsd)
df_AT$d8 <- as.numeric(df_AT$cldgng)

# Reverse score the positive items
df_AT$d4 <- 5 - df_AT$d4
df_AT$d6 <- 5 - df_AT$d6

# Ensure numeric conversion is strict
scale_items <- df_AT[, c("d1", "d2", "d3", "d4", "d5", "d6", "d7", "d8")]
scale_items[] <- lapply(scale_items, function(x) as.numeric(as.character(x)))

# Compute depression score (mean)
df_AT$CES_D8 <- rowMeans(scale_items, na.rm = TRUE)

Likert Table and Plot

Below we describe the eight items included in the CES-D8 depression scale:

fltdpr – How much of the time during the past week did you feel depressed?
flteeff – … did you feel that everything you did was an effort?
slprl – … was your sleep restless?
wrhpp – … did you feel happy? (positive, reverse-coded)
fltlnl – … did you feel lonely?
enjlf – … did you enjoy life? (positive, reverse-coded)
fltsd – … did you feel sad?
cldgng – … did you feel you could not get going?

These items were rated on a Likert scale and are used to assess the presence and intensity of depressive symptoms experienced in the past week.

We now present the individual items as a table and visualize the results using the likert package.

likert_items <- df_AT[, c("fltdpr", "flteeff", "slprl", "wrhpp", "fltlnl", "enjlf", "fltsd", "cldgng")]

# Convert all to ordered factors 
likert_items[] <- lapply(likert_items, function(x) factor(as.character(x), ordered = TRUE))

# Create Likert object
likert_obj <- likert(likert_items)

# Table of results
summary_likert <- likert_obj$results
kable(summary_likert, caption = "Distribution of Depression Scale Responses") %>% kable_styling()

Distribution of Depression Scale Responses
Item	All or almost all of the time	Most of the time	None or almost none of the time	Some of the time
fltdpr	0.8510638	2.340425	68.170213	28.63830
flteeff	2.0442930	7.112436	52.129472	38.71380
slprl	3.1116795	7.630008	50.554135	38.70418
wrhpp	21.8990590	47.048760	3.592814	27.45937
fltlnl	1.3617021	3.276596	77.361702	18.00000
enjlf	22.0795892	41.420625	4.321780	32.17801
fltsd	1.3219616	2.089552	68.656716	27.93177
cldgng	1.2782275	2.939923	69.237324	26.54452

# Plot
plot(likert_obj)

Regression Analysis: CES-D8 Predictors

We now assess potential predictors of depressive symptoms using a linear regression model. The predictors include:

fltlnl: Perceived loneliness
sclmeet: Frequency of social meetings
health: Self-reported health
dosprt: Frequency of physical activity

We test both unweighted and weighted models.

# Transform to numeric
df_AT$fltlnl <- as.numeric(as.factor(df_AT$fltlnl))
df_AT$sclmeet <- as.numeric(as.factor(df_AT$sclmeet))
df_AT$health <- as.numeric(as.factor(df_AT$health))
df_AT$dosprt <- as.numeric(as.factor(df_AT$dosprt))

# Replace function
replace_with_mode <- function(x) {
  x <- as.character(x)
  non_na <- x[!is.na(x)]
  if (length(non_na) == 0) return(x)
  mode_value <- names(sort(table(non_na), decreasing = TRUE))[1]
  x[is.na(x)] <- mode_value
  return(as.numeric(as.character(x)))
}

# Apply to predictors
df_AT$fltlnl <- replace_with_mode(df_AT$fltlnl)
df_AT$sclmeet <- replace_with_mode(df_AT$sclmeet)
df_AT$health <- replace_with_mode(df_AT$health)
df_AT$dosprt <- replace_with_mode(df_AT$dosprt)

# Checking again if missing values are fixed
colSums(is.na(df_AT[, c("fltlnl", "sclmeet", "health", "dosprt")]))

##  fltlnl sclmeet  health  dosprt 
##       0       0       0       0

summary(df_AT$fltlnl)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   1.000   1.286   1.000   4.000

summary(df_AT$sclmeet)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   4.000   5.000   4.892   6.000   7.000

summary(df_AT$health)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   2.000   2.064   3.000   5.000

### Unweighted Model

# Subset complete cases for model variables
subset_vars = df_AT[, c("fltlnl", "sclmeet", "health", "dosprt")]
correlation = cor(subset_vars, use = "complete.obs")
print(correlation)

##              fltlnl     sclmeet     health      dosprt
## fltlnl   1.00000000 -0.04745841  0.2541255 -0.09371122
## sclmeet -0.04745841  1.00000000 -0.1943385  0.09867391
## health   0.25412550 -0.19433852  1.0000000 -0.27171649
## dosprt  -0.09371122  0.09867391 -0.2717165  1.00000000

# Fit model
model_unweighted <- lm(CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT)
summary(model_unweighted)

## 
## Call:
## lm(formula = CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.00555 -0.23609 -0.05078  0.19148  1.67036 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.795920   0.039992  19.902   <2e-16 ***
## fltlnl       0.381485   0.011619  32.832   <2e-16 ***
## sclmeet     -0.004519   0.005597  -0.807    0.419    
## health       0.177576   0.008319  21.346   <2e-16 ***
## dosprt      -0.003870   0.002861  -1.353    0.176    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3229 on 2346 degrees of freedom
##   (3 observations deleted due to missingness)
## Multiple R-squared:  0.4759, Adjusted R-squared:  0.475 
## F-statistic: 532.5 on 4 and 2346 DF,  p-value: < 2.2e-16

Weighted Model (using `dweight`)

# Fit weighted model
model_weighted <- lm(CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT, weights = dweight)
summary(model_weighted)

## 
## Call:
## lm(formula = CES_D8 ~ fltlnl + sclmeet + health + dosprt, data = df_AT, 
##     weights = dweight)
## 
## Weighted Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.66127 -0.21712 -0.03533  0.17408  2.35889 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.773847   0.039772  19.457   <2e-16 ***
## fltlnl       0.388493   0.012009  32.350   <2e-16 ***
## sclmeet     -0.002070   0.005538  -0.374   0.7086    
## health       0.180318   0.008363  21.561   <2e-16 ***
## dosprt      -0.005710   0.002828  -2.019   0.0436 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3171 on 2346 degrees of freedom
##   (3 observations deleted due to missingness)
## Multiple R-squared:  0.463,  Adjusted R-squared:  0.462 
## F-statistic: 505.6 on 4 and 2346 DF,  p-value: < 2.2e-16

Model Comparison

cat("R-squared unweighted: ", round(summary(model_unweighted)$r.squared, 3), "\n")

## R-squared unweighted:  0.476

cat("R-squared weighted: ", round(summary(model_weighted)$r.squared, 3), "\n")

## R-squared weighted:  0.463

The weighted model shows a slightly different R-squared, indicating the influence of survey weights on model performance.

Preparing Data for Logistc Regression

# Create binary outcome for clinically significant depression (CES-D >= 2)
hist(df_AT$CES_D8, main = "Distibution of CES_08", xlab = "CES-D Score", col = "brown")

df_AT$depr_clinical <- ifelse(df_AT$CES_D8 >= 2.0, 1, 0)

df_AT$gndr <- factor(df_AT$gndr)
df_AT$marsts <- factor(df_AT$marsts)
df_AT$uempla <- factor(df_AT$uempla)

# classify age into four group

df_AT$agea <- as.numeric(as.character(df_AT$agea))
quantile(df_AT$agea, c(1/4, 1/2, 3/4), na.rm=T)

## 25% 50% 75% 
##  42  57  70

df_AT$age_group <- factor(NA, levels = c("<30", "30-45", "46-55", "56-65", ">65"))
df_AT$age_group[df_AT$agea < 30] <- "<30"
df_AT$age_group[df_AT$agea >= 30 & df_AT$agea <= 45] <- "30-45"
df_AT$age_group[df_AT$agea >= 46 & df_AT$agea <= 55] <- "46-55"
df_AT$age_group[df_AT$agea >= 56 & df_AT$agea <=65] <- "56-65"
df_AT$age_group[df_AT$agea > 66] <- ">66"

## Warning in `[<-.factor`(`*tmp*`, df_AT$agea > 66, value = structure(c(4L, :
## invalid factor level, NA generated

Frequency Distribution

table(df_AT$depr_clinical)

## 
##    0    1 
## 1864  487

table(df_AT$gndr)

## 
##   Male Female 
##    993   1361

table(df_AT$age_group)

## 
##   <30 30-45 46-55 56-65   >65 
##   243   458   413   447     0

prop.table(table(df_AT$depr_clinical))

## 
##         0         1 
## 0.7928541 0.2071459

Manual Odds Ratio and Simple Logistic Regression

table(df_AT$depr_clinical, df_AT$gndr)

##    
##     Male Female
##   0  828   1036
##   1  162    325

t <- table(df_AT$depr_clinical, df_AT$gndr)
rownames(t)

## [1] "0" "1"

colnames(t)

## [1] "Male"   "Female"

odds_female <- t["1", "Female"] / t["0", "Female"]
odds_male   <- t["1", "Male"] / t["0", "Male"]
OR_gender   <- odds_female / odds_male
OR_gender

## [1] 1.603389

# Simple logistic regression with only gender
aModel_simple <- glm(depr_clinical ~ gndr, data = df_AT, family = binomial)
summary(aModel_simple)

## 
## Call:
## glm(formula = depr_clinical ~ gndr, family = binomial, data = df_AT)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.63142    0.08591 -18.990  < 2e-16 ***
## gndrFemale   0.47212    0.10688   4.417 9.99e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2398.7  on 2350  degrees of freedom
## Residual deviance: 2378.6  on 2349  degrees of freedom
##   (3 observations deleted due to missingness)
## AIC: 2382.6
## 
## Number of Fisher Scoring iterations: 4

exp(coef(aModel_simple))

## (Intercept)  gndrFemale 
##   0.1956522   1.6033891

exp(confint(aModel_simple))

## Waiting for profiling to be done...

##                 2.5 %    97.5 %
## (Intercept) 0.1647876 0.2308224
## gndrFemale  1.3022718 1.9803814

Multivariate Logistic Regression

aModel <- glm(depr_clinical ~ gndr + age_group + hincfel + health + marsts + uempla,
              data = df_AT, family = binomial)
summary(aModel)

## 
## Call:
## glm(formula = depr_clinical ~ gndr + age_group + hincfel + health + 
##     marsts + uempla, family = binomial, data = df_AT)
## 
## Coefficients:
##                                                                           Estimate
## (Intercept)                                                               -3.41619
## gndrFemale                                                                 0.27175
## age_group30-45                                                            -0.24285
## age_group46-55                                                            -0.85104
## age_group56-65                                                            -1.50075
## hincfelCoping on present income                                           -0.06307
## hincfelDifficult on present income                                         0.70677
## hincfelVery difficult on present income                                    1.44516
## health                                                                     1.13408
## marstsIn a legally registered civil union                                -13.36590
## marstsLegally divorced/Civil union dissolved                               0.37492
## marstsWidowed/Civil partner died                                           1.45960
## marstsNone of these (NEVER married or in legally registered civil union)  -0.11352
## uemplaMarked                                                               0.41621
##                                                                          Std. Error
## (Intercept)                                                                 0.73865
## gndrFemale                                                                  0.18969
## age_group30-45                                                              0.26021
## age_group46-55                                                              0.32076
## age_group56-65                                                              0.34917
## hincfelCoping on present income                                             0.23736
## hincfelDifficult on present income                                          0.28224
## hincfelVery difficult on present income                                     0.48160
## health                                                                      0.12523
## marstsIn a legally registered civil union                                 598.90712
## marstsLegally divorced/Civil union dissolved                                0.68018
## marstsWidowed/Civil partner died                                            0.80662
## marstsNone of these (NEVER married or in legally registered civil union)    0.67484
## uemplaMarked                                                                0.39768
##                                                                          z value
## (Intercept)                                                               -4.625
## gndrFemale                                                                 1.433
## age_group30-45                                                            -0.933
## age_group46-55                                                            -2.653
## age_group56-65                                                            -4.298
## hincfelCoping on present income                                           -0.266
## hincfelDifficult on present income                                         2.504
## hincfelVery difficult on present income                                    3.001
## health                                                                     9.056
## marstsIn a legally registered civil union                                 -0.022
## marstsLegally divorced/Civil union dissolved                               0.551
## marstsWidowed/Civil partner died                                           1.810
## marstsNone of these (NEVER married or in legally registered civil union)  -0.168
## uemplaMarked                                                               1.047
##                                                                          Pr(>|z|)
## (Intercept)                                                              3.75e-06
## gndrFemale                                                                0.15197
## age_group30-45                                                            0.35067
## age_group46-55                                                            0.00797
## age_group56-65                                                           1.72e-05
## hincfelCoping on present income                                           0.79044
## hincfelDifficult on present income                                        0.01227
## hincfelVery difficult on present income                                   0.00269
## health                                                                    < 2e-16
## marstsIn a legally registered civil union                                 0.98219
## marstsLegally divorced/Civil union dissolved                              0.58149
## marstsWidowed/Civil partner died                                          0.07037
## marstsNone of these (NEVER married or in legally registered civil union)  0.86642
## uemplaMarked                                                              0.29529
##                                                                             
## (Intercept)                                                              ***
## gndrFemale                                                                  
## age_group30-45                                                              
## age_group46-55                                                           ** 
## age_group56-65                                                           ***
## hincfelCoping on present income                                             
## hincfelDifficult on present income                                       *  
## hincfelVery difficult on present income                                  ** 
## health                                                                   ***
## marstsIn a legally registered civil union                                   
## marstsLegally divorced/Civil union dissolved                                
## marstsWidowed/Civil partner died                                         .  
## marstsNone of these (NEVER married or in legally registered civil union)    
## uemplaMarked                                                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 938.33  on 894  degrees of freedom
## Residual deviance: 764.30  on 881  degrees of freedom
##   (1459 observations deleted due to missingness)
## AIC: 792.3
## 
## Number of Fisher Scoring iterations: 13

exp(coef(aModel))

##                                                              (Intercept) 
##                                                             3.283725e-02 
##                                                               gndrFemale 
##                                                             1.312260e+00 
##                                                           age_group30-45 
##                                                             7.843905e-01 
##                                                           age_group46-55 
##                                                             4.269698e-01 
##                                                           age_group56-65 
##                                                             2.229637e-01 
##                                          hincfelCoping on present income 
##                                                             9.388731e-01 
##                                       hincfelDifficult on present income 
##                                                             2.027440e+00 
##                                  hincfelVery difficult on present income 
##                                                             4.242540e+00 
##                                                                   health 
##                                                             3.108311e+00 
##                                marstsIn a legally registered civil union 
##                                                             1.567709e-06 
##                             marstsLegally divorced/Civil union dissolved 
##                                                             1.454871e+00 
##                                         marstsWidowed/Civil partner died 
##                                                             4.304237e+00 
## marstsNone of these (NEVER married or in legally registered civil union) 
##                                                             8.926907e-01 
##                                                             uemplaMarked 
##                                                             1.516209e+00

suppressWarnings(exp(confint(aModel)))

## Waiting for profiling to be done...

##                                                                                2.5 %
## (Intercept)                                                              0.006424371
## gndrFemale                                                               0.906242694
## age_group30-45                                                           0.470450554
## age_group46-55                                                           0.225623225
## age_group56-65                                                           0.110927432
## hincfelCoping on present income                                          0.592966778
## hincfelDifficult on present income                                       1.167999603
## hincfelVery difficult on present income                                  1.657105885
## health                                                                   2.444331935
## marstsIn a legally registered civil union                                         NA
## marstsLegally divorced/Civil union dissolved                             0.433769660
## marstsWidowed/Civil partner died                                         0.964083577
## marstsNone of these (NEVER married or in legally registered civil union) 0.268861305
## uemplaMarked                                                             0.681660538
##                                                                                97.5 %
## (Intercept)                                                              1.242233e-01
## gndrFemale                                                               1.908325e+00
## age_group30-45                                                           1.307383e+00
## age_group46-55                                                           7.952167e-01
## age_group56-65                                                           4.370677e-01
## hincfelCoping on present income                                          1.507335e+00
## hincfelDifficult on present income                                       3.539585e+00
## hincfelVery difficult on present income                                  1.105939e+01
## health                                                                   3.996227e+00
## marstsIn a legally registered civil union                                4.298536e+34
## marstsLegally divorced/Civil union dissolved                             6.757216e+00
## marstsWidowed/Civil partner died                                         2.430291e+01
## marstsNone of these (NEVER married or in legally registered civil union) 4.109351e+00
## uemplaMarked                                                             3.273341e+00

Model Fit Assessment

r_mcfadden <- with(summary(aModel), 1 - deviance/null.deviance)
r_nagelkerke <- with(summary(aModel), 
                    r_mcfadden / (1 - (null.deviance / nrow(aModel$data)*log(2))))
r_mcfadden

## [1] 0.185469

r_nagelkerke

## [1] 0.2562777

Conclusion

A multivariate logistic regression revealed that self-rated poor health and economic difficulty are strong predictors of clinically significant depressive symptoms. Participants reporting their income as “very difficult” were over 4 times more likely to report depression (OR = 4.45, 95% CI: 1.66–11.06), and those in poor health had three times the odds (OR = 3.12, 95% CI: 2.44–4.00). Age also played a protective role, especially in the 56–65 group (OR = 0.23). Gender showed a trend toward higher risk among females (OR = 1.60), though not statistically significant at the 95% level.

# Just testing some pltos ;)
png("effects_plot.png", width = 1200, height = 800)
plot(allEffects(aModel))

Homework 4

Vinicius Gramuglia Marin and Alexander Niesner

2025-02-16

Introduction

Loading Required Libraries

Loading and Preparing the Data

Depression Scale (CES-D8)

Likert Table and Plot

Regression Analysis: CES-D8 Predictors

Weighted Model (using `dweight`)

Model Comparison

Preparing Data for Logistc Regression

Frequency Distribution

Manual Odds Ratio and Simple Logistic Regression

Multivariate Logistic Regression

Model Fit Assessment

Conclusion

Homework 4

Vinicius Gramuglia Marin and Alexander Niesner

2025-02-16

Introduction

Loading Required Libraries

Loading and Preparing the Data

Depression Scale (CES-D8)

Likert Table and Plot

Regression Analysis: CES-D8 Predictors

Weighted Model (using dweight)

Model Comparison

Preparing Data for Logistc Regression

Frequency Distribution

Manual Odds Ratio and Simple Logistic Regression

Multivariate Logistic Regression

Model Fit Assessment

Conclusion

Weighted Model (using `dweight`)