1. Load required libraries

library(foreign) # For reading SPSS files library(ggplot2) # For visualizations library(psych) # For descriptive statistics and reliability check

2. Data Preparation

setwd(“C:\1. SemesterResearch Methods”) df = read.spss(“ESS11.sav”, to.data.frame = TRUE) df_aus = df[df$cntry == “Austria”, ]

3. Recoding CES-D8 Depression Scale (Dependent Variable)

cesd_items = c(“fltdpr”, “flteeff”, “slprl”, “wrhpp”, “fltlnl”, “enjlf”, “fltsd”, “cldgng”)

for (item in cesd_items) { df_aus[[paste0(item, “_n”)]] = as.numeric(NA) df_aus[[paste0(item, “_n”)]][df_aus[[item]] == “None or almost none of the time”] = 1 df_aus[[paste0(item, “_n”)]][df_aus[[item]] == “Some of the time”] = 2 df_aus[[paste0(item, “_n”)]][df_aus[[item]] == “Most of the time”] = 3 df_aus[[paste0(item, “_n”)]][df_aus[[item]] == “All or almost all of the time”] = 4 }

df_aus$wrhpp_n = 5 - df_aus$wrhpp_n df_aus$enjlf_n = 5 - df_aus$enjlf_n df_aus$CESD_TOTAL = rowMeans(df_aus[, paste0(cesd_items, “_n”)], na.rm = TRUE)

3.1 Reliability Analysis (Cronbach’s Alpha for CES-D8)

alpha_value = alpha(df_aus[, paste0(cesd_items, “_n”)]) print(alpha_value$total$raw_alpha)

Interpretation:

Cronbach’s Alpha (α = 0.80) indicates good internal consistency.

Values >0.7 are acceptable, >0.8 are good, and >0.9 are excellent.

Removing any item does not significantly improve reliability, meaning all CES-D8 items contribute well to the scale.

4. Descriptive Statistics and Normality Test

summary(df_aus$CESD_TOTAL) ggplot(df_aus, aes(x = CESD_TOTAL)) + geom_histogram(bins = 10, fill = "skyblue", color = "black") + labs(title = "Distribution of Depression Scores (CES-D8)", x = "CES-D8 Score", y = "Frequency") + theme_minimal() shapiro.test(df_aus$CESD_TOTAL)

Interpretation:

The p-value is < 0.05, meaning that CESD_TOTAL is NOT normally distributed.

Since many statistical tests (e.g., Pearson correlation, ANOVA, t-tests) assume normality,

we use non-parametric alternatives: Spearman correlation and Kruskal-Wallis.

5. Bivariate Analysis

df_aus$hincfel_num = NA df_aus$hincfel_num[df_aus$hincfel == "Living comfortably on present income"] = 4 df_aus$hincfel_num[df_aus$hincfel == "Coping on present income"] = 3 df_aus$hincfel_num[df_aus$hincfel == "Difficult on present income"] = 2 df_aus$hincfel_num[df_aus$hincfel == “Very difficult on present income”] = 1

df_aus$stfhlth_num = as.numeric(factor(df_aus$stfhlth)) df_aus$cnfpplh_num = as.numeric(factor(df_aus$cnfpplh, levels = c(“Always”, “Often”, “Sometimes”, “Hardly ever”, “Never”), ordered =T)) df_aus$wkhtot = as.numeric(as.character(df_aus$wkhtot))

H1: Income satisfaction & depression

cor.test(df_aus$CESD_TOTAL, df_aus$hincfel_num, method = “spearman”) # Spearman’s rho = -0.2531, p < 2.2e-16 (significant negative correlation) kruskal.test(CESD_TOTAL ~ hincfel_num, data = df_aus) # Kruskal-Wallis χ² = 181.74, p < 2.2e-16 (significant group differences)

H2: Perceived health services & depression

cor.test(df_aus$CESD_TOTAL, df_aus$stfhlth_num, method = “spearman”) # Spearman’s rho = -0.1765157, p < 2.2e-16 (significant negative correlation) kruskal.test(CESD_TOTAL ~ stfhlth_num, data = df_aus) # Kruskal-Wallis χ² = 81.03, p = 3.16e-13 (significant group differences) # Visualization for the report ggplot(df_aus, aes(x = as.factor(hincfel_num), y = CESD_TOTAL)) + geom_boxplot(fill = “lightblue”, color = “black”) + labs(title = “Depressive Symptoms by Income Satisfaction”, x = “Income Satisfaction (1 = Very difficult, 4 = Living comfortably)”, y = “CES-D8 Score”) + theme_minimal()

H3: Childhood conflicts & depression

cor.test(df_aus$CESD_TOTAL, df_aus$cnfpplh_num, method = “spearman”) # Spearman’s rho = -0.2790, p < 2.2e-16 (significant negative correlation) kruskal.test(CESD_TOTAL ~ cnfpplh_num, data = df_aus) # Kruskal-Wallis χ² = 182.79, p < 2.2e-16 (significant group differences)

H4: Working hours & depression

cor.test(df_aus$CESD_TOTAL, df_aus$wkhtot, method = “spearman”) # Spearman’s rho = 0.0097, p = 0.6569 (no significant correlation) # Visualization for the report ggplot(df_aus, aes(x = wkhtot, y = CESD_TOTAL)) + geom_point(alpha = 0.5, color = “blue”) + geom_smooth(method = “lm”, color = “red”, se = FALSE) + labs(title = “Depressive Symptoms vs. Weekly Working Hours”, x = “Total Weekly Working Hours”, y = “CES-D8 Score”) + theme_minimal() # 6. Multivariate Model

Compute correlation matrix

cor_matrix = cor(df_aus[, c(“CESD_TOTAL”, “hincfel_num”, “cnfpplh_num”, “wkhtot”, “stfhlth_num”)], use = “pairwise.complete.obs”) print(cor_matrix)

Multiple linear regression model

mlrm = lm(CESD_TOTAL ~ hincfel_num + cnfpplh_num + wkhtot + stfhlth_num, data = df_aus) summary(mlrm)

Interpretation:

Income satisfaction (p < 2e-16, β = -0.1533): Significant predictor. More income satisfaction → lower depression.

Childhood conflicts (p < 2e-16, β = -0.0912): Significant predictor. More conflicts → higher depression.

Health services (p = 7.02e-10, β = -0.0240): Significant predictor. Better services → lower depression.

Work hours (p = 0.711, β = -0.0003): NOT significant. Work hours do NOT influence depression.

Adjusted R² = 0.1406: The model explains 14.06% of variance in depression.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Seminarpaper Wachter Krabichler

2025-05-05

1. Load required libraries

2. Data Preparation

3. Recoding CES-D8 Depression Scale (Dependent Variable)

3.1 Reliability Analysis (Cronbach’s Alpha for CES-D8)

Interpretation:

Cronbach’s Alpha (α = 0.80) indicates good internal consistency.

Values >0.7 are acceptable, >0.8 are good, and >0.9 are excellent.

Removing any item does not significantly improve reliability, meaning all CES-D8 items contribute well to the scale.

4. Descriptive Statistics and Normality Test

Interpretation:

The p-value is < 0.05, meaning that CESD_TOTAL is NOT normally distributed.

Since many statistical tests (e.g., Pearson correlation, ANOVA, t-tests) assume normality,

we use non-parametric alternatives: Spearman correlation and Kruskal-Wallis.

5. Bivariate Analysis

H1: Income satisfaction & depression

H2: Perceived health services & depression

H3: Childhood conflicts & depression

H4: Working hours & depression

Compute correlation matrix

Multiple linear regression model

Interpretation:

Income satisfaction (p < 2e-16, β = -0.1533): Significant predictor. More income satisfaction → lower depression.

Childhood conflicts (p < 2e-16, β = -0.0912): Significant predictor. More conflicts → higher depression.

Health services (p = 7.02e-10, β = -0.0240): Significant predictor. Better services → lower depression.

Work hours (p = 0.711, β = -0.0003): NOT significant. Work hours do NOT influence depression.

Adjusted R² = 0.1406: The model explains 14.06% of variance in depression.

R Markdown

Including Plots