library(foreign) #to read SPSS and other formats)

setwd("C:/Users/User/Downloads/ssp saffie")
df = read.spss("ESS11.sav", to.data.frame = T)

INTRODUCTION

Exploring Social Determinants and Depression: A Multivariate Analysis

Depression is one of the leading causes of disability, affecting about 280 million people worldwide (Paiva, T. et al.,2023). It is characterized by persistent feelings of sadness, loss of interest, and cognitive impairment, with significant social and economic consequences. Approximately 4.3% of the global population is affected by depression, which has substantial financial and health impacts on the global burden of disease (Liu, J.et al.,2024). Research highlights that various social health determinants, such as sleep patterns, social support, loneliness, financial stress, age, gender and education, strongly influence the onset, progression, and severity of depression (Onyekachi et al., 2024). WHO reports higher rates of depression cases in high-income countries, while many cases in low- and middle-income countries (LMICs) go under diagnosed and untreated due to limited mental health resources and stigma (WHO,2021). Due to the growing complexity of mental health issues globally, examining the relationship between social health determinants and depression is essential for targeted interventions and policies.

Problem Statement

Depression is a significant growing health concern affecting millions of people worldwide and contributing to the global disease burden caused by several factors (WHO,2021). This mental illness can lead to self-harm or suicide and is characterized by high levels of sadness, apathy, guilt, low self-confidence, poor sleep, fatigue, and difficulty concentrating. These factors may be social and emotional factors such as sleeplessness, loneliness, sadness and unhappiness strongly correlating with the onset and severity of depression. Lack of sleep, loneliness and unhappiness increase the risk of depression by impacting emotional regulation and reducing social support (Baglioni et al.,2011). Understanding these predictors is important to improve mental health interventions and reduce the global disease burden. The justification for choosing this dependent variable lies in the rising prevalence of depression and the need to understand the social determinants that contribute to its onset and severity.

Literature review

Prevalence and relationship between depression and social health determinants Social Health Determinants as Explanatory Variables: In Finland, the prevalence of depression remains a pressing issue, with approximately 5.9% of the population experiencing depressive symptoms annually (OECD, 2023). Recent studies highlight that social health determinants, such as gender, age, income, employment status, education level, and social support, play a crucial role in influencing mental health outcomes (WHO, 2022). In this study, we shall explore how social determinants such as happiness, lack of sleep, sadness and loneliness contribute to increasing levels of depression among the population of Finland.

Sleeplessness with depressio

Poor sleep patterns and disturbances are strong predictors of depression. Insufficient sleep exacerbates the relationship between lack of sleep and high depression rates, intensifying symptoms such as sadness, fatigue, and cognitive impairment (Lee et al., 2024). According to (Dong, L.et al., 2022) show that people who experience chronic sleep disturbances have 1.9 times higher to develop depression than those with healthy sleep patterns, Studies show that supportive social network helps mitigate the adverse effects of stress and reduces the risk of depression (Onyekachi et al., 2024). Loneliness and social isolation Loneliness is a powerful social determinant of depression. Research indicates that loneliness increases the risk of depression by 26%, particularly among the elderly and individuals with limited social support (Cacioppo et al., 2015). Social isolation reduces opportunities for emotional validation and meaningful interactions, which are crucial for mental well-being.

Sadness with depression

Persistent sadness is a core symptom of depression, closely associated with chronic stress, loss, and emotional trauma. Studies highlight that sadness leads to prolonged depressive episodes, especially when compounded by poor coping mechanisms and inadequate social support (Kessler et al., 2010). Happiness as a Protective Factor Happiness, while less studied in the context of depression, is known to be a buffer against mental illness(Seo, E. et al.,2018). Positive emotions and strong social connections foster resilience and promote better mental health outcomes. Individuals with higher happiness levels are more likely to engage in self-care behaviors and social activities, reducing the risk of depression(Luis.E.et al.,2021)

Methods

The study operationalized variables with depression status measured using the d20-d27 variables to create the CES-D8 depression scale. The independent variables were categorized on how frequently it happened in the past week i.e. sleeplessness, happiness, feelings of depression, sadness and loneliness. They were categorized as (none or almost none of the time, most of the time, all or almost all the time). Descriptive statistics summarized sample characteristics, while correlation analysis and multivariate regression analysis explored the relationship between social health determinants and depression. The data used was a subset of Finland.

Hypotheses:

Null hypothesis(H0): there is no relationship between the independent variables and depression Alternative hypothesis(H1): There is an association between the independent variables and depression H1: The more feeling of depression “fltdpr” is associated with higher levels of depression H1: Sleeplessness “slprl” is associated with high levels of depression H1: Sadness “fltsd” is associated to increase of depression levels H1: Loneliness “fltlnl” is associated to higher depression levels H1: Happiness “wrhpp” is associated to low levels of depression H1: Could not get going “cldgng” is related to an increase in depression H1: The feeling that everything did as effort “flteeff” is associated with an increase in depression.

Results

The data-set used in this analysis is drawn from the ESS11 with a subset focusing on Finland. The sample consists of 1563 respondents. The variables considered were psycho social determinants related to depression CES-D8 scale i.e. sleeplessness, loneliness, sadness, happiness, motivation and effort that will help to understand the emotional and mental health status of the sample population. Below are the results obtained from the analysis methods used We had to first test the chosen variables ie fltlnl, slprl, fltsd, fltsd, wrhpp, cldgng, and flteeff from the database for reliability and the Cronbach’s alpha value was 0.714 which indicates acceptable internal consistency for the CES-D8 depression scale. The table below shows the descriptive statistics summarizing the distribution of the depression-dependent variable with its interpretation.

library(foreign) #to read SPSS and other formats)

setwd("C:/Users/User/Downloads/ssp saffie")
df = read.spss("ESS11.sav", to.data.frame = T)

knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)

Statistic measure   Value   Interpretation
Min (Minimum)   7.00    The lowest depression score in the dataset is 7.
1st Quartile (Q1)   9.00    25% of the participants scored ≤ 9 on the depression scale.
Median (Q2) 10.00   The middle value in the dataset (50% of participants scored ≤ 10).
Mean (Average)  10.78   The average depression score is 10.78.
3rd Quartile (Q3)   12.00   75% of participants scored ≤ 12 on the depression scale.
Max (Maximum)   28.00   The highest depression score in the dataset is 28.
NA's (Missing Values)   18  There are 18 missing values in the dataset.

For the bivariate association, we used correlation analysis and regression analysis between the independent variables and depression (CES-D8).
The correlation analysis shows that CES-D8 is positively associated with loneliness, sadness, sleeplessness, difficulty getting going, and feelings of effort, while happiness negatively correlates with depression. The p-value of < 2e-16 for all the independent variables shows a significant prediction of depression.
The multivariate regression model
The regression analysis demonstrated a strong relationship between the predictor variables and depression (CES-D8 scale). Where the intercept (1.035) represents the baseline depression score when all the independent variables are at zero. All the predictor variables have significant coefficients(p<2e-16) indicating that higher levels of these factors are associated with increase in depression scores. The R-squared value (0.9468) suggests that approximately 94.68% of the variation in depression scores confirming its strong predictive power. The F-statistic (4563, p< 2.2e-16) further reinforces the model’s statistical significance.

## Discussion

In testing the hypotheses stated above with p-value of < 2e-16*** significance since it is less than 0.05 standard significant threshold, indicating strong relationship between the predictor variables and dependent variable which is depression.
We therefore go on and reject the null hypothesis and accept the alternative hypotheses since the empirical findings strongly support the hypotheses, that sleeplessness, loneliness, sadness, loneliness, and the general emotional distress are significantly associated with higher levels of depression. The positive and statistically significant coefficients in the regression model indicating increased levels of feeling sad, lonely or restless as well as struggling with motivation and effort are strongly associated to higher CES-D8 depression scores. the variation of 94.68% confirms that the selected independent variables or predictors effectively explain variations in depression levels, reinforcing that the hypothesis that these factors are strong determinants of mental health outcomes.

## Limitations and quality criteria

1.The analysis was restricted to one country Finland limiting generalizability of other countries in the whole data set.
2.The study was limited to selected social determinants not all determinants which would be important in this context.
3.The Cronbach alpha was done to test the reliability and validity of the predictor variables and with the alpha value of 0.714 , they were considered reliable to bring sufficient correlation.

means_df = data.frame(
  by(df$test_score, df$cntry_long_name, mean))
names(means_df) = c("Score")
means_df


``` r
#extracting all countries from the data set
t1 = table(df$cntry)
t1
## 
##            Albania            Austria            Belgium           Bulgaria 
##                  0               2354               1594                  0 
##        Switzerland             Cyprus            Czechia            Germany 
##               1384                685                  0               2420 
##            Denmark            Estonia              Spain            Finland 
##                  0                  0               1844               1563 
##             France     United Kingdom            Georgia             Greece 
##               1771               1684                  0               2757 
##            Croatia            Hungary            Ireland             Israel 
##               1563               2118               2017                  0 
##            Iceland              Italy          Lithuania         Luxembourg 
##                842               2865               1365                  0 
##             Latvia         Montenegro    North Macedonia        Netherlands 
##                  0                  0                  0               1695 
##             Norway             Poland           Portugal            Romania 
##               1337               1442               1373                  0 
##             Serbia Russian Federation             Sweden           Slovenia 
##               1563                  0               1230               1248 
##           Slovakia             Turkey            Ukraine             Kosovo 
##               1442                  0                  0                  0
#we decided to subset Finland data
DataFI = subset(df,df$cntry=="Finland")
#DataFI

#selected explanatory variables/predictor variables 
#fltlnl felt lonely, how often past week
#slprl sleep was restless,how often past week
#fltdpr felt depressed,how often past week
#wrhpp felt happy in the past week
#fltsd felt sad how often in the past week
#cldgng couldnot get going how often in the past week
#flteeff felt everything did as effort, how often past week

# Now I am going to elaborate d20-d27 variables to create the CES-D8 depression scale 
# First of all I have to reverse "wrhpp"  as their indication of wellbeing is exactly reversed from the other variables
# I will do this with "5-" so we get numbers from 1-4 

DataFI$wrhpp_num = as.numeric(DataFI$wrhpp)

DataFI$wrhpp_num = 5 - DataFI$wrhpp_num  #add varible no eight

DataFI$enjlf_num = as.numeric(DataFI$enjlf)
DataFI$enjlf_num = 5 - DataFI$enjlf_num  #add varible no eight

table(DataFI$wrhpp_num)
## 
##   1   2   3   4 
## 226 910 363  60
# Now I transform the other scales into numeric ones to calculate with it
DataFI$fltdpr_num = as.numeric(DataFI$fltdpr) 
DataFI$enjlf_num = as.numeric(DataFI$enjlf) 
DataFI$flteeff_num = as.numeric(DataFI$flteeff) 
DataFI$slprl_num = as.numeric(DataFI$slprl) 
DataFI$fltlnl_num = as.numeric(DataFI$fltlnl) 
DataFI$fltsd_num = as.numeric(DataFI$fltsd) 
DataFI$cldgng_num = as.numeric(DataFI$cldgng) 


# Now I can sum the rows and calculate the mean
DataFI$CES_D8 = rowSums(DataFI[, c("fltdpr_num","enjlf_num", "flteeff_num", "slprl_num", "wrhpp_num", "fltlnl_num", "fltsd_num", "cldgng_num")])
summary(DataFI$CES_D8)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    10.0    12.0    13.0    13.7    15.0    29.0      22
#the maximum possible value, if you are "very depressed" is 28 
#on the other hand the minimum possible value is 7

#HYPOTHESIS: for 8 independent variables with depression 

#H1:the more feeling of depression "fltdpr" is associated with higher levels of depression
#H1:enjoy life "enjlfl" is associated with high levels of depression
#H1:Sleeplessness "slprl" is associated with high levels of depression
#H1:sadness "fltsd" is associated to increase of depression levels
#H1: loneliness "fltlnl" is associated to higher depression levels
#H1: happiness "wrhpp" is associated to low levels of depression
#H1: "cldgng" Higher frequency of feeling unable to get going is associated with increased depression level
#H1: "Flteeff" Higher frequency of feeling that everything was an effort is associated with increased depression levels
#All of them show positive correlation  with the depression symptoms apart from wrhpp which shows a negative correlation
#all of these variables have the same scale of  0-10

#BIVARIATE ANALYSIS 
# we shall use correlation and regression analysis techniques
#interaction of these variables for Finland
subset =DataFI[, c("fltdpr_num","enjlf_num", "slprl_num", "wrhpp_num", "fltlnl_num", "fltsd_num", "flteeff_num", "cldgng_num")]
correlation = cor(subset, use = "complete.obs")
correlation
##             fltdpr_num  enjlf_num  slprl_num  wrhpp_num fltlnl_num  fltsd_num
## fltdpr_num   1.0000000 -0.3840703  0.2261975  0.3879171  0.4137524  0.4205361
## enjlf_num   -0.3840703  1.0000000 -0.1913690 -0.6298359 -0.3139027 -0.2978691
## slprl_num    0.2261975 -0.1913690  1.0000000  0.1606253  0.1263291  0.2053358
## wrhpp_num    0.3879171 -0.6298359  0.1606253  1.0000000  0.3292103  0.2962833
## fltlnl_num   0.4137524 -0.3139027  0.1263291  0.3292103  1.0000000  0.3075620
## fltsd_num    0.4205361 -0.2978691  0.2053358  0.2962833  0.3075620  1.0000000
## flteeff_num  0.4505025 -0.3237889  0.2560539  0.2848652  0.2879390  0.2866340
## cldgng_num   0.2839383 -0.2462121  0.2043640  0.2336577  0.2025135  0.1718811
##             flteeff_num cldgng_num
## fltdpr_num    0.4505025  0.2839383
## enjlf_num    -0.3237889 -0.2462121
## slprl_num     0.2560539  0.2043640
## wrhpp_num     0.2848652  0.2336577
## fltlnl_num    0.2879390  0.2025135
## fltsd_num     0.2866340  0.1718811
## flteeff_num   1.0000000  0.4093484
## cldgng_num    0.4093484  1.0000000
# calculate how depression symptoms change by increasing my independent variables by 1

modelFI = lm(DataFI$CES_D8 ~ fltdpr + enjlf + slprl + wrhpp + fltlnl + fltsd + cldgng + flteeff ,data = DataFI)
summary(modelFI)
## 
## Call:
## lm(formula = DataFI$CES_D8 ~ fltdpr + enjlf + slprl + wrhpp + 
##     fltlnl + fltsd + cldgng + flteeff, data = DataFI)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -6.725e-13 -1.600e-16  1.780e-15  3.410e-15  3.518e-13 
## 
## Coefficients:
##                                        Estimate Std. Error    t value Pr(>|t|)
## (Intercept)                           1.100e+01  5.958e-15  1.846e+15   <2e-16
## fltdprSome of the time                1.000e+00  2.762e-15  3.621e+14   <2e-16
## fltdprMost of the time                2.000e+00  7.787e-15  2.568e+14   <2e-16
## fltdprAll or almost all of the time   3.000e+00  1.057e-14  2.838e+14   <2e-16
## enjlfSome of the time                 1.000e+00  5.598e-15  1.786e+14   <2e-16
## enjlfMost of the time                 2.000e+00  5.798e-15  3.450e+14   <2e-16
## enjlfAll or almost all of the time    3.000e+00  6.101e-15  4.917e+14   <2e-16
## slprlSome of the time                 1.000e+00  1.817e-15  5.502e+14   <2e-16
## slprlMost of the time                 2.000e+00  3.358e-15  5.956e+14   <2e-16
## slprlAll or almost all of the time    3.000e+00  5.083e-15  5.902e+14   <2e-16
## wrhppSome of the time                -1.000e+00  5.451e-15 -1.835e+14   <2e-16
## wrhppMost of the time                -2.000e+00  5.674e-15 -3.525e+14   <2e-16
## wrhppAll or almost all of the time   -3.000e+00  6.151e-15 -4.878e+14   <2e-16
## fltlnlSome of the time                1.000e+00  2.408e-15  4.153e+14   <2e-16
## fltlnlMost of the time                2.000e+00  5.736e-15  3.487e+14   <2e-16
## fltlnlAll or almost all of the time   3.000e+00  8.204e-15  3.657e+14   <2e-16
## fltsdSome of the time                 1.000e+00  2.064e-15  4.845e+14   <2e-16
## fltsdMost of the time                 2.000e+00  7.152e-15  2.796e+14   <2e-16
## fltsdAll or almost all of the time    3.000e+00  1.290e-14  2.325e+14   <2e-16
## cldgngSome of the time                1.000e+00  1.837e-15  5.444e+14   <2e-16
## cldgngMost of the time                2.000e+00  3.940e-15  5.076e+14   <2e-16
## cldgngAll or almost all of the time   3.000e+00  6.842e-15  4.385e+14   <2e-16
## flteeffSome of the time               1.000e+00  1.988e-15  5.030e+14   <2e-16
## flteeffMost of the time               2.000e+00  4.270e-15  4.684e+14   <2e-16
## flteeffAll or almost all of the time  3.000e+00  7.048e-15  4.256e+14   <2e-16
##                                         
## (Intercept)                          ***
## fltdprSome of the time               ***
## fltdprMost of the time               ***
## fltdprAll or almost all of the time  ***
## enjlfSome of the time                ***
## enjlfMost of the time                ***
## enjlfAll or almost all of the time   ***
## slprlSome of the time                ***
## slprlMost of the time                ***
## slprlAll or almost all of the time   ***
## wrhppSome of the time                ***
## wrhppMost of the time                ***
## wrhppAll or almost all of the time   ***
## fltlnlSome of the time               ***
## fltlnlMost of the time               ***
## fltlnlAll or almost all of the time  ***
## fltsdSome of the time                ***
## fltsdMost of the time                ***
## fltsdAll or almost all of the time   ***
## cldgngSome of the time               ***
## cldgngMost of the time               ***
## cldgngAll or almost all of the time  ***
## flteeffSome of the time              ***
## flteeffMost of the time              ***
## flteeffAll or almost all of the time ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.281e-14 on 1516 degrees of freedom
##   (22 observations deleted due to missingness)
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.418e+29 on 24 and 1516 DF,  p-value: < 2.2e-16
#it seems that the hypotheses stated above are mostly correct, 

# linear regression model (lm)
lm(CES_D8 ~ wrhpp_num + fltdpr_num + enjlf_num + slprl_num + fltsd_num + fltlnl_num + cldgng_num + flteeff_num, data = DataFI)
## 
## Call:
## lm(formula = CES_D8 ~ wrhpp_num + fltdpr_num + enjlf_num + slprl_num + 
##     fltsd_num + fltlnl_num + cldgng_num + flteeff_num, data = DataFI)
## 
## Coefficients:
## (Intercept)    wrhpp_num   fltdpr_num    enjlf_num    slprl_num    fltsd_num  
##   2.961e-13    1.000e+00    1.000e+00    1.000e+00    1.000e+00    1.000e+00  
##  fltlnl_num   cldgng_num  flteeff_num  
##   1.000e+00    1.000e+00    1.000e+00
# save model to show extended summary
model = lm(CES_D8 ~ fltdpr_num +enjlf_num +  slprl_num + fltsd_num + fltlnl_num + cldgng_num + flteeff_num, data = DataFI)
summary(model)
## 
## Call:
## lm(formula = CES_D8 ~ fltdpr_num + enjlf_num + slprl_num + fltsd_num + 
##     fltlnl_num + cldgng_num + flteeff_num, data = DataFI)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.13736 -0.25381 -0.02128  0.38952  2.08524 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.10474    0.10005   31.03   <2e-16 ***
## fltdpr_num   1.15642    0.03488   33.15   <2e-16 ***
## enjlf_num    0.50543    0.02027   24.93   <2e-16 ***
## slprl_num    1.00177    0.01935   51.76   <2e-16 ***
## fltsd_num    1.07435    0.02914   36.87   <2e-16 ***
## fltlnl_num   1.11412    0.02748   40.53   <2e-16 ***
## cldgng_num   1.04317    0.02202   47.37   <2e-16 ***
## flteeff_num  1.00433    0.02401   41.83   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5357 on 1533 degrees of freedom
##   (22 observations deleted due to missingness)
## Multiple R-squared:  0.9502, Adjusted R-squared:   0.95 
## F-statistic:  4178 on 7 and 1533 DF,  p-value: < 2.2e-16
# this p-value of < 2e-6*** significance since it is less than 0.05 standard significant threshold, indicating strong relationship between the predictor varibles and dependent variable which is depression.
# so we go on and reject the null hypothesis and accept the alternative hypotheses.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.