Predictors of Clinically Significant Depression-Homework4

Introduction

Depression is a prevalent mental disorder, experienced by 4-10% of the global population over their lifetime (Chapman et al., 2022). Currently, around 280 million people (3.8%) are affected globally (WHO, 2023), with depression ranked among the top contributors to the global health burden in 2019.

library(foreign)
library(ltm)

setwd("/Users/annarendez/Desktop/Master/1.Semester/Quantitavie Forschung/R Data")
df = read.spss("ESS11.sav", to.data.frame = T)

Hypothesis

H1: The prevalence of depression increases with experienced discrimination based on an individual’s sexuality (LGBQ+).

H2: The prevalence of depression increases with experienced discrimination based on an individual’s skin colour or race.

H3: The prevalence of depression decrease with age (still to be justified by the literature)

H4: The prevalence of depression among females compared to males is higher (Female are more depressed than male) (still to be justified by the literature)

Methods

The present paper aimed to investigate depression in a British population, as 15-30% of individuals do not recover from depression after two or more treatments (Chapman et al., 2022) and therefore a greater understanding of potential contributing factors is crucial for improving recovery outcomes.

df$d20 = as.numeric(df$fltdpr)
df$d21 = as.numeric(df$flteeff)
df$d22 = as.numeric(df$slprl)
df$d23 = as.numeric(df$wrhpp)
df$d24 = as.numeric(df$fltlnl)
df$d25 = as.numeric(df$enjlf)
df$d26 = as.numeric(df$fltsd)
df$d27 = as.numeric(df$cldgng)


# reverse scales of d23 and d25 (negative coding)
df$d23 = 5 - df$d23
df$d25 = 5 - df$d25


# lookup: existing country names in the dataframe (df)
table(df$cntry)

## 
##            Albania            Austria            Belgium           Bulgaria 
##                  0               2354               1594                  0 
##        Switzerland             Cyprus            Czechia            Germany 
##               1384                685                  0               2420 
##            Denmark            Estonia              Spain            Finland 
##                  0                  0               1844               1563 
##             France     United Kingdom            Georgia             Greece 
##               1771               1684                  0               2757 
##            Croatia            Hungary            Ireland             Israel 
##               1563               2118               2017                  0 
##            Iceland              Italy          Lithuania         Luxembourg 
##                842               2865               1365                  0 
##             Latvia         Montenegro    North Macedonia        Netherlands 
##                  0                  0                  0               1695 
##             Norway             Poland           Portugal            Romania 
##               1337               1442               1373                  0 
##             Serbia Russian Federation             Sweden           Slovenia 
##               1563                  0               1230               1248 
##           Slovakia             Turkey            Ukraine             Kosovo 
##               1442                  0                  0                  0

# selected country: United Kingdom (UK hereafter)
# subset dataset: rows where cntry is "United Kingdom", all columns
# name it "df_uk" (dataset UK)
df_uk = df[df$cntry == "United Kingdom", ]
# check
table(df_uk$cntry)

## 
##            Albania            Austria            Belgium           Bulgaria 
##                  0                  0                  0                  0 
##        Switzerland             Cyprus            Czechia            Germany 
##                  0                  0                  0                  0 
##            Denmark            Estonia              Spain            Finland 
##                  0                  0                  0                  0 
##             France     United Kingdom            Georgia             Greece 
##                  0               1684                  0                  0 
##            Croatia            Hungary            Ireland             Israel 
##                  0                  0                  0                  0 
##            Iceland              Italy          Lithuania         Luxembourg 
##                  0                  0                  0                  0 
##             Latvia         Montenegro    North Macedonia        Netherlands 
##                  0                  0                  0                  0 
##             Norway             Poland           Portugal            Romania 
##                  0                  0                  0                  0 
##             Serbia Russian Federation             Sweden           Slovenia 
##                  0                  0                  0                  0 
##           Slovakia             Turkey            Ukraine             Kosovo 
##                  0                  0                  0                  0

We want to determine the score of the CES-D-8, when a depression is clinical significant. According to R. Briggs et al. a score of 9 can be used to identify those with clinically significant symptoms.

#Not relevant for Homework 4

# calculation of Cronbach's alpha (using df_uk) to check internal consistency ("reliability") of depression items
cronbach.alpha(df_uk[,c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm=T)

## 
## Cronbach's alpha for the 'df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]' data-set
## 
## Items: 8
## Sample units: 1684
## alpha: 0.838

 alpha_uk=cronbach.alpha(df_uk[,c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm=T)
 
# cronbach's alpha is 0.838 (not too high, not too low).
# to what extent are we measuring the same construct (reliability of the scale)? 
# goal of 0.7 ≤ α ≤ 0.9 has been achieved. 
# cronbach’s alpha falls between 0.8 and approximately 0.92, which is considered optimal.
# the scale measures the same underlying construct (depression) - no items needs to be removed.

Am Anfang haben wir den Cronbach Alpha berechnet um zu schauen ob unsere Variablen in Zusammenhang miteinander stehen. Der Cornbach Alpha unserer Rechnung beträgt 0.84 , dieses Ergebnis zeigt, dass die berechneten Variablen in Zusammenhang zueinander stehen.

# calculation of the average score (new variable named "depression")
# score = mean of items row wise = sum of item values / number of items - 
df_uk$depression = rowSums(df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]) -8
# check that all columns (d20, d21, d22, etc.) are correctly spelled and exist in df_uk (not meaningful).
#names(df_uk)
#df_uk$depression
#table(df_uk$depression)
#table(df$gndr)

#median(df_uk$depression,3)

#not relevant for homework 4

library(foreign)
#install.packages("likert") # required to calculate Cronbach's alpha
library(ltm)
library(likert)     # create basic Likert tables and plots
library(kableExtra)


# for datasets, see release guide pp 22ff
# read data and assign to data frame

setwd("/Users/annarendez/Desktop/Master/1.Semester/Quantitavie Forschung/R Data")
df = read.spss("ESS11.sav", to.data.frame = T)
names(df)

##   [1] "name"      "essround"  "edition"   "proddate"  "idno"      "cntry"    
##   [7] "dweight"   "pspwght"   "pweight"   "anweight"  "nwspol"    "netusoft" 
##  [13] "netustm"   "ppltrst"   "pplfair"   "pplhlp"    "polintr"   "psppsgva" 
##  [19] "actrolga"  "psppipla"  "cptppola"  "trstprl"   "trstlgl"   "trstplc"  
##  [25] "trstplt"   "trstprt"   "trstep"    "trstun"    "vote"      "prtvtdat" 
##  [31] "prtvtebe"  "prtvtchr"  "prtvtccy"  "prtvtffi"  "prtvtffr"  "prtvgde1" 
##  [37] "prtvgde2"  "prtvtegr"  "prtvthhu"  "prtvteis"  "prtvteie"  "prtvteit" 
##  [43] "prtvclt1"  "prtvclt2"  "prtvclt3"  "prtvtinl"  "prtvtcno"  "prtvtfpl" 
##  [49] "prtvtept"  "prtvtbrs"  "prtvtesk"  "prtvtgsi"  "prtvtges"  "prtvtdse" 
##  [55] "prtvthch"  "prtvtdgb"  "contplt"   "donprty"   "badge"     "sgnptit"  
##  [61] "pbldmna"   "bctprd"    "pstplonl"  "volunfp"   "clsprty"   "prtcleat" 
##  [67] "prtclebe"  "prtclbhr"  "prtclccy"  "prtclgfi"  "prtclgfr"  "prtclgde" 
##  [73] "prtclegr"  "prtclihu"  "prtcleis"  "prtclfie"  "prtclfit"  "prtclclt" 
##  [79] "prtclhnl"  "prtclcno"  "prtcljpl"  "prtclgpt"  "prtclbrs"  "prtclesk" 
##  [85] "prtclgsi"  "prtclhes"  "prtcldse"  "prtclhch"  "prtcldgb"  "prtdgcl"  
##  [91] "lrscale"   "stflife"   "stfeco"    "stfgov"    "stfdem"    "stfedu"   
##  [97] "stfhlth"   "gincdif"   "freehms"   "hmsfmlsh"  "hmsacld"   "euftf"    
## [103] "lrnobed"   "loylead"   "imsmetn"   "imdfetn"   "impcntr"   "imbgeco"  
## [109] "imueclt"   "imwbcnt"   "happy"     "sclmeet"   "inprdsc"   "sclact"   
## [115] "crmvct"    "aesfdrk"   "health"    "hlthhmp"   "atchctr"   "atcherp"  
## [121] "rlgblg"    "rlgdnm"    "rlgdnbat"  "rlgdnacy"  "rlgdnafi"  "rlgdnade" 
## [127] "rlgdnagr"  "rlgdnhu"   "rlgdnais"  "rlgdnie"   "rlgdnlt"   "rlgdnanl" 
## [133] "rlgdnno"   "rlgdnapl"  "rlgdnapt"  "rlgdnrs"   "rlgdnask"  "rlgdnase" 
## [139] "rlgdnach"  "rlgdngb"   "rlgblge"   "rlgdnme"   "rlgdebat"  "rlgdeacy" 
## [145] "rlgdeafi"  "rlgdeade"  "rlgdeagr"  "rlgdehu"   "rlgdeais"  "rlgdeie"  
## [151] "rlgdelt"   "rlgdeanl"  "rlgdeno"   "rlgdeapl"  "rlgdeapt"  "rlgders"  
## [157] "rlgdeask"  "rlgdease"  "rlgdeach"  "rlgdegb"   "rlgdgr"    "rlgatnd"  
## [163] "pray"      "dscrgrp"   "dscrrce"   "dscrntn"   "dscrrlg"   "dscrlng"  
## [169] "dscretn"   "dscrage"   "dscrgnd"   "dscrsex"   "dscrdsb"   "dscroth"  
## [175] "dscrdk"    "dscrref"   "dscrnap"   "dscrna"    "ctzcntr"   "brncntr"  
## [181] "cntbrthd"  "livecnta"  "lnghom1"   "lnghom2"   "feethngr"  "facntr"   
## [187] "fbrncntc"  "mocntr"    "mbrncntc"  "ccnthum"   "ccrdprs"   "wrclmch"  
## [193] "admrclc"   "testjc34"  "testjc35"  "testjc36"  "testjc37"  "testjc38" 
## [199] "testjc39"  "testjc40"  "testjc41"  "testjc42"  "vteurmmb"  "vteubcmb" 
## [205] "ctrlife"   "etfruit"   "eatveg"    "dosprt"    "cgtsmok"   "alcfreq"  
## [211] "alcwkdy"   "alcwknd"   "icgndra"   "alcbnge"   "height"    "weighta"  
## [217] "dshltgp"   "dshltms"   "dshltnt"   "dshltref"  "dshltdk"   "dshltna"  
## [223] "medtrun"   "medtrnp"   "medtrnt"   "medtroc"   "medtrnl"   "medtrwl"  
## [229] "medtrnaa"  "medtroth"  "medtrnap"  "medtrref"  "medtrdk"   "medtrna"  
## [235] "medtrnu"   "hlpfmly"   "hlpfmhr"   "trhltacu"  "trhltacp"  "trhltcm"  
## [241] "trhltch"   "trhltos"   "trhltho"   "trhltht"   "trhlthy"   "trhltmt"  
## [247] "trhltpt"   "trhltre"   "trhltsh"   "trhltnt"   "trhltref"  "trhltdk"  
## [253] "trhltna"   "fltdpr"    "flteeff"   "slprl"     "wrhpp"     "fltlnl"   
## [259] "enjlf"     "fltsd"     "cldgng"    "hltprhc"   "hltprhb"   "hltprbp"  
## [265] "hltpral"   "hltprbn"   "hltprpa"   "hltprpf"   "hltprsd"   "hltprsc"  
## [271] "hltprsh"   "hltprdi"   "hltprnt"   "hltprref"  "hltprdk"   "hltprna"  
## [277] "hltphhc"   "hltphhb"   "hltphbp"   "hltphal"   "hltphbn"   "hltphpa"  
## [283] "hltphpf"   "hltphsd"   "hltphsc"   "hltphsh"   "hltphdi"   "hltphnt"  
## [289] "hltphnap"  "hltphref"  "hltphdk"   "hltphna"   "hltprca"   "cancfre"  
## [295] "cnfpplh"   "fnsdfml"   "jbexpvi"   "jbexpti"   "jbexpml"   "jbexpmc"  
## [301] "jbexpnt"   "jbexpnap"  "jbexpref"  "jbexpdk"   "jbexpna"   "jbexevl"  
## [307] "jbexevh"   "jbexevc"   "jbexera"   "jbexecp"   "jbexebs"   "jbexent"  
## [313] "jbexenap"  "jbexeref"  "jbexedk"   "jbexena"   "nobingnd"  "likrisk"  
## [319] "liklead"   "sothnds"   "actcomp"   "mascfel"   "femifel"   "impbemw"  
## [325] "trmedmw"   "trwrkmw"   "trplcmw"   "trmdcnt"   "trwkcnt"   "trplcnt"  
## [331] "eqwrkbg"   "eqpolbg"   "eqmgmbg"   "eqpaybg"   "eqparep"   "eqparlv"  
## [337] "freinsw"   "fineqpy"   "wsekpwr"   "weasoff"   "wlespdm"   "wexashr"  
## [343] "wprtbym"   "wbrgwrm"   "hhmmb"     "gndr"      "gndr2"     "gndr3"    
## [349] "gndr4"     "gndr5"     "gndr6"     "gndr7"     "gndr8"     "gndr9"    
## [355] "gndr10"    "gndr11"    "gndr12"    "yrbrn"     "agea"      "yrbrn2"   
## [361] "yrbrn3"    "yrbrn4"    "yrbrn5"    "yrbrn6"    "yrbrn7"    "yrbrn8"   
## [367] "yrbrn9"    "yrbrn10"   "yrbrn11"   "yrbrn12"   "rshipa2"   "rshipa3"  
## [373] "rshipa4"   "rshipa5"   "rshipa6"   "rshipa7"   "rshipa8"   "rshipa9"  
## [379] "rshipa10"  "rshipa11"  "rshipa12"  "rshpsts"   "rshpsgb"   "lvgptnea" 
## [385] "dvrcdeva"  "marsts"    "marstgb"   "maritalb"  "chldhhe"   "domicil"  
## [391] "paccmoro"  "paccdwlr"  "pacclift"  "paccnbsh"  "paccocrw"  "paccxhoc" 
## [397] "paccnois"  "paccinro"  "paccnt"    "paccref"   "paccdk"    "paccna"   
## [403] "edulvlb"   "eisced"    "edlveat"   "edlvebe"   "edlvehr"   "edlvgcy"  
## [409] "edlvdfi"   "edlvdfr"   "edudde1"   "educde2"   "edlvegr"   "edlvdahu" 
## [415] "edlvdis"   "edlvdie"   "edlvfit"   "edlvdlt"   "edlvenl"   "edlveno"  
## [421] "edlvipl"   "edlvept"   "edlvdrs"   "edlvdsk"   "edlvesi"   "edlvies"  
## [427] "edlvdse"   "edlvdch"   "educgb1"   "edubgb2"   "edagegb"   "eduyrs"   
## [433] "pdwrk"     "edctn"     "uempla"    "uempli"    "dsbld"     "rtrd"     
## [439] "cmsrv"     "hswrk"     "dngoth"    "dngref"    "dngdk"     "dngna"    
## [445] "mainact"   "mnactic"   "crpdwk"    "pdjobev"   "pdjobyr"   "emplrel"  
## [451] "emplno"    "wrkctra"   "estsz"     "jbspv"     "njbspv"    "wkdcorga" 
## [457] "iorgact"   "wkhct"     "wkhtot"    "nacer2"    "tporgwk"   "isco08"   
## [463] "wrkac6m"   "uemp3m"    "uemp12m"   "uemp5yr"   "mbtru"     "hincsrca" 
## [469] "hinctnta"  "hincfel"   "edulvlpb"  "eiscedp"   "edlvpfat"  "edlvpebe" 
## [475] "edlvpehr"  "edlvpgcy"  "edlvpdfi"  "edlvpdfr"  "edupdde1"  "edupcde2" 
## [481] "edlvpegr"  "edlvpdahu" "edlvpdis"  "edlvpdie"  "edlvpfit"  "edlvpdlt" 
## [487] "edlvpenl"  "edlvpeno"  "edlvphpl"  "edlvpept"  "edlvpdrs"  "edlvpdsk" 
## [493] "edlvpesi"  "edlvphes"  "edlvpdse"  "edlvpdch"  "edupcgb1"  "edupbgb2" 
## [499] "edagepgb"  "pdwrkp"    "edctnp"    "uemplap"   "uemplip"   "dsbldp"   
## [505] "rtrdp"     "cmsrvp"    "hswrkp"    "dngothp"   "dngdkp"    "dngnapp"  
## [511] "dngrefp"   "dngnap"    "mnactp"    "crpdwkp"   "isco08p"   "emprelp"  
## [517] "wkhtotp"   "edulvlfb"  "eiscedf"   "edlvfeat"  "edlvfebe"  "edlvfehr" 
## [523] "edlvfgcy"  "edlvfdfi"  "edlvfdfr"  "edufcde1"  "edufbde2"  "edlvfegr" 
## [529] "edlvfdahu" "edlvfdis"  "edlvfdie"  "edlvffit"  "edlvfdlt"  "edlvfenl" 
## [535] "edlvfeno"  "edlvfgpl"  "edlvfept"  "edlvfdrs"  "edlvfdsk"  "edlvfesi" 
## [541] "edlvfges"  "edlvfdse"  "edlvfdch"  "edufcgb1"  "edufbgb2"  "edagefgb" 
## [547] "emprf14"   "occf14b"   "edulvlmb"  "eiscedm"   "edlvmeat"  "edlvmebe" 
## [553] "edlvmehr"  "edlvmgcy"  "edlvmdfi"  "edlvmdfr"  "edumcde1"  "edumbde2" 
## [559] "edlvmegr"  "edlvmdahu" "edlvmdis"  "edlvmdie"  "edlvmfit"  "edlvmdlt" 
## [565] "edlvmenl"  "edlvmeno"  "edlvmgpl"  "edlvmept"  "edlvmdrs"  "edlvmdsk" 
## [571] "edlvmesi"  "edlvmges"  "edlvmdse"  "edlvmdch"  "edumcgb1"  "edumbgb2" 
## [577] "edagemgb"  "emprm14"   "occm14b"   "atncrse"   "anctrya1"  "anctrya2" 
## [583] "regunit"   "region"    "ipcrtiva"  "impricha"  "ipeqopta"  "ipshabta" 
## [589] "impsafea"  "impdiffa"  "ipfrulea"  "ipudrsta"  "ipmodsta"  "ipgdtima" 
## [595] "impfreea"  "iphlppla"  "ipsucesa"  "ipstrgva"  "ipadvnta"  "ipbhprpa" 
## [601] "iprspota"  "iplylfra"  "impenva"   "imptrada"  "impfuna"   "testji1"  
## [607] "testji2"   "testji3"   "testji4"   "testji5"   "testji6"   "testji7"  
## [613] "testji8"   "testji9"   "respc19a"  "symtc19"   "symtnc19"  "vacc19"   
## [619] "recon"     "inwds"     "ainws"     "ainwe"     "binwe"     "cinwe"    
## [625] "dinwe"     "einwe"     "finwe"     "hinwe"     "iinwe"     "kinwe"    
## [631] "rinwe"     "inwde"     "jinws"     "jinwe"     "inwtm"     "mode"     
## [637] "domain"    "prob"      "stratum"   "psu"

vnames = c("fltdpr", "flteeff", "slprl", "wrhpp", "fltlnl", "enjlf", "fltsd", "cldgng")
likert_df = df[,vnames]

#Nur UK Daten

#df_uk = df[df$cntry == "United Kingdom", ]
# check

#df_uk$depression = rowSums(df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]) / 8

#Likert Scale

likert_table = likert(likert_df)$results 
likert_numeric_df = as.data.frame(lapply((df[,vnames]), as.numeric))
likert_table$Mean = unlist(lapply((likert_numeric_df[,vnames]), mean, na.rm=T)) 
# ... and append new columns to the data frame
likert_table$Count = unlist(lapply((likert_numeric_df[,vnames]), function (x) sum(!is.na(x))))
likert_table$Item = c(
  d20="how much of the time during the past week you felt depressed?",
  d21="…you felt that everything you did was an effort?",
  d22="…your sleep was restless?",
  d23="…you were happy?",
  d24="…you felt lonely?",
  d25="…you enjoyed life?",
  d26="…you felt sad?",
  d27="…you could not get going?")
likert_table

##                                                            Item
## 1 how much of the time during the past week you felt depressed?
## 2              …you felt that everything you did was an effort?
## 3                                     …your sleep was restless?
## 4                                              …you were happy?
## 5                                             …you felt lonely?
## 6                                            …you enjoyed life?
## 7                                                …you felt sad?
## 8                                     …you could not get going?
##   None or almost none of the time Some of the time Most of the time
## 1                       64.915835         29.06631         4.557165
## 2                       48.395568         38.42383         9.814171
## 3                       43.873854         39.87056        11.625059
## 4                        4.003510         23.53973        48.886939
## 5                       68.136458         24.27532         5.302253
## 6                        5.338783         24.82572        44.804153
## 7                       52.489933         41.07451         4.859808
## 8                       55.673484         36.10353         6.217928
##   All or almost all of the time     Mean Count
## 1                      1.460694 1.425627 39981
## 2                      3.366431 1.681515 39983
## 3                      4.630532 1.770123 40017
## 4                     23.569817 2.920231 39890
## 5                      2.285972 1.417377 39983
## 6                     25.031346 2.895281 39878
## 7                      1.575748 1.555214 39981
## 8                      2.005056 1.545546 39949

# round all percentage values to 1 decimal digit
likert_table[,2:6] = round(likert_table[,2:6],1)
# round means to 3 decimal digits
likert_table[,7] = round(likert_table[,7],3)

# create formatted table
kable_styling(kable(likert_table,
                    format="html",
                    caption = "Distribution of answers regarding same sex partnerships (ESS round 11, all countries)"
                    )
              )

Distribution of answers regarding same sex partnerships (ESS round 11, all countries)
Item	None or almost none of the time	Some of the time	Most of the time	All or almost all of the time	Mean	Count
how much of the time during the past week you felt depressed?	64.9	29.1	4.6	1.5	1.4	39981
…you felt that everything you did was an effort?	48.4	38.4	9.8	3.4	1.7	39983
…your sleep was restless?	43.9	39.9	11.6	4.6	1.8	40017
…you were happy?	4.0	23.5	48.9	23.6	2.9	39890
…you felt lonely?	68.1	24.3	5.3	2.3	1.4	39983
…you enjoyed life?	5.3	24.8	44.8	25.0	2.9	39878
…you felt sad?	52.5	41.1	4.9	1.6	1.6	39981
…you could not get going?	55.7	36.1	6.2	2.0	1.5	39949

# create basic plot (code also valid)
plot(likert(summary=likert_table[,1:6])) # limit to columns 1:6 to skip mean and count

#Other R Work not relevant for Homework 4

library(kableExtra)
library(knitr)
# check further (frequency table)
table(df_uk$depression)

## 
##   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19 
## 103  98 172 201 167 158 146 144  94  78  55  43  34  35  27  14  15  11  11   9 
##  20  21  22  23  24 
##   9   1   3   3   4

table_dep=data.frame(table(df_uk$depression))


#kable(table_dep,
      #col.names = c("Depression Score","Frequency"),
      #caption = "Frequency Distribution of Depressionscores in the UK")

#kable_styling(
 # kable(table_dep,
     # col.names = c("Depression Score","Frequency"),
      #caption = "Frequency Distribution of Depressionscores in the UK"
      #),full_width = F, font_size = 13, bootstrap_options = c("hover", #"condensed"))

scroll_box(
  kable_styling(
  kable(table_dep, col.names = c("Depression Score","Frequency"),
      caption = "Frequency Distribution of Depressionscores in the UK"
      ),full_width = F, font_size = 13, bootstrap_options = c("hover", "condensed")),height="300px")

Frequency Distribution of Depressionscores in the UK
Depression Score	Frequency
0	103
1	98
2	172
3	201
4	167
5	158
6	146
7	144
8	94
9	78
10	55
11	43
12	34
13	35
14	27
15	14
16	15
17	11
18	11
19	9
20	9
21	1
22	3
23	3
24	4

#depression_table_uk = table(df_uk$depression)
#depression_table_uk 

#Just show me the scores of people with equal or higher than 9 depression scores

df_uk$depressed=ifelse(df_uk$depression >= 9,1,0)
df_uk$depressed

##    [1]  0  1  0  0  1  0  0  0  0  1  1  0  0  0  1  0  0  0 NA  0  0  0 NA  0
##   [25]  0  0  0  1  1  1  0  0  1  0  0  0  1  0  0  0  1  0  1  0  0 NA  0  0
##   [49]  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  0
##   [73]  1  0  1  1  0  0  0  0 NA  0  1  0  1  0  0  0  0  0  0  0  0  1  0  0
##   [97]  0  1  0  0  0  0  0  0  1  1  0  0  0  1  1  1  0  1  1  0  1  0  0  0
##  [121]  0  1  0  0  0 NA  1  1  0  0  0  0  0  0  1  0  0  0  1  0  1  0  0  0
##  [145]  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  0  1  0  0  1
##  [169]  0  0  0 NA  0  0  0  0  1  0  0  0  0  0  0  1  0  0  1  0  0  0  0  0
##  [193]  0  0  0  0  1  0  0  0  1  0  0  0  1  0  1  0  1  0  0  1  0  0  0  1
##  [217]  0  0  0  1  1  0  1  1  0  0  0  0  1  0  0  1  0  0  0  0  0 NA NA  0
##  [241]  0  1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  1  1  0  0  1  0  0
##  [265]  0  0  0  0  1  0  1  0  0  0  1  1  0  0  0  1  0  0  1  1  0  0  0  0
##  [289]  0  0  1  1  1  1  0  0  0  0  0  0 NA  0  0  0  0  0  0  0  1  1  0  1
##  [313]  0  0  0  0  1  0  0 NA  0  1  0 NA  0  1  1  1  1  0  1  0  0  0  0  0
##  [337]  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0
##  [361]  1  0  0  1  0  0  1  1  0  0  0  0  1  1  1  1  0 NA  0  0  0  1  0  0
##  [385]  0  0  0  0  0  1  0  0  0  1  0  0  0  0  0  1  0  1  0  0  0  0  0  0
##  [409]  0  0  0  0  1  0  1  0  1  0  0  0  0  1  0  0  0  0  0  0  0  1  0  0
##  [433]  1  0  1  0 NA  1  0 NA  0  0  1  0  0  0  0  1  0  0  1  0  0  0  1  0
##  [457]  0  0  1  1  1  0  1  0  0  0  0  0  1  0  0  0  0  1  0  0  1  0  1  0
##  [481]  0  0  0  0  1  1  0  0  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0  0
##  [505]  0  1  0 NA  0  1  0  0  0  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0
##  [529]  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1
##  [553]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1 NA  1  0  0  0
##  [577]  1  0  1  0  0  0  0 NA  0  1  1  0  0  1  0  0  0  0  0  1  0  0  1  1
##  [601]  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  0  1  0  0  1  0
##  [625]  1  1  0  0  0  0  0  0  0 NA  0  1  0  1  1  0  0  0  0  0  0  0  0  0
##  [649]  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  1 NA  0
##  [673]  0  0  0  0  0  0  0  0  0  0 NA  0  0  1  1  0  0  0  0  1  0  0  1  0
##  [697]  0  1  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  1  1  0  0
##  [721]  0  1  0  0  1  0  0 NA  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0
##  [745]  0  0  0  0  1  0  0  0  1  0  0  0  1  1  0  0  0  0  0  1  0  0  0  0
##  [769]  1  0  0  0  1  1  0  0  0  1  0  0  0  0  1  0  0  0  1  0  0  0 NA  0
##  [793]  0  0  1 NA  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0
##  [817]  1  0  0  0  0  0  0  1  0  0  0  0  0  0  1 NA  0  0  0  1 NA  0  0  0
##  [841]  0  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  1  0  0  1  0
##  [865]  1  0  0  0  0  0  0  0  1 NA  0  0  0 NA  0  0  1  0  1  0  0  1  0  0
##  [889]  1  1  0  1  0  0  0  0  0  0  0  0  0  0  1  0  1 NA  0  1  0  1  0  0
##  [913]  0  0  0  0  1  0  0  0  1  0  0  0  0  0  0  0  1  1  0  0  0  0 NA  0
##  [937]  0  0  1 NA  0  0  0  1  1  0  0  0  0  1  0  0  0  0  0  0  1  0  0  0
##  [961]  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0  1  0  1  0  0  1  1  0  0
##  [985]  0  0  0  0  0  0  0  0  1  1  0  0  0  0  0  0  0  1  0  0  0 NA  0  0
## [1009]  1  1  0  0  1  1  0  1  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0
## [1033]  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
## [1057]  0  0  0  0  0  0  0  1  0  0  0  1  1  0  0  0  0  0  1  0  1  0  0  0
## [1081]  1  0  1  0  0  0  1  0  0  1  0  1  1  0  1  0  0  1  0  0  1  0  0  0
## [1105] NA  0  0  0  0  1  0  0  1  0  0  0  0  0  0  0  0  1  0  0  0  0  1  1
## [1129]  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  1  0  0  1  0  1  0  0
## [1153]  0  0  1  0  0  1  0  0  1  0  0  1  0  0  0  0  0  1  0  1  0  0  0  0
## [1177]  0  0  0  1  0 NA  0 NA  0  0  1  0  0  0  0  0  0 NA  1  0  0  0  0  0
## [1201]  0  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0
## [1225]  1  0  0  0  1  0  0  1  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0
## [1249]  0  0  0  0  1  0  1  0  0  0  0  1  0  0  0  1  0  1  0  0  1  0  1  0
## [1273]  0  0  1  0  0  0  0  0  0  1  0  0 NA  0 NA  0  1  0  0  0  0  0  0  1
## [1297]  0  0 NA  0  0  0  0  0  1  1  0  0  0  1  1 NA  0  0  0  0  0  0  1 NA
## [1321]  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0
## [1345]  0  1  0  0  0  0  1  0  0 NA  0  0  0  0  0  0  1  0  0  1  0  0  0  0
## [1369]  1  0  1  0  1  0  0  0 NA  0  0  1  0  0  1  0  0  0  0  0  0  0  0  0
## [1393]  0  0  1  1 NA  1  0  0  1  0  0  0  0  0  0  0  0  1  0  0  0  1  0  0
## [1417]  0  0  0  1  0  0  1  1  0  0  1  0  0  0 NA  0  0  0  0  0  0 NA NA  0
## [1441]  0  0  0  0  0  0  1  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0
## [1465]  0  1  0  0  0  0  1  0  0  1  1  0  0  1  1  0  0  0  0  0 NA  0  0  0
## [1489]  0  1  1  0  0  0  0  0  0  0  0  1  1  0  1  0  0  0  0  0  1  0  0  0
## [1513]  1  0  1  0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0
## [1537]  1  0  0  0  1  1  0  0  0  0  0  0  0  1  0  1  0  1  0  0  0  0  0  0
## [1561]  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## [1585]  0  1  0  0  0  0  0  0  0  0  0  0  0  1  1  1  0  1  1  0  1  0  0  0
## [1609]  0  0  1  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  1  1  0  0  0
## [1633]  0  1  0  1  0  1  0  0  0  0  0  0  1  0  1  0  0  0  0  0  0  0  0  0
## [1657] NA  0  0  1  0  0  0  0  0  0  1 NA  0  0  0  0  0  0  0  1  0  0  1  0
## [1681]  0  0  0  0

table(df_uk$depressed)

## 
##    0    1 
## 1283  352

#Calculating Odds Ratio between people with lower score 0-8 and people with higher score 9 up to 24

#People with depression scale between 0-8: 1557
#People with despression scale between 9-24: 78 
#Odds Ratio: 78/1557=0,050 --> Odds are lower to have a severe depression

aModel = glm(depressed ~ gndr, data=df_uk, family=binomial) 
# Show summary of regression model
summary(aModel)

## 
## Call:
## glm(formula = depressed ~ gndr, family = binomial, data = df_uk)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.45815    0.09035 -16.139   <2e-16 ***
## gndrFemale   0.30941    0.12131   2.551   0.0108 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1703.3  on 1634  degrees of freedom
## Residual deviance: 1696.7  on 1633  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 1700.7
## 
## Number of Fisher Scoring iterations: 4

coef(aModel)

## (Intercept)  gndrFemale 
##  -1.4581529   0.3094088

# Interpretation:
# For every one unit change in gender (i.e. from male to female), the log odds of bstart 
# (having vs non-having own start-up) decreases by -0.343230. This effect is highly 
# significant, i.e. p<0.001 that regression coefficient B = -0.343230 is only due to random effects
# i.e is actually 0 in the population
#
# As "log odds of bstart" is quite hard to interpret, calculate the Odds Ratios (as explained above)
# OR = exp(B) = e**B
#
exp(coef(aModel))

## (Intercept)  gndrFemale 
##   0.2326656   1.3626193

# Calculate Confidence Intervals for ORs
exp(confint(aModel))

##                2.5 %    97.5 %
## (Intercept) 0.194246 0.2768727
## gndrFemale  1.075026 1.7300513

Other way of trining to solve the homework is presented in the following

#Other way of trying it out

# frequency distribution of the new variable (depression)
# interpretation:
# min. is 0, max. is 24
# we have at least one individual who answered all items by 0 (lowest possible depression level) and at least on individual who answered all answers by 24 (highest possible depression level)
# most participants report low to moderate depression:

# but missing data (49 NA's)
### just for a further check:
# counting participants with depression scores between 0 and 8
# counting participants with depression scores > 9 (up to 24)
# store the frequency table uk - depression in a new variable

depression_table_uk = table(df_uk$depression)
depression_table_uk

## 
##   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19 
## 103  98 172 201 167 158 146 144  94  78  55  43  34  35  27  14  15  11  11   9 
##  20  21  22  23  24 
##   9   1   3   3   4

# print
# sum of participants between 0 and 8 (inclusive)
depression_scale_1_to_2 = sum(depression_table_uk[names(depression_table_uk) >= 0 & names(depression_table_uk) <= 8])
depression_scale_1_to_2 # print

## [1] 1557

# sum of participants >= 9 (up to 24)
depression_scale_gt2 = sum(depression_table_uk[names(depression_table_uk) > 9])
depression_scale_gt2 # print

## [1] 0

# double-check
sum(1283, 352) # 1635 participants (uk - depression), excluding 49 NA's

## [1] 1635

#Calculating Odds Ratio between people with lower score 1-2 and people with higher score 2 up to 4

#People with depression scale between 1-2: 12383
#People with despression scale between 2-4:  352
#Odds Ratio: 352/1283=0,274 --> Odds are lower to have a severe depression

Discussion

Our results mirror the results in other papers. For example that LGTBQ+ people are more likely to suffer from depression than straight people. The United Kingdom Survey on the Mental Health of LGBTQ+ (2024), highlighted that problem before us and claimed that victimization, discrimination, and lack of access to affirming spaces result in poorer mental health status. With our data we can confirm those findings.

As well as our findings that different skin colour contributes to higher depression scores than white people, could be linked to higher rates of discrimination, victimization and lack of affirming spaces. According to ”Stop Hate UK” a help organization against hate crime in the UK, 43% of all hate crimes reported to their helpline were because of racism. This could result from the historical legacy of Colonialism and Empire, where racism is deeply rooted in. Another possible explanation could be the Lack of Representation, Ethnic minorities are underrepresented in positions of power across politics, media, and business.

Our results concerning the correlation between age and depression showed little to none significance. Age does not seem to have an influence on depression scores. The slight negative correlation could be interpreted that with age resilience rises and that people are more settled to stand against depression.

The gender gap between men and women continues with depression scores. We found a significant difference in depression scores between men and women. Possible explanations for these findings could be the higher strain women face in our society. From poorer payment, responsibility at home and parenting.

Nevertheless our Regression Analysis showed little impact of sexuality and gender on depression. Therefore further research is needed to identify bigger drivers of depression. According to the “Mental Health Foundation, UK”- “People living in the lowest socioeconomic groups are more likely to experience common mental health problems such as depression and anxiety.”-. Loneliness is another strong driver of depression, especially in elderly people (Sheffield Hallam University, 2025). Furthermore a lack of access and inequalities in health care services in the UK account for higher depression rates (Royal College of Psychiatrist, 2025). These variables could be more dominant when looking at determinants of depression as well as exercise, food and lifestyle choices. Further research has to be done to verify these associations.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Depression Score	Frequency
0	103
1	98
2	172
3	201
4	167
5	158
6	146
7	144
8	94
9	78
10	55
11	43
12	34
13	35
14	27
15	14
16	15
17	11
18	11
19	9
20	9
21	1
22	3
23	3
24	4

Depression Score	Frequency
0	103
1	98
2	172
3	201
4	167
5	158
6	146
7	144
8	94
9	78
10	55
11	43
12	34
13	35
14	27
15	14
16	15
17	11
18	11
19	9
20	9
21	1
22	3
23	3
24	4

Predictors of Clinically Significant Depression-Homework4

2025-05-07

Introduction

Hypothesis

Methods

Discussion

Depression Score	Frequency
0	103
1	98
2	172
3	201
4	167
5	158
6	146
7	144
8	94
9	78
10	55
11	43
12	34
13	35
14	27
15	14
16	15
17	11
18	11
19	9
20	9
21	1
22	3
23	3
24	4