Depression is a prevalent mental disorder, experienced by 4-10% of the global population over their lifetime (Chapman et al., 2022). Currently, around 280 million people (3.8%) are affected globally (WHO, 2023), with depression ranked among the top contributors to the global health burden in 2019.
library(foreign)
library(ltm)
setwd("/Users/annarendez/Desktop/Master/1.Semester/Quantitavie Forschung/R Data")
df = read.spss("ESS11.sav", to.data.frame = T)
H1: The prevalence of depression increases with experienced discrimination based on an individual’s sexuality (LGBQ+).
H2: The prevalence of depression increases with experienced discrimination based on an individual’s skin colour or race.
H3: The prevalence of depression decrease with age (still to be justified by the literature)
H4: The prevalence of depression among females compared to males is higher (Female are more depressed than male) (still to be justified by the literature)
The present paper aimed to investigate depression in a British population, as 15-30% of individuals do not recover from depression after two or more treatments (Chapman et al., 2022) and therefore a greater understanding of potential contributing factors is crucial for improving recovery outcomes.
df$d20 = as.numeric(df$fltdpr)
df$d21 = as.numeric(df$flteeff)
df$d22 = as.numeric(df$slprl)
df$d23 = as.numeric(df$wrhpp)
df$d24 = as.numeric(df$fltlnl)
df$d25 = as.numeric(df$enjlf)
df$d26 = as.numeric(df$fltsd)
df$d27 = as.numeric(df$cldgng)
# reverse scales of d23 and d25 (negative coding)
df$d23 = 5 - df$d23
df$d25 = 5 - df$d25
# lookup: existing country names in the dataframe (df)
table(df$cntry)
##
## Albania Austria Belgium Bulgaria
## 0 2354 1594 0
## Switzerland Cyprus Czechia Germany
## 1384 685 0 2420
## Denmark Estonia Spain Finland
## 0 0 1844 1563
## France United Kingdom Georgia Greece
## 1771 1684 0 2757
## Croatia Hungary Ireland Israel
## 1563 2118 2017 0
## Iceland Italy Lithuania Luxembourg
## 842 2865 1365 0
## Latvia Montenegro North Macedonia Netherlands
## 0 0 0 1695
## Norway Poland Portugal Romania
## 1337 1442 1373 0
## Serbia Russian Federation Sweden Slovenia
## 1563 0 1230 1248
## Slovakia Turkey Ukraine Kosovo
## 1442 0 0 0
# selected country: United Kingdom (UK hereafter)
# subset dataset: rows where cntry is "United Kingdom", all columns
# name it "df_uk" (dataset UK)
df_uk = df[df$cntry == "United Kingdom", ]
# check
table(df_uk$cntry)
##
## Albania Austria Belgium Bulgaria
## 0 0 0 0
## Switzerland Cyprus Czechia Germany
## 0 0 0 0
## Denmark Estonia Spain Finland
## 0 0 0 0
## France United Kingdom Georgia Greece
## 0 1684 0 0
## Croatia Hungary Ireland Israel
## 0 0 0 0
## Iceland Italy Lithuania Luxembourg
## 0 0 0 0
## Latvia Montenegro North Macedonia Netherlands
## 0 0 0 0
## Norway Poland Portugal Romania
## 0 0 0 0
## Serbia Russian Federation Sweden Slovenia
## 0 0 0 0
## Slovakia Turkey Ukraine Kosovo
## 0 0 0 0
We want to determine the score of the CES-D-8, when a depression is clinical significant. According to R. Briggs et al. a score of 9 can be used to identify those with clinically significant symptoms.
#Not relevant for Homework 4
# calculation of Cronbach's alpha (using df_uk) to check internal consistency ("reliability") of depression items
cronbach.alpha(df_uk[,c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm=T)
##
## Cronbach's alpha for the 'df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]' data-set
##
## Items: 8
## Sample units: 1684
## alpha: 0.838
alpha_uk=cronbach.alpha(df_uk[,c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")], na.rm=T)
# cronbach's alpha is 0.838 (not too high, not too low).
# to what extent are we measuring the same construct (reliability of the scale)?
# goal of 0.7 ≤ α ≤ 0.9 has been achieved.
# cronbach’s alpha falls between 0.8 and approximately 0.92, which is considered optimal.
# the scale measures the same underlying construct (depression) - no items needs to be removed.
Am Anfang haben wir den Cronbach Alpha berechnet um zu schauen ob unsere Variablen in Zusammenhang miteinander stehen. Der Cornbach Alpha unserer Rechnung beträgt 0.84 , dieses Ergebnis zeigt, dass die berechneten Variablen in Zusammenhang zueinander stehen.
# calculation of the average score (new variable named "depression")
# score = mean of items row wise = sum of item values / number of items -
df_uk$depression = rowSums(df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]) -8
# check that all columns (d20, d21, d22, etc.) are correctly spelled and exist in df_uk (not meaningful).
#names(df_uk)
#df_uk$depression
#table(df_uk$depression)
#table(df$gndr)
#median(df_uk$depression,3)
#not relevant for homework 4
library(foreign)
#install.packages("likert") # required to calculate Cronbach's alpha
library(ltm)
library(likert) # create basic Likert tables and plots
library(kableExtra)
# for datasets, see release guide pp 22ff
# read data and assign to data frame
setwd("/Users/annarendez/Desktop/Master/1.Semester/Quantitavie Forschung/R Data")
df = read.spss("ESS11.sav", to.data.frame = T)
names(df)
## [1] "name" "essround" "edition" "proddate" "idno" "cntry"
## [7] "dweight" "pspwght" "pweight" "anweight" "nwspol" "netusoft"
## [13] "netustm" "ppltrst" "pplfair" "pplhlp" "polintr" "psppsgva"
## [19] "actrolga" "psppipla" "cptppola" "trstprl" "trstlgl" "trstplc"
## [25] "trstplt" "trstprt" "trstep" "trstun" "vote" "prtvtdat"
## [31] "prtvtebe" "prtvtchr" "prtvtccy" "prtvtffi" "prtvtffr" "prtvgde1"
## [37] "prtvgde2" "prtvtegr" "prtvthhu" "prtvteis" "prtvteie" "prtvteit"
## [43] "prtvclt1" "prtvclt2" "prtvclt3" "prtvtinl" "prtvtcno" "prtvtfpl"
## [49] "prtvtept" "prtvtbrs" "prtvtesk" "prtvtgsi" "prtvtges" "prtvtdse"
## [55] "prtvthch" "prtvtdgb" "contplt" "donprty" "badge" "sgnptit"
## [61] "pbldmna" "bctprd" "pstplonl" "volunfp" "clsprty" "prtcleat"
## [67] "prtclebe" "prtclbhr" "prtclccy" "prtclgfi" "prtclgfr" "prtclgde"
## [73] "prtclegr" "prtclihu" "prtcleis" "prtclfie" "prtclfit" "prtclclt"
## [79] "prtclhnl" "prtclcno" "prtcljpl" "prtclgpt" "prtclbrs" "prtclesk"
## [85] "prtclgsi" "prtclhes" "prtcldse" "prtclhch" "prtcldgb" "prtdgcl"
## [91] "lrscale" "stflife" "stfeco" "stfgov" "stfdem" "stfedu"
## [97] "stfhlth" "gincdif" "freehms" "hmsfmlsh" "hmsacld" "euftf"
## [103] "lrnobed" "loylead" "imsmetn" "imdfetn" "impcntr" "imbgeco"
## [109] "imueclt" "imwbcnt" "happy" "sclmeet" "inprdsc" "sclact"
## [115] "crmvct" "aesfdrk" "health" "hlthhmp" "atchctr" "atcherp"
## [121] "rlgblg" "rlgdnm" "rlgdnbat" "rlgdnacy" "rlgdnafi" "rlgdnade"
## [127] "rlgdnagr" "rlgdnhu" "rlgdnais" "rlgdnie" "rlgdnlt" "rlgdnanl"
## [133] "rlgdnno" "rlgdnapl" "rlgdnapt" "rlgdnrs" "rlgdnask" "rlgdnase"
## [139] "rlgdnach" "rlgdngb" "rlgblge" "rlgdnme" "rlgdebat" "rlgdeacy"
## [145] "rlgdeafi" "rlgdeade" "rlgdeagr" "rlgdehu" "rlgdeais" "rlgdeie"
## [151] "rlgdelt" "rlgdeanl" "rlgdeno" "rlgdeapl" "rlgdeapt" "rlgders"
## [157] "rlgdeask" "rlgdease" "rlgdeach" "rlgdegb" "rlgdgr" "rlgatnd"
## [163] "pray" "dscrgrp" "dscrrce" "dscrntn" "dscrrlg" "dscrlng"
## [169] "dscretn" "dscrage" "dscrgnd" "dscrsex" "dscrdsb" "dscroth"
## [175] "dscrdk" "dscrref" "dscrnap" "dscrna" "ctzcntr" "brncntr"
## [181] "cntbrthd" "livecnta" "lnghom1" "lnghom2" "feethngr" "facntr"
## [187] "fbrncntc" "mocntr" "mbrncntc" "ccnthum" "ccrdprs" "wrclmch"
## [193] "admrclc" "testjc34" "testjc35" "testjc36" "testjc37" "testjc38"
## [199] "testjc39" "testjc40" "testjc41" "testjc42" "vteurmmb" "vteubcmb"
## [205] "ctrlife" "etfruit" "eatveg" "dosprt" "cgtsmok" "alcfreq"
## [211] "alcwkdy" "alcwknd" "icgndra" "alcbnge" "height" "weighta"
## [217] "dshltgp" "dshltms" "dshltnt" "dshltref" "dshltdk" "dshltna"
## [223] "medtrun" "medtrnp" "medtrnt" "medtroc" "medtrnl" "medtrwl"
## [229] "medtrnaa" "medtroth" "medtrnap" "medtrref" "medtrdk" "medtrna"
## [235] "medtrnu" "hlpfmly" "hlpfmhr" "trhltacu" "trhltacp" "trhltcm"
## [241] "trhltch" "trhltos" "trhltho" "trhltht" "trhlthy" "trhltmt"
## [247] "trhltpt" "trhltre" "trhltsh" "trhltnt" "trhltref" "trhltdk"
## [253] "trhltna" "fltdpr" "flteeff" "slprl" "wrhpp" "fltlnl"
## [259] "enjlf" "fltsd" "cldgng" "hltprhc" "hltprhb" "hltprbp"
## [265] "hltpral" "hltprbn" "hltprpa" "hltprpf" "hltprsd" "hltprsc"
## [271] "hltprsh" "hltprdi" "hltprnt" "hltprref" "hltprdk" "hltprna"
## [277] "hltphhc" "hltphhb" "hltphbp" "hltphal" "hltphbn" "hltphpa"
## [283] "hltphpf" "hltphsd" "hltphsc" "hltphsh" "hltphdi" "hltphnt"
## [289] "hltphnap" "hltphref" "hltphdk" "hltphna" "hltprca" "cancfre"
## [295] "cnfpplh" "fnsdfml" "jbexpvi" "jbexpti" "jbexpml" "jbexpmc"
## [301] "jbexpnt" "jbexpnap" "jbexpref" "jbexpdk" "jbexpna" "jbexevl"
## [307] "jbexevh" "jbexevc" "jbexera" "jbexecp" "jbexebs" "jbexent"
## [313] "jbexenap" "jbexeref" "jbexedk" "jbexena" "nobingnd" "likrisk"
## [319] "liklead" "sothnds" "actcomp" "mascfel" "femifel" "impbemw"
## [325] "trmedmw" "trwrkmw" "trplcmw" "trmdcnt" "trwkcnt" "trplcnt"
## [331] "eqwrkbg" "eqpolbg" "eqmgmbg" "eqpaybg" "eqparep" "eqparlv"
## [337] "freinsw" "fineqpy" "wsekpwr" "weasoff" "wlespdm" "wexashr"
## [343] "wprtbym" "wbrgwrm" "hhmmb" "gndr" "gndr2" "gndr3"
## [349] "gndr4" "gndr5" "gndr6" "gndr7" "gndr8" "gndr9"
## [355] "gndr10" "gndr11" "gndr12" "yrbrn" "agea" "yrbrn2"
## [361] "yrbrn3" "yrbrn4" "yrbrn5" "yrbrn6" "yrbrn7" "yrbrn8"
## [367] "yrbrn9" "yrbrn10" "yrbrn11" "yrbrn12" "rshipa2" "rshipa3"
## [373] "rshipa4" "rshipa5" "rshipa6" "rshipa7" "rshipa8" "rshipa9"
## [379] "rshipa10" "rshipa11" "rshipa12" "rshpsts" "rshpsgb" "lvgptnea"
## [385] "dvrcdeva" "marsts" "marstgb" "maritalb" "chldhhe" "domicil"
## [391] "paccmoro" "paccdwlr" "pacclift" "paccnbsh" "paccocrw" "paccxhoc"
## [397] "paccnois" "paccinro" "paccnt" "paccref" "paccdk" "paccna"
## [403] "edulvlb" "eisced" "edlveat" "edlvebe" "edlvehr" "edlvgcy"
## [409] "edlvdfi" "edlvdfr" "edudde1" "educde2" "edlvegr" "edlvdahu"
## [415] "edlvdis" "edlvdie" "edlvfit" "edlvdlt" "edlvenl" "edlveno"
## [421] "edlvipl" "edlvept" "edlvdrs" "edlvdsk" "edlvesi" "edlvies"
## [427] "edlvdse" "edlvdch" "educgb1" "edubgb2" "edagegb" "eduyrs"
## [433] "pdwrk" "edctn" "uempla" "uempli" "dsbld" "rtrd"
## [439] "cmsrv" "hswrk" "dngoth" "dngref" "dngdk" "dngna"
## [445] "mainact" "mnactic" "crpdwk" "pdjobev" "pdjobyr" "emplrel"
## [451] "emplno" "wrkctra" "estsz" "jbspv" "njbspv" "wkdcorga"
## [457] "iorgact" "wkhct" "wkhtot" "nacer2" "tporgwk" "isco08"
## [463] "wrkac6m" "uemp3m" "uemp12m" "uemp5yr" "mbtru" "hincsrca"
## [469] "hinctnta" "hincfel" "edulvlpb" "eiscedp" "edlvpfat" "edlvpebe"
## [475] "edlvpehr" "edlvpgcy" "edlvpdfi" "edlvpdfr" "edupdde1" "edupcde2"
## [481] "edlvpegr" "edlvpdahu" "edlvpdis" "edlvpdie" "edlvpfit" "edlvpdlt"
## [487] "edlvpenl" "edlvpeno" "edlvphpl" "edlvpept" "edlvpdrs" "edlvpdsk"
## [493] "edlvpesi" "edlvphes" "edlvpdse" "edlvpdch" "edupcgb1" "edupbgb2"
## [499] "edagepgb" "pdwrkp" "edctnp" "uemplap" "uemplip" "dsbldp"
## [505] "rtrdp" "cmsrvp" "hswrkp" "dngothp" "dngdkp" "dngnapp"
## [511] "dngrefp" "dngnap" "mnactp" "crpdwkp" "isco08p" "emprelp"
## [517] "wkhtotp" "edulvlfb" "eiscedf" "edlvfeat" "edlvfebe" "edlvfehr"
## [523] "edlvfgcy" "edlvfdfi" "edlvfdfr" "edufcde1" "edufbde2" "edlvfegr"
## [529] "edlvfdahu" "edlvfdis" "edlvfdie" "edlvffit" "edlvfdlt" "edlvfenl"
## [535] "edlvfeno" "edlvfgpl" "edlvfept" "edlvfdrs" "edlvfdsk" "edlvfesi"
## [541] "edlvfges" "edlvfdse" "edlvfdch" "edufcgb1" "edufbgb2" "edagefgb"
## [547] "emprf14" "occf14b" "edulvlmb" "eiscedm" "edlvmeat" "edlvmebe"
## [553] "edlvmehr" "edlvmgcy" "edlvmdfi" "edlvmdfr" "edumcde1" "edumbde2"
## [559] "edlvmegr" "edlvmdahu" "edlvmdis" "edlvmdie" "edlvmfit" "edlvmdlt"
## [565] "edlvmenl" "edlvmeno" "edlvmgpl" "edlvmept" "edlvmdrs" "edlvmdsk"
## [571] "edlvmesi" "edlvmges" "edlvmdse" "edlvmdch" "edumcgb1" "edumbgb2"
## [577] "edagemgb" "emprm14" "occm14b" "atncrse" "anctrya1" "anctrya2"
## [583] "regunit" "region" "ipcrtiva" "impricha" "ipeqopta" "ipshabta"
## [589] "impsafea" "impdiffa" "ipfrulea" "ipudrsta" "ipmodsta" "ipgdtima"
## [595] "impfreea" "iphlppla" "ipsucesa" "ipstrgva" "ipadvnta" "ipbhprpa"
## [601] "iprspota" "iplylfra" "impenva" "imptrada" "impfuna" "testji1"
## [607] "testji2" "testji3" "testji4" "testji5" "testji6" "testji7"
## [613] "testji8" "testji9" "respc19a" "symtc19" "symtnc19" "vacc19"
## [619] "recon" "inwds" "ainws" "ainwe" "binwe" "cinwe"
## [625] "dinwe" "einwe" "finwe" "hinwe" "iinwe" "kinwe"
## [631] "rinwe" "inwde" "jinws" "jinwe" "inwtm" "mode"
## [637] "domain" "prob" "stratum" "psu"
vnames = c("fltdpr", "flteeff", "slprl", "wrhpp", "fltlnl", "enjlf", "fltsd", "cldgng")
likert_df = df[,vnames]
#Nur UK Daten
#df_uk = df[df$cntry == "United Kingdom", ]
# check
#df_uk$depression = rowSums(df_uk[, c("d20", "d21", "d22", "d23", "d24", "d25", "d26", "d27")]) / 8
#Likert Scale
likert_table = likert(likert_df)$results
likert_numeric_df = as.data.frame(lapply((df[,vnames]), as.numeric))
likert_table$Mean = unlist(lapply((likert_numeric_df[,vnames]), mean, na.rm=T))
# ... and append new columns to the data frame
likert_table$Count = unlist(lapply((likert_numeric_df[,vnames]), function (x) sum(!is.na(x))))
likert_table$Item = c(
d20="how much of the time during the past week you felt depressed?",
d21="…you felt that everything you did was an effort?",
d22="…your sleep was restless?",
d23="…you were happy?",
d24="…you felt lonely?",
d25="…you enjoyed life?",
d26="…you felt sad?",
d27="…you could not get going?")
likert_table
## Item
## 1 how much of the time during the past week you felt depressed?
## 2 …you felt that everything you did was an effort?
## 3 …your sleep was restless?
## 4 …you were happy?
## 5 …you felt lonely?
## 6 …you enjoyed life?
## 7 …you felt sad?
## 8 …you could not get going?
## None or almost none of the time Some of the time Most of the time
## 1 64.915835 29.06631 4.557165
## 2 48.395568 38.42383 9.814171
## 3 43.873854 39.87056 11.625059
## 4 4.003510 23.53973 48.886939
## 5 68.136458 24.27532 5.302253
## 6 5.338783 24.82572 44.804153
## 7 52.489933 41.07451 4.859808
## 8 55.673484 36.10353 6.217928
## All or almost all of the time Mean Count
## 1 1.460694 1.425627 39981
## 2 3.366431 1.681515 39983
## 3 4.630532 1.770123 40017
## 4 23.569817 2.920231 39890
## 5 2.285972 1.417377 39983
## 6 25.031346 2.895281 39878
## 7 1.575748 1.555214 39981
## 8 2.005056 1.545546 39949
# round all percentage values to 1 decimal digit
likert_table[,2:6] = round(likert_table[,2:6],1)
# round means to 3 decimal digits
likert_table[,7] = round(likert_table[,7],3)
# create formatted table
kable_styling(kable(likert_table,
format="html",
caption = "Distribution of answers regarding same sex partnerships (ESS round 11, all countries)"
)
)
| Item | None or almost none of the time | Some of the time | Most of the time | All or almost all of the time | Mean | Count |
|---|---|---|---|---|---|---|
| how much of the time during the past week you felt depressed? | 64.9 | 29.1 | 4.6 | 1.5 | 1.4 | 39981 |
| …you felt that everything you did was an effort? | 48.4 | 38.4 | 9.8 | 3.4 | 1.7 | 39983 |
| …your sleep was restless? | 43.9 | 39.9 | 11.6 | 4.6 | 1.8 | 40017 |
| …you were happy? | 4.0 | 23.5 | 48.9 | 23.6 | 2.9 | 39890 |
| …you felt lonely? | 68.1 | 24.3 | 5.3 | 2.3 | 1.4 | 39983 |
| …you enjoyed life? | 5.3 | 24.8 | 44.8 | 25.0 | 2.9 | 39878 |
| …you felt sad? | 52.5 | 41.1 | 4.9 | 1.6 | 1.6 | 39981 |
| …you could not get going? | 55.7 | 36.1 | 6.2 | 2.0 | 1.5 | 39949 |
# create basic plot (code also valid)
plot(likert(summary=likert_table[,1:6])) # limit to columns 1:6 to skip mean and count
#Other R Work not relevant for Homework 4
library(kableExtra)
library(knitr)
# check further (frequency table)
table(df_uk$depression)
##
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
## 103 98 172 201 167 158 146 144 94 78 55 43 34 35 27 14 15 11 11 9
## 20 21 22 23 24
## 9 1 3 3 4
table_dep=data.frame(table(df_uk$depression))
#kable(table_dep,
#col.names = c("Depression Score","Frequency"),
#caption = "Frequency Distribution of Depressionscores in the UK")
#kable_styling(
# kable(table_dep,
# col.names = c("Depression Score","Frequency"),
#caption = "Frequency Distribution of Depressionscores in the UK"
#),full_width = F, font_size = 13, bootstrap_options = c("hover", #"condensed"))
scroll_box(
kable_styling(
kable(table_dep, col.names = c("Depression Score","Frequency"),
caption = "Frequency Distribution of Depressionscores in the UK"
),full_width = F, font_size = 13, bootstrap_options = c("hover", "condensed")),height="300px")
| Depression Score | Frequency |
|---|---|
| 0 | 103 |
| 1 | 98 |
| 2 | 172 |
| 3 | 201 |
| 4 | 167 |
| 5 | 158 |
| 6 | 146 |
| 7 | 144 |
| 8 | 94 |
| 9 | 78 |
| 10 | 55 |
| 11 | 43 |
| 12 | 34 |
| 13 | 35 |
| 14 | 27 |
| 15 | 14 |
| 16 | 15 |
| 17 | 11 |
| 18 | 11 |
| 19 | 9 |
| 20 | 9 |
| 21 | 1 |
| 22 | 3 |
| 23 | 3 |
| 24 | 4 |
#depression_table_uk = table(df_uk$depression)
#depression_table_uk
#Just show me the scores of people with equal or higher than 9 depression scores
df_uk$depressed=ifelse(df_uk$depression >= 9,1,0)
df_uk$depressed
## [1] 0 1 0 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 NA 0 0 0 NA 0
## [25] 0 0 0 1 1 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 NA 0 0
## [49] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0
## [73] 1 0 1 1 0 0 0 0 NA 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0
## [97] 0 1 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 1 0 1 0 0 0
## [121] 0 1 0 0 0 NA 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0
## [145] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1
## [169] 0 0 0 NA 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0
## [193] 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0 0 0 1
## [217] 0 0 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 NA NA 0
## [241] 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0
## [265] 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 1 0 0 1 1 0 0 0 0
## [289] 0 0 1 1 1 1 0 0 0 0 0 0 NA 0 0 0 0 0 0 0 1 1 0 1
## [313] 0 0 0 0 1 0 0 NA 0 1 0 NA 0 1 1 1 1 0 1 0 0 0 0 0
## [337] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0
## [361] 1 0 0 1 0 0 1 1 0 0 0 0 1 1 1 1 0 NA 0 0 0 1 0 0
## [385] 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0
## [409] 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0
## [433] 1 0 1 0 NA 1 0 NA 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0
## [457] 0 0 1 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 1 0
## [481] 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## [505] 0 1 0 NA 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0
## [529] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
## [553] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 NA 1 0 0 0
## [577] 1 0 1 0 0 0 0 NA 0 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1
## [601] 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0
## [625] 1 1 0 0 0 0 0 0 0 NA 0 1 0 1 1 0 0 0 0 0 0 0 0 0
## [649] 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 NA 0
## [673] 0 0 0 0 0 0 0 0 0 0 NA 0 0 1 1 0 0 0 0 1 0 0 1 0
## [697] 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0
## [721] 0 1 0 0 1 0 0 NA 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
## [745] 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0
## [769] 1 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 NA 0
## [793] 0 0 1 NA 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0
## [817] 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 NA 0 0 0 1 NA 0 0 0
## [841] 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0
## [865] 1 0 0 0 0 0 0 0 1 NA 0 0 0 NA 0 0 1 0 1 0 0 1 0 0
## [889] 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 NA 0 1 0 1 0 0
## [913] 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 NA 0
## [937] 0 0 1 NA 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0
## [961] 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0
## [985] 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 NA 0 0
## [1009] 1 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
## [1033] 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
## [1057] 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0
## [1081] 1 0 1 0 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0
## [1105] NA 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1
## [1129] 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0
## [1153] 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0
## [1177] 0 0 0 1 0 NA 0 NA 0 0 1 0 0 0 0 0 0 NA 1 0 0 0 0 0
## [1201] 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0
## [1225] 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
## [1249] 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 1 0
## [1273] 0 0 1 0 0 0 0 0 0 1 0 0 NA 0 NA 0 1 0 0 0 0 0 0 1
## [1297] 0 0 NA 0 0 0 0 0 1 1 0 0 0 1 1 NA 0 0 0 0 0 0 1 NA
## [1321] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
## [1345] 0 1 0 0 0 0 1 0 0 NA 0 0 0 0 0 0 1 0 0 1 0 0 0 0
## [1369] 1 0 1 0 1 0 0 0 NA 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0
## [1393] 0 0 1 1 NA 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0
## [1417] 0 0 0 1 0 0 1 1 0 0 1 0 0 0 NA 0 0 0 0 0 0 NA NA 0
## [1441] 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
## [1465] 0 1 0 0 0 0 1 0 0 1 1 0 0 1 1 0 0 0 0 0 NA 0 0 0
## [1489] 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 0
## [1513] 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
## [1537] 1 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0
## [1561] 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [1585] 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 1 0 0 0
## [1609] 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0
## [1633] 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0
## [1657] NA 0 0 1 0 0 0 0 0 0 1 NA 0 0 0 0 0 0 0 1 0 0 1 0
## [1681] 0 0 0 0
table(df_uk$depressed)
##
## 0 1
## 1283 352
#Calculating Odds Ratio between people with lower score 0-8 and people with higher score 9 up to 24
#People with depression scale between 0-8: 1557
#People with despression scale between 9-24: 78
#Odds Ratio: 78/1557=0,050 --> Odds are lower to have a severe depression
aModel = glm(depressed ~ gndr, data=df_uk, family=binomial)
# Show summary of regression model
summary(aModel)
##
## Call:
## glm(formula = depressed ~ gndr, family = binomial, data = df_uk)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.45815 0.09035 -16.139 <2e-16 ***
## gndrFemale 0.30941 0.12131 2.551 0.0108 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1703.3 on 1634 degrees of freedom
## Residual deviance: 1696.7 on 1633 degrees of freedom
## (49 observations deleted due to missingness)
## AIC: 1700.7
##
## Number of Fisher Scoring iterations: 4
coef(aModel)
## (Intercept) gndrFemale
## -1.4581529 0.3094088
# Interpretation:
# For every one unit change in gender (i.e. from male to female), the log odds of bstart
# (having vs non-having own start-up) decreases by -0.343230. This effect is highly
# significant, i.e. p<0.001 that regression coefficient B = -0.343230 is only due to random effects
# i.e is actually 0 in the population
#
# As "log odds of bstart" is quite hard to interpret, calculate the Odds Ratios (as explained above)
# OR = exp(B) = e**B
#
exp(coef(aModel))
## (Intercept) gndrFemale
## 0.2326656 1.3626193
# Calculate Confidence Intervals for ORs
exp(confint(aModel))
## 2.5 % 97.5 %
## (Intercept) 0.194246 0.2768727
## gndrFemale 1.075026 1.7300513
Other way of trining to solve the homework is presented in the following
#Other way of trying it out
# frequency distribution of the new variable (depression)
# interpretation:
# min. is 0, max. is 24
# we have at least one individual who answered all items by 0 (lowest possible depression level) and at least on individual who answered all answers by 24 (highest possible depression level)
# most participants report low to moderate depression:
# but missing data (49 NA's)
### just for a further check:
# counting participants with depression scores between 0 and 8
# counting participants with depression scores > 9 (up to 24)
# store the frequency table uk - depression in a new variable
depression_table_uk = table(df_uk$depression)
depression_table_uk
##
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
## 103 98 172 201 167 158 146 144 94 78 55 43 34 35 27 14 15 11 11 9
## 20 21 22 23 24
## 9 1 3 3 4
# print
# sum of participants between 0 and 8 (inclusive)
depression_scale_1_to_2 = sum(depression_table_uk[names(depression_table_uk) >= 0 & names(depression_table_uk) <= 8])
depression_scale_1_to_2 # print
## [1] 1557
# sum of participants >= 9 (up to 24)
depression_scale_gt2 = sum(depression_table_uk[names(depression_table_uk) > 9])
depression_scale_gt2 # print
## [1] 0
# double-check
sum(1283, 352) # 1635 participants (uk - depression), excluding 49 NA's
## [1] 1635
#Calculating Odds Ratio between people with lower score 1-2 and people with higher score 2 up to 4
#People with depression scale between 1-2: 12383
#People with despression scale between 2-4: 352
#Odds Ratio: 352/1283=0,274 --> Odds are lower to have a severe depression
Our results mirror the results in other papers. For example that LGTBQ+ people are more likely to suffer from depression than straight people. The United Kingdom Survey on the Mental Health of LGBTQ+ (2024), highlighted that problem before us and claimed that victimization, discrimination, and lack of access to affirming spaces result in poorer mental health status. With our data we can confirm those findings.
As well as our findings that different skin colour contributes to higher depression scores than white people, could be linked to higher rates of discrimination, victimization and lack of affirming spaces. According to ”Stop Hate UK” a help organization against hate crime in the UK, 43% of all hate crimes reported to their helpline were because of racism. This could result from the historical legacy of Colonialism and Empire, where racism is deeply rooted in. Another possible explanation could be the Lack of Representation, Ethnic minorities are underrepresented in positions of power across politics, media, and business.
Our results concerning the correlation between age and depression showed little to none significance. Age does not seem to have an influence on depression scores. The slight negative correlation could be interpreted that with age resilience rises and that people are more settled to stand against depression.
The gender gap between men and women continues with depression scores. We found a significant difference in depression scores between men and women. Possible explanations for these findings could be the higher strain women face in our society. From poorer payment, responsibility at home and parenting.
Nevertheless our Regression Analysis showed little impact of sexuality and gender on depression. Therefore further research is needed to identify bigger drivers of depression. According to the “Mental Health Foundation, UK”- “People living in the lowest socioeconomic groups are more likely to experience common mental health problems such as depression and anxiety.”-. Loneliness is another strong driver of depression, especially in elderly people (Sheffield Hallam University, 2025). Furthermore a lack of access and inequalities in health care services in the UK account for higher depression rates (Royal College of Psychiatrist, 2025). These variables could be more dominant when looking at determinants of depression as well as exercise, food and lifestyle choices. Further research has to be done to verify these associations.
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.