The data for assignment 2 used in my analysis was gathered from the paper “The Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans and Attempts.” published by Nicole S. Smith, Rachel L. Martin, Brian W. Bauer, Shelby L. Brandel, and Daniel W. Capron. This paper hoped to expand upon existing literature regarding the relationships between noctural panic and suicidal ideation, plans, and attempts. Using the data, my hypothesis is that individuals with nocturnal panic are more likely to develop suicidal ideations when compared to individuals without a history of nocturnal panic. The outcome of interest, suicidal ideation, is measured by the number of suicidal episodes within the previous year. The primary exposure of interest, nocturnal panic attacks, is identified as having panic attacks occurring out of sleeping status. Confounders identified and controlled for are gender, age group, and level of education. After analysis, I was able to conclude statistically significant findings that those who experience nocturnal panic attacks have 2.80 times the rate of suicide ideation when compared to those who do not experience nocturnal panic attacks.
download.file("https://md-datasets-cache-zipfiles-prod.s3.eu-west-1.amazonaws.com/9p4vv275hk-1.zip", destfile = "9p4vv275hk-1.zip")
unzip("9p4vv275hk-1.zip")
library(haven)
rawdat <- haven::read_sav("NPs_DeidentifiedData.sav")
r = getOption("repos")
r["CRAN"] = "http://cran.us.r-project.org"
options(repos = r)
install.packages("tidyverse")
##
## The downloaded binary packages are in
## /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
cleandat <- rawdat %>% mutate(
agegroup = factor(case_when(
Age >= 19 & Age <= 29 ~ "19-29",
Age >= 30 & Age <= 39 ~ "30-39",
Age >= 40 & Age <= 49 ~ "40-49",
Age >= 50 ~ "50+"
))
) %>%
mutate(
education1 = recode_factor(
EducationCollapsed,
"1" = "High school graduate",
"2" = "Some college",
"3" = "College graduate or above"
)
) %>%
mutate(gender = recode_factor(Sex,
"0" = "Male",
"1" = "Female")) %>%
mutate(NPS1 = recode_factor(NPS1,
"0" = "No nocturnal panic attacks",
"1" = "Nocturnal panic attacks"))
install.packages("table1")
##
## The downloaded binary packages are in
## /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(table1)
##
## Attaching package: 'table1'
## The following objects are masked from 'package:base':
##
## units, units<-
cleandat <- within(cleandat, {
label(gender) <- "Biological sex"
label(agegroup) <- "Age group (years)"
label(education1) <- "Education"
label(SITBI5) <- "Episodes of suicide ideation in past year"
label(SITBI18) <- "Episodes of suicide planning in past year"
label(SITBI41) <- "Episodes of attempted suicide in past year"
})
View(cleandat)
table1(~ Sex + agegroup + education1 + SITBI5 + SITBI18 + SITBI41 | NPS1, data=cleandat, overall="Total", rowlabelhead = "Characteristics", caption = "Table 1: Characteristics of the Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans, and Attempts", topclass="Rtable1-zebra")
| Characteristics | No nocturnal panic attacks (N=160) |
Nocturnal panic attacks (N=161) |
Total (N=327) |
|---|---|---|---|
| Sex | |||
| Mean (SD) | 0.565 (0.498) | 0.573 (0.497) | 0.568 (0.496) |
| Median [Min, Max] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] |
| Missing | 36 (22.5%) | 58 (36.0%) | 100 (30.6%) |
| Age group (years) | |||
| 19-29 | 35 (21.9%) | 43 (26.7%) | 78 (23.9%) |
| 30-39 | 50 (31.3%) | 31 (19.3%) | 81 (24.8%) |
| 40-49 | 24 (15.0%) | 17 (10.6%) | 41 (12.5%) |
| 50+ | 15 (9.4%) | 12 (7.5%) | 27 (8.3%) |
| Missing | 36 (22.5%) | 58 (36.0%) | 100 (30.6%) |
| Education | |||
| High school graduate | 16 (10.0%) | 10 (6.2%) | 26 (8.0%) |
| Some college | 19 (11.9%) | 23 (14.3%) | 42 (12.8%) |
| College graduate or above | 88 (55.0%) | 70 (43.5%) | 158 (48.3%) |
| Missing | 37 (23.1%) | 58 (36.0%) | 101 (30.9%) |
| Episodes of suicide ideation in past year | |||
| Mean (SD) | 2.01 (10.2) | 8.39 (34.9) | 4.76 (24.3) |
| Median [Min, Max] | 0 [0, 100] | 0 [0, 300] | 0 [0, 300] |
| Missing | 49 (30.6%) | 77 (47.8%) | 132 (40.4%) |
| Episodes of suicide planning in past year | |||
| Mean (SD) | 0.0360 (0.299) | 1.71 (6.30) | 0.759 (4.21) |
| Median [Min, Max] | 0 [0, 3.00] | 0 [0, 55.0] | 0 [0, 55.0] |
| Missing | 49 (30.6%) | 77 (47.8%) | 132 (40.4%) |
| Episodes of attempted suicide in past year | |||
| Mean (SD) | 0 (0) | 1.04 (5.32) | 0.471 (3.61) |
| Median [Min, Max] | 0 [0, 0] | 0 [0, 52.0] | 0 [0, 52.0] |
| Missing | 37 (23.1%) | 59 (36.6%) | 102 (31.2%) |
Our sample data set included a total of 327 participants, approximately equally distributed between historically having nocturnal panic attacks and no reported nocturnal panic. Our study population was mostly between age 19-29 and 30-39.
Each variable of interest contains missing data. Age, sex and education were not reported for 100 people. Episodes of suicide ideation & planning were not reported for 132 people. Attempted suicide also was not reported for 102 people.
library(ggplot2)
ggplot(cleandat, aes(x=SITBI5)) + geom_histogram(binwidth = 1)
## Warning: Removed 132 rows containing non-finite values (stat_bin).
The histogram depicts # of suicide ideation episodes. From the histogram we can see there was a large right skew in our data, with many reports of 0 episodes of suicide ideation.
mm <- model.matrix(~ NPS1 + Sex + agegroup + education1, data = cleandat)
plot(hclust(as.dist(1 - cor(mm[, -1]))))
library(pheatmap)
pheatmap(cor(mm[, -1]), display_numbers = TRUE, number_color = "black", fontsize = 9, angel_col = 315, labels_row = c("female", "30-39", "40-49", "50+", "some college", "college education and above", "nocturnal panic attacks"), labels_col = c("female", "30-39", "40-49", "50+", "some college", "college education and above", "nocturnal panic attacks"), main = "Heatmap Assessing Multicollinearity")
The predictors (NPS1, sex, agegroup, and education) for the error models were assesed for multicollinearity. Model matrix was created using the variables and then embedded in a pheatmap. Variables were checked for correlations with each other and it was observed that none of the variables have any significant correlation.
poi <- glm(SITBI5 ~ NPS1 + Sex + agegroup + education1, family="poisson", data=cleandat)
summary(poi)
##
## Call:
## glm(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1, family = "poisson",
## data = cleandat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -10.9968 -2.4663 -1.5086 -0.7216 27.9509
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.12217 0.08651 36.091 < 2e-16 ***
## NPS1Nocturnal panic attacks 1.47492 0.07781 18.954 < 2e-16 ***
## Sex -1.50651 0.07418 -20.308 < 2e-16 ***
## agegroup30-39 -0.49502 0.08061 -6.141 8.19e-10 ***
## agegroup40-49 -0.38384 0.09575 -4.009 6.10e-05 ***
## agegroup50+ -2.43290 0.28148 -8.643 < 2e-16 ***
## education1Some college -1.41429 0.10272 -13.768 < 2e-16 ***
## education1College graduate or above -2.03498 0.07480 -27.205 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 4662.4 on 193 degrees of freedom
## Residual deviance: 3098.9 on 186 degrees of freedom
## (133 observations deleted due to missingness)
## AIC: 3312.5
##
## Number of Fisher Scoring iterations: 8
The poisson model shows very strong associations with all predictor variables (females, all age groups, all education levels, and having nocturnal panic attacks) and their impact on the outcome variable, episodes of suicide ideation in the past year. Women, age, and increased education levels all show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.
install.packages("MASS")
##
## The downloaded binary packages are in
## /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
negbin <- glm.nb(SITBI5 ~ NPS1 + Sex + agegroup + education1, data=cleandat)
## Warning: glm.fit: algorithm did not converge
## Warning in glm.nb(SITBI5 ~ NPS1 + Sex + agegroup + education1, data = cleandat):
## alternation limit reached
summary(negbin)
##
## Call:
## glm.nb(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1,
## data = cleandat, init.theta = 0.1038776776, link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0876 -0.8365 -0.7112 -0.4170 2.3618
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.0836 0.8253 2.525 0.0116 *
## NPS1Nocturnal panic attacks 1.0290 0.4714 2.183 0.0290 *
## Sex -1.2504 0.4864 -2.571 0.0102 *
## agegroup30-39 -0.3528 0.5535 -0.637 0.5238
## agegroup40-49 -0.3994 0.6974 -0.573 0.5668
## agegroup50+ -1.8336 0.8352 -2.195 0.0281 *
## education1Some college 0.6656 0.8710 0.764 0.4447
## education1College graduate or above -0.7551 0.7487 -1.009 0.3132
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(0.1039) family taken to be 1)
##
## Null deviance: 137.94 on 193 degrees of freedom
## Residual deviance: 116.10 on 186 degrees of freedom
## (133 observations deleted due to missingness)
## AIC: 604.71
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 0.1039
## Std. Err.: 0.0170
## Warning while fitting theta: alternation limit reached
##
## 2 x log-likelihood: -586.7140
Next we have the negative binomial model, which shows significant associations with predictor variables of having nocturnal panic attacks, females, and age 50+, and their impact on the outcome variable, episodes of suicide ideation in the past year. Women and age 50+ show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.
library(pscl)
## Classes and Methods for R developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University
## Simon Jackman
## hurdle and zeroinfl functions by Achim Zeileis
zeropoi <- zeroinfl(SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data=cleandat, dist = "poisson")
summary(zeropoi)
##
## Call:
## zeroinfl(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data = cleandat,
## dist = "poisson")
##
## Pearson residuals:
## Min 1Q Median 3Q Max
## -0.6375 -0.6046 -0.5747 -0.4022 33.2476
##
## Count model coefficients (poisson with log link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.92912 0.09678 50.933 < 2e-16 ***
## NPS1Nocturnal panic attacks 0.25432 0.08897 2.858 0.004258 **
## Sex -0.78972 0.08243 -9.581 < 2e-16 ***
## agegroup30-39 -0.42085 0.09127 -4.611 4.01e-06 ***
## agegroup40-49 0.03257 0.11145 0.292 0.770116
## agegroup50+ -1.10114 0.30994 -3.553 0.000381 ***
## education1Some college -1.95352 0.11424 -17.100 < 2e-16 ***
## education1College graduate or above -2.32817 0.07883 -29.532 < 2e-16 ***
##
## Zero-inflation model coefficients (binomial with logit link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.8904 0.1591 5.598 2.17e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Number of iterations in BFGS optimization: 13
## Log-likelihood: -851.6 on 9 Df
Next, the zero-inflation poisson model shows strong associations with most predictor variables (gender, age (30-39 as well as 50+), all education levels) and their impact on the outcome variable, episodes of suicide ideation in the past year. The model also shows significant, but not as strong association with having nocturnal panic attacks and their impact on the outcome variable. Women, age 30-39 and 50+, as well as increasing education levels all show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.
zeronegbin <- zeroinfl(SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data=cleandat, dist = "negbin")
summary(zeronegbin)
##
## Call:
## zeroinfl(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data = cleandat,
## dist = "negbin")
##
## Pearson residuals:
## Min 1Q Median 3Q Max
## -0.3218 -0.3167 -0.3079 -0.2547 9.7620
##
## Count model coefficients (negbin with log link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.0834 0.8484 2.456 0.0141 *
## NPS1Nocturnal panic attacks 1.0290 0.5247 1.961 0.0498 *
## Sex -1.2503 0.6148 -2.034 0.0420 *
## agegroup30-39 -0.3528 0.6491 -0.544 0.5867
## agegroup40-49 -0.3994 0.8402 -0.475 0.6345
## agegroup50+ -1.8335 0.9282 -1.975 0.0482 *
## education1Some college 0.6659 1.1669 0.571 0.5682
## education1College graduate or above -0.7549 0.8473 -0.891 0.3729
## Log(theta) -2.2645 0.1633 -13.868 <2e-16 ***
##
## Zero-inflation model coefficients (binomial with logit link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -12.35 362.64 -0.034 0.973
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Theta = 0.1039
## Number of iterations in BFGS optimization: 43
## Log-likelihood: -293.4 on 10 Df
Lastly, we fit the zero-inflation negative binomial model which shows significant associations with having nocturnal panic attacks, females, and age 50+ and the outcome variable, suicidal ideations in the past year. Women and age 50+ show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.
#Assessing which model is best fit
logLik(poi)
## 'log Lik.' -1648.245 (df=8)
logLik(negbin)
## 'log Lik.' -293.3571 (df=9)
logLik(zeropoi)
## 'log Lik.' -851.6468 (df=9)
logLik(zeronegbin)
## 'log Lik.' -293.3571 (df=10)
So after we compared the log likelihoods for each model of fit, we can conclude that the negative binomial model is the best fit for the outcome variable, episodes of suicide ideation in the past year.
library(gtsummary)
##
## Attaching package: 'gtsummary'
## The following object is masked from 'package:MASS':
##
## select
tbl_regression(negbin, exponentiate = TRUE)
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: algorithm did not converge
| Characteristic | IRR1 | 95% CI1 | p-value |
|---|---|---|---|
| NPS1 | |||
| No nocturnal panic attacks | — | — | |
| Nocturnal panic attacks | 2.80 | 1.00, 8.02 | 0.029 |
| Sex | 0.29 | 0.08, 0.93 | 0.010 |
| Age group (years) | |||
| 19-29 | — | — | |
| 30-39 | 0.70 | 0.19, 2.58 | 0.5 |
| 40-49 | 0.67 | 0.13, 3.67 | 0.6 |
| 50+ | 0.16 | 0.03, 1.17 | 0.028 |
| Education | |||
| High school graduate | — | — | |
| Some college | 1.95 | 0.18, 18.7 | 0.4 |
| College graduate or above | 0.47 | 0.07, 2.23 | 0.3 |
|
1
IRR = Incidence Rate Ratio, CI = Confidence Interval
|
|||
After choosing the negative binomial model, I interpreted the regression results by exponentiating the coefficients to get incidence rate ratios for comparison. The table above shows the incidence rate ratios with 95% confidence intervals and the associated p-values for each coefficient. I can conclude that those who experience nocturnal panic attacks have 2.80 times the rate of suicide ideation when compared to those who do not experience nocturnal panic attacks. The p-value 0.05 shows this model as statistically significant. I can also conclude from the table that being female and being 50 years and older both have statistically significant p values and show lower rates of suicide ideation episodes.
My results are consistent with the paper findings. A difference between my data and the publication that likely gave me different results is that the publication further defined panic attacks into daytime vs nighttime panics vs no panic, rather than my data which only compared panic attacks out of sleep vs no panic. It should be noted that since the data gathered is self-reported episodes of suicide ideation, error is possible. Overall, both the paper and my findings are able to conclude noctural panic attacks are associated with higher rates of suicide ideation.