Introduction

The data for assignment 2 used in my analysis was gathered from the paper “The Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans and Attempts.” published by Nicole S. Smith, Rachel L. Martin, Brian W. Bauer, Shelby L. Brandel, and Daniel W. Capron. This paper hoped to expand upon existing literature regarding the relationships between noctural panic and suicidal ideation, plans, and attempts. Using the data, my hypothesis is that individuals with nocturnal panic are more likely to develop suicidal ideations when compared to individuals without a history of nocturnal panic. The outcome of interest, suicidal ideation, is measured by the number of suicidal episodes within the previous year. The primary exposure of interest, nocturnal panic attacks, is identified as having panic attacks occurring out of sleeping status. Confounders identified and controlled for are gender, age group, and level of education. After analysis, I was able to conclude statistically significant findings that those who experience nocturnal panic attacks have 2.80 times the rate of suicide ideation when compared to those who do not experience nocturnal panic attacks.

Load Data

download.file("https://md-datasets-cache-zipfiles-prod.s3.eu-west-1.amazonaws.com/9p4vv275hk-1.zip", destfile = "9p4vv275hk-1.zip")
unzip("9p4vv275hk-1.zip")
library(haven)
rawdat <- haven::read_sav("NPs_DeidentifiedData.sav")

Clean Data

r = getOption("repos")
r["CRAN"] = "http://cran.us.r-project.org"
options(repos = r)
install.packages("tidyverse")
## 
## The downloaded binary packages are in
##  /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
cleandat <- rawdat %>% mutate(
  agegroup = factor(case_when(
    Age >= 19 & Age <= 29 ~ "19-29",
    Age >= 30 & Age <= 39 ~ "30-39",
    Age >= 40 & Age <= 49 ~ "40-49",
    Age >= 50 ~ "50+"
  ))
) %>%
  mutate(
    education1 = recode_factor(
      EducationCollapsed,
      "1" = "High school graduate",
      "2" = "Some college",
      "3" = "College graduate or above"
    )
  ) %>%
  mutate(gender = recode_factor(Sex,
                                "0" = "Male",
                                "1" = "Female")) %>%
  mutate(NPS1 = recode_factor(NPS1,
         "0" = "No nocturnal panic attacks", 
         "1" = "Nocturnal panic attacks"))

install.packages("table1")
## 
## The downloaded binary packages are in
##  /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(table1)
## 
## Attaching package: 'table1'
## The following objects are masked from 'package:base':
## 
##     units, units<-
cleandat <- within(cleandat, {
  label(gender) <- "Biological sex"
  label(agegroup) <- "Age group (years)"
  label(education1)   <- "Education"
  label(SITBI5) <- "Episodes of suicide ideation in past year"
  label(SITBI18) <- "Episodes of suicide planning in past year"
  label(SITBI41) <- "Episodes of attempted suicide in past year"
})
View(cleandat)

Table 1: Characteristics of the Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans, and Attempts

table1(~ Sex + agegroup + education1 + SITBI5 + SITBI18 + SITBI41 | NPS1, data=cleandat, overall="Total", rowlabelhead = "Characteristics", caption = "Table 1: Characteristics of the Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans, and Attempts", topclass="Rtable1-zebra")
Table 1: Characteristics of the Association Between Nocturnal Panic Attacks and Suicidal Ideation, Plans, and Attempts
Characteristics No nocturnal panic attacks
(N=160)
Nocturnal panic attacks
(N=161)
Total
(N=327)
Sex
Mean (SD) 0.565 (0.498) 0.573 (0.497) 0.568 (0.496)
Median [Min, Max] 1.00 [0, 1.00] 1.00 [0, 1.00] 1.00 [0, 1.00]
Missing 36 (22.5%) 58 (36.0%) 100 (30.6%)
Age group (years)
19-29 35 (21.9%) 43 (26.7%) 78 (23.9%)
30-39 50 (31.3%) 31 (19.3%) 81 (24.8%)
40-49 24 (15.0%) 17 (10.6%) 41 (12.5%)
50+ 15 (9.4%) 12 (7.5%) 27 (8.3%)
Missing 36 (22.5%) 58 (36.0%) 100 (30.6%)
Education
High school graduate 16 (10.0%) 10 (6.2%) 26 (8.0%)
Some college 19 (11.9%) 23 (14.3%) 42 (12.8%)
College graduate or above 88 (55.0%) 70 (43.5%) 158 (48.3%)
Missing 37 (23.1%) 58 (36.0%) 101 (30.9%)
Episodes of suicide ideation in past year
Mean (SD) 2.01 (10.2) 8.39 (34.9) 4.76 (24.3)
Median [Min, Max] 0 [0, 100] 0 [0, 300] 0 [0, 300]
Missing 49 (30.6%) 77 (47.8%) 132 (40.4%)
Episodes of suicide planning in past year
Mean (SD) 0.0360 (0.299) 1.71 (6.30) 0.759 (4.21)
Median [Min, Max] 0 [0, 3.00] 0 [0, 55.0] 0 [0, 55.0]
Missing 49 (30.6%) 77 (47.8%) 132 (40.4%)
Episodes of attempted suicide in past year
Mean (SD) 0 (0) 1.04 (5.32) 0.471 (3.61)
Median [Min, Max] 0 [0, 0] 0 [0, 52.0] 0 [0, 52.0]
Missing 37 (23.1%) 59 (36.6%) 102 (31.2%)

Our sample data set included a total of 327 participants, approximately equally distributed between historically having nocturnal panic attacks and no reported nocturnal panic. Our study population was mostly between age 19-29 and 30-39.

Each variable of interest contains missing data. Age, sex and education were not reported for 100 people. Episodes of suicide ideation & planning were not reported for 132 people. Attempted suicide also was not reported for 102 people.

Histogram

library(ggplot2)
ggplot(cleandat, aes(x=SITBI5)) + geom_histogram(binwidth = 1)
## Warning: Removed 132 rows containing non-finite values (stat_bin).

The histogram depicts # of suicide ideation episodes. From the histogram we can see there was a large right skew in our data, with many reports of 0 episodes of suicide ideation.

Assess Multicollinearity

mm <- model.matrix(~ NPS1 + Sex + agegroup + education1, data = cleandat)
plot(hclust(as.dist(1 - cor(mm[, -1]))))

library(pheatmap)
pheatmap(cor(mm[, -1]), display_numbers = TRUE, number_color = "black", fontsize = 9, angel_col = 315, labels_row = c("female", "30-39", "40-49", "50+", "some college", "college education and above", "nocturnal panic attacks"), labels_col = c("female", "30-39", "40-49", "50+", "some college", "college education and above", "nocturnal panic attacks"), main = "Heatmap Assessing Multicollinearity")

The predictors (NPS1, sex, agegroup, and education) for the error models were assesed for multicollinearity. Model matrix was created using the variables and then embedded in a pheatmap. Variables were checked for correlations with each other and it was observed that none of the variables have any significant correlation.

Poisson

poi <- glm(SITBI5 ~ NPS1 + Sex + agegroup + education1, family="poisson", data=cleandat)
summary(poi)
## 
## Call:
## glm(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1, family = "poisson", 
##     data = cleandat)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -10.9968   -2.4663   -1.5086   -0.7216   27.9509  
## 
## Coefficients:
##                                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                          3.12217    0.08651  36.091  < 2e-16 ***
## NPS1Nocturnal panic attacks          1.47492    0.07781  18.954  < 2e-16 ***
## Sex                                 -1.50651    0.07418 -20.308  < 2e-16 ***
## agegroup30-39                       -0.49502    0.08061  -6.141 8.19e-10 ***
## agegroup40-49                       -0.38384    0.09575  -4.009 6.10e-05 ***
## agegroup50+                         -2.43290    0.28148  -8.643  < 2e-16 ***
## education1Some college              -1.41429    0.10272 -13.768  < 2e-16 ***
## education1College graduate or above -2.03498    0.07480 -27.205  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4662.4  on 193  degrees of freedom
## Residual deviance: 3098.9  on 186  degrees of freedom
##   (133 observations deleted due to missingness)
## AIC: 3312.5
## 
## Number of Fisher Scoring iterations: 8

The poisson model shows very strong associations with all predictor variables (females, all age groups, all education levels, and having nocturnal panic attacks) and their impact on the outcome variable, episodes of suicide ideation in the past year. Women, age, and increased education levels all show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.

Negative Binomial

install.packages("MASS")
## 
## The downloaded binary packages are in
##  /var/folders/ny/q_wrjbts20x38z4_vr94qp0c0000gn/T//RtmpAC8kft/downloaded_packages
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
negbin <- glm.nb(SITBI5 ~ NPS1 + Sex + agegroup + education1, data=cleandat)
## Warning: glm.fit: algorithm did not converge
## Warning in glm.nb(SITBI5 ~ NPS1 + Sex + agegroup + education1, data = cleandat):
## alternation limit reached
summary(negbin)
## 
## Call:
## glm.nb(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1, 
##     data = cleandat, init.theta = 0.1038776776, link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.0876  -0.8365  -0.7112  -0.4170   2.3618  
## 
## Coefficients:
##                                     Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                           2.0836     0.8253   2.525   0.0116 *
## NPS1Nocturnal panic attacks           1.0290     0.4714   2.183   0.0290 *
## Sex                                  -1.2504     0.4864  -2.571   0.0102 *
## agegroup30-39                        -0.3528     0.5535  -0.637   0.5238  
## agegroup40-49                        -0.3994     0.6974  -0.573   0.5668  
## agegroup50+                          -1.8336     0.8352  -2.195   0.0281 *
## education1Some college                0.6656     0.8710   0.764   0.4447  
## education1College graduate or above  -0.7551     0.7487  -1.009   0.3132  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(0.1039) family taken to be 1)
## 
##     Null deviance: 137.94  on 193  degrees of freedom
## Residual deviance: 116.10  on 186  degrees of freedom
##   (133 observations deleted due to missingness)
## AIC: 604.71
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  0.1039 
##           Std. Err.:  0.0170 
## Warning while fitting theta: alternation limit reached 
## 
##  2 x log-likelihood:  -586.7140

Next we have the negative binomial model, which shows significant associations with predictor variables of having nocturnal panic attacks, females, and age 50+, and their impact on the outcome variable, episodes of suicide ideation in the past year. Women and age 50+ show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.

Zero Inflation Pois (Intercept Only)

library(pscl)
## Classes and Methods for R developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University
## Simon Jackman
## hurdle and zeroinfl functions by Achim Zeileis
zeropoi <- zeroinfl(SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data=cleandat, dist = "poisson")
summary(zeropoi)
## 
## Call:
## zeroinfl(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data = cleandat, 
##     dist = "poisson")
## 
## Pearson residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6375 -0.6046 -0.5747 -0.4022 33.2476 
## 
## Count model coefficients (poisson with log link):
##                                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                          4.92912    0.09678  50.933  < 2e-16 ***
## NPS1Nocturnal panic attacks          0.25432    0.08897   2.858 0.004258 ** 
## Sex                                 -0.78972    0.08243  -9.581  < 2e-16 ***
## agegroup30-39                       -0.42085    0.09127  -4.611 4.01e-06 ***
## agegroup40-49                        0.03257    0.11145   0.292 0.770116    
## agegroup50+                         -1.10114    0.30994  -3.553 0.000381 ***
## education1Some college              -1.95352    0.11424 -17.100  < 2e-16 ***
## education1College graduate or above -2.32817    0.07883 -29.532  < 2e-16 ***
## 
## Zero-inflation model coefficients (binomial with logit link):
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8904     0.1591   5.598 2.17e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Number of iterations in BFGS optimization: 13 
## Log-likelihood: -851.6 on 9 Df

Next, the zero-inflation poisson model shows strong associations with most predictor variables (gender, age (30-39 as well as 50+), all education levels) and their impact on the outcome variable, episodes of suicide ideation in the past year. The model also shows significant, but not as strong association with having nocturnal panic attacks and their impact on the outcome variable. Women, age 30-39 and 50+, as well as increasing education levels all show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.

Zero Inflation Negative Binomial (Intercept Only)

zeronegbin <- zeroinfl(SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data=cleandat, dist = "negbin")
summary(zeronegbin)
## 
## Call:
## zeroinfl(formula = SITBI5 ~ NPS1 + Sex + agegroup + education1 | 1, data = cleandat, 
##     dist = "negbin")
## 
## Pearson residuals:
##     Min      1Q  Median      3Q     Max 
## -0.3218 -0.3167 -0.3079 -0.2547  9.7620 
## 
## Count model coefficients (negbin with log link):
##                                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                           2.0834     0.8484   2.456   0.0141 *  
## NPS1Nocturnal panic attacks           1.0290     0.5247   1.961   0.0498 *  
## Sex                                  -1.2503     0.6148  -2.034   0.0420 *  
## agegroup30-39                        -0.3528     0.6491  -0.544   0.5867    
## agegroup40-49                        -0.3994     0.8402  -0.475   0.6345    
## agegroup50+                          -1.8335     0.9282  -1.975   0.0482 *  
## education1Some college                0.6659     1.1669   0.571   0.5682    
## education1College graduate or above  -0.7549     0.8473  -0.891   0.3729    
## Log(theta)                           -2.2645     0.1633 -13.868   <2e-16 ***
## 
## Zero-inflation model coefficients (binomial with logit link):
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -12.35     362.64  -0.034    0.973
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Theta = 0.1039 
## Number of iterations in BFGS optimization: 43 
## Log-likelihood: -293.4 on 10 Df

Lastly, we fit the zero-inflation negative binomial model which shows significant associations with having nocturnal panic attacks, females, and age 50+ and the outcome variable, suicidal ideations in the past year. Women and age 50+ show significant negative correlations with episodes of suicidal ideation, while nocturnal panic attacks show a positive correlation.

#Assessing which model is best fit

logLik(poi)
## 'log Lik.' -1648.245 (df=8)
logLik(negbin)
## 'log Lik.' -293.3571 (df=9)
logLik(zeropoi)
## 'log Lik.' -851.6468 (df=9)
logLik(zeronegbin)
## 'log Lik.' -293.3571 (df=10)

So after we compared the log likelihoods for each model of fit, we can conclude that the negative binomial model is the best fit for the outcome variable, episodes of suicide ideation in the past year.

library(gtsummary)
## 
## Attaching package: 'gtsummary'
## The following object is masked from 'package:MASS':
## 
##     select
tbl_regression(negbin, exponentiate = TRUE)
## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: algorithm did not converge
Characteristic IRR1 95% CI1 p-value
NPS1
No nocturnal panic attacks
Nocturnal panic attacks 2.80 1.00, 8.02 0.029
Sex 0.29 0.08, 0.93 0.010
Age group (years)
19-29
30-39 0.70 0.19, 2.58 0.5
40-49 0.67 0.13, 3.67 0.6
50+ 0.16 0.03, 1.17 0.028
Education
High school graduate
Some college 1.95 0.18, 18.7 0.4
College graduate or above 0.47 0.07, 2.23 0.3

1 IRR = Incidence Rate Ratio, CI = Confidence Interval

Conclusion

After choosing the negative binomial model, I interpreted the regression results by exponentiating the coefficients to get incidence rate ratios for comparison. The table above shows the incidence rate ratios with 95% confidence intervals and the associated p-values for each coefficient. I can conclude that those who experience nocturnal panic attacks have 2.80 times the rate of suicide ideation when compared to those who do not experience nocturnal panic attacks. The p-value 0.05 shows this model as statistically significant. I can also conclude from the table that being female and being 50 years and older both have statistically significant p values and show lower rates of suicide ideation episodes.

My results are consistent with the paper findings. A difference between my data and the publication that likely gave me different results is that the publication further defined panic attacks into daytime vs nighttime panics vs no panic, rather than my data which only compared panic attacks out of sleep vs no panic. It should be noted that since the data gathered is self-reported episodes of suicide ideation, error is possible. Overall, both the paper and my findings are able to conclude noctural panic attacks are associated with higher rates of suicide ideation.