The dataset used for this analysis is the National Health Interview Survey (NHIS) from 1997-2016. The analysis is between Mental illness, the dependent variable and gender, being the independant variable. For this analysis, mental illness is indicated using Kessler Score, which was developed by Ronald C. Kessler and known as the Kessler 6 scale (K6). My hypothesis is that women are more likely to have mental illness along with other indicators such as those with high BMIs, different Sexual orientations, poor health status, etc. This analysis will run a few likelihood ratio models to see the best model that fits to one having Mental illness.
library(readr)
library(dplyr)
library(texreg)
library(Zelig)
library(visreg)
library(lmtest)
load("/Users/Deepakie/Documents/Queens College/SOC712/Data/NHIS_v3.rdata")
head(NHIS_v3)
NHIS<- NHIS_v3%>%
select(sex,racenew,sexorien,bmi_7,health,asad,aeffort,ahopeless,aworthless,anervous,arestless)%>%
mutate(sex = ifelse(sex==1, "Male","Female"),
BMI = ifelse(bmi_7==1, "Underweight",
ifelse(bmi_7==2, "Normal",
ifelse(bmi_7==3, "Overweight",
ifelse(bmi_7==4, "Obese30s",
ifelse(bmi_7==5, "Obese40s",
ifelse(bmi_7==6, "Obese50s",NA)))))),
sexorient =factor(ifelse(sexorien==1, "Gay/Lesbian",
ifelse(sexorien==2, "Straight",
ifelse(sexorien==3, "Bisexual",
ifelse(sexorien>=4, NA, NA)))),levels=c("Straight","Bisexual","Gay/Lesbian")),
race =factor(ifelse(racenew==10, "White",
ifelse(racenew==20, "Black/African American",
ifelse(racenew==30, "American Indian/Alaskan Native",
ifelse(racenew==40, "Asian",
ifelse(racenew==50, "Multiple Race",
ifelse(racenew==60, "Other Race",
ifelse(racenew>=61, NA,NA))))))),levels=c("White","Black/African American","Asian","Multiple Race","Other Race")),
Healthstatus=factor(ifelse(health==1, "Excellent",
ifelse(health==2, "Very Good",
ifelse(health==3, "Good",
ifelse(health==4, "Fair",
ifelse(health==5, "Poor", NA))))),levels=c("Poor","Very Good","Good","Fair","Excellent")),
ahopeless= ifelse(ahopeless>4,NA,ahopeless),
asad= ifelse(asad>4,NA,asad),
aworthless= ifelse(aworthless>4,NA,aworthless),
aeffort= ifelse(aeffort>4,NA,aeffort),
arestless= ifelse(arestless>4,NA,arestless),
anervous= ifelse(anervous>4,NA,anervous),
Seriousmentalillness=ifelse(ahopeless+asad+aworthless+aeffort+arestless+anervous>=13,1,0))%>%
select(-asad,-aeffort,-ahopeless,-aworthless,-anervous,-arestless,-bmi_7,-sexorien,-racenew,-health)
As mentioned above, I recoded and cleaned my variables. For BMI, there are 6 categories being underweight, overweight, normal, Obese30s, Obese40s and Obese50s.BMI of 30 or above is considered obese. There are three obese categories for the BMI variable in this analysis as obese 30s, indicating one with BMI from 30-39, obese 40s indicating BMI from 40-49 and obese 50s indicating one’s obesity level of 50 and above. Sexual Orientation is indicated in 3 categories, being Straight, Bisexual, and Gay/lesbian. Additionally, race is specified in 5 categories. It is important to note that health status indicates how “ONE” rates “Their” health as poor, fair, good, very good and excellent health. It is assumed those who have poor or fair health status are more likely to have Mental illness. The dependent variable, Seriousmentalillness, is the addition of all 6 questions which consitiute a scale measuring psychological distress.Kessler score (K6) is indicated by the addition of 6 mental distress variables on a scale of 1-4 each. So if one is to have a total score of 13 or above, they are known to have serious mental distress or illness for the sake of this analysis. Those with 13 or below have low or no mental illness. Those who have serious mental illness are coded as 1 and who aren’t are coded by 0.
NHIS <- NHIS%>%filter(!is.na(Seriousmentalillness), !is.na(sex), !is.na(BMI), !is.na(race), !is.na(Healthstatus), !is.na(sexorient))
logit.seriousmentalillness <- glm(Seriousmentalillness ~ sex, family = "binomial", data = NHIS)
coef(logit.seriousmentalillness)
(Intercept) sexMale
-3.0779025 -0.3917858
M0<- glm(Seriousmentalillness ~ sex, family = binomial, data = NHIS)
summary(M0)
Call:
glm(formula = Seriousmentalillness ~ sex, family = binomial,
data = NHIS)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.3001 -0.3001 -0.3001 -0.2476 2.6459
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.07790 0.01866 -164.97 <2e-16 ***
sexMale -0.39179 0.03069 -12.77 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 40384 on 125759 degrees of freedom
Residual deviance: 40217 on 125758 degrees of freedom
AIC: 40221
Number of Fisher Scoring iterations: 6
This is a simple model where we have one binary dependent and independent variable. The results above show us men have lower likelihood than females to have serious mental distress. The log odds of females are 3.07 higher likely to have serious mental distress compared to men. Now lets add race to our model and see how it may vary across different races?
M1<- glm(Seriousmentalillness ~ sex + race , family = binomial, data = NHIS)
summary(M1)
Call:
glm(formula = Seriousmentalillness ~ sex + race, family = binomial,
data = NHIS)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4068 -0.3008 -0.2505 -0.2484 2.8369
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.07310 0.02035 -150.984 < 2e-16 ***
sexMale -0.39023 0.03073 -12.698 < 2e-16 ***
raceBlack/African American 0.01710 0.04268 0.401 0.689
raceAsian -0.54253 0.08075 -6.719 1.83e-11 ***
raceMultiple Race 0.62291 0.08017 7.770 7.86e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 40384 on 125759 degrees of freedom
Residual deviance: 40107 on 125755 degrees of freedom
AIC: 40117
Number of Fisher Scoring iterations: 6
In this model, we see that log odds of Asians having serious mental illness are .54 lower than whites as the race, white is used here as a reference. African americans have .08 log odds more of having mental illness whereas being the highest, those with miltiple races have log odds of .62 more to be diagnosed with mental illness. This is interesting as we see that mental illess varies across races and between those with multiple races. A white female has log odds more of 3.07 to have serious mental illness.I will add sexual orienation and health status to see how it may vary in context to one having serious mental illness.
M2<- glm(Seriousmentalillness ~ sex + race + sexorient + Healthstatus , family = binomial, data = NHIS)
summary(M2)
Call:
glm(formula = Seriousmentalillness ~ sex + race + sexorient +
Healthstatus, family = binomial, data = NHIS)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.4122 -0.2635 -0.1832 -0.1519 3.2205
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.84191 0.03903 -21.572 < 2e-16 ***
sexMale -0.37771 0.03195 -11.823 < 2e-16 ***
raceBlack/African American -0.27049 0.04449 -6.080 1.20e-09 ***
raceAsian -0.36999 0.08271 -4.473 7.71e-06 ***
raceMultiple Race 0.45140 0.08552 5.278 1.30e-07 ***
sexorientBisexual 1.37880 0.10090 13.664 < 2e-16 ***
sexorientGay/Lesbian 0.61112 0.09926 6.157 7.43e-10 ***
HealthstatusVery Good -3.23685 0.05504 -58.813 < 2e-16 ***
HealthstatusGood -2.23065 0.04567 -48.843 < 2e-16 ***
HealthstatusFair -1.14207 0.04524 -25.245 < 2e-16 ***
HealthstatusExcellent -3.59070 0.06588 -54.506 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 40384 on 125759 degrees of freedom
Residual deviance: 34407 on 125749 degrees of freedom
AIC: 34429
Number of Fisher Scoring iterations: 7
Model 3 shows us how sexual orientation and health status effect ones log odds of having serious mental illness. We see those who are bisexual have log odds by 1.38 more to have mental illness compared to those who are straight whereas Gay/Lesbians have .61 log odds more. Interesting, bisexuals have the highest likelihood of having mental illness. Looking at health status, obviousally those who believe their health is poor, have the highest log odds of sexual mental illness. Those with fair health have 1.14 log odds more than those with excellent health status. Good health respondents have log odds of 2.23 more and those with very good health have 3.23 more. We see that a white female who belongs to the white race and is straight with excellent health status have .84 less odds to have serious mental illiness. As mentioned above, it is important that we keep in mind that the variable of health status was a question asking respondents about what they see their health as so one who may have responded may have said that his/her health is poor due to the fact that they have some sort of mental illiness. I believe BMI has a big impact on those with serious mental illiness as obese population tend to have anxiety, depression, etc.
M3<- glm(Seriousmentalillness ~ sex + sexorient + BMI , family = binomial, data = NHIS)
summary(M3)
Call:
glm(formula = Seriousmentalillness ~ sex + sexorient + BMI, family = binomial,
data = NHIS)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.8487 -0.2730 -0.2622 -0.2274 2.7314
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.35324 0.02991 -112.118 < 2e-16 ***
sexMale -0.35275 0.03118 -11.312 < 2e-16 ***
sexorientBisexual 1.25533 0.09398 13.357 < 2e-16 ***
sexorientGay/Lesbian 0.48733 0.09456 5.154 2.55e-07 ***
BMIObese30s 0.43520 0.03896 11.171 < 2e-16 ***
BMIObese40s 0.91945 0.05960 15.428 < 2e-16 ***
BMIObese50s 1.26198 0.12190 10.352 < 2e-16 ***
BMIOverweight 0.06340 0.03976 1.595 0.111
BMIUnderweight 0.81124 0.08740 9.282 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 40384 on 125759 degrees of freedom
Residual deviance: 39644 on 125751 degrees of freedom
AIC: 39662
Number of Fisher Scoring iterations: 6
This model has sexual oreintation and BMI to see the correlation with serious mental illness. As assumed those with obese levels have more log odds of having mental illness than those with a Normal BMI. BMI levels of 30-39, in category Obese30s, have .43 log odds more than Normal BMIs. As the BMI obesity levels go higher, the log odds become higher for one to have serious mental illness. We see that those with a BMI of 50 or above have 1.26 more log odds of having serious mental illness. A straight female with a normal BMI has log odds of 3.35 lower to have serious mental illness.
M4<- glm(Seriousmentalillness ~ sex*BMI + sexorient + race + Healthstatus , family = binomial, data = NHIS)
summary(M4)
Call:
glm(formula = Seriousmentalillness ~ sex * BMI + sexorient +
race + Healthstatus, family = binomial, data = NHIS)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.5115 -0.2674 -0.1776 -0.1464 3.2468
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.94416 0.05092 -18.540 < 2e-16 ***
sexMale -0.16522 0.06013 -2.748 0.00600 **
BMIObese30s 0.10228 0.05166 1.980 0.04772 *
BMIObese40s 0.34332 0.07464 4.600 4.23e-06 ***
BMIObese50s 0.19268 0.15520 1.241 0.21443
BMIOverweight 0.01505 0.05325 0.283 0.77742
BMIUnderweight 0.54276 0.10834 5.010 5.45e-07 ***
sexorientBisexual 1.35885 0.10107 13.445 < 2e-16 ***
sexorientGay/Lesbian 0.59790 0.09937 6.017 1.78e-09 ***
raceBlack/African American -0.28513 0.04473 -6.374 1.84e-10 ***
raceAsian -0.38402 0.08335 -4.607 4.08e-06 ***
raceMultiple Race 0.44336 0.08561 5.179 2.24e-07 ***
HealthstatusVery Good -3.19783 0.05555 -57.567 < 2e-16 ***
HealthstatusGood -2.20350 0.04593 -47.980 < 2e-16 ***
HealthstatusFair -1.12744 0.04538 -24.845 < 2e-16 ***
HealthstatusExcellent -3.54922 0.06680 -53.131 < 2e-16 ***
sexMale:BMIObese30s -0.27045 0.08375 -3.229 0.00124 **
sexMale:BMIObese40s -0.45991 0.14072 -3.268 0.00108 **
sexMale:BMIObese50s -0.08743 0.27736 -0.315 0.75258
sexMale:BMIOverweight -0.23822 0.08345 -2.855 0.00431 **
sexMale:BMIUnderweight -0.17706 0.21404 -0.827 0.40811
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 40384 on 125759 degrees of freedom
Residual deviance: 34344 on 125739 degrees of freedom
AIC: 34386
Number of Fisher Scoring iterations: 7
When we do a interaction between sex and bmi we see that men have lower log odds of serious mental illness than women between BMIs. Men with Obese BMI’s have lower odds than females with obese BMI’s. For example a male with BMI level of 40 above have .46 lower log odds than women to have of serious mental illness. Interestingly, males with BMI of 50 are not statistically significant to have serious mental illness. In addition, males with a BMI of underweight also have no statistical significance where those with a underweight BMI of both sexs are significant. Log odds of serious mental illness are .17 lower for men overall. The reference category for this model is normal BMI females which is the estimated log odds for females with a normal BMI are .94 log odds lower to have serious mental illness compared to those with other obese BMI levels.
visreg( M2, "sexorient", scale = "response")
visreg( M3, "BMI", scale = "response")
visreg(M3, "sex", by = "BMI", scale = "response")
anova(M0, M1, M2, M3, M4, test= "Chisq")
Analysis of Deviance Table
Model 1: Seriousmentalillness ~ sex
Model 2: Seriousmentalillness ~ sex + race
Model 3: Seriousmentalillness ~ sex + race + sexorient + Healthstatus
Model 4: Seriousmentalillness ~ sex + sexorient + BMI
Model 5: Seriousmentalillness ~ sex * BMI + sexorient + race + Healthstatus
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 125758 40217
2 125755 40107 3 109.5 < 2.2e-16 ***
3 125749 34407 6 5700.7 < 2.2e-16 ***
4 125751 39644 -2 -5237.8 < 2.2e-16 ***
5 125739 34344 12 5300.8 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Examing the function anova, we can see that all models are statistically signficant but the best model fit is the 2nd model according to deviance which is interesting, lets see the AIC/BIC values to further see the model that fits best with the data.
htmlreg(list(M0,M1,M2,M3,M4))
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | ||
|---|---|---|---|---|---|---|
| (Intercept) | -3.08*** | -3.07*** | -0.84*** | -3.35*** | -0.94*** | |
| (0.02) | (0.02) | (0.04) | (0.03) | (0.05) | ||
| sexMale | -0.39*** | -0.39*** | -0.38*** | -0.35*** | -0.17** | |
| (0.03) | (0.03) | (0.03) | (0.03) | (0.06) | ||
| raceBlack/African American | 0.02 | -0.27*** | -0.29*** | |||
| (0.04) | (0.04) | (0.04) | ||||
| raceAsian | -0.54*** | -0.37*** | -0.38*** | |||
| (0.08) | (0.08) | (0.08) | ||||
| raceMultiple Race | 0.62*** | 0.45*** | 0.44*** | |||
| (0.08) | (0.09) | (0.09) | ||||
| sexorientBisexual | 1.38*** | 1.26*** | 1.36*** | |||
| (0.10) | (0.09) | (0.10) | ||||
| sexorientGay/Lesbian | 0.61*** | 0.49*** | 0.60*** | |||
| (0.10) | (0.09) | (0.10) | ||||
| HealthstatusVery Good | -3.24*** | -3.20*** | ||||
| (0.06) | (0.06) | |||||
| HealthstatusGood | -2.23*** | -2.20*** | ||||
| (0.05) | (0.05) | |||||
| HealthstatusFair | -1.14*** | -1.13*** | ||||
| (0.05) | (0.05) | |||||
| HealthstatusExcellent | -3.59*** | -3.55*** | ||||
| (0.07) | (0.07) | |||||
| BMIObese30s | 0.44*** | 0.10* | ||||
| (0.04) | (0.05) | |||||
| BMIObese40s | 0.92*** | 0.34*** | ||||
| (0.06) | (0.07) | |||||
| BMIObese50s | 1.26*** | 0.19 | ||||
| (0.12) | (0.16) | |||||
| BMIOverweight | 0.06 | 0.02 | ||||
| (0.04) | (0.05) | |||||
| BMIUnderweight | 0.81*** | 0.54*** | ||||
| (0.09) | (0.11) | |||||
| sexMale:BMIObese30s | -0.27** | |||||
| (0.08) | ||||||
| sexMale:BMIObese40s | -0.46** | |||||
| (0.14) | ||||||
| sexMale:BMIObese50s | -0.09 | |||||
| (0.28) | ||||||
| sexMale:BMIOverweight | -0.24** | |||||
| (0.08) | ||||||
| sexMale:BMIUnderweight | -0.18 | |||||
| (0.21) | ||||||
| AIC | 40220.76 | 40117.28 | 34428.60 | 39662.42 | 34385.60 | |
| BIC | 40240.24 | 40165.99 | 34535.76 | 39750.10 | 34590.18 | |
| Log Likelihood | -20108.38 | -20053.64 | -17203.30 | -19822.21 | -17171.80 | |
| Deviance | 40216.76 | 40107.28 | 34406.60 | 39644.42 | 34343.60 | |
| Num. obs. | 125760 | 125760 | 125760 | 125760 | 125760 | |
| p < 0.001, p < 0.01, p < 0.05 | ||||||
Overall we see throught our analysis that women have higher log odds to have severe mental illness compared to males. Additionally, serious mental illness varies across races, those who have a sexual orientation of being bisexual have higher log odds compared to the gay/lesbian and straight population, and people with a higher BMI in categories across obesity have higher log odds. We can see that BMI and sexual orientation have alot to do with serious mental illness and there is a different impact on males than females in context to serious mental illness along BMI levels. In conclusion, those with a lower AIC AND BIC values are the better fit. According to our models, we see model 4(refered as in the Model 5 above) is the best fit as it includes all the independent variables and an interaction between bmi and sex.