As medicine became an important influence to health among Americans, by observing socioeconomic factors researchers can see and examine whom is healthiest in America. The data was collected using national health survey from ipums.org. I will attempt to investigate the relationship between poverty and health using certain demographic variables and socioeconomic variables.
The variables chosen were created through ipums.org, with a few adjustments.
For my analysis and future research of observing independent variables influence on Health. I have chosen a few socioeconomic variables and other variables to identify influences with health.
AGE- this variable exhibits each respondents age.
SEX- respondents sex with two factors Female or Male.
Sexorient- represents the sexual orientation of each respondent and has been mutated into a factor.
Race- repondents race the original data incorporated all different types of races but has been recoded and divided into three categories. White, Black, and Other
library(dplyr)
nhis1 <- nhis%>%
mutate (Race= (RACEA),
Race = ifelse(Race== 100, "White",
ifelse(Race== 200, "Black",
ifelse(Race== 300:990, "Other", NA))),
HEALTH = factor(ifelse(HEALTH==1, "Excellent",
ifelse(HEALTH==2, "Very Good",
ifelse(HEALTH==3, "Good",
ifelse(HEALTH==4,"Fair",
ifelse(HEALTH==5, "Poor", NA))))),
ordered =TRUE, levels= c("Poor", "Fair", "Good", "Very Good", "Excellent")),
POORYN = factor(ifelse(POORYN==1, "At or above poverty threshold",
ifelse(POORYN==2, "Below poverty threshold", NA)),
ordered = TRUE, levels= c("At or above poverty threshold", "Below poverty threshold")),
SEXORIEN = factor(ifelse(SEXORIEN== 1, "Lesbian or gay",
ifelse(SEXORIEN== 2, "Straight",
ifelse(SEXORIEN== 3, "Bisexual", NA))),
ordered = TRUE, levels= c("Lesbian or gay", "Straight", "Bisexual")),
MARSTAT = factor(ifelse(MARSTAT== 10:13, "Married",
ifelse(MARSTAT== 20, "Widowed",
ifelse(MARSTAT== 30, "Divorced",
ifelse(MARSTAT== 40, "Separated",
ifelse(MARSTAT== 50, "Never married", NA))))),
ordered = TRUE, levels= c("Married", "Widowed", "Divorced", "Separated", "Never married")))%>%
select (AGE, HEALTH, SEX, POORYN, MARSTAT, Race, EDUC, SEXORIEN)
nhis2<- nhis1 %>%
filter(!is.na(HEALTH),!is.na(EDUC),!is.na(SEXORIEN), !is.na(AGE), !is.na(MARSTAT), !is.na(SEX), !is.na(Race), !is.na(POORYN))
data(nhis2)
head(nhis2)
The statistical analysis we will be using to examine the correllation between income and health is through zelig package model “ologit”. Ordered logit model, observes the independent variables influence on the dependent through probability. The reason we will be analyzing our observations through this model is because our dependent variable describes ordered factors.
The model below exhibits the odds ratio of influence of POORN, age, and sex on the ordinal dependent variable health. This model suggests that for every unit increase of POORYN changes health by -.7176. While as age increases by 1 unit health changes by -.02.
z.ord <- zelig(HEALTH~POORYN + AGE + SEX, model = "ologit",
data = nhis2, cite = F)
summary(z.ord)
## Model:
## Call:
## z5$zelig(formula = HEALTH ~ POORYN + AGE + SEX, data = nhis2)
##
## Coefficients:
## Value Std. Error t value
## POORYN.L -0.717610 0.0138365 -51.8634
## AGE -0.029780 0.0004063 -73.3006
## SEX 0.005583 0.0152721 0.3656
##
## Intercepts:
## Value Std. Error t value
## Poor|Fair -4.5864 0.0381 -120.2310
## Fair|Good -2.9329 0.0329 -89.0230
## Good|Very Good -1.4077 0.0312 -45.1310
## Very Good|Excellent 0.0760 0.0306 2.4822
##
## Residual Deviance: 161835.84
## AIC: 161849.84
## Next step: Use 'setx' method
To evaluate the probability of the poor thresholds influence on health. I have created a summary below of probabilities which outline the following:
Sim x- represents the probability of each health factor influenced by if the respondent was below the poverty threshold. This section creates means of probability. It can be observed that when poverty is below the threshold, health at poor is 16% and only 5% are excellent. The majority of the observations reside on the lower end of health.
Sim x1- represents the probability of 100% when the poor threshold is above the threshold. This section can be interpreted as when income is above the threshold the probability of health being poor is 8% whereas, the probability of health at excellent is at 9%.
The probabilities from both sections show that majority of the observed health factors for persons below and above the poverty threshold range mainly in the middle of the ordered health factors. But, what can be seen is that when poverty is below the threshold the probability of health is highest under fair (33%). Whereas, when persons are above the poverty threshold the probability of health is highest under Good (36%).
The differences between simx and simxl gives an easier method in seeing the pattern and the influence income has on health. as we can determine that there is a significant difference between the lower end of health under poor and fair between both extremes of POORN.
x.low <- setx(z.ord, POORYN =02)
x.high <- setx(z.ord, POORYN =01)
s.ord <- sim(z.ord, x = x.low, x1 = x.high)
summary(s.ord)
##
## sim x :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## Poor 0.15510825 0.004873649 0.15519073 0.1457933 0.16494593
## Fair 0.33550085 0.015466943 0.33538303 0.3068951 0.36678366
## Good 0.32501307 0.004480583 0.32488135 0.3159497 0.33407120
## Very Good 0.13543946 0.010143712 0.13543063 0.1152706 0.15490633
## Excellent 0.04893837 0.007358501 0.04872303 0.0360599 0.06465709
## pv
## mean sd 50% 2.5% 97.5%
## [1,] 2.585 1.036267 3 1 5
##
## sim x1 :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## Poor 0.08221563 0.002075962 0.08220599 0.07825534 0.08640195
## Fair 0.23769803 0.013573320 0.23717548 0.21242333 0.26606749
## Good 0.36394094 0.010258610 0.36433191 0.34215437 0.38229841
## Very Good 0.22090996 0.011024622 0.22139155 0.19828272 0.24081272
## Excellent 0.09523543 0.013238882 0.09472037 0.07112569 0.12455499
## pv
## mean sd 50% 2.5% 97.5%
## [1,] 3 1.083131 3 1 5
## fd
## mean sd 50% 2.5% 97.5%
## Poor -0.07289262 0.002946661 -0.07290841 -0.07863442 -0.06715169
## Fair -0.09780282 0.002261255 -0.09784356 -0.10191244 -0.09335049
## Good 0.03892787 0.009910997 0.03865725 0.02022460 0.05858020
## Very Good 0.08547050 0.001814674 0.08551428 0.08169674 0.08882290
## Excellent 0.04629707 0.005938204 0.04606813 0.03569137 0.05879240
From this short analysis, it was analyzed that persons below the poverty threshold are least likely to have excellent health. But, it was also observed that majority of both observations below and above the poverty threshold are within the middle factors; where below the threshold probabilities are a bit higher on the fair and above the threshold is at good. Although, there are significant differences that the poverty threshold does effect health we can also determine that there may be other interacting socioeconomic factors that can be evaluated.