For this assignment, I chose to work with data provided by the NHIS (National Health Interview Survey) yet again because it was rich in survey questions that tackled subjects such as: disabilities, sexually trasmitted diseases, immigration, education and public service etc. For this week, I decided to focus on the medical attention people deprive themselves of whether it’s due to fear of losing one’s job, lack of health insurance or medical attention. I chose to highlight the lack of medical attention due to inadequate funds. My research question is as follows:
Does a relationship exist between a respondent’s sex & age as far as lack of medical attention?
The effect that I will be focusing on is, medical care, I chose to work with the file titled Persons for the 2017 calendar year. For your reference, you can download the data set and codebook here
VARIABLES USED
Dependent variable is medical care coded as Med_Care
2 for No- a time when a respondent needed medical care, but did not get it because they couldn’t afford it? (during the past 12 months)
Independent variables are: sex, race & age coded as the same with the exception of an uppercase letter at the beginning of the variable name
Variable Race was coded as 1= White; 2= Black & 3= Asian
Variable Age was a continuous variable and thus left as is
HYPOTHESES
#install.packages("tidyverse", dependencies=TRUE)
#install.packages("readr")
#install.packages("dplyr")
#install.packages("tidyr")
#install.packages("magrittr")
#install.packages("Zelig")
library(tidyr)
library(readr)
library(dplyr)
library(tidyr)
#read_csv("/Users/safiesaf/Downloads/personsx.csv")
TestCare<-read_csv("/Users/safiesaf/Downloads/personsx.csv")
RefCare<-TestCare%>%
rename("Race"=RACRECI3,
"Medic_Care"=PNMED12M,
"Age"=AGE_P,
"Sex"=SEX)%>%
select(Race,
Age,
Medic_Care,
Sex)%>%
mutate(Race=factor(Race),
Sex=factor(Sex),
Medic_Care=ifelse(Medic_Care==1,1,
ifelse(Medic_Care==2,0,NA)), # 1= Yes, 2= No
Age=ifelse(Age==997,NA,Age))
head(RefCare)
## # A tibble: 6 x 4
## Race Age Medic_Care Sex
## <fct> <dbl> <dbl> <fct>
## 1 1 65 0 2
## 2 2 27 0 2
## 3 2 10 0 1
## 4 1 19 0 1
## 5 1 43 0 2
## 6 1 45 0 1
#unique(test$NOTCOV)
#head(test$PNMED12M)
INTERACTION/ESTIMATING THE MODEL
library(Zelig)
# INTERACTION/ESTIMATING THE MODEL:
z4l<- zelig(Medic_Care ~ Age + Sex*Race, model = "logit", data = RefCare, cite = F)
summary(z4l)
## Model:
##
## Call:
## z5$zelig(formula = Medic_Care ~ Age + Sex * Race, data = RefCare)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.4522 -0.3278 -0.2963 -0.2657 2.8964
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.553102 0.044021 -80.714 < 2e-16
## Age 0.008412 0.000747 11.261 < 2e-16
## Sex2 0.213349 0.039569 5.392 6.97e-08
## Race2 0.404150 0.073139 5.526 3.28e-08
## Race3 -0.634758 0.147422 -4.306 1.66e-05
## Race4 0.309048 0.173711 1.779 0.0752
## Sex2:Race2 -0.008486 0.095357 -0.089 0.9291
## Sex2:Race3 -0.014637 0.194251 -0.075 0.9399
## Sex2:Race4 -0.042965 0.232347 -0.185 0.8533
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 28630 on 78071 degrees of freedom
## Residual deviance: 28327 on 78063 degrees of freedom
## (60 observations deleted due to missingness)
## AIC: 28345
##
## Number of Fisher Scoring iterations: 6
##
## Next step: Use 'setx' method
The results above shows significance with age as a respondent increases with 1 age year, they did not get medical care because they couldn’t afford it. The results also showed signifance among gender and race (females and minorities respectively). Females are estimated to not get medical care because they couldn’t afford it at a rate of 0.21 as opposed to males. Blacks are estimated to not get medical care because they could not afford it at a rate of 0.40 as opposed to whites. Asians are estimated to get medical care on the other hand as opposed to whites at a rate of 0.63.
AGE EFFECT
a.range=min(RefCare$Age):max(RefCare$Age)
z4x.out <- setx(z4l, Age= a.range) # setting values of independent variable 'Age'
sim.out <- sim(z4l, x = z4x.out) # simulating my pv & ev
ci.plot(sim.out) # plotting results
#summary(sim.out) # summarizing/presenting results
The above shows a positive linear relationship between age and the probability that a respondent will need medical care but not get it because they couldn’t afford it. Based on the information provided by the graph, between the age range of 40 and 60, I am 95% confident that a responded will need medical care but not get it because they are unable to afford it. This simply means that my graph shows a closer estimation between this age range of respondents.
RACE DIFFERENCE
x <- setx(z4l, Race = 1) # 1= White, 2= Black, 3= Asian
x1 <- setx(z4l, Race = 2)
x2 <- setx(z4l, Race = 3)
s <- sim(z4l, x = x, x1 = x1, x2= x2)
summary(s)
##
## sim x :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## [1,] 0.04705896 0.001241561 0.04704334 0.04472215 0.04955514
## pv
## 0 1
## [1,] 0.953 0.047
##
## sim x1 :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## [1,] 0.06835966 0.003505539 0.06811819 0.0617525 0.07571533
## pv
## 0 1
## [1,] 0.943 0.057
## fd
## mean sd 50% 2.5% 97.5%
## [1,] 0.0213007 0.003655512 0.02111371 0.01474709 0.02907359
The above data shows a difference in race by respondent with respect to their their need for medical care but not getting it due to not being able to afford it. The data shows that on average blacks are estimated to not get medical attention although they need it a rate of 0.02 in relation to asians and whites.
RACE DIFFERENCE PLOT
plot(s)
SEX DIFFERENCE
x <- setx(z4l, Sex = 1) # 1= Male, 2= Female
x1 <- setx(z4l, Race = 2)
s2 <- sim(z4l, x = x, x1 = x1)
summary(s2)
##
## sim x :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## [1,] 0.0383918 0.001103365 0.0384239 0.03603928 0.04053625
## pv
## 0 1
## [1,] 0.961 0.039
##
## sim x1 :
## -----
## ev
## mean sd 50% 2.5% 97.5%
## [1,] 0.06848413 0.003594644 0.06830879 0.06197824 0.07571932
## pv
## 0 1
## [1,] 0.931 0.069
## fd
## mean sd 50% 2.5% 97.5%
## [1,] 0.03009233 0.003745802 0.03018354 0.02287581 0.03773455
The above data shows a difference in sex by respondent with respect to their their need for medical care but not getting it due to not being able to afford it. The data shows that on average females are estimated to not get medical attention although they need it a rate of 0.03 in relation to males.
SEX DIFFERENCE PLOT
plot(s2)
FIRST DIFFERENCE
FirstDiff <- s$get_qi(xvalue="x1", qi="fd") #quantitiy of interest
summary(FirstDiff)
## V1
## Min. :0.01197
## 1st Qu.:0.01880
## Median :0.02111
## Mean :0.02130
## 3rd Qu.:0.02351
## Max. :0.03518
SecDiff <- s2$get_qi(xvalue="x1", qi="fd") #quantitiy of interest
summary(SecDiff)
## V1
## Min. :0.01835
## 1st Qu.:0.02758
## Median :0.03018
## Mean :0.03009
## 3rd Qu.:0.03250
## Max. :0.04089
RACE OF RESPONDENTS SIMULATIONS
# RACE OF FEMALE RESPONDENT
R1 <- setx(z4l, Race = 1, Sex = 2)
R2 <- setx(z4l, Race = 2, Sex = 2)
R3 <- setx(z4l, Race = 3, Sex = 2)
RTot <- sim(z4l, x = R1, x1 = R2, R3)
# RACE OF MALE RESPONDENT
R1x <- setx(z4l, Race = 1, Sex = 1)
R2x <- setx(z4l, Race = 2, Sex = 1)
R3x <- setx(z4l, Race = 3, Sex = 1)
RTot2 <- sim(z4l, x = R1x, x1 = R2x, R3x)
__ SIDE BY SIDE COMPARISON OF QUANTITIES OF INTEREST w/ FIRST DIFFERENCES__
Diff1 <- RTot$get_qi(xvalue="x1", qi="fd")
Diff2 <- RTot2$get_qi(xvalue="x1", qi="fd")
BigDiff <- as.data.frame(cbind(Diff1, Diff2))
head(BigDiff)
## V1 V2
## 1 0.02216218 0.01700240
## 2 0.02156347 0.02537499
## 3 0.02015365 0.01833789
## 4 0.02439898 0.01320822
## 5 0.02317373 0.02172311
## 6 0.02303681 0.01382331
# Vector1 (V1- Race of respondents of the female sex)
# Vector2 (V2- Race of respondents of the male sex)
Grap <- BigDiff %>%
gather(Sex,simv, 1:2)
head(Grap)
## Sex simv
## 1 V1 0.02216218
## 2 V1 0.02156347
## 3 V1 0.02015365
## 4 V1 0.02439898
## 5 V1 0.02317373
## 6 V1 0.02303681
Grap %>%
group_by(Sex) %>%
summarise(mean = mean(simv), sd = sd(simv))
## # A tibble: 2 x 3
## Sex mean sd
## <chr> <dbl> <dbl>
## 1 V1 0.0213 0.00377
## 2 V2 0.0184 0.00389
GG PLOT HISTOGRAMS SHOWING DIFFERENCE BY SEX
library(ggplot2)
ggplot(Grap, aes(simv)) + geom_histogram(fill = "maroon", color = 'green')+
geom_histogram() + facet_grid(~Sex)
CONCLUSION
In conlusion, I saw that my hypothesis was wrong based on the data that was executed. Females are estimated to NOT get medical help although they need it because they cannot afford it in reference to men and blacks are estimated to not seek medical help although they may need it in reference to other races. Essentially, black women are more likely to NOT seek medical help although they need it because they cannot afford it.