Four variables were selected from the data set to conduct this analysis. The dependent variable, “low” is an indicator of birth weight less than 2.5 kg (5 lbs 8 ounces) in which “1” an unhealthy birth weight less than 2.5 kg and “0” represented a healthy birth weight above 2.5 kg. The independent variables used in this study are “smoke”, “race” and “ht”. Smoke measured whether the mother smoked during pregnancy, race was recoded for “1” to represent white and “0” to represent all other variables and “ht” measures whether the mother had a history of hypertension or not.
library(tidyverse)
library(magrittr)
library(dplyr)
library(Zelig)
library(pander)
library(texreg)
library(visreg)
library(lmtest)
library(visreg)
library(sjmisc)
library(MASS)
data("birthwt")
head(birthwt)
summary(birthwt)
low age lwt race smoke ptl
Min. :0.0000 Min. :14.00 Min. : 80.0 Min. :1.000 Min. :0.0000 Min. :0.0000
1st Qu.:0.0000 1st Qu.:19.00 1st Qu.:110.0 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:0.0000
Median :0.0000 Median :23.00 Median :121.0 Median :1.000 Median :0.0000 Median :0.0000
Mean :0.3122 Mean :23.24 Mean :129.8 Mean :1.847 Mean :0.3915 Mean :0.1958
3rd Qu.:1.0000 3rd Qu.:26.00 3rd Qu.:140.0 3rd Qu.:3.000 3rd Qu.:1.0000 3rd Qu.:0.0000
Max. :1.0000 Max. :45.00 Max. :250.0 Max. :3.000 Max. :1.0000 Max. :3.0000
ht ui ftv bwt
Min. :0.00000 Min. :0.0000 Min. :0.0000 Min. : 709
1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:2414
Median :0.00000 Median :0.0000 Median :0.0000 Median :2977
Mean :0.06349 Mean :0.1481 Mean :0.7937 Mean :2945
3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:3487
Max. :1.00000 Max. :1.0000 Max. :6.0000 Max. :4990
dim(birthwt)
[1] 189 10
birthwt2 <- birthwt%>%
rename(hypertension = ht)%>%
mutate(race = ifelse(race == 1, "white", "non-white"))
head(birthwt2)
In this first model i looked at whether a child is born with a below normal birthweight is affected by race. Looking at the results from this first model we can see that being white had a negative affect on whether a child is born below normal birth weight (-0.6954) as opposed to other races. Depending on race one may get better care than other races because of income, health, etc. Based on the p value we can see the results are statistically significant.
m0 <- glm(low ~ race, family = binomial, data = birthwt2)
summary(m0)
Call:
glm(formula = low ~ race, family = binomial, data = birthwt2)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.9895 -0.9895 -0.7401 1.3777 1.6905
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.4595 0.2129 -2.159 0.0309 *
racewhite -0.6954 0.3202 -2.172 0.0298 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 229.86 on 187 degrees of freedom
AIC: 233.86
Number of Fisher Scoring iterations: 4
In this second model i added the variable smoke which indicated whether the mother smoked during pregnancy or not. In this model we can see that mothers who smoked during pregnancy had a higher chance (1.1130) of giving birth to a baby that would be below normal birth weight of 2.5 kg. This relationship is significant at a .01 confidence level.
m1 <- glm(low ~ race + smoke, family = binomial, data = birthwt2)
summary(m1)
Call:
glm(formula = low ~ race + smoke, family = binomial, data = birthwt2)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3402 -0.8840 -0.5433 1.4968 1.9930
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.7382 0.2379 -3.103 0.00191 **
racewhite -1.1003 0.3645 -3.019 0.00254 **
smoke 1.1130 0.3643 3.056 0.00225 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 219.98 on 186 degrees of freedom
AIC: 225.98
Number of Fisher Scoring iterations: 4
In this third model we added another variable called hypertension which recorded whether the mother had a history of high blood pressure. Having a history of smoking and high blood pressure can add to the chances of a child birth weight being below normal. In this model we see that hypertension does increase the likelihood that a childs birth weight will be below 2.5 kg (1.1725). However, this indicator is not statistically significant with a p-value of >.05.
m2 <- glm(low ~ race + smoke + hypertension, family = binomial, data = birthwt2)
summary(m2)
Call:
glm(formula = low ~ race + smoke + hypertension, family = binomial,
data = birthwt2)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3371 -0.8496 -0.5222 1.0587 2.0297
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.8331 0.2460 -3.386 0.000708 ***
racewhite -1.0904 0.3686 -2.958 0.003095 **
smoke 1.1190 0.3681 3.040 0.002367 **
hypertension 1.1725 0.6225 1.883 0.059633 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 216.38 on 185 degrees of freedom
AIC: 224.38
Number of Fisher Scoring iterations: 4
In this final model we examine race and the interaction between smoking and hypertension to analyze the impact it has on the likelihood a child being born below normal child birth of 2.5 kg. The model shows that people of white race are still more likely to have a child that is of normal birth weight or higher also that smoking negatively affect that but there is no relationship between hypertension and having a child of lower birth weight.
m3 <- glm(low ~ race + smoke * hypertension, family = binomial, data = birthwt2)
summary(m3)
Call:
glm(formula = low ~ race + smoke * hypertension, family = binomial,
data = birthwt2)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3706 -0.8456 -0.5213 1.0561 2.0314
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.8446 0.2518 -3.354 0.000797 ***
racewhite -1.0828 0.3700 -2.926 0.003431 **
smoke 1.1367 0.3766 3.018 0.002544 **
hypertension 1.2881 0.8128 1.585 0.113016
smoke:hypertension -0.2821 1.2652 -0.223 0.823550
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 234.67 on 188 degrees of freedom
Residual deviance: 216.33 on 184 degrees of freedom
AIC: 226.33
Number of Fisher Scoring iterations: 4
The likelihood ratio test illustrates that Model 4 is the best fit because it has the smallest deviance. Lower deviance in Model 4 means that it is a better fit for the data.
anova(m0, m1, m2, m3, test = "Chisq")
Analysis of Deviance Table
Model 1: low ~ race
Model 2: low ~ race + smoke
Model 3: low ~ race + smoke + hypertension
Model 4: low ~ race + smoke * hypertension
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 187 229.86
2 186 219.98 1 9.8802 0.001671 **
3 185 216.38 1 3.5949 0.057956 .
4 184 216.33 1 0.0495 0.823892
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From this test we see that model 2 is the best fit because it contained the lowest AIC(225.98) and lowest BIC(235.70) this may be due to the fact that we found hypertension was not a serious indicator of whether a child was born with a normal birth weight above 2.5 kg.
screenreg(list(m0, m1, m2, m3))
===================================================================
Model 1 Model 2 Model 3 Model 4
-------------------------------------------------------------------
(Intercept) -0.46 * -0.74 ** -0.83 *** -0.84 ***
(0.21) (0.24) (0.25) (0.25)
racewhite -0.70 * -1.10 ** -1.09 ** -1.08 **
(0.32) (0.36) (0.37) (0.37)
smoke 1.11 ** 1.12 ** 1.14 **
(0.36) (0.37) (0.38)
hypertension 1.17 1.29
(0.62) (0.81)
smoke:hypertension -0.28
(1.27)
-------------------------------------------------------------------
AIC 233.86 225.98 224.38 226.33
BIC 240.34 235.70 237.35 242.54
Log Likelihood -114.93 -109.99 -108.19 -108.17
Deviance 229.86 219.98 216.38 216.33
Num. obs. 189 189 189 189
===================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05
Likelihood of child being born below normal birth weight by race. The graph shows us that whites have a lower chance of their child being born with low normal birth weight than non-whites.
library(visreg)
visreg(m1, "race", scale = "response")
The graphs clearly shows the relationship by which if mothers who responded yes to smoking during pregnancy had a higher chance of being born with a child that was below normal birth weight.
library(visreg)
visreg(m1, "smoke", scale = "response")