library(Zelig)
## Loading required package: survival
data(infert)
str(infert)
## 'data.frame': 248 obs. of 8 variables:
## $ education : Factor w/ 3 levels "0-5yrs","6-11yrs",..: 1 1 1 1 2 2 2 2 2 2 ...
## $ age : num 26 42 39 34 35 36 23 32 21 28 ...
## $ parity : num 6 1 6 4 3 4 1 2 1 2 ...
## $ induced : num 1 1 2 2 1 2 0 0 0 0 ...
## $ case : num 1 1 1 1 1 1 1 1 1 1 ...
## $ spontaneous : num 2 0 0 0 1 1 0 0 1 0 ...
## $ stratum : int 1 2 3 4 5 6 7 8 9 10 ...
## $ pooled.stratum: num 3 1 4 2 32 36 6 22 5 19 ...
library(ggplot2)
ggplot(infert, aes (x = parity)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The chart above shows how many times women in this data set has been or currently is pregnant.
m1 <- lm(spontaneous ~ parity, data = infert)
summary(m1)
##
## Call:
## lm(formula = spontaneous ~ parity, data = infert)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.2911 -0.5597 -0.3768 0.6232 1.4403
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.19396 0.08640 2.245 0.0257 *
## parity 0.18285 0.03545 5.158 5.15e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6973 on 246 degrees of freedom
## Multiple R-squared: 0.09758, Adjusted R-squared: 0.09392
## F-statistic: 26.6 on 1 and 246 DF, p-value: 5.145e-07
average between 1 and 2, 1.8, spontaneous abortions many resut compared to all other amount of times pregnant. On your fisrt or second spontaneous abortion may result compared to never gettting pregnant or having more than one child.
library(ggplot2)
ggplot (infert, aes (x = education)) + geom_bar()
m2 <- lm(spontaneous ~ education, data = infert)
summary(m2)
##
## Call:
## lm(formula = spontaneous ~ education, data = infert)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.6293 -0.5417 -0.5417 0.4583 1.5833
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.4167 0.2117 1.968 0.0502 .
## education6-11yrs 0.1250 0.2220 0.563 0.5740
## education12+ yrs 0.2126 0.2224 0.956 0.3399
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7334 on 245 degrees of freedom
## Multiple R-squared: 0.005852, Adjusted R-squared: -0.002263
## F-statistic: 0.7211 on 2 and 245 DF, p-value: 0.4872
On average .12 middle schoolers and high schoolers have more spontaneous abortions than elementary schoolers. On average .21 high school+ people have more spontaneous abortions then elementary schoolers.
m3 <- lm(spontaneous ~ education + parity, data = infert)
summary(m3)
##
## Call:
## lm(formula = spontaneous ~ education + parity, data = infert)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.1316 -0.5182 -0.2832 0.5732 1.6421
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.5818 0.2546 -2.285 0.023155 *
## education6-11yrs 0.6301 0.2223 2.835 0.004968 **
## education12+ yrs 0.7737 0.2260 3.423 0.000727 ***
## parity 0.2349 0.0379 6.199 2.4e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.683 on 244 degrees of freedom
## Multiple R-squared: 0.1411, Adjusted R-squared: 0.1306
## F-statistic: 13.36 on 3 and 244 DF, p-value: 4.181e-08
For every -.58 increase in unit of education, there is a .63 increase in spontaneous abortions for middle schools to high schoolers. There is a .77 increase in spontaneous abortions for highschool+. Controlling for number of children there is .63 increase in spontaneous abortion for middle school and high schooler, while there is a .77 increase in spontaneous abortions for the highschool+ group. Controlling for education, women, who have had been pregnant before, have a .23 increase spontaneous abortion.
m4 <- lm(spontaneous ~ education*parity, data = infert)
summary(m4)
##
## Call:
## lm(formula = spontaneous ~ education * parity, data = infert)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.1319 -0.5106 -0.1999 0.5651 1.5970
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.18408 0.45149 0.408 0.6838
## education6-11yrs -0.29483 0.47172 -0.625 0.5326
## education12+ yrs 0.02538 0.46840 0.054 0.9568
## parity 0.05473 0.09572 0.572 0.5680
## education6-11yrs:parity 0.25595 0.11192 2.287 0.0231 *
## education12+ yrs:parity 0.17075 0.11181 1.527 0.1280
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6785 on 242 degrees of freedom
## Multiple R-squared: 0.1595, Adjusted R-squared: 0.1421
## F-statistic: 9.182 on 5 and 242 DF, p-value: 5.138e-08
The only t value that is statisitically significant is the group of participants with 6 to 11 years of education. The value for participants 6 to 11 years of education: parity is .02, which means this t value is statistically significant. By examining the P value one can determine if it is less than or equal to 0.05. If the p value is less than or equal to 0.05 then the t value is statistically significant.
library(texreg)
## Version: 1.36.23
## Date: 2017-03-03
## Author: Philip Leifeld (University of Glasgow)
##
## Please cite the JSS article in your publications -- see citation("texreg").
screenreg(list(m1, m2, m3, m4))
##
## ==================================================================
## Model 1 Model 2 Model 3 Model 4
## ------------------------------------------------------------------
## (Intercept) 0.19 * 0.42 -0.58 * 0.18
## (0.09) (0.21) (0.25) (0.45)
## parity 0.18 *** 0.23 *** 0.05
## (0.04) (0.04) (0.10)
## education6-11yrs 0.13 0.63 ** -0.29
## (0.22) (0.22) (0.47)
## education12+ yrs 0.21 0.77 *** 0.03
## (0.22) (0.23) (0.47)
## education6-11yrs:parity 0.26 *
## (0.11)
## education12+ yrs:parity 0.17
## (0.11)
## ------------------------------------------------------------------
## R^2 0.10 0.01 0.14 0.16
## Adj. R^2 0.09 -0.00 0.13 0.14
## Num. obs. 248 248 248 248
## RMSE 0.70 0.73 0.68 0.68
## ==================================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05