Raven Shan
Overpopulation has long been an issue in the country of Indonesia. In the late 1960s, the Indonesian government began taking measures to prevent overpopulation; one of these measures was the implementation of the Family Planning Program. The program’s most notable efforts include community education and the distribution of free contraceptives. Overall, the program proved to be effective as contraceptive use increased and fertility rates reduced substantially between 1976 and 2002 (Putjuk, 2014). In more recent years, however, family planning in Indonesia has reappeared as a topic of discussion. Currently, the Indonesian government is looking to renew and revamp the Family Planning program in an effort to regain better control of the Indonesian population.
In 1987, the National Indonesia Contraceptive Prevalence Survey was conducted in order to assess women’s contraceptive use after the implementation of the Family Planning program. The survey was given to married women who were reportedly not pregnant at the time. For this assignment, I will be analyzing a subset of this survey, which I retrieved from the UCI Machine Learning Repository. The data subset contains 1,473 observations of 10 variables, with the outcome variable being the type of contraceptive method used; 1 = No contraceptive use, 2 = Long-term contraceptive use, 3 = Short-term contraceptive use. The dataset also features variables pertaining to the husband such as the husband’s education, and occupation, but for the purposes of this assignment, I chose to concentrate solely on the child-bearer, the wife.
In this analysis, I will be conducting a multinomial logit regression model using mlogit in the ZeligChoice package. The objective is to predict the type of contraceptive method used (no use, long-term methods, or short-term methods) by Indonesian women based on demographic and socio-economic factors. The predictor variables I will examine are the woman’s education, her level of media exposure, and her employment status. In sum, this analysis will examine the effect of a woman’s education, media exposure, and employment status on the probability of her choosing one contraceptive method over another.
The variables used in this analysis are as follows:
Contraceptive.Method.Used - This variable will act as the dependent variable: 1 = No contraceptive use, 2 = Long-term contraceptive use, 3 = Short-term contraceptive use
Wife.Edu - This measures the level of education reached by the woman: 1 = low, 2, 3, 4 = high. It is important to note that the scale is according to Indonesian standards, and may vary considerably in other countries.
Media.Exposure - This binary variable represents the woman’s amount of media exposure: 0 = Frequent, 1 = Infrequent
Wife.Working - This binary variable records whether the woman is employed or not: 0 = Yes, 1 = No
Observations: 1,473
Variables: 4
$ Contraceptive.Method.Used <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
$ Wife.Edu <fct> 2, 1, 2, 3, 3, 4, 2, 3, 2, 1, 1, 1, 4, 2, 3, 2,...
$ Wife.Working <fct> 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1,...
$ Media.Exposure <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0,...
Contraceptive Method Used
1 = No contraceptive use, 2 = Long-term contaceptive use, 3 = Short-term contraceptive use
The majority of Indonesian women do not use contraception. Long-term contraception is the least common method.
Education Level
1 = low, 2, 3, 4 = high
The majority of the women in this dataset have reached the highest level of education.
Employment Status
Wife.Working: 0=Yes, 1=No
The majority of women in this data set are unemployed, which appears to be common for females in Indonesia.
Media Exposure
0 = Frequent, 1 = Infrequent
Most of the women in this dataset have frequent exposure to the media as depicted by the figure below.
This regression model estimates the effect of a woman’s education, employment status, and media exposure on the probability of her using one contraceptive method over another. As my dependent variable is a categorical variable with three categories, a multinomial logit model will be used for this regression.
First, I tested different interaction terms as seen in models 2,3, and 4. As none of the interactions were significant, the subsequent simulations will be based on Model 1. (Note: only the raw output of each model is shown as ZeligChoice is not yet supported by the texreg package)
model1 <- zelig(Contraceptive.Method.Used ~ Wife.Edu + Wife.Working + Media.Exposure, model = "mlogit", data = CMCDataset, cite = F)
summary(model1)
Model:
Call:
z5$zelig(formula = Contraceptive.Method.Used ~ Wife.Edu + Wife.Working +
Media.Exposure, data = CMCDataset)
Pearson residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -2.122 -0.9830 -0.3173 0.99208 1.527
log(mu[,2]/mu[,3]) -1.221 -0.7077 -0.2263 -0.07663 4.391
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 1.0030 0.2344 4.280 1.87e-05
(Intercept):2 -1.3779 0.4110 -3.353 0.000800
Wife.Edu2:1 -0.4090 0.2312 -1.769 0.076887
Wife.Edu2:2 0.3357 0.4252 0.789 0.429837
Wife.Edu3:1 -0.6394 0.2296 -2.785 0.005349
Wife.Edu3:2 0.8630 0.4085 2.113 0.034642
Wife.Edu4:1 -0.8766 0.2296 -3.819 0.000134
Wife.Edu4:2 1.5726 0.4006 3.926 8.65e-05
Wife.Working1:1 -0.3418 0.1426 -2.397 0.016526
Wife.Working1:2 -0.1883 0.1675 -1.125 0.260782
Media.Exposure1:1 0.6017 0.2581 2.331 0.019736
Media.Exposure1:2 0.1189 0.4061 0.293 0.769747
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Residual deviance: 2985.688 on 2934 degrees of freedom
Log-likelihood: -1492.844 on 2934 degrees of freedom
Number of iterations: 5
No Hauck-Donner effect found in any of the estimates
Reference group is level 3 of the response
Next step: Use 'setx' method
This model tests the interaction between a woman’s level of education and her employment status.
model2 <- zelig(Contraceptive.Method.Used ~ Wife.Edu * Wife.Working + Media.Exposure, model = "mlogit", data = CMCDataset, cite = F)
summary(model2)
Model:
Call:
z5$zelig(formula = Contraceptive.Method.Used ~ Wife.Edu * Wife.Working +
Media.Exposure, data = CMCDataset)
Pearson residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -2.202 -0.9945 -0.3147 0.99053 1.539
log(mu[,2]/mu[,3]) -1.281 -0.6800 -0.2302 -0.06796 4.577
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 1.10107 0.43472 2.533 0.0113
(Intercept):2 -0.88041 0.69716 -1.263 0.2066
Wife.Edu2:1 -0.56255 0.50119 -1.122 0.2617
Wife.Edu2:2 -0.36421 0.81521 -0.447 0.6550
Wife.Edu3:1 -0.83054 0.49372 -1.682 0.0925
Wife.Edu3:2 0.21298 0.75998 0.280 0.7793
Wife.Edu4:1 -0.90203 0.47584 -1.896 0.0580
Wife.Edu4:2 1.15103 0.72249 1.593 0.1111
Wife.Working1:1 -0.46327 0.47878 -0.968 0.3332
Wife.Working1:2 -0.86808 0.82114 -1.057 0.2904
Media.Exposure1:1 0.60698 0.25858 2.347 0.0189
Media.Exposure1:2 0.13964 0.40682 0.343 0.7314
Wife.Edu2:Wife.Working1:1 0.19011 0.55906 0.340 0.7338
Wife.Edu2:Wife.Working1:2 0.93138 0.95055 0.980 0.3272
Wife.Edu3:Wife.Working1:1 0.23819 0.54889 0.434 0.6643
Wife.Edu3:Wife.Working1:2 0.86964 0.89054 0.977 0.3288
Wife.Edu4:Wife.Working1:1 0.02104 0.53149 0.040 0.9684
Wife.Edu4:Wife.Working1:2 0.57570 0.85133 0.676 0.4989
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Residual deviance: 2984.123 on 2928 degrees of freedom
Log-likelihood: -1492.062 on 2928 degrees of freedom
Number of iterations: 5
No Hauck-Donner effect found in any of the estimates
Reference group is level 3 of the response
Next step: Use 'setx' method
This model tests the interaction between a woman’s education and her level of media exposure.
model3 <- zelig(Contraceptive.Method.Used ~ Wife.Edu * Media.Exposure + Wife.Working, model = "mlogit", data = CMCDataset, cite = F)
summary(model3)
Model:
Call:
z5$zelig(formula = Contraceptive.Method.Used ~ Wife.Edu * Media.Exposure +
Wife.Working, data = CMCDataset)
Pearson residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -2.285 -0.9193 -0.3179 0.98368 1.523
log(mu[,2]/mu[,3]) -1.109 -0.6970 -0.2261 -0.08411 4.194
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 0.93814 0.25015 3.750 0.000177
(Intercept):2 -1.46319 0.46544 -3.144 0.001668
Wife.Edu2:1 -0.33273 0.25621 -1.299 0.194061
Wife.Edu2:2 0.37547 0.48906 0.768 0.442647
Wife.Edu3:1 -0.58345 0.25157 -2.319 0.020382
Wife.Edu3:2 0.94137 0.46874 2.008 0.044609
Wife.Edu4:1 -0.79320 0.24830 -3.194 0.001401
Wife.Edu4:2 1.68044 0.45847 3.665 0.000247
Media.Exposure1:1 0.82769 0.41667 1.986 0.046987
Media.Exposure1:2 0.41720 0.79598 0.524 0.600186
Wife.Working1:1 -0.34535 0.14281 -2.418 0.015597
Wife.Working1:2 -0.18934 0.16796 -1.127 0.259599
Wife.Edu2:Media.Exposure1:1 -0.30652 0.62587 -0.490 0.624309
Wife.Edu2:Media.Exposure1:2 0.24523 1.03265 0.237 0.812286
Wife.Edu3:Media.Exposure1:1 0.08356 0.72711 0.115 0.908504
Wife.Edu3:Media.Exposure1:2 -0.04075 1.11227 -0.037 0.970774
Wife.Edu4:Media.Exposure1:1 -1.43979 0.96930 -1.485 0.137441
Wife.Edu4:Media.Exposure1:2 -15.58008 568.66653 NA NA
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Residual deviance: 2977.462 on 2928 degrees of freedom
Log-likelihood: -1488.731 on 2928 degrees of freedom
Number of iterations: 14
Warning: Hauck-Donner effect detected in the following estimate(s):
'Wife.Edu4:Media.Exposure1:2'
Reference group is level 3 of the response
Next step: Use 'setx' method
This model tests the interaction between a woman’s employment status and level of media exposure.
model4 <- zelig(Contraceptive.Method.Used ~ Wife.Edu + Wife.Working * Media.Exposure, model = "mlogit", data = CMCDataset, cite = F)
summary(model4)
Model:
Call:
z5$zelig(formula = Contraceptive.Method.Used ~ Wife.Edu + Wife.Working *
Media.Exposure, data = CMCDataset)
Pearson residuals:
Min 1Q Median 3Q Max
log(mu[,1]/mu[,3]) -2.081 -0.9823 -0.3159 0.99310 1.532
log(mu[,2]/mu[,3]) -1.480 -0.7021 -0.2273 -0.06452 4.745
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 1.00665 0.23491 4.285 1.82e-05
(Intercept):2 -1.37992 0.41030 -3.363 0.000770
Wife.Edu2:1 -0.40686 0.23163 -1.756 0.079006
Wife.Edu2:2 0.31402 0.42510 0.739 0.460086
Wife.Edu3:1 -0.63761 0.22990 -2.773 0.005547
Wife.Edu3:2 0.84398 0.40827 2.067 0.038713
Wife.Edu4:1 -0.87575 0.22974 -3.812 0.000138
Wife.Edu4:2 1.55529 0.40008 3.887 0.000101
Wife.Working1:1 -0.34898 0.14732 -2.369 0.017846
Wife.Working1:2 -0.16158 0.17111 -0.944 0.345022
Media.Exposure1:1 0.58517 0.52976 1.105 0.269333
Media.Exposure1:2 0.62547 0.70526 0.887 0.375147
Wife.Working1:Media.Exposure1:1 0.02763 0.59184 0.047 0.962760
Wife.Working1:Media.Exposure1:2 -0.74304 0.84672 -0.878 0.380188
Number of linear predictors: 2
Names of linear predictors: log(mu[,1]/mu[,3]), log(mu[,2]/mu[,3])
Residual deviance: 2984.633 on 2932 degrees of freedom
Log-likelihood: -1492.317 on 2932 degrees of freedom
Number of iterations: 5
No Hauck-Donner effect found in any of the estimates
Reference group is level 3 of the response
Next step: Use 'setx' method
Here, the two counter-factual situations being compared is the level of education reached by the woman: high or low. I will start by examining the results for the first counter-factual situation in which a woman’s education level is high. The results suggest that highly-educated women are (nearly) equally likely to use either long-term or short-term methods. The probability is .357 and .356 respectively. The probability of the woman using no contraception at all is .28.
The next step will be to examine the results for the second counter-factual situation in which the woman’s education is low. The results indicate that among women with low education, by far the most common “method” is using no form of contraception at all. The probability of her forgoing contraception is .61. Meanwhile, the probability of using long-term contraception methods is merely .06, and .31 for short-term methods.
Shifting the focus on the effect, the first difference (‘fd’ value) will also be examined. The first difference represents the the simulated difference in the probability of using one of the three contraceptive methods estimated between the two counter-factual situations (high education and low education). The results indicate that women with low education are .32 more likely to use no contraception at all, which is a considerable difference. In terms of long-term contraception use, the probability of using such method is reduced by .28 for women with low education; for short-term use, among these same women, the probability is reduced slightly by .03.
x.low <- setx(model1, Wife.Edu = 1)
x.high <- setx(model1, Wife.Edu = 4)
w.edu <- sim(model1, x = x.high, x1 = x.low)
summary(w.edu)
sim x :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.2858052 0.02051591 0.2859207 0.2447516 0.3259854
Pr(Y=2) 0.3577311 0.02295682 0.3577726 0.3164730 0.4036156
Pr(Y=3) 0.3564637 0.02131764 0.3572185 0.3155794 0.3972875
pv
1 2 3
[1,] 0.297 0.334 0.369
sim x1 :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.61208612 0.04560393 0.61214054 0.52408448 0.6985460
Pr(Y=2) 0.06924483 0.02356310 0.06467699 0.03506892 0.1270507
Pr(Y=3) 0.31866905 0.04416071 0.31533577 0.24102696 0.4092072
pv
1 2 3
[1,] 0.621 0.071 0.308
fd
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.32628096 0.04952370 0.32587516 0.2351278 0.42042467
Pr(Y=2) -0.28848628 0.03259555 -0.28945981 -0.3470256 -0.22258086
Pr(Y=3) -0.03779467 0.04928690 -0.03900134 -0.1290733 0.06229219
plot(w.edu)
Media exposure (binary): 0=Frequent, 1=Infrequent
The two counter-factual situations in this simulation are frequent and infrequent exposure to media. The results indicate that women with poor media exposure are most likely to use no contraception at all; the probability of using such method is .40. The first difference values indicate that compared to their counterparts, the probability of forgoing contraception is .11 higher among women with low media exposure. Comparatively, women with frequent exposure are most likely to use either long-term or short-term methods; the probability is .356 and .357 respectively. The probability of these women using no contraception at all is .28.
x.frequent <- setx(model1, Media.Exposure = 0)
x.infrequent <- setx(model1, Media.Exposure = 1)
w.med <- sim(model1, x = x.frequent, x1 = x.infrequent)
summary(w.med)
sim x :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.2858126 0.02042095 0.2852634 0.2467859 0.3270618
Pr(Y=2) 0.3565281 0.02228227 0.3560594 0.3138779 0.4011361
Pr(Y=3) 0.3576592 0.02195915 0.3583364 0.3138782 0.3975370
pv
1 2 3
[1,] 0.303 0.347 0.35
sim x1 :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.4030783 0.06191986 0.4020089 0.2781678 0.5271446
Pr(Y=2) 0.3144288 0.07879188 0.3066767 0.1798903 0.4859188
Pr(Y=3) 0.2824929 0.05817214 0.2795322 0.1746189 0.4010451
pv
1 2 3
[1,] 0.408 0.326 0.266
fd
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.11726572 0.05831227 0.11586095 0.002960504 0.23257127
Pr(Y=2) -0.04209937 0.07596227 -0.04898901 -0.171548091 0.12152613
Pr(Y=3) -0.07516635 0.05545549 -0.07910464 -0.181335528 0.03624811
plot(w.med)
Wife.Working (binary): 0=Yes, 1=No
The two counter-factual situations being compared is whether the woman is currently employed or unemployed. The first differences are extremely small among each contraceptive method. Compared to their employed counterparts, the probability of using no contraception is .05 lower among unemployed women. For long-term contraception use, the probability is .005 lower among the unemployed, and for short-term contraception use, the probability is .05 higher among this same group. Therefore, it can be said that a women’s choice of contraception (though statistically significant) only varies slightly based on whether or not she is employed.
x.yes <- setx(model1, Wife.Working = 0)
x.no <- setx(model1, Wife.Working = 1)
w.work <- sim(model1, x = x.yes, x1 = x.no)
summary(w.work)
sim x :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.3385224 0.02820220 0.3371739 0.2871586 0.3967852
Pr(Y=2) 0.3627914 0.03030408 0.3616677 0.3058960 0.4257095
Pr(Y=3) 0.2986862 0.02700949 0.2992076 0.2480362 0.3554056
pv
1 2 3
[1,] 0.325 0.35 0.325
sim x1 :
-----
ev
mean sd 50% 2.5% 97.5%
Pr(Y=1) 0.2874611 0.02037128 0.2867549 0.2488568 0.3276109
Pr(Y=2) 0.3577279 0.02210657 0.3578123 0.3142851 0.4007271
Pr(Y=3) 0.3548110 0.02196311 0.3541075 0.3146654 0.3993760
pv
1 2 3
[1,] 0.266 0.361 0.373
fd
mean sd 50% 2.5% 97.5%
Pr(Y=1) -0.051061268 0.02752600 -0.051284394 -0.105384630 0.0003324921
Pr(Y=2) -0.005063539 0.03293110 -0.004549248 -0.070521160 0.0570594030
Pr(Y=3) 0.056124807 0.02833028 0.054654761 0.004725213 0.1134128544
plot(w.work)
In this assignment, I examined the effect of Indonesian women’s education level, media exposure, and employment status on the probability of them choosing one contraceptive method over another. The results indicate that women with low education are, by far, most likely to use no form of contraception at all. Compared to their counterparts, the probability of forgoing contraception is .31 higher among women with low education. While the data does not specify, it would have been interesting and insightful to know what exactly constitutes “high” and “low” levels of education in a country such as Indonesia. In terms of media exposure, the results suggests that women with poor media exposure are most likely to use no contraception at all; the probability of using such method is .40. The first difference values indicate that compared to their counterparts, the probability of not using contraception is .11 higher among women with infrequent media exposure. Women with frequent media exposure are most likely to use either long-term or short-term methods. This is not surprising as the 1987 Family-Planning campaign, which promoted contraception use, was largely media-based. Lastly, in terms of a women’s employment status, it turns out that working women are slightly more likely to forego contraception (.05). The first difference values suggest that a women’s choice of contraception only varies slightly based on whether or not she is employed. Furthermore, there is not much variation in terms of which contraception method is preferred among employed and unemployed women. Ultimately, these findings help provide a better understanding of some of the factors that affect a woman’s contraceptive choices. However, there is still a lot of underlying information and factors left to be uncovered.
References