Introduction

I recently watched the documentary “Motherland” on PBS, described as a “vérité look at the busiest maternity hospital on the planet, in one of the world’s most populous countries: the Philippines”. As a first-generation American-Born-Filipina, the hyper-realistic film left me in awe. It showcased the lives of girls younger than I was, having their first child and caught in what seems to be a never-ending cycle of adolescent birth. One 26 year old in particular, already had six children. I’ve thought about this documentary a lot since I’ve seen it and how we can help empower women or give them access to the right tools so they won’t be caught in a cycle of continuous pregnancy. I asked myself, what in particular makes America different from the Philippines? So I decided I should take a look at the countries in which adolescent birth rates are increasing/decreasing, and then determine potential reasons. In essence:

Which countries have the most significant increasing/decreasing adolescent birth rates and why?

Loading Data

#library(dplyr)
#library(tidyr)
#library(ggplot2)
#library(ggfortify)
gendered_world_indicators <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Gender_World%20_Indicators.csv", header=TRUE, check.names = FALSE)

I aim to focus primarily on the Philippines, The United States of America, and Zambia: three countries that have varied adolescent birth rates. While the adolescent birth rate in the Philippines has been increasing, in Zambia it is decreasing significantly. I chose The United States of America because it is the country I know best culturally and because it is an incredibly powerful and rich country with decreasing adolescent birth rates.

Philippines

Philippines is one of the few countries in the world that have an increase in adolecent fertility rates.

Philippines <- tidy_gendered_world_indicators %>%
  filter(Country == "Philippines")
fertility_phi <- Philippines %>%
  filter(Indicator == indicators[8])

PHI_fertility <- ggplot(fertility_phi, aes(year, n))
PHI_fertility + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

United States of America

The United States of America’s adolescent birth rate is generally decreasing, despite a slight uptick during the late 80’s to early ’90s.

United_States <- tidy_gendered_world_indicators %>%
  filter(Country == "United States")
fertility_usa <- United_States %>%
  filter(Indicator == indicators[8])

indicators <- unique(United_States$Indicator)

USA_fertility <- ggplot(fertility_usa, aes(year, n))
USA_fertility + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Zambia

Zambia’s adolescent birth rate has a generally decreasing trend, however it is still very high.

Zambia <- tidy_gendered_world_indicators %>%
  filter(Country == "Zambia")
fertility_zam <- Zambia %>%
  filter(Indicator == indicators[8])

ZAM_fertility <- ggplot(fertility_zam, aes(year, n))
ZAM_fertility + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Taking a Global Perspective

First, let’s take a globalized approach to understanding adolescent birth rates.

The World Bank provides us with adolescent birth rate data going back to the 1960’s, but also provides data for many other “Indicators”. These Indicators often vary from country to country, but all countries also have these Indicators: Survival to age 65, female (% of cohort) and Survival to age 65, male (% of cohort). Let’s use it to see a bigger picture about how mens and womens health care and how they affect young teenagers having kids. Is there any correlation?

In general, let’s note that the data is skewed to the left. That means in our data set, going back to the 1960’s, most countries have 0-150 births per thousand girls age 15-19. Very few have more than that. It is very likely that if we included data from the 1800’s ot 1900’s this would look like a more normal distribution, because teenage births were quite common. Therefore, our data could be slightly biased.

pregnancy <- tidy_gendered_world_indicators %>%
  filter(Indicator == indicators[8])

a <- ggplot(pregnancy, aes(n))
a + geom_density()

If we perform multiple linear regression on our Indicators Survival to age 65, female (% of cohort) and Survival to age 65, male (% of cohort) against Adolescent Pregnancy rates per 1000 teenages from 15-19, we get the following

fertility_survival <- summary(lm(fertility$n~ survival_65_female$n + survival_65_male$n))
fertility_survival
## 
## Call:
## lm(formula = fertility$n ~ survival_65_female$n + survival_65_male$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -143.012  -18.619   -2.936   18.273  143.989 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          236.65239    1.12107  211.10   <2e-16 ***
## survival_65_female$n   0.90526    0.06529   13.86   <2e-16 ***
## survival_65_male$n    -3.13280    0.05915  -52.96   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 32.31 on 13437 degrees of freedom
## Multiple R-squared:  0.6169, Adjusted R-squared:  0.6168 
## F-statistic: 1.082e+04 on 2 and 13437 DF,  p-value: < 2.2e-16

Our equation is

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = \beta_0 + \beta_1(Percent \; survival \; to \; 64, \; male) + \beta_2(Percent \; survival \; to \; 64, female)\] \[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = 236.65 + .90526(Percent \; survival \; to \; 64, \; male) - 3.1328 (Percent \; survival \; to \; 64, female)\]

From this, we can already see that both percent survival variables are correlated to adolescent birth rates. In general, births per 1,000 women ages 15-19 are 236.65239 and increase by .905 births as the percent of males 64+ increases by 1% and decrease by 3.1328 births as the percent of females 64+ increases by 1%. We can see that an increase in care for women leads to less teenage births – despite the massive age gap. This could be because of things like influences from matronly figures such as mothers, aunts, and grandmothers that are around and can nuture the young women in the family. Interestingly, percent of males increases births, although when done separately, both female and male percents decrease adolescent pregnancy.

Note that the residuals are relatively normal and we have a large population size.

qqnorm(fertility_survival$residuals)
qqline(fertility_survival$residuals)

Next, let’s use these same Indicators and run multiple linear regression for our three countries in question: Philippines, United States, and Zambia.

Philippines

The Philippines has a similar density plot to the world, and actually doesn’t have a very high maximum in comparison. It is still skewed to the left.

a <- ggplot(fertility_phi, aes(n))
a + geom_density()

fertility_survival_phi <- summary(lm(fertility_p$n~male_65_percent_phi$n + female_65_percent_phi$n))
fertility_survival_phi
## 
## Call:
## lm(formula = fertility_p$n ~ male_65_percent_phi$n + female_65_percent_phi$n)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.519 -2.903 -0.166  2.140  8.649 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -132.002     52.366  -2.521 0.014754 *  
## male_65_percent_phi$n      8.780      2.145   4.093 0.000146 ***
## female_65_percent_phi$n   -4.544      1.020  -4.454 4.38e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.637 on 53 degrees of freedom
## Multiple R-squared:  0.4336, Adjusted R-squared:  0.4122 
## F-statistic: 20.29 on 2 and 53 DF,  p-value: 2.87e-07

The model has an R2 of .82 and a p-value less than .05. The equation is

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = \beta_0 + \beta_1(Percent \; survival \; to \; 64, \; male) + \beta_2(Percent \; survival \; to \; 64, female)\] \[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = -132.002 + 8.780(Percent \; survival \; to \; 64, \; male) - 4.544 (Percent \; survival \; to \; 64, female)\]

Notice how much of an increase male survival impacts adolescent pregnancy. An article, published in August points us to a potential reason why.

“The Philippines is struggling to manage its soaring teen pregnancy rates. The United Nations Population Fund (UNFPA) has found that teen pregnancies in the Philippines increased by 65% from 2000-2010. An estimated 24 babies are born to teen mothers every hour. Now, advocates and health workers are identifying a new trend: Teenage girls are not just getting pregnant, but doing so with much older men. They say it’s the need for financial security that drives girls into such relationships.”

https://www.globalcitizen.org/en/content/teen-pregnancy-older-men-philippines/

This isn’t a foreign concept to for many filipinos. I know many American Filipino men who travel back to their homeland with promises of money or material goods for a night or ongoing relationship with an underage girl. Underage for American standards, that is. The age of consent back home is only 12 years old.

On a more positive note, there is hope. Every percent of elder females decreases births per 1000 by 4. As mentioned before, perhaps an increase in care for women leads to less teenage births. This could be because of influences from matronly figures such as mothers, aunts, and grandmothers that are around and can nurture young women in the family. Family is incredibly important in the Philippines. If women can be role models or nurture their young daughters/sons to be careful, then we could potentially positively impact birth rates in the Philippines.

To view the relationship between the Indicators, we can observe trends in the graph below.

f <- ggplot(data=phi_health, aes(year, n, group=Indicator))
f + geom_line(aes(color=Indicator)) + theme(axis.text.x = element_blank() ) 

United States

The United States follows a very normal distribution, as shown by the linear relationship shown earlier. In contrast to the World density plot, we have a normal distribution.

a <- ggplot(fertility_usa, aes(n))
a + geom_density()

fertility_survival_usa <- summary(lm(fertility_u$n~male_65_percent_usa$n + female_65_percent_usa$n))
fertility_survival_usa
## 
## Call:
## lm(formula = fertility_u$n ~ male_65_percent_usa$n + female_65_percent_usa$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.1155  -3.9812  -0.4512   3.3197  13.1536 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              543.980     92.907   5.855 3.08e-07 ***
## male_65_percent_usa$n      1.814      1.055   1.719 0.091404 .  
## female_65_percent_usa$n   -7.410      2.006  -3.694 0.000523 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.935 on 53 degrees of freedom
## Multiple R-squared:  0.8282, Adjusted R-squared:  0.8217 
## F-statistic: 127.7 on 2 and 53 DF,  p-value: < 2.2e-16

Here, our R2 is .82 and the p-value is less than .05. Our equation is

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = \beta_0 + \beta_1(Percent \; survival \; to \; 64, \; male) + \beta_2(Percent \; survival \; to \; 64, female)\]

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = 543.980 + 1.814(Percent \; survival \; to \; 64, \; male) - 7.410 (Percent \; survival \; to \; 64, female)\]

Similar story as the Philippines is being shown here.

To view the relationship between the Indicators, we can observe trends in the graph below.

f <- ggplot(data=usa_health, aes(year, n, group=Indicator))
f + geom_line(aes(color=Indicator)) + theme(axis.text.x = element_blank() ) 

Zambia

Zambia’s density chart is skewed to the right, likely because for a long period of time, adolescent birth rates were quite high, nearing the maximum for the world. Some factor is pushing them towards having less teenage pregnancies very recently.

a <- ggplot(fertility_zam, aes(n))
a + geom_density()

fertility_survival_zam <- summary(lm(fertility_p$n~male_65_percent_zam$n + female_65_percent_zam$n))
fertility_survival_zam
## 
## Call:
## lm(formula = fertility_p$n ~ male_65_percent_zam$n + female_65_percent_zam$n)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.595 -1.879 -0.574  1.262  8.248 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              57.5472     3.2530  17.690  < 2e-16 ***
## male_65_percent_zam$n     1.9532     0.3218   6.071 1.40e-07 ***
## female_65_percent_zam$n  -1.6670     0.3205  -5.200 3.27e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.347 on 53 degrees of freedom
## Multiple R-squared:  0.5205, Adjusted R-squared:  0.5024 
## F-statistic: 28.76 on 2 and 53 DF,  p-value: 3.477e-09

Here, R2 is .48 and the p-value is less than .05. This may show us that there are alternative factors that aren’t related to medial advances. We do however, see a similar trend with male and female survival rates – although there is much less of a gendered distinction. Our equation is

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = \beta_0 + \beta_1(Percent \; survival \; to \; 64, \; male) + \beta_2(Percent \; survival \; to \; 64, female)\]

\[ \widehat{\mathrm{births \; per \; 1,000 \; women \; ages \; 15-19}} = 120.6512 + 5.8708(Percent \; survival \; to \; 64, \; male) - 6.4058 (Percent \; survival \; to \; 64, female)\]

To view the relationship between the Indicators, we can observe trends in the graph below.

f <- ggplot(data=zam_health, aes(year, n, group=Indicator))
f + geom_line(aes(color=Indicator)) + theme(axis.text.x = element_blank() ) 

Backward Elimination

Since many of the indicators included (161 in total) don’t have complete information, we will only analyze the ones with more than 25 values. In general, factors were broken down into three main categories as data permits:

  1. Employment
  2. Education
  3. Health

Philippines

The following factors are the values that we will be regressing against adolescent pregnancy rates for the Philippines.

indicators_phi_backwards[1:7]
## [1] Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)
## [2] Labor force participation rate, male (% of male population ages 15+) (modeled ILO estimate)    
## [3] Labor force, female (% of total labor force)                                                   
## [4] Lifetime risk of maternal death (%)                                                            
## [5] Lifetime risk of maternal death (1 in: rate varies by country)                                 
## [6] Maternal mortality ratio (modeled estimate, per 100,000 live births)                           
## [7] Prevalence of HIV, female (% ages 15-24)                                                       
## 161 Levels: Account at a financial institution, female (% age 15+) ...

Employment

fertility_employment_phi <- summary(lm(fertility_phi_emp$n~ female_agriculture$n + 
                             female_industry$n + female_services$n))
fertility_employment_phi
## 
## Call:
## lm(formula = fertility_phi_emp$n ~ female_agriculture$n + female_industry$n + 
##     female_services$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5812 -1.3614 -0.2444  0.6337  6.3384 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1327.677    326.365   4.068 0.000223 ***
## female_agriculture$n  -12.723      3.224  -3.946 0.000322 ***
## female_industry$n     -13.546      3.487  -3.885 0.000385 ***
## female_services$n     -12.594      3.242  -3.885 0.000386 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.343 on 39 degrees of freedom
## Multiple R-squared:  0.4194, Adjusted R-squared:  0.3747 
## F-statistic:  9.39 on 3 and 39 DF,  p-value: 8.45e-05

Even removing variables with the highest p-values, we do not have a very good model in terms of R2. I do not believe these employment metrics are a very useful to understand adolescent pregnancy in the Philippines. It is however better than looking at any one sector by itself, which yielded R2 values around .11.

Education

fertility_education_phi <- summary(lm(fertility_phi_edu$n ~ female_primary_education_phi$n + 
                             female_school_enrollment_phi$n + male_school_enrollment_phi$n))
fertility_education_phi
## 
## Call:
## lm(formula = fertility_phi_edu$n ~ female_primary_education_phi$n + 
##     female_school_enrollment_phi$n + male_school_enrollment_phi$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6510 -0.9744 -0.3875  1.0896  2.4366 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    494.8829    90.0198   5.497 9.09e-06 ***
## female_primary_education_phi$n  -8.8507     1.6384  -5.402 1.17e-05 ***
## female_school_enrollment_phi$n   2.0464     0.3832   5.341 1.37e-05 ***
## male_school_enrollment_phi$n    -2.1453     0.4686  -4.578 0.000103 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.485 on 26 degrees of freedom
## Multiple R-squared:  0.6056, Adjusted R-squared:  0.5601 
## F-statistic: 13.31 on 3 and 26 DF,  p-value: 1.86e-05

Since this model doesn’t have a very high R2, let’s run simple linear regression on the variable with the highest R2 and the most likely to affect teenage pregnancy – female school enrollment.

fertility_education_phi <- summary(lm(fertility_phi_edu$n ~ female_school_enrollment_phi$n))
fertility_education_phi
## 
## Call:
## lm(formula = fertility_phi_edu$n ~ female_school_enrollment_phi$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7072 -1.5055 -0.8293  2.0301  4.0793 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                     19.2076    15.0947   1.272   0.2137  
## female_school_enrollment_phi$n   0.3046     0.1401   2.174   0.0383 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.108 on 28 degrees of freedom
## Multiple R-squared:  0.1444, Adjusted R-squared:  0.1139 
## F-statistic: 4.726 on 1 and 28 DF,  p-value: 0.03831

For a single variable, this is has one of the highest correlations. What does it tell us? That when the risk of death increases, births decrease. A pretty obvious statement, but perhaps the reason we have such an uptick in adolescent birth rates is actually because of something we would deem GOOD in the Philippines – access to health care is improving. The only issue? Access to birth control in this heavily Christian country is heavily regulated and deemed morally wrong.

ggplot(schoolphi, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Female School Enrollment") 

I find this a very interesting metric – why is it that female enrollment does not decrease birth rates? Schools must not be teaching children ways of preventing pregnancy. In fact, the more females are in school, the higher birth rates are (slight correlation). They could be teaching the complete opposite, and Catholic schools could be teaching children to be fearful of contraceptives. This is a common narative in the Documentary Motherland. Children were refusing free IUDs that would prevent pregnancy for up to 10 years – out of fear. Schools need to teach that contraceptives are benefical. I also find it interesting that we can use statistics to understand relationships even when they aren’t statistically significant – like this one.

Health

fertility_health_phi <- summary(lm(fertility_phi_health$n~ risk_maternal_death$n + maternal_mortality_ratio$n + 
                             HIV_male_15to24$n))
fertility_health_phi
## 
## Call:
## lm(formula = fertility_phi_health$n ~ risk_maternal_death$n + 
##     maternal_mortality_ratio$n + HIV_male_15to24$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.40026 -0.91039 -0.00734  0.45542  2.80484 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 33.53837    4.30786   7.785 9.24e-08 ***
## risk_maternal_death$n      -43.52231    3.74912 -11.609 7.49e-11 ***
## maternal_mortality_ratio$n   0.27339    0.03856   7.090 4.12e-07 ***
## HIV_male_15to24$n           61.05476    8.59186   7.106 3.98e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.027 on 22 degrees of freedom
## Multiple R-squared:  0.925,  Adjusted R-squared:  0.9148 
## F-statistic:  90.5 on 3 and 22 DF,  p-value: 1.568e-12
plot(fertility_health_phi$residuals ~ fertility_phi_health$n)
abline(h = 0, lty = 3)

Health is by far our best model for understanding adolescent birth rates in the philippines, with an adjusted R squared of .91. I think the most interesting metric, is that the risk of maternal death. Let’s do simple regression on this indicator.

fertility_health_phi <- summary(lm(fertility_phi_health$n ~ risk_maternal_death$n))
fertility_health_phi
## 
## Call:
## lm(formula = fertility_phi_health$n ~ risk_maternal_death$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0460 -1.0595  0.0821  0.7907  4.5236 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             70.166      2.660  26.378  < 2e-16 ***
## risk_maternal_death$n  -34.190      5.407  -6.323 1.55e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.199 on 24 degrees of freedom
## Multiple R-squared:  0.6249, Adjusted R-squared:  0.6093 
## F-statistic: 39.99 on 1 and 24 DF,  p-value: 1.547e-06

For a single variable, this is has one of the highest correlations. What does it tell us? That when the risk of death increases, births decrease. While its good that maternal death rates are decreasing during pregancies, it’s not good that teenage girls are having more pregancies. Perhaps teenage pregnancies were always high – but there are less birth related deaths now? Doesn’t really account for the sudden uptick after rates were decreasing in the 70s.

One thing to note, is that access to birth control isn’t very good in the Philippines. In this heavily Christian country, it is deemed morally wrong and absinence is the only good option.

ggplot(risk, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Rate Maternal Death") 

autoplot(lm(n.y ~ n.x, data = risk), label.size = 3)

United States

The following are factors we are using for understanding adolescent pregnancy in the United States.

USA_indicatiors_backwards

Employment

fertility_employment_usa <- summary(lm(fertility_usa_emp$n~ male_industry_usa$n + 
                             male_services_usa$n + percent_female_family_contributes_usa$n + 
                             male_self_emp_usa$n + male_agriculture_usa$n))
fertility_employment_usa
## 
## Call:
## lm(formula = fertility_usa_emp$n ~ male_industry_usa$n + male_services_usa$n + 
##     percent_female_family_contributes_usa$n + male_self_emp_usa$n + 
##     male_agriculture_usa$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8889 -1.3204  0.0098  1.1188  3.2995 
## 
## Coefficients:
##                                           Estimate Std. Error t value
## (Intercept)                             -20747.372   7939.192  -2.613
## male_industry_usa$n                        208.263     79.437   2.622
## male_services_usa$n                        206.514     79.376   2.602
## percent_female_family_contributes_usa$n    -29.255      2.548 -11.480
## male_self_emp_usa$n                         10.434      1.026  10.166
## male_agriculture_usa$n                     205.526     79.241   2.594
##                                         Pr(>|t|)    
## (Intercept)                               0.0145 *  
## male_industry_usa$n                       0.0142 *  
## male_services_usa$n                       0.0149 *  
## percent_female_family_contributes_usa$n 6.79e-12 ***
## male_self_emp_usa$n                     9.97e-11 ***
## male_agriculture_usa$n                    0.0152 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.922 on 27 degrees of freedom
## Multiple R-squared:  0.9725, Adjusted R-squared:  0.9674 
## F-statistic: 190.6 on 5 and 27 DF,  p-value: < 2.2e-16

For The United States of America, it appears as though Employment actually plays a very big role in family planning, a narative that has been said for many years as women are accepted more and more into the workplace. In fact, the most important variable here is percent_female_workers_usa, and generates a statistically significant model using simple linear regression.

fertility_employment_usa <- summary(lm(fertility_usa_emp$n ~ percent_female_workers_usa$n  ))
fertility_employment_usa
## 
## Call:
## lm(formula = fertility_usa_emp$n ~ percent_female_workers_usa$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.6354 -2.6870  0.3658  1.8562 12.1302 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1723.788    137.349   12.55 1.09e-13 ***
## percent_female_workers_usa$n  -17.905      1.465  -12.22 2.18e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.481 on 31 degrees of freedom
## Multiple R-squared:  0.8281, Adjusted R-squared:  0.8225 
## F-statistic: 149.3 on 1 and 31 DF,  p-value: 2.181e-13

As one can see from the graph below, the more females in the workplace, the less adolescent births within the United States of America.

ggplot(usawork, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Percent Female Workers in USA") 

Residuals appear nearly normal.

autoplot(lm(n.y ~ n.x, data = usawork), label.size = 3)

Education

fertility_education_phi <- summary(lm(fertility_usa_edu$n~ female_primary_edu_usa$n + gross_school_enrollment_usa$n + 
                             female_primary_school_enrollment_usa$n + male_primary_school_enrollment_usa$n + gross_secondary_school_enrollment_usa$n +
                             male_secondary_school_enrollment_usa$n + female_secondary_education_usa$n +
                             gross_tertiary_enrollment_usa$n + female_tertiary_enrollment_usa$n + male_tertiary_enrollment_usa$n))
fertility_education_phi
## 
## Call:
## lm(formula = fertility_usa_edu$n ~ female_primary_edu_usa$n + 
##     gross_school_enrollment_usa$n + female_primary_school_enrollment_usa$n + 
##     male_primary_school_enrollment_usa$n + gross_secondary_school_enrollment_usa$n + 
##     male_secondary_school_enrollment_usa$n + female_secondary_education_usa$n + 
##     gross_tertiary_enrollment_usa$n + female_tertiary_enrollment_usa$n + 
##     male_tertiary_enrollment_usa$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.86932 -0.92455  0.01899  0.67581  2.85783 
## 
## Coefficients:
##                                           Estimate Std. Error t value
## (Intercept)                             -6321.7290  1871.6004  -3.378
## female_primary_edu_usa$n                  -41.3183     5.2062  -7.936
## gross_school_enrollment_usa$n            8038.3195  1906.9879   4.215
## female_primary_school_enrollment_usa$n    -68.5551    18.5062  -3.704
## male_primary_school_enrollment_usa$n       70.7308    18.5655   3.810
## gross_secondary_school_enrollment_usa$n   352.4974    50.2126   7.020
## male_secondary_school_enrollment_usa$n     -1.9978     0.2214  -9.025
## female_secondary_education_usa$n          -14.4225     1.8847  -7.652
## gross_tertiary_enrollment_usa$n           462.1723    40.3458  11.455
## female_tertiary_enrollment_usa$n           -7.8488     0.6921 -11.341
## male_tertiary_enrollment_usa$n             11.1162     0.9594  11.586
##                                         Pr(>|t|)    
## (Intercept)                             0.002991 ** 
## female_primary_edu_usa$n                1.32e-07 ***
## gross_school_enrollment_usa$n           0.000425 ***
## female_primary_school_enrollment_usa$n  0.001403 ** 
## male_primary_school_enrollment_usa$n    0.001097 ** 
## gross_secondary_school_enrollment_usa$n 8.25e-07 ***
## male_secondary_school_enrollment_usa$n  1.73e-08 ***
## female_secondary_education_usa$n        2.30e-07 ***
## gross_tertiary_enrollment_usa$n         3.08e-10 ***
## female_tertiary_enrollment_usa$n        3.67e-10 ***
## male_tertiary_enrollment_usa$n          2.52e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.599 on 20 degrees of freedom
## Multiple R-squared:  0.9827, Adjusted R-squared:  0.9741 
## F-statistic: 113.7 on 10 and 20 DF,  p-value: 2.245e-15

I think we can see that there is a big difference in how education and employment affects adolescent birth rates in USA. Nearly every value is statistically significant in our model.

Female tertiary enrollment (high school) was a very important variable. I’m sure if college level education was included, that would be the most successful.

fertility_education_usa <- summary(lm(fertility_usa_edu$n ~   female_tertiary_enrollment_usa$n ))
fertility_education_usa
## 
## Call:
## lm(formula = fertility_usa_edu$n ~ female_tertiary_enrollment_usa$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.374  -2.824  -1.682   0.756  13.901 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      78.80793    5.42280  14.533 7.58e-15 ***
## female_tertiary_enrollment_usa$n -0.39050    0.06306  -6.193 9.40e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.627 on 29 degrees of freedom
## Multiple R-squared:  0.5694, Adjusted R-squared:  0.5546 
## F-statistic: 38.35 on 1 and 29 DF,  p-value: 9.402e-07
ggplot(usaedu, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Females Enrolled in High School") 

Residuals appear nearly normal.

autoplot(lm(n.y ~ n.x, data = usaedu), label.size = 3)

Zambia

The following factors are the values that we will be regressing against adolescent pregnancy rates for Zambia

Zambia_indicatiors_backwards

Employment

fertility_employment_zam <- summary(lm(fertility_zam_emp$n~ female_employment_to_pop_15plus_zam$n + male_employment_to_pop_15plus_zam$n + 
                             male_employment_to_pop_teen_zam$n + female_unemployment_zam$n + 
                             male_unemployment_zam$n + female_youth_unemployment_zam$n +
                             male_youth_unemployment_zam$n + female_labor_force_zam$n + male_labor_force_zam$n))
fertility_employment_zam
## 
## Call:
## lm(formula = fertility_zam_emp$n ~ female_employment_to_pop_15plus_zam$n + 
##     male_employment_to_pop_15plus_zam$n + male_employment_to_pop_teen_zam$n + 
##     female_unemployment_zam$n + male_unemployment_zam$n + female_youth_unemployment_zam$n + 
##     male_youth_unemployment_zam$n + female_labor_force_zam$n + 
##     male_labor_force_zam$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.27604 -0.73046 -0.02146  0.75876  1.71222 
## 
## Coefficients:
##                                        Estimate Std. Error t value
## (Intercept)                             97.0976   129.7144   0.749
## female_employment_to_pop_15plus_zam$n  104.9413    24.6971   4.249
## male_employment_to_pop_15plus_zam$n   -103.4976    19.9177  -5.196
## male_employment_to_pop_teen_zam$n        1.8607     0.6870   2.708
## female_unemployment_zam$n               76.5178    18.0777   4.233
## male_unemployment_zam$n                -90.9740    17.2214  -5.283
## female_youth_unemployment_zam$n          1.9045     0.6078   3.134
## male_youth_unemployment_zam$n            3.3053     0.6910   4.783
## female_labor_force_zam$n               -83.1157    21.4769  -3.870
## male_labor_force_zam$n                  81.9844    17.5658   4.667
##                                       Pr(>|t|)    
## (Intercept)                           0.465705    
## female_employment_to_pop_15plus_zam$n 0.000700 ***
## male_employment_to_pop_15plus_zam$n   0.000109 ***
## male_employment_to_pop_teen_zam$n     0.016184 *  
## female_unemployment_zam$n             0.000724 ***
## male_unemployment_zam$n                9.2e-05 ***
## female_youth_unemployment_zam$n       0.006833 ** 
## male_youth_unemployment_zam$n         0.000242 ***
## female_labor_force_zam$n              0.001511 ** 
## male_labor_force_zam$n                0.000304 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.276 on 15 degrees of freedom
## Multiple R-squared:  0.9915, Adjusted R-squared:  0.9864 
## F-statistic: 194.1 on 9 and 15 DF,  p-value: 7.134e-14

Employmeny also seems to be a big factor that dictates adolescent pregnancy.

Let’s take one variable for analysis. Female labor force rates in zambia appear to be an important factor for decreasing adolescent pregnancy rates, with an R2 of .7963

fertility_employment_zam <- summary(lm(fertility_zam_emp$n~ female_labor_force_zam$n ))
fertility_employment_zam
## 
## Call:
## lm(formula = fertility_zam_emp$n ~ female_labor_force_zam$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3473 -4.2701  0.5518  3.2004  8.4212 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              -322.8444    38.5769  -8.369 1.96e-08 ***
## female_labor_force_zam$n    5.0321     0.5307   9.482 2.06e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.038 on 23 degrees of freedom
## Multiple R-squared:  0.7963, Adjusted R-squared:  0.7874 
## F-statistic: 89.91 on 1 and 23 DF,  p-value: 2.065e-09

This appears to follow a logrithmic function, but interestingly, we have an inverse relationship between female labor force and adolescent pregnancy. While in America, females being included in the workplace decreased rates of pregnancy, in Zambia they appear to increase rates.

ggplot(laborzam, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Females Labor Force") 

Residuals appear nearly normal.

autoplot(lm(n.y ~ n.x, data = laborzam), label.size = 3)

fertility_employment_zam <- summary(lm(fertility_zam_emp$n~ male_youth_unemployment_zam$n ))
fertility_employment_zam
## 
## Call:
## lm(formula = fertility_zam_emp$n ~ male_youth_unemployment_zam$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.7068  -2.1080   0.0822   3.7281  10.0004 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     8.5718     4.1961   2.043   0.0527 .  
## male_youth_unemployment_zam$n   1.4500     0.1714   8.459 1.62e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.506 on 23 degrees of freedom
## Multiple R-squared:  0.7568, Adjusted R-squared:  0.7462 
## F-statistic: 71.56 on 1 and 23 DF,  p-value: 1.625e-08

In contrast, the higher the youth unemployment rate, the smaller the adolescent pregnancy rate. Perhaps there is a socal stigma against women working in Zambia. Further research is necessary here.

ggplot(laborzamm, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Females Labor Force") 

Health

fertility_health_zam <- summary(lm(fertility_zam_health$n~ risk_maternal_death_zam$n + maternal_mortality_ratio_zam$n + 
                             female_prevalance_HIV_zam$n + female_life_expectancy_zam$n + male_life_expectancy_zam$n))
fertility_health_zam
## 
## Call:
## lm(formula = fertility_zam_health$n ~ risk_maternal_death_zam$n + 
##     maternal_mortality_ratio_zam$n + female_prevalance_HIV_zam$n + 
##     female_life_expectancy_zam$n + male_life_expectancy_zam$n)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.15146 -0.48991 -0.07492  0.41954  1.35736 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    66.79186   21.53168   3.102  0.00562 ** 
## risk_maternal_death_zam$n      48.18266    4.52672  10.644 1.10e-09 ***
## maternal_mortality_ratio_zam$n -0.25573    0.01678 -15.240 1.80e-12 ***
## female_prevalance_HIV_zam$n     1.37471    0.18792   7.315 4.51e-07 ***
## female_life_expectancy_zam$n   -6.02636    1.44421  -4.173  0.00047 ***
## male_life_expectancy_zam$n      5.55107    1.83912   3.018  0.00679 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7357 on 20 degrees of freedom
## Multiple R-squared:  0.9965, Adjusted R-squared:  0.9956 
## F-statistic:  1127 on 5 and 20 DF,  p-value: < 2.2e-16

Health is also correlated with adolescent pregnancy. Let’s delve deeper to find out how by looking at just one of these variables.

fertility_health_zam <- summary(lm(fertility_zam_health$n~ risk_maternal_death_zam$n ))
fertility_health_zam
## 
## Call:
## lm(formula = fertility_zam_health$n ~ risk_maternal_death_zam$n)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.1419 -2.3684  0.2256  2.2875  5.0295 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                13.0683     1.8501   7.064 2.66e-07 ***
## risk_maternal_death_zam$n  12.1021     0.6986  17.324 4.53e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.073 on 24 degrees of freedom
## Multiple R-squared:  0.926,  Adjusted R-squared:  0.9229 
## F-statistic: 300.1 on 1 and 24 DF,  p-value: 4.535e-15

Just like in the Philippines, the risk of maternal death is highly correlated with adolescent pregnancy, however the higher the risk the higher the pregnancy. It is likely then that female health is improving in Zambia in general and that females are more likely to use contraception than in the Philippines.

ggplot(healthzamm, aes(x=n.x,y=n.y)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") + 
  labs(x="Births per 1000 age 15-19",y="Risk Maternal Death") 

autoplot(lm(n.y ~ n.x, data = healthzamm), label.size = 3)

Conclusions

It is often citied that the decrease in pregnancy overall is due to things like access to contraception and sex-education programs. In this study, various World Bank indicators were regressed against adolescent preganancy rates within Philippines, United States, and Zambia. In the Philippines, rates are increasing (one of the few countries) despite lower risks of maternal birth, and increasing survival rates (alluding to better healthcare overall). The country should allow for the use of contraceptives and sex-education programs. There is something to be said that the country has good education rates and increased teenage pregnancies. Their other metrics are ok – there is something holding them back, and that’s a flawed outlook on pregnancy.

Zambia, on the other hand – is attempting to improve, and it doing so more rapidly than their metrics would allude to. There is likely some form out outside influence going on. The United States of America, is very organically reducing preganancy, in contrast.