Data Preparation

# load data
adolecent_fertility_rates <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Adolecent_Fertility_Rates.csv", header=TRUE, check.names = FALSE)
gendered_financial_indicators <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Gendered_Financial_Indicators.csv", header=TRUE, check.names = FALSE)
gendered_world_indicators <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Gender_World%20_Indicators.csv", header=TRUE, check.names = FALSE)

Research question

I recently watched the documentary “Motherland” on PBS, described as a “vérité look at the busiest maternity hospital on the planet, in one of the world’s most populous countries: the Philippines”. As a first-generation American-Born-Filipina, the hyper-realistic film left me in awe thinking about a life I could have led. It showcased the lives of girls younger than I was, having their first child and caught in what seems to be a never-ending cycle of adolescent fertility. One 26 year old in particular, already had six children. Having just graduated college, I’ve noticed some high school friends now have children themselves, but it was nothing like the girls in the Philippines. I’ve thought about this documentary a lot since I’ve seen it and how we can help empower women or give them access to the right tools so they won’t be caught in a cycle of continuous pregnancy. I asked myself, what in particular makes America different from the Philippines? So I decided I should take a look at the countries in which adolescent fertility rates are increasing/decreasing, and then determine potential reasons. In essence:

Which countries have the most significant increasing/decreasing adolescent fertility rates and why?

Cases

Each country forms it’s own case and demonstrates rate of adolescent fertility for women aged 15 to 19 years old.

Data collection

The World Bank has an up-to-date (as of 2015) data-set with adolescent fertility rates for 261 countries ranging 45 years. It also has financial indicators for each country broken down by gender.

Type of study

This is an observational study looking at data from 1960 to 2015 for the most populous countries in the world.

Data Source

Fertility Data is found here: https://data.worldbank.org/indicator/SP.ADO.TFRT World Development Indicators are found here: http://wdi.worldbank.org/table/WV.5 https://data.worldbank.org/topic/gender

Response

The response variable is numerical value demonstrating a weighted average of births per 1,000 women ages 15-19. This is used to determine fertility rates for adolecent girls.

Explanatory

The explanatory variable is world development indicators and are also numerical (some are percentage, one is age, another is binary). Sample variables include: “Life Expectancy”, “% with Account at a Financial Institution”, “% Women in Parliaments”, and “Nondiscrimination clause mentions gender in the constitution”

Relevant summary statistics

#library(tidyr)
#library(dplyr)

tidy_adolecent_fertility_rates <- gather(adolecent_fertility_rates, "year", "n", 5:60) 
colnames(tidy_adolecent_fertility_rates)[colnames(tidy_adolecent_fertility_rates) == "Country Name"] <- "Country"
tidy_adolecent_fertility_rates <- select(tidy_adolecent_fertility_rates, one_of("Country", "year", "n"))
head(tidy_adolecent_fertility_rates)
summary(tidy_adolecent_fertility_rates)
##            Country          year                 n           
##  Afghanistan   :   56   Length:14784       Min.   :  0.5222  
##  Albania       :   56   Class :character   1st Qu.: 34.0233  
##  Algeria       :   56   Mode  :character   Median : 66.7703  
##  American Samoa:   56                      Mean   : 77.2695  
##  Andorra       :   56                      3rd Qu.:114.1903  
##  Angola        :   56                      Max.   :235.3200  
##  (Other)       :14448                      NA's   :1344

American Fertility Rates

#library(ggplot2)

NAmerica_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="North America")

Namerica <- ggplot(NAmerica_Fertility, aes(year, n))
Namerica + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Philippines Fertility Rates

Philippines is one of the few countries in the world that have an increase in adolecent fertility rates.

Philippines_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="Philippines")

Philippines <- ggplot(Philippines_Fertility, aes(year, n))
Philippines + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Zambia Fertility Rates

Note although it is a decreasing trend, the number of adolencent girls giving birth to children at a young age is still very high.

Zambia_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="Zambia")

Zambia <- ggplot(Zambia_Fertility, aes(year, n))
Zambia + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))

World Indicators Exploratory Variables

tidy_gendered_world_indicators <- gather(gendered_world_indicators, "year", "n", 5:60) 
colnames(tidy_gendered_world_indicators)[colnames(tidy_gendered_world_indicators) == "Country Name"] <- "Country"
colnames(tidy_gendered_world_indicators)[colnames(tidy_gendered_world_indicators) == "Indicator Name"] <- "Indicator"
tidy_gendered_world_indicators <- select(tidy_gendered_world_indicators, one_of("Country", "Indicator","year", "n"))
head(tidy_gendered_world_indicators)

America’s Female Labor force Participlation Rate

Sample Exploratory Variable we can use for analysis

NAmerica_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="North America") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))

NAmerica_Labor_Force <- ggplot(NAmerica_Labor_Force, aes(year, n))
NAmerica_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).

Philippines’s Female Labor force Participlation Rate

Philippines_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="Philippines") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))

Philippines_Labor_Force <- ggplot(Philippines_Labor_Force, aes(year, n))
Philippines_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).

Zambia’s Female Labor force Participlation Rate

Zambia_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="Zambia") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))

Zambia_Labor_Force <- ggplot(Zambia_Labor_Force, aes(year, n))
Zambia_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).

Conclusions

We can make a lot of linear regression comparions between various variables, comparing it to adolescent fertility, and use statistical inference techniques to determine best variables to use.