# load data
adolecent_fertility_rates <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Adolecent_Fertility_Rates.csv", header=TRUE, check.names = FALSE)
gendered_financial_indicators <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Gendered_Financial_Indicators.csv", header=TRUE, check.names = FALSE)
gendered_world_indicators <- read.csv("https://raw.githubusercontent.com/Michelebradley/DATA-606/master/Gender_World%20_Indicators.csv", header=TRUE, check.names = FALSE)
I recently watched the documentary “Motherland” on PBS, described as a “vérité look at the busiest maternity hospital on the planet, in one of the world’s most populous countries: the Philippines”. As a first-generation American-Born-Filipina, the hyper-realistic film left me in awe thinking about a life I could have led. It showcased the lives of girls younger than I was, having their first child and caught in what seems to be a never-ending cycle of adolescent fertility. One 26 year old in particular, already had six children. Having just graduated college, I’ve noticed some high school friends now have children themselves, but it was nothing like the girls in the Philippines. I’ve thought about this documentary a lot since I’ve seen it and how we can help empower women or give them access to the right tools so they won’t be caught in a cycle of continuous pregnancy. I asked myself, what in particular makes America different from the Philippines? So I decided I should take a look at the countries in which adolescent fertility rates are increasing/decreasing, and then determine potential reasons. In essence:
Which countries have the most significant increasing/decreasing adolescent fertility rates and why?
Each country forms it’s own case and demonstrates rate of adolescent fertility for women aged 15 to 19 years old.
The World Bank has an up-to-date (as of 2015) data-set with adolescent fertility rates for 261 countries ranging 45 years. It also has financial indicators for each country broken down by gender.
This is an observational study looking at data from 1960 to 2015 for the most populous countries in the world.
Fertility Data is found here: https://data.worldbank.org/indicator/SP.ADO.TFRT World Development Indicators are found here: http://wdi.worldbank.org/table/WV.5 https://data.worldbank.org/topic/gender
The response variable is numerical value demonstrating a weighted average of births per 1,000 women ages 15-19. This is used to determine fertility rates for adolecent girls.
The explanatory variable is world development indicators and are also numerical (some are percentage, one is age, another is binary). Sample variables include: “Life Expectancy”, “% with Account at a Financial Institution”, “% Women in Parliaments”, and “Nondiscrimination clause mentions gender in the constitution”
#library(tidyr)
#library(dplyr)
tidy_adolecent_fertility_rates <- gather(adolecent_fertility_rates, "year", "n", 5:60)
colnames(tidy_adolecent_fertility_rates)[colnames(tidy_adolecent_fertility_rates) == "Country Name"] <- "Country"
tidy_adolecent_fertility_rates <- select(tidy_adolecent_fertility_rates, one_of("Country", "year", "n"))
head(tidy_adolecent_fertility_rates)
summary(tidy_adolecent_fertility_rates)
## Country year n
## Afghanistan : 56 Length:14784 Min. : 0.5222
## Albania : 56 Class :character 1st Qu.: 34.0233
## Algeria : 56 Mode :character Median : 66.7703
## American Samoa: 56 Mean : 77.2695
## Andorra : 56 3rd Qu.:114.1903
## Angola : 56 Max. :235.3200
## (Other) :14448 NA's :1344
#library(ggplot2)
NAmerica_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="North America")
Namerica <- ggplot(NAmerica_Fertility, aes(year, n))
Namerica + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
Philippines is one of the few countries in the world that have an increase in adolecent fertility rates.
Philippines_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="Philippines")
Philippines <- ggplot(Philippines_Fertility, aes(year, n))
Philippines + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
Note although it is a decreasing trend, the number of adolencent girls giving birth to children at a young age is still very high.
Zambia_Fertility <- filter(tidy_adolecent_fertility_rates, Country=="Zambia")
Zambia <- ggplot(Zambia_Fertility, aes(year, n))
Zambia + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
tidy_gendered_world_indicators <- gather(gendered_world_indicators, "year", "n", 5:60)
colnames(tidy_gendered_world_indicators)[colnames(tidy_gendered_world_indicators) == "Country Name"] <- "Country"
colnames(tidy_gendered_world_indicators)[colnames(tidy_gendered_world_indicators) == "Indicator Name"] <- "Indicator"
tidy_gendered_world_indicators <- select(tidy_gendered_world_indicators, one_of("Country", "Indicator","year", "n"))
head(tidy_gendered_world_indicators)
Sample Exploratory Variable we can use for analysis
NAmerica_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="North America") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))
NAmerica_Labor_Force <- ggplot(NAmerica_Labor_Force, aes(year, n))
NAmerica_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).
Philippines_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="Philippines") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))
Philippines_Labor_Force <- ggplot(Philippines_Labor_Force, aes(year, n))
Philippines_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).
Zambia_Labor_Force <- filter(tidy_gendered_world_indicators, (Country=="Zambia") & (Indicator == "Labor force participation rate, female (% of female population ages 15+) (modeled ILO estimate)"))
Zambia_Labor_Force <- ggplot(Zambia_Labor_Force, aes(year, n))
Zambia_Labor_Force + geom_jitter() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
## Warning: Removed 30 rows containing missing values (geom_point).
We can make a lot of linear regression comparions between various variables, comparing it to adolescent fertility, and use statistical inference techniques to determine best variables to use.