Author: Chinedu Onyeka, Date: October 1st, 2021

Collaborators: Tyler Baker, Leticia Salazar

Objective: To understand the trend in global child mortality from the 1980’s

Data Source: UN Inter-agency Group for Child Mortality Estimation

library(tidyverse)

Read the file

url <- "https://raw.githubusercontent.com/chinedu2301/DATA607-Data-Acquisition-and-Management/main/dataset1_Child_Mortality.csv"
child_mort <- read_csv(url, skip = 6)
head(child_mort, n = 10)

We don’t need the ISO code and we need just the median estimate for each country

child_mort <- child_mort %>% select(CountryName:Neonatal.Deaths.2015) %>% filter(`Uncertainty bounds*` == "Median")
head(child_mort)
dim(child_mort)
## [1] 195 398

This data-set is now 195 x 399 which is a lot of columns. More columns than rows.

Transform from wide to long

#Gather all Death cases into Type and Deaths.
child_mort <- child_mort %>% gather(key = "Type", value = "Deaths",
                      U5MR.1950:Neonatal.Deaths.2015, na.rm = TRUE)

head(child_mort, n = 10)
#check the dimensions for this long table
dim(child_mort)
## [1] 57958     4

We now have only 5 columns and each row is for a country for each type of infant mortality and for each year.

Separate the “Type” column:

child_mort <- child_mort %>% separate("Type", into = c("Type", "Year")) %>%
  select(CountryName, Type:Deaths) %>% arrange(CountryName)
## Warning: Expected 2 pieces. Additional pieces discarded in 28976 rows [28983,
## 28984, 28985, 28986, 28987, 28988, 28989, 28990, 28991, 28992, 28993, 28994,
## 28995, 28996, 28997, 28998, 28999, 29000, 29001, 29002, ...].
head(child_mort, n = 20)

Sub setting to data from 1990 to 2015

child_m <- child_mort %>% filter(Year %in% 1980:2015) %>% 
  mutate(Year = as.numeric(Year))
head(child_m, n = 10)
tail(child_m, n = 10)

Adding all categories of child mortality rates for each country,

country_child_deaths <- child_m %>% group_by(CountryName) %>% 
  summarise(Total_Child_Deaths = sum(Deaths))
head(country_child_deaths, n = 20)
country_child_deaths_year <- child_m %>% group_by(Year) %>% 
  summarise(Total_Child_Deaths = sum(Deaths))
country_child_deaths_year
ggplot(data = country_child_deaths_year, aes(Year, Total_Child_Deaths)) + geom_point() +
  geom_smooth() + labs(title = "Total Child mortality vs Year") + ylab("Total Child Mortality") +
  theme_bw()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Conclusion: From the chart, we can see that the total child mortality per year around the world has significantly declined over the years since the 1980’s.