Infant Mortality

Harold Nelson

Introduction: The Questions

In his talk, Hans Rosling noted that some population based outcomes had a general trend of improvement up to 2003. There was also a convergence.

We want to look past 2003 and see if these trends have continued.

Setup

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(janitor)
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(plotly)
## 
## Attaching package: 'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout

Get Data

IMR <- read_csv("API_SP.DYN.IMRT.IN_DS2_en_csv_v2_4770442.csv", na = "empty", skip = 4) %>% 
  janitor::clean_names() 
## New names:
## • `` -> `...67`
## Warning: One or more parsing issues, see `problems()` for details
## Rows: 266 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): Country Name, Country Code, Indicator Name, Indicator Code
## dbl (61): 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, ...
## lgl  (2): 2021, ...67
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Pivot

We need year and IMR to be variables. How do we do this?

Solution

IMR_long = IMR %>% 
  pivot_longer(cols = x1960:x2021,
               names_to = "year",
               values_to = "IMR") %>% 
  select(country_name,country_code,year,IMR) %>% 
  mutate(year = parse_number(year)) %>% 
  filter(year < 2021)
glimpse(IMR_long)
## Rows: 16,226
## Columns: 4
## $ country_name <chr> "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Ar…
## $ country_code <chr> "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "…
## $ year         <dbl> 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 196…
## $ IMR          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

Scatterplot

Do a scatterplot using jitter of year and IMR. Keep only every fifth year.

Solution

IMR_long %>% 
  filter(year %in% c(1960,1970,1980,1990,2000,2010,2020)) %>% 
  ggplot(aes(year,IMR)) +
  geom_jitter(size = .2)
## Warning: Removed 465 rows containing missing values (geom_point).

Do we see improvement and convergence?

Post 2003

Repeat the graphic with post 2003 data. Use every year.

Solution

g = IMR_long %>% 
  filter(year > 2003) %>% 
  ggplot(aes(year,IMR,group = country_name)) +
  geom_jitter(size = .2)
ggplotly(g)

Summary Statistics

Create summary level variables such as mean and standard deviation for every year.

Solution

summary_level = IMR_long %>% 
  group_by(year) %>% 
  summarize(mean = mean(IMR,na.rm = T),
            sd = sd(IMR, na.rm = T),
            median = median(IMR,na.rm = T),
            max = max(IMR,na.rm = T),
            min = min(IMR,na.rm = T),
            range = max - min) 

Do a scatterplot of year and standard deviation.

Solution

summary_level %>% 
  ggplot(aes(x = year,y = sd)) +
  geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Do a scatterplot of year and mean

Solution

summary_level %>% 
  ggplot(aes(x = year,y = mean)) +
  geom_point() + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Examine Countries

Look at individual countries compare the range of values.

Solution

IMR_range = IMR_long %>% 
  group_by(country_name) %>% 
  summarize(max = max(IMR, na.rm = T),
            min = min(IMR,na.rm = T),
            range = max - min) %>% 
            filter(range > 0) %>% 
  arrange(range)
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf

## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf

Compare Max and Range

Solution

g = IMR_range %>% 
  ggplot(aes(max, range,group = country_name)) +
  geom_point(size = .5)
ggplotly(g)

Filter and Look Again

Solution

g = IMR_range %>% 
  filter(range < 20) %>% 
  ggplot(aes(max, range,group = country_name)) +
  geom_point(size = .5)
ggplotly(g)

Save the Data

save(IMR_long,file = "IMR_long.Rdata")