Harold Nelson
In his talk, Hans Rosling noted that some population based outcomes had a general trend of improvement up to 2003. There was also a convergence.
We want to look past 2003 and see if these trends have continued.
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
IMR <- read_csv("API_SP.DYN.IMRT.IN_DS2_en_csv_v2_4770442.csv", na = "empty", skip = 4) %>%
janitor::clean_names()
## New names:
## • `` -> `...67`
## Warning: One or more parsing issues, see `problems()` for details
## Rows: 266 Columns: 67
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Country Name, Country Code, Indicator Name, Indicator Code
## dbl (61): 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, ...
## lgl (2): 2021, ...67
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
We need year and IMR to be variables. How do we do this?
IMR_long = IMR %>%
pivot_longer(cols = x1960:x2021,
names_to = "year",
values_to = "IMR") %>%
select(country_name,country_code,year,IMR) %>%
mutate(year = parse_number(year)) %>%
filter(year < 2021)
glimpse(IMR_long)
## Rows: 16,226
## Columns: 4
## $ country_name <chr> "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Ar…
## $ country_code <chr> "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "…
## $ year <dbl> 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 196…
## $ IMR <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
Do a scatterplot using jitter of year and IMR. Keep only every fifth year.
IMR_long %>%
filter(year %in% c(1960,1970,1980,1990,2000,2010,2020)) %>%
ggplot(aes(year,IMR)) +
geom_jitter(size = .2)
## Warning: Removed 465 rows containing missing values (geom_point).
Do we see improvement and convergence?
Repeat the graphic with post 2003 data. Use every year.
g = IMR_long %>%
filter(year > 2003) %>%
ggplot(aes(year,IMR,group = country_name)) +
geom_jitter(size = .2)
ggplotly(g)
Create summary level variables such as mean and standard deviation for every year.
summary_level = IMR_long %>%
group_by(year) %>%
summarize(mean = mean(IMR,na.rm = T),
sd = sd(IMR, na.rm = T),
median = median(IMR,na.rm = T),
max = max(IMR,na.rm = T),
min = min(IMR,na.rm = T),
range = max - min)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Look at individual countries compare the range of values.
IMR_range = IMR_long %>%
group_by(country_name) %>%
summarize(max = max(IMR, na.rm = T),
min = min(IMR,na.rm = T),
range = max - min) %>%
filter(range > 0) %>%
arrange(range)
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in max(IMR, na.rm = T): no non-missing arguments to max; returning -Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
## Warning in min(IMR, na.rm = T): no non-missing arguments to min; returning Inf
g = IMR_range %>%
filter(range < 20) %>%
ggplot(aes(max, range,group = country_name)) +
geom_point(size = .5)
ggplotly(g)