Life Expectancy in Mexico, Canada and the US for years 1950-1980

Reading the data

url <- "https://raw.githubusercontent.com/Stat579-at-ISU/
stat579-at-isu.github.io/master/homework/data/gapminder-5060.csv" 

life5060 <- read.csv(url)
head(life5060)
##       country continent year lifeExp      pop gdpPercap
## 1 Afghanistan      Asia 1952  28.801  8425333  779.4453
## 2 Afghanistan      Asia 1957  30.332  9240934  820.8530
## 3 Afghanistan      Asia 1962  31.997 10267083  853.1007
## 4 Afghanistan      Asia 1967  34.020 11537966  836.1971
## 5     Albania    Europe 1952  55.230  1282697 1601.0561
## 6     Albania    Europe 1957  59.280  1476505 1942.2842
library(ggplot2)
library(tidyverse)
## -- Attaching packages ------------------------------------------------------- tidyverse 1.2.1 --
## v tibble  1.4.2     v purrr   0.2.5
## v tidyr   0.8.1     v dplyr   0.7.6
## v readr   1.1.1     v stringr 1.3.1
## v tibble  1.4.2     v forcats 0.3.0
## -- Conflicts ---------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Focusing on the data of 50-60 and correcting the data of Canada in 1957

canada<- life5060 %>% filter(country=="Canada")
canada %>% filter(year==1957)
##   country continent year lifeExp      pop gdpPercap
## 1  Canada  Americas 1957  999999 17010154  12489.95
canada_fixed= canada %>% mutate(lifeExp= replace(lifeExp, year==1957, 69.96))

Reading the data of gap7080

url2 <- "https://raw.githubusercontent.com/
Stat579-at-ISU/stat579-at-isu.github.io/master/homework/data/gap7080.csv" 

life7080 <- read.csv(url2)
head(life7080)
##       country continent year lifeExp      pop gdpPercap
## 1 Afghanistan      Asia 1972  36.088 13079460  739.9811
## 2 Afghanistan      Asia 1977  38.438 14880372  786.1134
## 3 Afghanistan      Asia 1982  39.854 12881816  978.0114
## 4 Afghanistan      Asia 1987  40.822 13867957  852.3959
## 5     Albania    Europe 1972  67.690  2263554 3313.4222
## 6     Albania    Europe 1977  68.930  2509048 3533.0039

Combining the data of 5060

usa.mexico= life5060 %>% filter(country %in% c("United States", "Mexico"))
data1=rbind(canada_fixed, usa.mexico)
data1 %>% filter(year==1957)
##         country continent year lifeExp       pop gdpPercap
## 1        Canada  Americas 1957   69.96  17010154 12489.950
## 2        Mexico  Americas 1957   55.19  35015548  4131.547
## 3 United States  Americas 1957   69.49 171984000 14847.127

Combining data of 5060 with 7080

data2= life7080 %>% filter(country %in% c("Canada","United States", "Mexico" ))
head(data2)
##   country continent year lifeExp      pop gdpPercap
## 1  Canada  Americas 1972  72.880 22284500 18970.571
## 2  Canada  Americas 1977  74.210 23796400 22090.883
## 3  Canada  Americas 1982  75.760 25201900 22898.792
## 4  Canada  Americas 1987  76.860 26549700 26626.515
## 5  Mexico  Americas 1972  62.361 55984294  6809.407
## 6  Mexico  Americas 1977  65.032 63759976  7674.929
data.full= rbind(data1, data2)
head(data.full)
##   country continent year lifeExp      pop gdpPercap
## 1  Canada  Americas 1952  68.750 14785584 11367.161
## 2  Canada  Americas 1957  69.960 17010154 12489.950
## 3  Canada  Americas 1962  71.300 18985849 13462.486
## 4  Canada  Americas 1967  72.130 20819767 16076.588
## 5  Mexico  Americas 1952  50.789 30144317  3478.126
## 6  Mexico  Americas 1957  55.190 35015548  4131.547
tail(data.full)
##          country continent year lifeExp       pop gdpPercap
## 19        Mexico  Americas 1982  67.405  71640904  9611.148
## 20        Mexico  Americas 1987  69.498  80122492  8688.156
## 21 United States  Americas 1972  71.340 209896000 21806.036
## 22 United States  Americas 1977  73.380 220239000 24072.632
## 23 United States  Americas 1982  74.650 232187835 25009.559
## 24 United States  Americas 1987  75.020 242803533 29884.350
data.full %>% ggplot(aes(x = year, y = lifeExp, colour=country))+geom_line()

###Plotting both life5060 and gap7080

library(gridExtra)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
p1<-canada_fixed %>% 
  ggplot(aes(x = year, y = lifeExp, colour=country)) + geom_line() +
  geom_line(data = life5060 %>%
        filter(country %in% 
    c("United States", "Mexico")))+ 
  labs( x="1950-1960", y="Life Expectancy")+xlim(1950,1990)

p2<-data.full %>% ggplot(aes(x = year, y = lifeExp, colour=country))+
  geom_line()+ labs(x= "1950-1980", y="Life Expectancy")+xlim(1950,1990)

grid.arrange(p1,p2)

Comparing the two plots show that the life expectancy in Canada was the highest among the three countries for the whole period 1950-1980. At this time, the trend for Mexico was increasing markedly, but still lower than United States and Canada. Although the life expectancy of Canada increased more rapidly than the United States between 1950-1960, the life expectancy among Americans started to increase and stood close by the Canadian life expectancy between 1975-1980. Then it became flat around 75.