Data source = https://data.cdc.gov/api/views/w9j2-ggv5/rows.csv?accessType=DOWNLOAD

The dataset contains Life Expectancy rates in USA of Male, Female and both sexes combined. Let’s do some cleaning by removing the death rate column.

data <- read.csv("C:/Users/sulov/Downloads/NCHS_-_Death_rates_and_life_expectancy_at_birth.csv")
data = data [-c(5)]
head(data)
##   Year      Race        Sex Average.Life.Expectancy..Years.
## 1 2015 All Races Both Sexes                              NA
## 2 2014 All Races Both Sexes                            78.9
## 3 2013 All Races Both Sexes                            78.8
## 4 2012 All Races Both Sexes                            78.8
## 5 2011 All Races Both Sexes                            78.7
## 6 2010 All Races Both Sexes                            78.7

Removing the rows containing no values

data = na.omit(data)

The year 2015 is removed as one of the row did not contain data. Let’s compare male and female expectancy rate. We are creating a new data that contains male and female only. We are removing rows containing “Both Sexes”. The new data is named as “mf”

library(ggplot2)
mf = filter(data, Sex == "Male" | Sex == "Female")
ggplot(mf)+ 
  aes(Year, Average.Life.Expectancy..Years., group = Sex, colour = Sex)+ 
  geom_point()+
  theme_minimal()+
  labs(title = "Life Expectancy of Male and Female (1900-2014)", x = "Year", y = "Average Life Expectancy (in Years)", legend = "Sex") 

As expected, females have higher Life Expectancy over males. Now, visualizing the Life Expectancy rate of both sexes.

ggplot(both)+ 
  aes(Year, Average.Life.Expectancy..Years.)+ 
  geom_line(colour = "cadetblue4", size = 2)+
  theme_minimal()+
  labs(title = "Life Expectancy Rate", subtitle = "During 1900-2014 in USA", x = "Year", y = "Average Life Expectancy (in Years)")

Life Expectancy Rate was has followed a rising trend except during the World War I period, where the it sharply fell. Now, comparing the Life Expectancy Rate of two races : Black and White. As usual, creating a different data containing “Blacks” and “Whites” only.

ggplot(bw)+ 
    aes(Year, bw$Average.Life.Expectancy..Years., colour = Race)+ 
    geom_point()+ 
    theme_minimal()+ 
  labs(y = "Average Life Expectancy (in Years)", title = "Life Expectancy of Races", subtitle = "Black and White")+
  scale_color_brewer(palette = "Dark2")

It is clear from the graph that Life Expectancy of White is higher than Black race.