Gapminder Dataset

In this session we are going to talk about the dynamic graph using Gapminder Dataset. These data are discussed in the Hans Rosling TED Talk (start at 2:30) Link to Youtube. For the purpose to comparison the fertility between countries, we use the dataset Gapminder, below giving you the link to get it.

To get the dataset, you can download here with type of CSV file: Gapminder Dataset

Load the Data

library(tidyverse)
fert <- read_csv("gapminder.csv")

head(fert)
## # A tibble: 6 x 6
##   Country      year  life  fert   pop continent
##   <chr>       <dbl> <dbl> <dbl> <dbl> <chr>    
## 1 Afghanistan  1962  33.0  7.67   9.3 Asia     
## 2 Afghanistan  1963  33.5  7.67   9.5 Asia     
## 3 Afghanistan  1964  34.1  7.67   9.7 Asia     
## 4 Afghanistan  1965  34.6  7.67   9.9 Asia     
## 5 Afghanistan  1966  35.1  7.67  10.1 Asia     
## 6 Afghanistan  1967  35.7  7.67  10.4 Asia
str(fert)
## spec_tbl_df [7,506 x 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Country  : chr [1:7506] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ year     : num [1:7506] 1962 1963 1964 1965 1966 ...
##  $ life     : num [1:7506] 33 33.5 34.1 34.6 35.1 ...
##  $ fert     : num [1:7506] 7.67 7.67 7.67 7.67 7.67 7.67 7.67 7.67 7.67 7.67 ...
##  $ pop      : num [1:7506] 9.3 9.5 9.7 9.9 10.1 10.4 10.6 10.8 11.1 11.4 ...
##  $ continent: chr [1:7506] "Asia" "Asia" "Asia" "Asia" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Country = col_character(),
##   ..   year = col_double(),
##   ..   life = col_double(),
##   ..   fert = col_double(),
##   ..   pop = col_double(),
##   ..   continent = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>
summary(fert)
##    Country               year           life            fert      
##  Length:7506        Min.   :1962   Min.   :13.20   Min.   :0.000  
##  Class :character   1st Qu.:1975   1st Qu.:55.00   1st Qu.:2.240  
##  Mode  :character   Median :1988   Median :66.70   Median :4.080  
##                     Mean   :1988   Mean   :64.13   Mean   :4.238  
##                     3rd Qu.:2002   3rd Qu.:73.77   3rd Qu.:6.200  
##                     Max.   :2015   Max.   :83.73   Max.   :8.450  
##       pop           continent        
##  Min.   :   0.00   Length:7506       
##  1st Qu.:   3.20   Class :character  
##  Median :   8.15   Mode  :character  
##  Mean   :  34.58                     
##  3rd Qu.:  22.30                     
##  Max.   :1376.00

The Fertility Rate

The Fertility Rate

library(dplyr)
library(ggplot2)
p<- fert%>%
  filter(year == 2015)%>%
  ggplot(aes(fert, life, size = pop, color = continent)) +
  labs(x="Fertility Rate", y = "Life expectancy at birth (years)", 
       caption = "(Based on data from Hans Rosling - gapminder.com)", 
       color = 'Continent',size = "Population (millions)") + 
  ylim(30,100) +
  geom_point()
  
p

Dynamic graph

To make the dynamic graph, the plotly package can be used to create interactive graph.

#install.packages("plotly")
library(plotly)
library(ggplot2)
p<- fert%>%
  ggplot(aes(x=fert, y=life, size = pop, color = continent,frame = year)) +
  labs(x="Fertility Rate", y = "Life expectancy at birth (years)", 
       caption = "(Based on data from Hans Rosling - gapminder.com)", 
       color = 'Continent',size = "Population (millions)") + 
  ylim(30,100) +
  geom_point(aes(text=Country))

ggplotly(p)

Using the gganimate

#install.packages("gganimate")
library(gganimate)
#install.packages("gifski")
library(gifski)

p1 <- ggplot(fert, aes(fert, life, size = pop, color = continent, frame = year)) +
  labs(x="Fertility Rate", y = "Life expectancy at birth (years)", 
       caption = "(Based on data from Hans Rosling - gapminder.com)", 
       color = 'Continent',size = "Population (millions)") + 
  ylim(30,100) +
  geom_point() +
  #ggtitle("Year: {frame_time}") +
  transition_time(year) +
  ease_aes("linear") +
  enter_fade() +
  exit_fade()

animate(p1,fps = 4, width = 600, height = 400, renderer = gifski_renderer())