Animated graphics provide a really engaging and effective illustration of how things change, usually over time. Any data scientist or aspiring data scientist should be able to deploy animations when he/she believes they are the best way to communicate or illustrate a phenomenon to their client.
Since learning how to create animated graphics earlier this year, I have found numerous situations where I have been able to use them to more effectively support an argument.
It’s surpisingly easy to create animated graphics in R, particularly if you have some familiarity with working in ggplot2.
library(tidyverse) # includes ggplot2
library(viridis) # optional - for nice colours
library(gganimate) # core animation package in R
library(wbstats) # connects to world bank and pulls statistical indicatorsI also have Oswald font family installed. If you want to use that you can download it here.
What is an animation and how is one created?
Why did Hans Rosling create his famous bubble chart?
What data did he use?
wbstats packageWe are going to need some macroeconomic data to create this chart. A good source is the World Bank Open Data Site.
The wbstats package allows you to pull the data you need directly from this site into an R dataframe, by utilizing the API. Two functions from this package are useful to us:
## pull specific indicator between specific dates based on indicator ID
wb(country = "all", indicator, startdate, enddate, mrv, return_wide = FALSE,
gapfill, freq, cache, lang = c("en", "es", "fr", "ar", "zh"),
removeNA = TRUE, POSIXct = FALSE, include_dec = FALSE,
include_unit = FALSE, include_obsStatus = FALSE,
include_lastUpdated = FALSE)
## get latest information about country properties.
wbcountries(lang = c("en", "es", "fr", "ar", "zh"))
NY.GDP.PCAP.CDSP.DYN.LE00.INSP.POP.TOTLSince this is a time series animation, we will need a start year and end year - lets start at 1960 and end at 2017 - some data will be missing but that’s not a problem.
We also need it for all countries. Note that the World Bank also has data for regional and economic country groupings - we are not interested in those - we just want the pure countries.
rosling_data <- wbstats::wb(indicator = c("SP.DYN.LE00.IN", "NY.GDP.PCAP.CD", "SP.POP.TOTL"),
country = "countries_only", startdate = 1960, enddate = 2017)
head(rosling_data %>%
dplyr::arrange(date, country), n = 3)## iso3c date value indicatorID
## 1 AFG 1960 3.244600e+01 SP.DYN.LE00.IN
## 2 AFG 1960 5.977319e+01 NY.GDP.PCAP.CD
## 3 AFG 1960 8.996973e+06 SP.POP.TOTL
## indicator iso2c country
## 1 Life expectancy at birth, total (years) AF Afghanistan
## 2 GDP per capita (current US$) AF Afghanistan
## 3 Population, total AF Afghanistan
Looks like what we need, but we may want to assign countries to regions.
The wbcountries() function returns a bunch of information about countries - we only need the region from this. So we can grab that and join it to our datafile based on the iso3c country code.
rosling_data <- rosling_data %>%
dplyr::left_join(wbstats::wbcountries() %>%
dplyr::select(iso3c, region))## Joining, by = "iso3c"
## iso3c date value indicatorID
## 1 AFG 1960 3.244600e+01 SP.DYN.LE00.IN
## 2 AFG 1960 5.977319e+01 NY.GDP.PCAP.CD
## 3 AFG 1960 8.996973e+06 SP.POP.TOTL
## indicator iso2c country region
## 1 Life expectancy at birth, total (years) AF Afghanistan South Asia
## 2 GDP per capita (current US$) AF Afghanistan South Asia
## 3 Population, total AF Afghanistan South Asia
For ggplot2 we will need all the indictor data on a single row for each date and country.
rosling_data <- rosling_data %>%
tidyr::pivot_wider(id_cols = c("date", "country", "region"), names_from = indicator, values_from = value)
head(rosling_data %>%
dplyr::arrange(date, country), n = 3)## # A tibble: 3 x 6
## date country region `Life expectanc… `GDP per capita… `Population, to…
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 1960 Afghani… South … 32.4 59.8 8996973
## 2 1960 Albania Europe… 62.3 NA 1608800
## 3 1960 Algeria Middle… 46.1 246. 11057863
Before we animate, we can work on the data for a single year to make sure that we get the static chart design the way we want it.
The first thing we should always do in ggplot2 is set our aesthetics - which is to declare what elements of the data correspond to what properties of the chart.
We need to tell ggplot2 what kind of chart we want, and we basically want a scatter chart with the regions colour coded, which is geom_point(). Then we can render the chart for the first time to see what it looks like:
rosling_chart <- ggplot2::ggplot(rosling_data_2010, aes(x = log(`GDP per capita (current US$)`),
y = `Life expectancy at birth, total (years)`,
size = `Population, total`)) +
ggplot2::geom_point(alpha = 0.5, aes(color = region)) +
ggplot2::scale_size(range = c(.1, 16), guide = FALSE) +
ggplot2::theme_classic()Much nicer, just a few more tweaks…
rosling_chart <- ggplot2::ggplot(rosling_data_2010, aes(x = log(`GDP per capita (current US$)`),
y = `Life expectancy at birth, total (years)`,
size = `Population, total`)) +
ggplot2::geom_point(alpha = 0.5, aes(color = region)) +
ggplot2::scale_size(range = c(.1, 16), guide = FALSE) +
ggplot2::theme_classic() +
viridis::scale_color_viridis(discrete = TRUE, name = "Region", option = "viridis") +
ggplot2::labs(x = "Log GDP per capita",
y = "Life expectancy at birth") Nice! Now we need to animate…
transition_state and ease_aesWhen we use the package gganimate, it will expect a a transition state variable, which is the variable that it uses to move between static states. In this case our transition state is clearly date. So basically gganimate renders your design for each value of date, and then moves between them. The function transition_states() is used for this.
There are various options for how gganimate moves between the states. It can move at a steady pace, or it can speed up at various rates as the states progress, so that movement is slow at the beginning and fast at the end. The function ease_aes() is used to determine how to move in and out of the states. There are various options available to you for this, but I will use cubic-in-out.
All we need to do to have an animated graphic is to add these two functions to our existing ggplot2 code. Remember we have to return to our original multi-year data set.
rosling_chart_anim <- ggplot2::ggplot(rosling_data, aes(x = log(`GDP per capita (current US$)`),
y = `Life expectancy at birth, total (years)`,
size = `Population, total`)) +
ggplot2::geom_point(alpha = 0.5, aes(color = region)) +
ggplot2::scale_size(range = c(.1, 16), guide = FALSE) +
ggplot2::theme_classic() +
viridis::scale_color_viridis(discrete = TRUE, name = "Region", option = "viridis") +
ggplot2::labs(x = "Log GDP per capita",
y = "Life expectancy at birth") +
gganimate::transition_states(date, transition_length = 1, state_length = 1) +
gganimate::ease_aes('cubic-in-out')Looks promising, but we need to make some tweaks.
We want to tweak some of the appearance.
.gif or a .mp4 if we prefer.rosling_chart_anim <- ggplot2::ggplot(rosling_data, aes(x = log(`GDP per capita (current US$)`),
y = `Life expectancy at birth, total (years)`,
size = `Population, total`)) +
ggplot2::geom_point(alpha = 0.5, aes(color = region)) +
ggplot2::scale_size(range = c(.1, 16), guide = FALSE) +
ggplot2::theme_classic() +
viridis::scale_color_viridis(discrete = TRUE, name = "Region", option = "viridis") +
ggplot2::labs(x = "Log GDP per capita",
y = "Life expectancy at birth") +
ggplot2::geom_text(aes(x = 7.5, y = 60, label = date), size = 14, color = 'lightgrey', family = 'Oswald') +
ggplot2::scale_x_continuous(limits = c(2.5, 12.5)) +
ggplot2::scale_y_continuous(limits = c(30, 90)) +
gganimate::transition_states(date, transition_length = 1, state_length = 1) +
gganimate::ease_aes('cubic-in-out')
# animate and save a gif (default if no renderer is explicitly called)
rosling_chart_gif <- gganimate::animate(rosling_chart_anim, nframes = 200, width = 800, height = 600)
gganimate::save_animation(rosling_chart_gif, "rosling.gif")
# animate and save as an mpeg
rosling_chart_mp4 <- gganimate::animate(rosling_chart_anim, nframes = 200, width = 800, height = 600,
renderer = ffmpeg_renderer())
gganimate::save_animation(rosling_chart_mp4, "rosling.mp4")gif.mp4