gapminder example

Let’s recreate Hans Rosling’s famous visualization. If you have never seen it, check out this long video: https://www.ted.com/talks/hans_rosling_the_best_stats_you_ve_ever_seen

However, for the sake of brevity, watch this video:

https://www.youtube.com/embed/Z8t4k0Q8e8Y

Install gapminder package for this example.

install.packages(‘gapminder’)

Get the data from gapminder

data(gapminder, package = "gapminder")

Next, let’s create a static ggplot2 visualization. Note that frame and ids are the aesthetics that we have never used before. This is because they don’t belong to ggplot2. They come from plotly. So, ggplot2 will simply ignore them.

Usually we map a time variable to frame and cross-sectional id variable to ids.

gg <- ggplot(gapminder, 
             aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point(aes(size = pop, frame = year, ids = country)) +
  scale_x_log10() +
  labs(x = 'GDP per Capital',
       y = 'Life Expectancy',
       color = 'Continent') +
  scale_color_manual(values = wesanderson::wes_palette("Moonrise3")) +
  theme_minimal() 
Warning: Ignoring unknown aesthetics: frame and ids

Now is the time for some plotly magic!

ggplotly(gg)

Using plotly’s own functionality

Above, we used ggplot2 to build the graphics and then just modified it using plotly. However, plotly is a powerful library with tons of functionality of its own. Check out this documentation - https://plotly.com/r/getting-started/

Let’s create a base plot:

base <- gapminder %>%
  plot_ly(x = ~ gdpPercap,
          y = ~ lifeExp,
          size = ~ pop,
          color = ~ continent,
          text = ~ country,
          hoverinfo = "text") %>%
  layout(xaxis = list(type = "log"))

Next, modify the base plot and display it:

base %>%
  add_markers(frame = ~ year,
              ids = ~ country) %>%
  animation_opts(1000, 
                 easing = "elastic-in-out", 
                 redraw = FALSE
                 ) %>%
  animation_button( x = 1,
                    xanchor = "right",
                    y = 0,
                    yanchor = "bottom"
                    ) %>%
  animation_slider(currentvalue = list(prefix = "Year: ",
                                       font = list(color = "red")
                                       )
                   )
NA

Cumulative animation

We will now create a progressing time series of Amazon Inc market value. You can call it a worm plot!

Read Amazon’s monthly stock returns

Please download this file from here: https://github.com/ashgreat/DA6233

And save it in a subdirectory “Data” in your project folder.

amzn <- read_csv(here::here('Data', 'amzn_2005_2020.csv')) %>% 
  mutate(mktval = PRC*SHROUT*1000,
         date2 = as.Date(as.character(date), format = '%Y%m%d')) %>% 
  filter(!is.na(mktval)) %>% 
  mutate(id = row_number())
Rows: 192 Columns: 64── Column specification ──────────────────────────────────────────────
Delimiter: ","
chr (10): NCUSIP, TICKER, COMNAM, TSYMBOL, PRIMEXCH, TRDSTAT, SECS...
dbl (39): PERMNO, date, NAMEENDT, SHRCD, EXCHCD, SICCD, NAICS, PER...
lgl (15): SHRCLS, DCLRDT, DLPDT, NEXTDT, PAYDT, RCRDDT, HSICMG, HS...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(amzn)
NA

In order to create a worm, plotly needs to create multiple data sets capturing the progress. First data set will effectively have only one observation for the first month. The second data set will have 2 observations, and so on. These are all stacked on top of each other to get a large data set.

accumulate_by <- function(dat, var) {
  var <- lazyeval::f_eval(var, dat)
  lvls <- plotly:::getLevels(var)
  dats <- lapply(seq_along(lvls), function(x) {
    cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
  })
  dplyr::bind_rows(dats)
}

amzn2 <- amzn %>% accumulate_by(~id)
head(amzn2)
NA

Finally, the worm plot!

p <- amzn2 %>%
  plot_ly(
    x = ~ id, 
    y = ~ mktval,
    frame = ~frame, 
    type = 'scatter',
    mode = 'lines'
  ) %>% 
  layout(
    xaxis = list(
      title = "Date",
      zeroline = F
    ),
    yaxis = list(
      title = "Market Value",
      zeroline = F
    )
  ) %>%
  animation_opts(
    frame = 10,
    transition = 0,
    redraw = FALSE
  ) %>%
  animation_slider(
    hide = T
  ) %>%
  animation_button(
    x = 1,
    xanchor = "right",
    y = 0,
    yanchor = "bottom"
  )

p
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMgZ2FwbWluZGVyIGV4YW1wbGUKCkxldCdzIHJlY3JlYXRlIEhhbnMgUm9zbGluZydzIGZhbW91cyB2aXN1YWxpemF0aW9uLiBJZiB5b3UgaGF2ZSBuZXZlciBzZWVuIGl0LCBjaGVjayBvdXQgdGhpcyBsb25nIHZpZGVvOiA8aHR0cHM6Ly93d3cudGVkLmNvbS90YWxrcy9oYW5zX3Jvc2xpbmdfdGhlX2Jlc3Rfc3RhdHNfeW91X3ZlX2V2ZXJfc2Vlbj4KCkhvd2V2ZXIsIGZvciB0aGUgc2FrZSBvZiBicmV2aXR5LCB3YXRjaCB0aGlzIHZpZGVvOgoKaHR0cHM6Ly93d3cueW91dHViZS5jb20vZW1iZWQvWjh0NGswUThlOFkKCkluc3RhbGwgYGdhcG1pbmRlcmAgcGFja2FnZSBmb3IgdGhpcyBleGFtcGxlLgoKaW5zdGFsbC5wYWNrYWdlcygnZ2FwbWluZGVyJykKCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQoKcGFjbWFuOjpwX2xvYWQocGxvdGx5LCB0aWR5dmVyc2UsIGdhcG1pbmRlciwgaGVyZSkKCmBgYAoKR2V0IHRoZSBkYXRhIGZyb20gYGdhcG1pbmRlcmAKCmBgYHtyfQpkYXRhKGdhcG1pbmRlciwgcGFja2FnZSA9ICJnYXBtaW5kZXIiKQpgYGAKCk5leHQsIGxldCdzIGNyZWF0ZSBhIHN0YXRpYyBgZ2dwbG90MmAgdmlzdWFsaXphdGlvbi4gTm90ZSB0aGF0IGBmcmFtZWAgYW5kIGBpZHNgIGFyZSB0aGUgYWVzdGhldGljcyB0aGF0IHdlIGhhdmUgbmV2ZXIgdXNlZCBiZWZvcmUuIFRoaXMgaXMgYmVjYXVzZSB0aGV5IGRvbid0IGJlbG9uZyB0byBgZ2dwbG90MmAuIFRoZXkgY29tZSBmcm9tIGBwbG90bHlgLiBTbywgYGdncGxvdDJgIHdpbGwgc2ltcGx5IGlnbm9yZSB0aGVtLgoKVXN1YWxseSB3ZSBtYXAgYSB0aW1lIHZhcmlhYmxlIHRvIGBmcmFtZWAgYW5kIGNyb3NzLXNlY3Rpb25hbCBpZCB2YXJpYWJsZSB0byBgaWRzYC4KCmBgYHtyfQpnZyA8LSBnZ3Bsb3QoZ2FwbWluZGVyLCAKICAgICAgICAgICAgIGFlcyh4ID0gZ2RwUGVyY2FwLCB5ID0gbGlmZUV4cCwgY29sb3IgPSBjb250aW5lbnQpKSArCiAgZ2VvbV9wb2ludChhZXMoc2l6ZSA9IHBvcCwgZnJhbWUgPSB5ZWFyLCBpZHMgPSBjb3VudHJ5KSkgKwogIHNjYWxlX3hfbG9nMTAoKSArCiAgbGFicyh4ID0gJ0dEUCBwZXIgQ2FwaXRhbCcsCiAgICAgICB5ID0gJ0xpZmUgRXhwZWN0YW5jeScsCiAgICAgICBjb2xvciA9ICdDb250aW5lbnQnKSArCiAgc2NhbGVfY29sb3JfbWFudWFsKHZhbHVlcyA9IHdlc2FuZGVyc29uOjp3ZXNfcGFsZXR0ZSgiTW9vbnJpc2UzIikpICsKICB0aGVtZV9taW5pbWFsKCkgCmBgYAoKTm93IGlzIHRoZSB0aW1lIGZvciBzb21lIGBwbG90bHlgIG1hZ2ljIQoKYGBge3J9CmdncGxvdGx5KGdnKQpgYGAKCiMjIFVzaW5nIGBwbG90bHlgJ3Mgb3duIGZ1bmN0aW9uYWxpdHkKCkFib3ZlLCB3ZSB1c2VkIGBnZ3Bsb3QyYCB0byBidWlsZCB0aGUgZ3JhcGhpY3MgYW5kIHRoZW4ganVzdCBtb2RpZmllZCBpdCB1c2luZyBgcGxvdGx5YC4gSG93ZXZlciwgYHBsb3RseWAgaXMgYSBwb3dlcmZ1bCBsaWJyYXJ5IHdpdGggdG9ucyBvZiBmdW5jdGlvbmFsaXR5IG9mIGl0cyBvd24uIENoZWNrIG91dCB0aGlzIGRvY3VtZW50YXRpb24gLSA8aHR0cHM6Ly9wbG90bHkuY29tL3IvZ2V0dGluZy1zdGFydGVkLz4KCkxldCdzIGNyZWF0ZSBhIGJhc2UgcGxvdDoKCmBgYHtyfQpiYXNlIDwtIGdhcG1pbmRlciAlPiUKICBwbG90X2x5KHggPSB+IGdkcFBlcmNhcCwKICAgICAgICAgIHkgPSB+IGxpZmVFeHAsCiAgICAgICAgICBzaXplID0gfiBwb3AsCiAgICAgICAgICBjb2xvciA9IH4gY29udGluZW50LAogICAgICAgICAgdGV4dCA9IH4gY291bnRyeSwKICAgICAgICAgIGhvdmVyaW5mbyA9ICJ0ZXh0IikgJT4lCiAgbGF5b3V0KHhheGlzID0gbGlzdCh0eXBlID0gImxvZyIpKQpgYGAKCk5leHQsIG1vZGlmeSB0aGUgYmFzZSBwbG90IGFuZCBkaXNwbGF5IGl0OgoKYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KYmFzZSAlPiUKICBhZGRfbWFya2VycyhmcmFtZSA9IH4geWVhciwKICAgICAgICAgICAgICBpZHMgPSB+IGNvdW50cnkpICU+JQogIGFuaW1hdGlvbl9vcHRzKDEwMDAsIAogICAgICAgICAgICAgICAgIGVhc2luZyA9ICJlbGFzdGljLWluLW91dCIsIAogICAgICAgICAgICAgICAgIHJlZHJhdyA9IEZBTFNFCiAgICAgICAgICAgICAgICAgKSAlPiUKICBhbmltYXRpb25fYnV0dG9uKCB4ID0gMSwKICAgICAgICAgICAgICAgICAgICB4YW5jaG9yID0gInJpZ2h0IiwKICAgICAgICAgICAgICAgICAgICB5ID0gMCwKICAgICAgICAgICAgICAgICAgICB5YW5jaG9yID0gImJvdHRvbSIKICAgICAgICAgICAgICAgICAgICApICU+JQogIGFuaW1hdGlvbl9zbGlkZXIoY3VycmVudHZhbHVlID0gbGlzdChwcmVmaXggPSAiWWVhcjogIiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZm9udCA9IGxpc3QoY29sb3IgPSAicmVkIikKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKQogICAgICAgICAgICAgICAgICAgKQoKYGBgCgojIyBDdW11bGF0aXZlIGFuaW1hdGlvbgoKV2Ugd2lsbCBub3cgY3JlYXRlIGEgcHJvZ3Jlc3NpbmcgdGltZSBzZXJpZXMgb2YgQW1hem9uIEluYyBtYXJrZXQgdmFsdWUuIFlvdSBjYW4gY2FsbCBpdCBhIHdvcm0gcGxvdCEKClJlYWQgQW1hem9uJ3MgbW9udGhseSBzdG9jayByZXR1cm5zCgpQbGVhc2UgZG93bmxvYWQgdGhpcyBmaWxlIGZyb20gaGVyZTogPGh0dHBzOi8vZ2l0aHViLmNvbS9hc2hncmVhdC9EQTYyMzM+CgpBbmQgc2F2ZSBpdCBpbiBhIHN1YmRpcmVjdG9yeSAiRGF0YSIgaW4geW91ciBwcm9qZWN0IGZvbGRlci4KCmBgYHtyfQphbXpuIDwtIHJlYWRfY3N2KGhlcmU6OmhlcmUoJ0RhdGEnLCAnYW16bl8yMDA1XzIwMjAuY3N2JykpICU+JSAKICBtdXRhdGUobWt0dmFsID0gUFJDKlNIUk9VVCoxMDAwLAogICAgICAgICBkYXRlMiA9IGFzLkRhdGUoYXMuY2hhcmFjdGVyKGRhdGUpLCBmb3JtYXQgPSAnJVklbSVkJykpICU+JSAKICBmaWx0ZXIoIWlzLm5hKG1rdHZhbCkpICU+JSAKICBtdXRhdGUoaWQgPSByb3dfbnVtYmVyKCkpCgpoZWFkKGFtem4pCgpgYGAKCkluIG9yZGVyIHRvIGNyZWF0ZSBhIHdvcm0sIGBwbG90bHlgIG5lZWRzIHRvIGNyZWF0ZSBtdWx0aXBsZSBkYXRhIHNldHMgY2FwdHVyaW5nIHRoZSBwcm9ncmVzcy4gRmlyc3QgZGF0YSBzZXQgd2lsbCBlZmZlY3RpdmVseSBoYXZlIG9ubHkgb25lIG9ic2VydmF0aW9uIGZvciB0aGUgZmlyc3QgbW9udGguIFRoZSBzZWNvbmQgZGF0YSBzZXQgd2lsbCBoYXZlIDIgb2JzZXJ2YXRpb25zLCBhbmQgc28gb24uIFRoZXNlIGFyZSBhbGwgc3RhY2tlZCBvbiB0b3Agb2YgZWFjaCBvdGhlciB0byBnZXQgYSBsYXJnZSBkYXRhIHNldC4KCmBgYHtyfQphY2N1bXVsYXRlX2J5IDwtIGZ1bmN0aW9uKGRhdCwgdmFyKSB7CiAgdmFyIDwtIGxhenlldmFsOjpmX2V2YWwodmFyLCBkYXQpCiAgbHZscyA8LSBwbG90bHk6OjpnZXRMZXZlbHModmFyKQogIGRhdHMgPC0gbGFwcGx5KHNlcV9hbG9uZyhsdmxzKSwgZnVuY3Rpb24oeCkgewogICAgY2JpbmQoZGF0W3ZhciAlaW4lIGx2bHNbc2VxKDEsIHgpXSwgXSwgZnJhbWUgPSBsdmxzW1t4XV0pCiAgfSkKICBkcGx5cjo6YmluZF9yb3dzKGRhdHMpCn0KCmFtem4yIDwtIGFtem4gJT4lIGFjY3VtdWxhdGVfYnkofmlkKQpoZWFkKGFtem4yKQoKYGBgCgpGaW5hbGx5LCB0aGUgd29ybSBwbG90IQoKYGBge3J9CnAgPC0gYW16bjIgJT4lCiAgcGxvdF9seSgKICAgIHggPSB+IGlkLCAKICAgIHkgPSB+IG1rdHZhbCwKICAgIGZyYW1lID0gfmZyYW1lLCAKICAgIHR5cGUgPSAnc2NhdHRlcicsCiAgICBtb2RlID0gJ2xpbmVzJwogICkgJT4lIAogIGxheW91dCgKICAgIHhheGlzID0gbGlzdCgKICAgICAgdGl0bGUgPSAiRGF0ZSIsCiAgICAgIHplcm9saW5lID0gRgogICAgKSwKICAgIHlheGlzID0gbGlzdCgKICAgICAgdGl0bGUgPSAiTWFya2V0IFZhbHVlIiwKICAgICAgemVyb2xpbmUgPSBGCiAgICApCiAgKSAlPiUKICBhbmltYXRpb25fb3B0cygKICAgIGZyYW1lID0gMTAsCiAgICB0cmFuc2l0aW9uID0gMCwKICAgIHJlZHJhdyA9IEZBTFNFCiAgKSAlPiUKICBhbmltYXRpb25fc2xpZGVyKAogICAgaGlkZSA9IFQKICApICU+JQogIGFuaW1hdGlvbl9idXR0b24oCiAgICB4ID0gMSwKICAgIHhhbmNob3IgPSAicmlnaHQiLAogICAgeSA9IDAsCiAgICB5YW5jaG9yID0gImJvdHRvbSIKICApCgpwCmBgYAo=