Plotly (Interactive Vizualizations)

Introduction and Overview

Plotly is in R package for creating interactive web-based graphics or other HTML sources. Similar to DT for tables, Plotly relies on JavaScript for the back end programming, but a relatively simple syntax for creating nice looking, interactive graphics. Some of the features that Plotly can do are:

  • Hover over points for more info
  • Click on legend to show/hide certain categories
  • Easily download as png or as a stand-alone html file
  • Zoom into specific areas of the chart
  • Overwhelming options for different visualizations

Before getting into the details, have a look at the Plotly documentation and you’ll see just how many types of visualizations and options within those visualizations plotly offers. A couple of them for our purposes are:

Plotly arguments

Plotly follows a pretty intuitive syntax. You can create a simple chart by just specifying the dataset you want to plot, the data for the x and y axes, and the type of chart you want.

But take a look at the Plotly documentation and you’ll see just how many types of visualizations and options within those visualizations plotly offers. A couple of them for our purposes are:

Add plot layers

  • Different types of plots overlaid
  • Works well when table is in “wide” format

%>% add_trace(y = ~trace_1, name = 'trace 1', mode = 'lines+markers')

  • variables for x and/or y axis
  • name for legend
  • type of plot/marker

Combine multiple plots

  • Similar to facet within ggplot

%>% subplot(plot1,plot2,nrows=,ncols=,shareX=,shareY=)

  • combine list of plots
  • nrows/ncols in plot grid
  • shareX/shareY to indicate axes that are shared

Plotly Layout

There are also a ton of layout options including everything from whether to include a legend and where to put it, options for chart titles, background colors, hover behaviors, and many many more. Most often you can stick to the options below or presented in the examples, but can use the plotly documentation if you want to branch out.

# plotly() %>%
#   layout( title="",
#           showLegend=TRUE/FALSE,
#           legend = list(title=list(text='Legend Title'),font = t2, orientation = 'h', x = 0.1, y = 0.9),
#           yaxis=list(title=""),
#           xaxis=list(title=""),
#           paper_bgcolor="",
#           plot_bgcolor="",
#           autosize = F,
#           width = <number of pixels>,
#           height = <number of pixels>,
#           ...and so many more...
#         )

Plotly Examples

So let’s give this a try

slice_head(std_gg1, n = 10, by = "disease_f") %>% gt()
county disease_f cases year sex
California Chlamydia 75941 2001 Female
California Chlamydia 24885 2001 Male
California Chlamydia 81584 2002 Female
California Chlamydia 28521 2002 Male
California Chlamydia 85153 2003 Female
California Chlamydia 31007 2003 Male
California Chlamydia 89438 2004 Female
California Chlamydia 33652 2004 Male
California Chlamydia 92362 2005 Female
California Chlamydia 36227 2005 Male
#simple barchart
plot_ly(
  std_gg1,
  x= ~year,
  y= ~cases,
  color= ~sex,
  type="bar"
  # mode="bar"
) 
#bar chart - side by side vs stacked
plot_ly(
  std_gg1,
  x= ~year,
  y= ~cases,
  color= ~sex,
  type="bar"
) %>%
  layout(barmode="stack")
std_gg2a <- std_gg %>%
  filter(county != "California" & year=="2018" & sex=="Total") %>%
  select(county, disease_f, rate)

slice_head(std_gg2a, n = 5, by = disease_f) |> 
  gt(auto_align = TRUE) |>
  tab_header(
    title = "California STD Case Rates",
    subtitle = "First 5 Counties for each disease"
  )
California STD Case Rates
First 5 Counties for each disease
county disease_f rate
Alameda Chlamydia 580.6
Alpine Chlamydia 993.4
Amador Chlamydia 157.4
Butte Chlamydia 562.2
Calaveras Chlamydia 262.3
Alameda Early Syphilis 26.3
Alpine Early Syphilis 0.0
Amador Early Syphilis 10.0
Butte Early Syphilis 43.9
Calaveras Early Syphilis 6.6
Alameda Gonorrhea 225.2
Alpine Gonorrhea 82.8
Amador Gonorrhea 42.5
Butte Gonorrhea 180.3
Calaveras Gonorrhea 88.2
#boxplot
plot_ly(
  std_gg2a,
  y=~rate,
  color=~disease_f,
  type="box"
)
#trend over time in alameda county with confidence intervals
std_gg3 <- std_gg %>%
  filter(county=="Alameda" & sex=="Total") %>%
  select(-`lower_95%_ci`, -`upper_95%_ci`, -annotation_code)

slice_tail(std_gg3, n=5, by = disease_f) |> 
  gt(auto_align = TRUE) |>
  tab_header(
    title = "Alameda STD Cases",
    subtitle = "First 5 rows for each disease"
  ) 
Alameda STD Cases
First 5 rows for each disease
disease county year sex cases population rate disease_f
Chlamydia Alameda 2017 Total 9169 1659750 552.4 Chlamydia
Chlamydia Alameda 2018 Total 9694 1669659 580.6 Chlamydia
Chlamydia Alameda 2019 Total 9676 1678926 576.3 Chlamydia
Chlamydia Alameda 2020 Total 7222 1676458 430.8 Chlamydia
Chlamydia Alameda 2021 Total 7455 1654938 450.5 Chlamydia
Early Syphilis Alameda 2017 Total 435 1659750 26.2 Early Syphilis
Early Syphilis Alameda 2018 Total 439 1669659 26.3 Early Syphilis
Early Syphilis Alameda 2019 Total 502 1678926 29.9 Early Syphilis
Early Syphilis Alameda 2020 Total 417 1676458 24.9 Early Syphilis
Early Syphilis Alameda 2021 Total 374 1654938 22.6 Early Syphilis
Gonorrhea Alameda 2017 Total 3581 1659750 215.8 Gonorrhea
Gonorrhea Alameda 2018 Total 3760 1669659 225.2 Gonorrhea
Gonorrhea Alameda 2019 Total 3687 1678926 219.6 Gonorrhea
Gonorrhea Alameda 2020 Total 3466 1676458 206.7 Gonorrhea
Gonorrhea Alameda 2021 Total 3810 1654938 230.2 Gonorrhea
#make a presentable table
plot_ly(
  std_gg3,
  x=~year,
  y=~rate,
  color=~disease_f,
  type="scatter",
  mode="lines",
  colors=c("darkorange","darkcyan","darkslateblue"),
  text = ~paste('Cases: ',cases,'<br>Population: ',population,'<br>Rate: ',rate)
) %>%
  layout(
    title="Alameda County STD Rates, 2001-2018",
    yaxis=list(title="Case Rate per 100,000"),
    xaxis=list(title="Year"),
    paper_bgcolor="azure",
    plot_bgcolor="white"
  )
#add trace example
std_gg3b <- std_gg %>%
  filter(county=="Alameda" & sex=="Total") %>%
  select(year,disease,rate) %>%
  pivot_wider(names_from=disease, values_from=rate)

bind_rows(slice_head(std_gg3b, n=5), 
          slice_tail(std_gg3b, n=5)) |> 
  gt(auto_align = TRUE) |>
  tab_header(
    title = "Alameda STD Case Rates",
    subtitle = "First and Last 5 years displayed"
  ) 
Alameda STD Case Rates
First and Last 5 years displayed
year Chlamydia Early Syphilis Gonorrhea
2001 331.7 2.5 145.8
2002 331.6 4.9 138.7
2003 335.9 4.4 111.4
2004 358.2 5.0 123.6
2005 356.5 5.3 142.6
2017 552.4 26.2 215.8
2018 580.6 26.3 225.2
2019 576.3 29.9 219.6
2020 430.8 24.9 206.7
2021 450.5 22.6 230.2
plot_ly(
  std_gg3b,
  x=~year,
  y=~`Chlamydia`,
  name="Chlamydia",
  type="scatter",
  mode="markers"
) %>%
  add_trace(y=~`Gonorrhea`,name="Gonorrhea",mode="lines")%>%
  add_trace(y=~`Early Syphilis`,name="Early Syphilis",mode="lines+markers") %>%
  layout(yaxis=list(title="Rate per 100,000"))
#subplot example
plot1 <- plot_ly(std_gg3b,
                 x=~year,
                  y=~`Chlamydia`,
                  name="Chlamydia",
                  type="scatter")

plot2 <- plot_ly(std_gg3b,
                 x=~year,
                  y=~`Gonorrhea`,
                  name="Gonorrhea",
                  type="scatter")

subplot(plot1,plot2,nrows=2,shareX=T)
subplot(plot1,plot2,shareY=T)