Introduction to Interactive Graphics

Interactive graphics have revolutionized the way we explore and present data. Unlike static plots, interactive visualizations allow users to engage with the data dynamically, uncovering insights that might otherwise remain hidden. This guide will walk you through the process of creating interactive plots in R using ggplot2 and ggplotly.

Motivation for Using Interactive Plots

Interactive plots are particularly useful when dealing with large datasets or complex visualizations. They enable users to drill down into the data and uncover patterns that might be missed in static plots. By allowing users to zoom in on specific areas of interest, hover over data points for detailed information, and pan across large datasets, interactive plots provide a more engaging and intuitive way to explore data. This makes them invaluable in various fields such as finance, scientific research, and business analytics, where the ability to interact with data in real-time can lead to quicker insights and decision-making.

When dealing with multidimensional data, interactivity allows users to toggle between different variables or layers of information, providing a more comprehensive understanding of the relationships within the data. This capability is especially powerful when presenting complex data stories to both technical and non-technical audiences. The interactive nature of these plots can make data more accessible and engaging, allowing viewers to explore the aspects of the data that interest them most.

Examples

The CCES (Cooperative Congressional Election Study) is a large-scale academic survey project that collects data on voter behavior, attitudes, and demographics. It uses several Plotly interactive plots. https://cces.gov.harvard.edu/explore

Output Format for Interactivity

To preserve the interactivity of plots created with ggplotly, it is crucial to knit the R Markdown document to an HTML format. This ensures that all interactive features such as zooming, hovering, and clicking are fully functional. Other output formats like PDF or Word will render the plots as static images, losing the interactive capabilities.

Creating a Basic Interactive Plot

Let’s start with a simple example to demonstrate how to convert a ggplot2 plot into an interactive one using ggplotly.

# Create a basic ggplot
p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  labs(title = "Car Weight vs. Miles per Gallon",
       x = "Weight (1000 lbs)",
       y = "Miles per Gallon")

# Convert the ggplot to an interactive plot
ggplotly(p)

In this example, we’ve created a scatter plot of car weight versus miles per gallon using the mtcars dataset. By wrapping our ggplot object p with ggplotly(), we’ve made it interactive. You can now hover over points to see details, zoom in/out, and pan across the plot.

Enhancing Interactive Plots

Now that we’ve created a basic interactive plot, let’s explore ways to enhance it with custom tooltips, layout adjustments, and additional interactive features.

Customized Tooltips

Tooltips are the information boxes that appear when you hover over data points. We can customize these to display specific information.

# Customize tooltips
p <- ggplot(mtcars, aes(x = wt, y = mpg, 
                        text = paste("Model:", rownames(mtcars),
                                     "<br>Weight:", wt,
                                     "<br>MPG:", mpg,
                                     "<br>Horsepower:", hp))) +
  geom_point(aes(color = factor(cyl))) +
  labs(title = "Car Weight vs. Miles per Gallon",
       x = "Weight (1000 lbs)",
       y = "Miles per Gallon",
       color = "Cylinders") +
  theme_ipsum()

p

# Convert to interactive plot with customized tooltips
ggplotly(p, tooltip = "text")

In this example, we’ve added a custom text aesthetic that includes the car model, weight, MPG, and horsepower. The tooltip argument in ggplotly() specifies which aesthetic to use for the tooltip information.

Customized Layout

We can further customize the layout of our plot using the layout() function from plotly. This allows us to make adjustments that go beyond what’s easily achievable with ggplot2 alone.

# Create a basic ggplot
p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point()

# Customize layout using plotly's layout function
ggplotly(p) %>%
  layout(title = list(text = "Interactive Plot of Weight vs. MPG", font = list(size = 20)),
         xaxis = list(title = "Weight (1000 lbs)", range = c(1, 6)),
         yaxis = list(title = "Miles per Gallon", range = c(10, 35)),
         legend = list(title = list(text = "Cylinders"), x = 0.85, y = 0.9))

Here, we’ve used the layout() function to adjust the title font size, axis ranges, legend title, and legend position. These adjustments showcase how we can fine-tune the appearance of our interactive plot beyond what we initially set in ggplot2.

Advanced Interactive Plots

Let’s explore some more advanced examples of interactive plots using different chart types and datasets.

Interactive Scatter Plot with Multiple Variables

# Create an interactive scatter plot with multiple variables
p_scatter <- gapminder %>%
  filter(year == 2007) %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, size = pop, color = continent,
             text = paste("Country:", country,
                          "<br>GDP per capita:", round(gdpPercap, 2),
                          "<br>Life expectancy:", round(lifeExp, 1),
                          "<br>Population:", pop))) +
  geom_point(alpha = 0.7) +
  scale_x_log10() +
  labs(title = "Life Expectancy vs. GDP per Capita (2007)",
       x = "GDP per Capita (log scale)",
       y = "Life Expectancy",
       size = "Population",
       color = "Continent")

ggplotly(p_scatter, tooltip = "text")

This scatter plot visualizes the relationship between GDP per capita and life expectancy for different countries in 2007. The size of each point represents the population, and the color represents the continent.

Interactive Time Series Chart

econ2 <- economics %>% 
  mutate(
    unemploy_m  = unemploy / 1000, 
    hover_text = paste0(
      "Date: ", format(date, "%B %Y"), 
      "<br>Unemployment: ", round(unemploy_m, 2), " million"
    )
  )

p_timeseries <- ggplot(econ2,
                       aes(x = date, 
                           y = unemploy_m, 
                           text = hover_text)) +
  geom_line(color = "blue", alpha = 0.5) +
  geom_point(alpha = 0.5) +
   scale_x_date(date_breaks = "5 years", 
               date_labels = "%Y") +
  scale_y_continuous(labels = scales::comma) +
  labs(title = "US Unemployment Over Time",
       x     = "Year",
       y     = "Unemployment (millions)") +
  theme_minimal()

ggplotly(p_timeseries, tooltip = "text")

This time series chart shows the trend of unemployment in the United States over time. The interactive features allow users to zoom in on specific time periods and see exact unemployment figures.

# Choose a small, diverse set of countries
focus_countries <- c("United States", "Japan", "India", "Brazil", "Nigeria")

life_long <- gapminder %>% 
  filter(country %in% focus_countries) %>% 
  select(country, year, lifeExp) %>% 
  mutate(year = as.Date(paste0(year, "-01-01")))   # convert to Date for nicer axis

# Attach a key for interactive highlighting
life_key <- highlight_key(life_long, ~country)

# Static ggplot that inherits the key
p_lines <- ggplot(life_key, aes(year, lifeExp, color = country)) +
  geom_line(size = 1) +
  labs(title   = "Life Expectancy, 1952–2007",
       subtitle = "Five illustrative countries",
       x       = "Year",
       y       = "Life Expectancy (years)",
       color   = "Country") +
  theme_minimal()

# Convert to plotly and set hover-highlight behaviour
ggplotly(p_lines, tooltip = c("country", "lifeExp")) %>%
  highlight(
    on          = "plotly_hover",
    dynamic     = TRUE,
    opacityDim  = 0.15,
    selected    = attrs_selected(line = list(width = 4))
  )

Interactive Bar Chart

# Create an interactive bar chart
diamonds_summary <- diamonds %>%
  group_by(cut, color) %>%
  summarize(avg_price = mean(price)) %>%
  ungroup()

p_bar <- ggplot(diamonds_summary, aes(x = cut, y = avg_price, fill = color,
                                      text = paste("Cut:", cut,
                                                   "<br>Color:", color,
                                                   "<br>Avg Price: $", round(avg_price, 2)))) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Average Diamond Price by Cut and Color",
       x = "Cut",
       y = "Average Price ($)",
       fill = "Color")

ggplotly(p_bar, tooltip = "text")

This bar chart shows the average diamond price by cut and color. Users can hover over bars to see detailed information and click on legend items to filter the data.