Interactive graphics have revolutionized the way we explore and present data. Unlike static plots, interactive visualizations allow users to engage with the data dynamically, uncovering insights that might otherwise remain hidden. This guide will walk you through the process of creating interactive plots in R using ggplot2 and ggplotly.
Interactive plots are particularly useful when dealing with large datasets or complex visualizations. They enable users to drill down into the data and uncover patterns that might be missed in static plots. By allowing users to zoom in on specific areas of interest, hover over data points for detailed information, and pan across large datasets, interactive plots provide a more engaging and intuitive way to explore data. This makes them invaluable in various fields such as finance, scientific research, and business analytics, where the ability to interact with data in real-time can lead to quicker insights and decision-making.
When dealing with multidimensional data, interactivity allows users to toggle between different variables or layers of information, providing a more comprehensive understanding of the relationships within the data. This capability is especially powerful when presenting complex data stories to both technical and non-technical audiences. The interactive nature of these plots can make data more accessible and engaging, allowing viewers to explore the aspects of the data that interest them most.
The CCES (Cooperative Congressional Election Study) is a large-scale academic survey project that collects data on voter behavior, attitudes, and demographics. It uses several Plotly interactive plots. https://cces.gov.harvard.edu/explore
To preserve the interactivity of plots created with
ggplotly
, it is crucial to knit the R Markdown document to
an HTML format. This ensures that all interactive features such as
zooming, hovering, and clicking are fully functional. Other output
formats like PDF or Word will render the plots as static images, losing
the interactive capabilities.
Let’s start with a simple example to demonstrate how to convert a
ggplot2 plot into an interactive one using ggplotly
.
# Create a basic ggplot
p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "Car Weight vs. Miles per Gallon",
x = "Weight (1000 lbs)",
y = "Miles per Gallon")
# Convert the ggplot to an interactive plot
ggplotly(p)
In this example, we’ve created a scatter plot of car weight versus
miles per gallon using the mtcars
dataset. By wrapping our
ggplot object p
with ggplotly()
, we’ve made it
interactive. You can now hover over points to see details, zoom in/out,
and pan across the plot.
Now that we’ve created a basic interactive plot, let’s explore ways to enhance it with custom tooltips, layout adjustments, and additional interactive features.
Tooltips are the information boxes that appear when you hover over data points. We can customize these to display specific information.
# Customize tooltips
p <- ggplot(mtcars, aes(x = wt, y = mpg,
text = paste("Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg,
"<br>Horsepower:", hp))) +
geom_point(aes(color = factor(cyl))) +
labs(title = "Car Weight vs. Miles per Gallon",
x = "Weight (1000 lbs)",
y = "Miles per Gallon",
color = "Cylinders") +
theme_ipsum()
p
# Convert to interactive plot with customized tooltips
ggplotly(p, tooltip = "text")
In this example, we’ve added a custom text
aesthetic
that includes the car model, weight, MPG, and horsepower. The
tooltip
argument in ggplotly()
specifies which
aesthetic to use for the tooltip information.
We can further customize the layout of our plot using the
layout()
function from plotly. This allows us to make
adjustments that go beyond what’s easily achievable with ggplot2
alone.
# Create a basic ggplot
p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point()
# Customize layout using plotly's layout function
ggplotly(p) %>%
layout(title = list(text = "Interactive Plot of Weight vs. MPG", font = list(size = 20)),
xaxis = list(title = "Weight (1000 lbs)", range = c(1, 6)),
yaxis = list(title = "Miles per Gallon", range = c(10, 35)),
legend = list(title = list(text = "Cylinders"), x = 0.85, y = 0.9))
Here, we’ve used the layout()
function to adjust the
title font size, axis ranges, legend title, and legend position. These
adjustments showcase how we can fine-tune the appearance of our
interactive plot beyond what we initially set in ggplot2.
Let’s explore some more advanced examples of interactive plots using different chart types and datasets.
# Create an interactive scatter plot with multiple variables
p_scatter <- gapminder %>%
filter(year == 2007) %>%
ggplot(aes(x = gdpPercap, y = lifeExp, size = pop, color = continent,
text = paste("Country:", country,
"<br>GDP per capita:", round(gdpPercap, 2),
"<br>Life expectancy:", round(lifeExp, 1),
"<br>Population:", pop))) +
geom_point(alpha = 0.7) +
scale_x_log10() +
labs(title = "Life Expectancy vs. GDP per Capita (2007)",
x = "GDP per Capita (log scale)",
y = "Life Expectancy",
size = "Population",
color = "Continent")
ggplotly(p_scatter, tooltip = "text")
This scatter plot visualizes the relationship between GDP per capita and life expectancy for different countries in 2007. The size of each point represents the population, and the color represents the continent.
econ2 <- economics %>%
mutate(
unemploy_m = unemploy / 1000,
hover_text = paste0(
"Date: ", format(date, "%B %Y"),
"<br>Unemployment: ", round(unemploy_m, 2), " million"
)
)
p_timeseries <- ggplot(econ2,
aes(x = date,
y = unemploy_m,
text = hover_text)) +
geom_line(color = "blue", alpha = 0.5) +
geom_point(alpha = 0.5) +
scale_x_date(date_breaks = "5 years",
date_labels = "%Y") +
scale_y_continuous(labels = scales::comma) +
labs(title = "US Unemployment Over Time",
x = "Year",
y = "Unemployment (millions)") +
theme_minimal()
ggplotly(p_timeseries, tooltip = "text")
This time series chart shows the trend of unemployment in the United States over time. The interactive features allow users to zoom in on specific time periods and see exact unemployment figures.
# Choose a small, diverse set of countries
focus_countries <- c("United States", "Japan", "India", "Brazil", "Nigeria")
life_long <- gapminder %>%
filter(country %in% focus_countries) %>%
select(country, year, lifeExp) %>%
mutate(year = as.Date(paste0(year, "-01-01"))) # convert to Date for nicer axis
# Attach a key for interactive highlighting
life_key <- highlight_key(life_long, ~country)
# Static ggplot that inherits the key
p_lines <- ggplot(life_key, aes(year, lifeExp, color = country)) +
geom_line(size = 1) +
labs(title = "Life Expectancy, 1952–2007",
subtitle = "Five illustrative countries",
x = "Year",
y = "Life Expectancy (years)",
color = "Country") +
theme_minimal()
# Convert to plotly and set hover-highlight behaviour
ggplotly(p_lines, tooltip = c("country", "lifeExp")) %>%
highlight(
on = "plotly_hover",
dynamic = TRUE,
opacityDim = 0.15,
selected = attrs_selected(line = list(width = 4))
)
# Create an interactive bar chart
diamonds_summary <- diamonds %>%
group_by(cut, color) %>%
summarize(avg_price = mean(price)) %>%
ungroup()
p_bar <- ggplot(diamonds_summary, aes(x = cut, y = avg_price, fill = color,
text = paste("Cut:", cut,
"<br>Color:", color,
"<br>Avg Price: $", round(avg_price, 2)))) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Average Diamond Price by Cut and Color",
x = "Cut",
y = "Average Price ($)",
fill = "Color")
ggplotly(p_bar, tooltip = "text")
This bar chart shows the average diamond price by cut and color. Users can hover over bars to see detailed information and click on legend items to filter the data.