Please knit your file as a HTML Document (to display interactivity) and submit your assignment to bCourses.
The questions in this exercise are meant to
complement Lab Module: Interactive Plots & Tables in
Week 10.
This exercise is going to use the
ca_aqi.rds dataset that we cleaned in
Module: Scatterplots
Please make sure to double check that your code does not run off the page in your knitted HTML Document. To fix this issue, go back to your code, and add new lines in your code by hitting enter so that part of the code chunk is on another line.
# Week parameter
week = params$week
# Read clean rds (R dataset) file with `write_rds`
ca_aqi = read_rds(here(week, "data", "top5_cities.rds"))
These steps are going to create dataframe top5_df for
the 5 cities with highest average AQI in California using
ca_aqi.
# Here we generate the names of cities we are interested in
top5cities <- ca_aqi %>%
# Group by city
group_by(city_ascii) %>%
# Population-weighted average of AQI
summarise(mean_aqi = weighted.mean(aqi, population, na.rm = T)) %>%
# Sort dataframe by mean aqi, in descending order
dplyr::arrange(desc(mean_aqi)) %>%
# Take the top 5
head(5) %>%
# Select just the city names
select(city_ascii) %>%
# Convert the dataframe into a vector
as_vector()
# Create dataframe with all unaggregated AQI measurements
top5_df <- ca_aqi %>%
# Filter for cities within our list
filter(city_ascii %in% top5cities,
# Filter out extreme aqi values
aqi < 1000)
Before building update “ggplot2” & “plotly”
# Update ggplot2 & update plotly
install.packages("ggplot2")
install.packages("plotly")
Then check their versions “ggplot2” & “plotly”
# Check package versions (Check package versions)
#packageVersion("ggplot2") # should be ‘3.5.1’
#packageVersion("plotly") # should be ‘4.10.4’
Use the top5_df dataframe we’ve prepared above (the 5
cities with highest average AQI in California) from
Lab Module: Interactive Plots & Tables and create
either an interactive violin plot or an interactive
box plot to depict the spread of AQI measurements
across the five cities usingggplot with a
ggplotly conversion.
Of importance is to take note as to how the tooltip is being
generated. Refer to the code from
Lab Module: Interactive Plots & Tables to customize the
tooltip if you deem it necessary.
library(ggplot2)
library(plotly)
box_df <- top5_df %>%
dplyr::ungroup() %>%
dplyr::select(city_ascii, aqi) %>%
dplyr::mutate(
city_ascii = as.factor(city_ascii),
aqi = as.numeric(aqi)
) %>%
as.data.frame()
p_box <- ggplot(
box_df,
aes(
x = city_ascii,
y = aqi,
fill = city_ascii,
text = paste("City:", city_ascii,
"<br>AQI:", aqi)
)
) +
geom_boxplot(alpha = 0.6, outlier.alpha = 0.3) +
labs(
x = "City",
y = "Daily AQI",
title = "Distribution of Daily AQI for Top 5 CA Cities (Boxplot)"
) +
theme_minimal(base_size = 12) +
theme(
legend.position = "none",
plot.title = element_text(face = "bold")
)
ggplotly(p_box, tooltip = "text")
Build the same plot as you did in Question 1 using native
plotly object construction.
library(dplyr)
p_box_plotly <- plot_ly(
data = box_df,
x = ~city_ascii,
y = ~aqi,
color = ~city_ascii,
type = "box",
boxpoints = "outliers",
hoverinfo = "text",
text = ~paste(
"City:", city_ascii,
"<br>AQI:", round(aqi, 1)
)
) %>%
layout(
title = "Distribution of Daily AQI for Top 5 CA Cities (Boxplot)",
xaxis = list(title = "City"),
yaxis = list(title = "Daily AQI"),
showlegend = FALSE
)
p_box_plotly