Interactive graphics allow you to manipulate plotted data to gain further insight. As an example, an interactive graphic would allow you to zoom in on a subset of your data without the need to create a new plot. In this course, you will learn how to create and customize interactive graphics in plotly using the R programming language. Along the way, you will review data visualization best practices and be introduced to new plot types such as scatterplot matrices and binned scatterplots.
In this chapter, you move past basic plotly charts to explore more-complex relationships and larger datasets. You will learn how to layer traces, create faceted charts and scatterplot matrices, and create binned scatterplots..
## Warning: package 'readr' was built under R version 3.4.4
## Warning: package 'dplyr' was built under R version 3.4.4
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Warning: package 'plotly' was built under R version 3.4.4
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.4.4
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
## Parsed with column specification:
## cols(
## Type = col_integer(),
## Alcohol = col_double(),
## Malic = col_double(),
## Ash = col_double(),
## Alcalinity = col_double(),
## Magnesium = col_integer(),
## Phenols = col_double(),
## Flavanoids = col_double(),
## Nonflavanoids = col_double(),
## Proanthocyanins = col_double(),
## Color = col_double(),
## Hue = col_double(),
## Dilution = col_double(),
## Proline = col_integer()
## )
## Warning: package 'bindrcpp' was built under R version 3.4.4
Youāve seen how to add LOESS smoothers to a scatterplot by using both the add_markers() and add_lines() traces. Adding a linear smoother uses the same approach, but you use lm() command to fit the linear model.
In this exercise, your task is to add a linear smoother to a scatterplot of user score against critic score for video games in 2016.
When you add smoothers, missing values (NAs) can be problematic because many modeling functions automatically delete missing observations. To avoid this conflict, use select() and na.omit() to delete observations before plotting.
Note that plotly and the vgsales2016 data has already been loaded for you.
Fit a linear regression model using Critic_Score as the predictor variable and User_Score as the response variable. Store this model in the object m. Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis. Add a linear smoother to your scatterplot representing the fitted values.
## Parsed with column specification:
## cols(
## Name = col_character(),
## Platform = col_character(),
## Year = col_integer(),
## Genre = col_character(),
## Publisher = col_character(),
## NA_Sales = col_double(),
## EU_Sales = col_double(),
## JP_Sales = col_double(),
## Other_Sales = col_double(),
## Global_Sales = col_double(),
## Critic_Score = col_integer(),
## Critic_Count = col_integer(),
## User_Score = col_character(),
## User_Count = col_integer(),
## Developer = col_character(),
## Rating = col_character()
## )
## Warning: Can't display both discrete & non-discrete data on same axis
In this exercise, you will learn how to create density plots and overlay them to compare the distribution of critic scores for three video game publishers: Activision, Electronic Arts, and Nintendo.
To create a density plot for Global_Sales, store the results of the density() command, and then pass the x and y coordinates to add_lines():
d <- density(vgsales2016\(Critic_Score, na.rm = TRUE) plot_ly() %>% add_lines(x = ~d\)x, y = ~d$y, fill = ātozeroyā) %>% layout(xaxis = list(title = āCritic scoreā), yaxis = list(title = āDensityā))
Notice how you can create new plot types easily using familiar code! The fill = ātozeroyā argument fills the area under the curve.
Data frames activision,ea, and nintendo are loaded, as is plotly.
Compute density curves for Activision, EA, and Nintendo, storing them in the d.a, d.e, and d.n objects, respectively. Create overlayed density plots of Critic_Score for activision, ea, and nintendo (in that order).
Recall that the subplot() command allows you to combine charts to create facets (i.e.Ā subplots or small multiples). This is a great way to explore distributions and relationships across factors. In this exercise, you will explore how the relationship between critic score and user score changes (or stays the same) across platform.
Note that plotly and dplyr have already been loaded for you.
Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis for PS4 games. Name the trace for the platform and store this plot as p1.
Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis for XOne video games. Name the trace for the platform and store this plot as p2.
Use subplot() to create a faceted scatterplot containing p1 and p2 with two rows.
In the previous exercise, you manually create a faceted scatterplot. This was not very tedious because you were only focused on two groups. However, there are 9 platforms in the vgsales2016 dataset, and it would be very tedious to manually code 9 scatterplots.
In this exercise, you will practice using the group_by() and do() commands to automate the process of creating a facetted scatterplot with 12 facets. Remember that the entire plotting command is embedded within do(), as shown in the template below:
data %>% group_by(factor) %>% do( plot = plot_ly(data = ., x = ~x, y = ~y) %>% add_markers(name = ~factor) ) %>% subplot(nrows = R, shareY = TRUE, shareX = TRUE)
Use group_by(), do(), and subplot() to create a faceted scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis, where the facets are defined by Platform. Arrange the facets in a grid with 3 rows.
In the previous two exercises, you saw a set of subplots that lack any axis labels and a set of subplots that used the column names as axis labels. Why are they different? By default, the subplot() command sets titleX = shareX and titleY = shareY; thus, axis labels are only displayed if shareX and/or shareY are TRUE. You can add titleX = TRUE and/or titleY = TRUE to override this behavior.
In this example, your task is to add titles to subplots. Note that plotly has already been loaded for you.
Adapt the subplot() code to allow the x- and y-axis titles to be shared. Add the title āUser score vs.Ā critic score by platform, 2016ā to the plot.
The axes in a subplot can be renamed using the layout() command, just like in a single plot; however, there are multiple x-axes to rename. For example, a 2 x 2 grid of plots requires four x-axis labels:
p %>% # subplot layout( xaxis = list(title = ātitle 1ā), xaxis2 = list(title = ātitle 2ā), xaxis3 = list(title = ātitle 3ā), xaxis4 = list(title = ātitle 4ā) ) A similar strategy holds for the y-axis labels.
In this example, your task is to polish the axis titles on a subplot. Note that plotly has already been loaded for you.
For the first plot in sp2, use āGlobal Sales (M units)ā for the y-axis label, and leave the x-axis label blank. For the second plot in sp2, label the x-axis āYearā and the y-axis āGlobal Sales (M units)ā
SPLOMS DO NOT WORK IN THE RMD FILE BUT DO WORK IN THE BROWSER ???
How closely related are North American and European video game sales? How do sales in Japan compare to North America and Europe? In this exercise you will create a scatterplot matrix (abbreviated as SPLOM) to explore these questions based on the vgsales2016 dataset.
Note that plotly has already been loaded for you.
Create a scatterplot matrix including NA_Sales, EU_Sales, and JP_Sales (in that order). Label the panels N. America, Europe, and Japan, respectively.
Just like with a single scatterplot, it can be useful to add color to represent an additional variable in a scatterplot matrix. In this exercise, you will add color to represent whether the game was produced by Nintendo or not.
In the code provided, an indicator (i.e.Ā dummy) variable(nintendo) has been created to indicate whether a game was published by Nintendo or some other publisher.
Note that plotly has already been loaded for you.
Recreate the SPLOM of NA_Sales, EU_Sales, and JP_Sales (in that order). Remember to label the panels N. America, Europe, and Japan, respectively. Use color to represent the values in nintendo.
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
So far, you have used the default settings for your SPLOMs. Now, we will introduce two common customizations to explore:
Deleting the diagonal panels. Displaying only the upper or lower triangle of plots. Both customizations are implemented by adding a style() layer.
Your task is to style your SPLOM from the previous exercise to explore how these customizations work.
Your plot from the previous exercise is stored in the splom object, and plotly has been loaded for you.
Delete the plots along the diagonal by setting the diagonal argument to a list that sets visible to FALSE.
Take Hint (-10 XP) 2 Delete the plots in the upper half of the matrix by setting the showupperhalf argument to FALSE.
3 Delete the plots in the lower half of the matrix by setting the showlowerhalf argument to FALSE.
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
The dataset vgsales contains 16450 cases (rows), which is large enough that binned scatterplots help avoid overplotting. In this exercise, your task is to create a binned scatterplot of User_Score against Critic_Score to display the entire dataset. (Recall up to now that you have only been displaying pieces of this dataset as scatterplots.)
Once you have created the plot, be sure to explore the interactivity. Specifically, note that the āzā entry in the hover info corresponds to the number of observations in the chosen bin.
plotly has already been loaded for you.
Create a binned scatterplot with Critic_Score on the x-axis and User_Score on the y-axis. Set the number of bins on the x- and y-axes to 50.