Ch3_Controlling Parametres

Ch3 Advanced charts

In this chapter, you move past basic plotly charts to explore more-complex relationships and larger datasets. You will learn how to layer traces, create faceted charts and scatterplot matrices, and create binned scatterplots..

3.1 Video Layering traces

## Warning: package 'readr' was built under R version 3.4.4

## Warning: package 'dplyr' was built under R version 3.4.4

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## Warning: package 'plotly' was built under R version 3.4.4

## Loading required package: ggplot2

## Warning: package 'ggplot2' was built under R version 3.4.4

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

## Parsed with column specification:
## cols(
##   Type = col_integer(),
##   Alcohol = col_double(),
##   Malic = col_double(),
##   Ash = col_double(),
##   Alcalinity = col_double(),
##   Magnesium = col_integer(),
##   Phenols = col_double(),
##   Flavanoids = col_double(),
##   Nonflavanoids = col_double(),
##   Proanthocyanins = col_double(),
##   Color = col_double(),
##   Hue = col_double(),
##   Dilution = col_double(),
##   Proline = col_integer()
## )

## Warning: package 'bindrcpp' was built under R version 3.4.4

3.1.1 Adding a linear smoother

You’ve seen how to add LOESS smoothers to a scatterplot by using both the add_markers() and add_lines() traces. Adding a linear smoother uses the same approach, but you use lm() command to fit the linear model.

In this exercise, your task is to add a linear smoother to a scatterplot of user score against critic score for video games in 2016.

When you add smoothers, missing values (NAs) can be problematic because many modeling functions automatically delete missing observations. To avoid this conflict, use select() and na.omit() to delete observations before plotting.

Note that plotly and the vgsales2016 data has already been loaded for you.

3.1.1.1 Instructions

Fit a linear regression model using Critic_Score as the predictor variable and User_Score as the response variable. Store this model in the object m. Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis. Add a linear smoother to your scatterplot representing the fitted values.

3.1.1.2 Adding a linear smoother

## Parsed with column specification:
## cols(
##   Name = col_character(),
##   Platform = col_character(),
##   Year = col_integer(),
##   Genre = col_character(),
##   Publisher = col_character(),
##   NA_Sales = col_double(),
##   EU_Sales = col_double(),
##   JP_Sales = col_double(),
##   Other_Sales = col_double(),
##   Global_Sales = col_double(),
##   Critic_Score = col_integer(),
##   Critic_Count = col_integer(),
##   User_Score = col_character(),
##   User_Count = col_integer(),
##   Developer = col_character(),
##   Rating = col_character()
## )

## Warning: Can't display both discrete & non-discrete data on same axis

3.1.2 Overlayed density plots

In this exercise, you will learn how to create density plots and overlay them to compare the distribution of critic scores for three video game publishers: Activision, Electronic Arts, and Nintendo.

To create a density plot for Global_Sales, store the results of the density() command, and then pass the x and y coordinates to add_lines():

d <- density(vgsales2016$Critic_Score, na.rm = TRUE) plot_ly() %>% add_lines(x = ~d$x, y = ~d$y, fill = ‘tozeroy’) %>% layout(xaxis = list(title = ‘Critic score’), yaxis = list(title = ‘Density’))

Notice how you can create new plot types easily using familiar code! The fill = ‘tozeroy’ argument fills the area under the curve.

Data frames activision,ea, and nintendo are loaded, as is plotly.

3.1.2.1 Instructions

Compute density curves for Activision, EA, and Nintendo, storing them in the d.a, d.e, and d.n objects, respectively. Create overlayed density plots of Critic_Score for activision, ea, and nintendo (in that order).

3.1.2.2 Overlayed density plots

3.2 Subplots

3.2.1 Manual faceting

Recall that the subplot() command allows you to combine charts to create facets (i.e. subplots or small multiples). This is a great way to explore distributions and relationships across factors. In this exercise, you will explore how the relationship between critic score and user score changes (or stays the same) across platform.

Note that plotly and dplyr have already been loaded for you.

3.2.1.1 Instructions

Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis for PS4 games. Name the trace for the platform and store this plot as p1.

Create a scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis for XOne video games. Name the trace for the platform and store this plot as p2.

Use subplot() to create a faceted scatterplot containing p1 and p2 with two rows.

3.2.1.2 Manual faceting

3.2.2 Automated faceting

In the previous exercise, you manually create a faceted scatterplot. This was not very tedious because you were only focused on two groups. However, there are 9 platforms in the vgsales2016 dataset, and it would be very tedious to manually code 9 scatterplots.

In this exercise, you will practice using the group_by() and do() commands to automate the process of creating a facetted scatterplot with 12 facets. Remember that the entire plotting command is embedded within do(), as shown in the template below:

data %>% group_by(factor) %>% do( plot = plot_ly(data = ., x = ~x, y = ~y) %>% add_markers(name = ~factor) ) %>% subplot(nrows = R, shareY = TRUE, shareX = TRUE)

3.2.2.1 Instructions

Use group_by(), do(), and subplot() to create a faceted scatterplot showing Critic_Score on the x-axis and User_Score on the y-axis, where the facets are defined by Platform. Arrange the facets in a grid with 3 rows.

3.2.2.2 Automated faceting

3.2.3 Plot and axis titles

In the previous two exercises, you saw a set of subplots that lack any axis labels and a set of subplots that used the column names as axis labels. Why are they different? By default, the subplot() command sets titleX = shareX and titleY = shareY; thus, axis labels are only displayed if shareX and/or shareY are TRUE. You can add titleX = TRUE and/or titleY = TRUE to override this behavior.

In this example, your task is to add titles to subplots. Note that plotly has already been loaded for you.

3.2.3.1 Instructions

Adapt the subplot() code to allow the x- and y-axis titles to be shared. Add the title “User score vs. critic score by platform, 2016” to the plot.

3.2.3.2 Plot and axis titles

3.2.4 Polishing axis titles

The axes in a subplot can be renamed using the layout() command, just like in a single plot; however, there are multiple x-axes to rename. For example, a 2 x 2 grid of plots requires four x-axis labels:

p %>% # subplot layout( xaxis = list(title = “title 1”), xaxis2 = list(title = “title 2”), xaxis3 = list(title = “title 3”), xaxis4 = list(title = “title 4”) ) A similar strategy holds for the y-axis labels.

In this example, your task is to polish the axis titles on a subplot. Note that plotly has already been loaded for you.

3.2.4.1 Instructions

For the first plot in sp2, use “Global Sales (M units)” for the y-axis label, and leave the x-axis label blank. For the second plot in sp2, label the x-axis “Year” and the y-axis “Global Sales (M units)”

3.2.4.2 Polishing axis titles

3.3 Scatterplot Matrices

SPLOMS DO NOT WORK IN THE RMD FILE BUT DO WORK IN THE BROWSER ???

3.3.1 Your first SPLOM

How closely related are North American and European video game sales? How do sales in Japan compare to North America and Europe? In this exercise you will create a scatterplot matrix (abbreviated as SPLOM) to explore these questions based on the vgsales2016 dataset.

Note that plotly has already been loaded for you.

3.3.1.1 Instructions

Create a scatterplot matrix including NA_Sales, EU_Sales, and JP_Sales (in that order). Label the panels N. America, Europe, and Japan, respectively.

3.3.1.2 Your first SPLOM

3.3.2 Customizing color

Just like with a single scatterplot, it can be useful to add color to represent an additional variable in a scatterplot matrix. In this exercise, you will add color to represent whether the game was produced by Nintendo or not.

In the code provided, an indicator (i.e. dummy) variable(nintendo) has been created to indicate whether a game was published by Nintendo or some other publisher.

Note that plotly has already been loaded for you.

3.3.2.1 Instructions

Recreate the SPLOM of NA_Sales, EU_Sales, and JP_Sales (in that order). Remember to label the panels N. America, Europe, and Japan, respectively. Use color to represent the values in nintendo.

3.3.2.2 Customizing color

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

3.3.3 Tweaking the appearance

So far, you have used the default settings for your SPLOMs. Now, we will introduce two common customizations to explore:

Deleting the diagonal panels. Displaying only the upper or lower triangle of plots. Both customizations are implemented by adding a style() layer.

Your task is to style your SPLOM from the previous exercise to explore how these customizations work.

Your plot from the previous exercise is stored in the splom object, and plotly has been loaded for you.

3.3.3.1 Instructions

Delete the plots along the diagonal by setting the diagonal argument to a list that sets visible to FALSE.

Take Hint (-10 XP) 2 Delete the plots in the upper half of the matrix by setting the showupperhalf argument to FALSE.

3 Delete the plots in the lower half of the matrix by setting the showlowerhalf argument to FALSE.

3.3.3.2 Tweaking the appearance

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

3.4 Binned scatterplots

3.4.1 Binning a scatterplot

The dataset vgsales contains 16450 cases (rows), which is large enough that binned scatterplots help avoid overplotting. In this exercise, your task is to create a binned scatterplot of User_Score against Critic_Score to display the entire dataset. (Recall up to now that you have only been displaying pieces of this dataset as scatterplots.)

Once you have created the plot, be sure to explore the interactivity. Specifically, note that the “z” entry in the hover info corresponds to the number of observations in the chosen bin.

plotly has already been loaded for you.

3.4.1.1 Instructions

Create a binned scatterplot with Critic_Score on the x-axis and User_Score on the y-axis. Set the number of bins on the x- and y-axes to 50.