Loading Libraries

library(ggvis)
## Warning: package 'ggvis' was built under R version 3.2.4
library(knitr)

Part one - The Grammar of Graphics

Section 1 - Introduction to ggvis

ggvis and its capabilities

ggvis is an R package, a collection of functions and data sets that enhance the R language. You need to install and load ggvis before you can use it.

ggvis helps you visualize data sets. For example, command below visualizes the mtcars data frame that comes with R. It plots the wt variable (weight) of mtcars on the x axis and the mpg variable (miles per gallon) on the y axis, and it uses points to visualize the data.

mtcars %>% ggvis(~wt, ~mpg) %>% layer_points()

The ggvis interface is very intuitive. For example, check the code below; you can change the fill color of the points in your graph by adding fill := “blue”. Or you can change the layer_points() function in the second expression to plot your data differently. You can even plot the same data twice to make your graph more informative.

# Make a graph with red points
mtcars %>% ggvis(~wt, ~mpg, fill := "red") %>% layer_points()

# Draw smooths instead of points
mtcars %>% ggvis(~wt, ~mpg) %>% layer_smooths()

# Make a graph containing both points and a smoothed summary line
mtcars %>% ggvis(~wt, ~mpg) %>% layer_points() %>% layer_smooths()

Section 2 - The grammar of ggvis

ggvis grammar ~ graphics grammar

ggvis recreates the grammar of graphics. You can combine a set of data, properties and marks with the following format.

# <data>  %>%    ggvis( ~ <x property> , ~ <y property>, fill = ~ <fill property>, ...) %>%   layer_<marks>()

Below we demonstrate how to use the grammar of graphics to build a graph step by step.

First, We make a graph that uses the points mark to plot variables from the pressure data frame. The graph should plot the temperature variable on the x axis, and the pressure variable on the y axis.

pressure %>% ggvis(~ temperature, ~ pressure ) %>% layer_points()

We can transform the plot created into a bar graph by changing the type of mark it uses.

pressure %>% ggvis(~ temperature, ~ pressure ) %>% layer_bars()

Similarly, we can change the plot into a line graph, again by changing the type of mark it uses.

pressure %>% ggvis(~ temperature, ~ pressure ) %>% layer_lines()

Below we map the fill property of the scatterplot to the temperature variable

pressure %>% ggvis(~ temperature, ~ pressure, fill = ~ temperature ) %>% layer_points() 

We extend the solution and map the size property to the pressure variable.

pressure %>% 
        ggvis(~ temperature, ~ pressure, fill = ~ temperature ,  size = ~ pressure ) %>% 
        layer_points() 

Part two - Lines and Syntax

Section 3 - Three new types of syntax

Three operators: %>%, = and :=

You’ll notice that ggvis uses the pipe operator, %>%, from the magrittr package. The pipe operator passes the result from its left-hand side into the first argument of the function on its right-hand side. So f(x) %>% g(y) is a shortcut for g(f(x), y).

Next, ggvis uses both = and := to assign properties.

= maps a property to a data value or a set of data values. This is how you visualize variation in your data set. ggvis will scale the values to a pleasing range of colors (or sizes, widths, etc.) and add a legend that explains how values are mapped to particular instances of the property.

:= sets a property to a specific color (or size, width, etc.). This is how you customize the appearance of your plots. If you set a property to a number, ggvis will usually interpret the number as the number of pixels. If you set a location property to a number, ggvis will usually interpret the number as the number of pixels from the top left-hand corner of the plot. You can set the fill of points to any common color name. ggvis passes your color selection to vega, a javascript library, so you can use any color name recognized by HTML/CSS.

You can refer to three different types of objects in your ggvis code: objects, variables, and raw values.

+ If you type a string of letters, ggvis will treat the string as an object name. It will look for an object with that name in your current environment.
+ If you place a tilde, ~, at the start of the string, ggvis will treat the string as a variable name. It will look for the column with that name in the data set that the graph visualizes.
+ If you surround a string of letters with quotation marks, ggvis will treat the string as a raw value, e.g., a piece of text.

A few examples are provided below. Below is an example of using pipe operator :

faithful %>% ggvis( ~waiting, ~eruptions) %>% layer_points()

Graph below map the size property to the pressure variable.

pressure %>% ggvis(~temperature, ~pressure, size = ~pressure ) %>% layer_points()

Graph below sets the size property to 100 pixels. Each point should appear 100 pixels wide.

pressure %>% ggvis(~temperature, ~pressure, size := 100) %>% layer_points()

Graph below attempts to make the points red :

pressure %>% ggvis(~temperature, ~pressure, fill := "red") %>% layer_points()

Properties for points

You can manipulate many different properties when using the points mark, including x, y, fill, fillOpacity, opacity, shape, size, stroke, strokeOpacity, and strokeWidth.

The shape property in turn recognizes several different values: circle (default), square, cross, diamond, triangle-up, and triangle-down.

Below we examine a few plots. In the first graph we map the fill property of a graph to a default collection of colors and adds a legend to map the default colors to the labels in pressure$black

pressure$black <- c("black", "grey80", "grey50", 
                    "navyblue", "blue", "springgreen4", 
                    "green4", "green", "yellowgreen", 
                    "yellow", "orange", "darkorange", 
                    "orangered", "red", "magenta", "violet", 
                    "purple", "purple4", "slateblue")
pressure %>% 
  ggvis(~temperature, ~pressure, fill := ~ black ) %>% 
  layer_points()

Below we plot the faithful data with the points mark. Put the waiting variable on the x axis and the eruptions variable on the y axis. Map size to the eruptions variable. Set the opacity to 0.5 (50%), the fill to blue, and the stroke to black.

faithful %>% 
        ggvis(~ waiting, ~eruptions, size= ~eruptions, opacity := 0.5, fill := "blue", stroke := "black") %>%
        layer_points()

Below we plot the faithful data with the points mark. Put the waiting variable on the x axis and the eruptions variable on the y axis. Map fill opacity to the eruptions variable. Set size to 100, the fill to red, the stroke to red, and the shape to cross.

faithful %>% 
        ggvis(~ waiting, ~eruptions, fillOpacity= ~eruptions, size := 100, fill := "red" ,stroke := "red", shape := "cross" ) %>%
        layer_points()

Section 4 - the line, a special type of mark

Properties for lines

Similar to points, lines have specific properties; they respond to: x, y, fill, fillOpacity, opacity, stroke, strokeDash, strokeOpacity, and strokeWidth. As you can see, most of them are common to the properties for points, some are missing (e.g., no size property) and others are new (e.g., strokeDash).

Below is a simple line graph

pressure %>% ggvis(~temperature, ~pressure) %>% 
        layer_lines()

Below we use the lines mark, set the line color to red, the line width to 2 pixels, and the line type to use dashes that are six pixels long.

pressure %>% 
        ggvis(~temperature, ~pressure, stroke := "red", strokeWidth := 2, strokeDash := 6 ) %>% 
        layer_lines()

Path marks and polygons

The lines mark will always connect the points in your plot from the leftmost point to the rightmost point. This can be undesirable if you are trying to plot a specific shape.

For example, code below would plot a map of Texas if ggvis connected the points in the correct order. You can do this with the paths mark, layer_paths(). The paths mark is similar to the lines mark except that it connects points in the order that they appear in the data set. So the paths mark will connect the point that corresponds to the first row of the data to the point that corresponds to the second row of data, and so on - no matter where those points appear in the graph.

library("maps")
## Warning: package 'maps' was built under R version 3.2.4
## 
##  # maps v3.1: updated 'world': all lakes moved to separate new #
##  # 'lakes' database. Type '?world' or 'news(package="maps")'.  #
texas <- ggplot2::map_data("state", region = "texas")
texas %>% ggvis(~long, ~lat) %>% layer_paths()

The second plot sets the fill property of the texas map to dark orange (“darkorange”). This additional property will color your map!

texas %>% ggvis(~long, ~lat, fill := "darkorange") %>% layer_paths()

Display model fits

compute_smooth() is a useful function to use with line graphs. It takes a data frame as input and returns a new data frame as output. The new data frame will contain the x and y values of a smooth line fitted to the data in the original data frame.

compute_smooth() takes a couple of arguments. First we use the pipe operator to pass it the data set faithful. Then we provide an R formula, eruptions ~ waiting. An R formula contains two variables connected by a tilde, ~. compute_smooth() will use the variable on the left as the y variable for the smooth line, and it will use the variable on the right as the x variable for the smooth line.

Finally, compute_smooth() takes a model argument. This is the name of the R modelling function that compute_smooth() should use to calculate the smooth line. lm() is R’s function for building linear models. If you do not supply a model value, compute_smooth() will generate a set of smoothed coordinates with loess.

head( mtcars %>% compute_smooth(mpg ~ wt))
##      pred_    resp_
## 1 1.513000 32.08897
## 2 1.562506 31.68786
## 3 1.612013 31.28163
## 4 1.661519 30.87037
## 5 1.711025 30.45419
## 6 1.760532 30.03318

compute_smooth() always returns a data set with two columns, one named pred_ and one named resp_. As a result, it is very easy to use compute_smooth() to plot a smoothed line of your data. For example, we can extend our code from the last exercise to plot the results of compute_smooth() as a line graph.

mtcars %>% compute_smooth(mpg ~ wt) %>% ggvis( ~pred_ ,~ resp_ ) %>% layer_lines()

Calling compute_smooth() can be a bit of a hassle, so ggvis includes a layer that automatically calls compute_smooth() in the background and plots the results as a smoothed line. That layer is layer_smooths().

Below we recreate the same graph by plotting mtcars with the smooths mark.

mtcars %>%  ggvis( ~wt , ~mpg )  %>% layer_smooths()

Often you’ll want to place the results of layer_smooths() on a points plot that contains the raw data. Below we extend the code for the previous plot to use both layer_points() and layer_smooths().

mtcars %>% ggvis( ~wt , ~mpg )  %>% layer_smooths() %>% layer_points()

Part three - Transformations

Section 5 - Compute functions

Histograms

A histogram - plotted using layer_histograms() - shows the distribution of a single continuous variable. To do this, a histogram divides the x axis into evenly spaced intervals, known as bins. Above each bin, the histogram plots a rectangle. The height of the rectangle displays how many values of the variable fell within the range of the bin. As a result, the rectangles show how the frequency of values varies over the range of the variable.

You can change the appearance of a histogram by changing the width of the bins in the histogram. In fact, you should explore different binwidths whenever you make a histogram because different binwidths can reveal different types of information. To change the binwidth of a histogram, map the width argument of layer_histograms() to a number.

width is an argument of layer_histograms(). For best results, you should write the width argument in the parentheses that follow layer_histograms(). Always map width to its value; This will ensure that it uses the same units as the variable on the x axis.

Histogram below shows the distribution of the waiting variable of the faithful data set

Guessing width = 2 # range / 27

faithful %>% ggvis(~ waiting) %>% layer_histograms()
## Guessing width = 2 # range / 27

Below we map the binwidth of this histogram to 5 units

faithful %>% ggvis(~ waiting) %>% layer_histograms(width = 5)

Have you noticed that the histogram plots data that does not appear directly in your data set? Specifically, it plots the counts of each bin on the y axis.

Behind the scenes, layer_histograms() calls compute_bin() to calculate these counts. You can calculate the same values by calling compute_bin() manually. compute_bin() requires at least two arguments: a data set (which you will provide with the %>% syntax), and a variable name to bin on. You can also pass compute_bin() a binwidth argument, just as you pass layer_histograms() a binwidth argument.

compute_bin() returns a data frame that provides everything you need to build a histogram from scratch. Notice the similarity with previous cases: combining compute_smooth() and layer_points() had the exact same result as using layer_smooths() directly! Can you spot the analogy?

faithful %>% compute_bin (~waiting, width = 5) 
##    count_ x_ xmin_ xmax_ width_
## 1      13 45  42.5  47.5      5
## 2      24 50  47.5  52.5      5
## 3      29 55  52.5  57.5      5
## 4      21 60  57.5  62.5      5
## 5      13 65  62.5  67.5      5
## 6      13 70  67.5  72.5      5
## 7      42 75  72.5  77.5      5
## 8      58 80  77.5  82.5      5
## 9      38 85  82.5  87.5      5
## 10     17 90  87.5  92.5      5
## 11      4 95  92.5  97.5      5

We combine the output of the solution to the first instruction with ggvis() and layer_rects() to plot a histogram. layer_rects() plots simple rectangles. To use it, you need to pass ggvis() four properties: x, x2, y, and y2. These should correspond to the minimum and maximum x values for each rectangle and the minimum and maximum y values for each rectangle, respectively

faithful %>% compute_bin (~waiting, width = 5)  %>%  ggvis(~ waiting) %>%  
layer_rects(x = ~xmin_, x2 = ~xmax_, y =0, y2 = ~count_)

Density plots

Density plots provide another way to display the distribution of a single variable. A density plot uses a line to display the density of a variable at each point in its range. You can think of a density plot as a continuous version of a histogram with a different y scale (although this is not exactly accurate).

compute_density() takes two arguments, a data set and a variable name. It returns a data frame with two columns: pred_, the x values of the variable’s density line, and resp_, the y values of the variable’s density line.

You can use layer_densities() to create density plots. Like layer_histograms() it calls the compute function that it needs in the background, so you do not need to worry about calling compute_density().

Examples are provided below :

faithful %>% compute_density(~waiting) %>% ggvis(~pred_, ~resp_) %>% layer_lines()

Building a density plot directly using layer_densities.

faithful %>% ggvis(~waiting, fill := "green") %>% layer_densities()

Shortcuts

You do not need to use a compute function to transform the variables in your data set. You can directly plot transformations of the variables. To do this, use the ~ syntax to pass a transformed variable to ggvis.

For example, the code below will plot a version of cyl that has been transformed into a factor (R’s version of a categorical variable).

Guessing layer_bars()

mtcars %>% ggvis(~factor(cyl))
## Guessing layer_bars()

layer_bars() will automatically plot count values on the y axis when you do not provide a y variable. To do this, it uses compute_count(). Below we use compute_count() to calculate the count values used in the previous graph.

mtcars %>%  compute_count(~factor(cyl))
##   count_ x_
## 1     11  4
## 2      7  6
## 3     14  8

Section 6 - ggvis and dplyr

ggvis and group_by

The first chunk of code below uses the group_by() function from the dplyr package to plot two grouped smooth lines. ggvis plots a separate smooth line for each unique value of the cyl variable . Since group_by() does not come with the ggvis package, it does not use the ~ syntax (although this may change in the future). You should just pass group_by() a variable name without quotes.

mtcars %>% group_by(cyl) %>% ggvis(~mpg, ~wt, stroke = ~factor(cyl)) %>% layer_smooths()
## Warning in rbind_all(out[[1]]): Unequal factor levels: coercing to
## character

group_by() uses a grouping variable to organize a data set into several groups of observations. It places each observation into a group with other observations that have the same value of the grouping variable. In other words group_by() will create a separate group for each unique value of the grouping variable. When ggvis encounters grouped data, it will draw a separate mark for each group of observations

Now we refactor the code of the previous graph such that it contains a separate density for each value of cyl

mtcars %>% group_by(cyl) %>% ggvis(~mpg, stroke = ~factor(cyl) ) %>% layer_densities()
## Warning in rbind_all(out[[1]]): Unequal factor levels: coercing to
## character

Next we map the fill property to a categorical version of cyl. This addition clarifies which density corresponds to which group of observations.

mtcars %>% group_by(cyl) %>% ggvis(~mpg, fill = ~factor(cyl) ) %>% layer_densities()
## Warning in rbind_all(out[[1]]): Unequal factor levels: coercing to
## character

group_by() versus interaction()

group_by() can also group data based on the interaction of two or more variables. To group based on the interaction of multiple variables, give group_by() multiple variable names, like this:

#<data> %>% group_by(<var1>, <var2>, <var3>, ...)

group_by will create a separate group for each distinct combination of values within the grouping variables. group_by() does not change the raw values of the data set. The grouping information is saved as an attribute (e.g., metadata). You can remove the grouping information from a data set with ungroup() (e.g., mtcars %>% ungroup()).

You can also map properties to unique combinations of variables. To do this, use the interaction() function. For example,

#stroke = ~interaction(<var1>, <var2>, <var3>)

will map stroke to the unique combinations of , , and .

Example below displays a separate density for each unique combination of cyl and am.

mtcars %>% group_by(cyl,am) %>% ggvis(~mpg, fill = ~factor(cyl)) %>% layer_densities()
## Warning in rbind_all(out[[1]]): Unequal factor levels: coercing to
## character

Next we map fill to the unique combinations of the grouping variables.

mtcars %>% group_by(cyl, am) %>% ggvis(~mpg, fill = ~interaction(cyl,am)) %>% layer_densities()
## Warning in rbind_all(out[[1]]): Unequal factor levels: coercing to
## character

Part four - Interactivity and Layers

Section 7 - Interactive plots

The basics of interactive plots

The first chunck of code makes a basic interactive plot. The plot includes a select box that you can use to change the shape of the points in the plot. If you ran this code inside the RStudio IDE, you’d get an interactive plot, with visualizations that change on the fly. However knitr only supports static plots, with the interactions removed. Therefore we only show the codes of the exampels below.

You can make your plots interactive by setting a property to the output of an input widget. ggvis comes with seven input widgets: input_checkbox(), input_checkboxgroup(), input_numeric(), input_radiobuttons(), input_select(), input_slider(), and input_text(). By default, each returns their current value as a number or character string.

Below some examples are provided :

faithful %>% 
  ggvis(~waiting, ~eruptions, fillOpacity := 0.5, 
        shape := input_select(label = "Choose shape:", 
                              choices = c("circle", "square", "cross", "diamond", 
                                          "triangle-up", "triangle-down"))) %>% 
  layer_points()
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Making the fill property interactive using a select box :

faithful %>% 
  ggvis(~waiting, ~eruptions, fillOpacity := 0.5, 
        shape := input_select(label = "Choose shape:", 
                              choices = c("circle", "square", "cross", 
                                          "diamond", "triangle-up", "triangle-down")), 
        fill := input_select(label = "Choose color:", 
                             choices = c("black", "red", "blue", "green"))) %>% 
  layer_points()
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Adding radio buttons to control the fill of the plot :

mtcars %>% 
  ggvis(~mpg, ~wt, 
        fill := input_radiobuttons(label = "Choose color:", 
                                   choices = c("black", "red", "blue", "green"))) %>% 
    layer_points()
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Input widgets in more detail

Some input widgets provide choices for the user to select from. Others allow the user to provide their own input. For example, input_text() provides a text field for users to type input into. Instead of assigning input_text() a choices argument, you assign it a value argument: a character string to display when the plot first loads.

mtcars %>% 
  ggvis(~mpg, ~wt, 
        fill := input_text(label = "Choose color:", 
                           value = "black")) %>% 
  layer_points()
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

By default, input widgets return their values as character strings and numbers. To have a widget return its value as a variable name, you need to add the extra argument map = as.name.

For example, the text widget in the first challenge will pass the character string “black” to the fill argument, which is useful for setting. If we add map = as.name to the arguments of input_text(), the widget would return ~black which is useful for mapping (or would be if black were a real variable in mtcars):

mtcars %>% 
  ggvis(~mpg, ~wt, 
        fill = input_select(label = "Choose fill variable:", 
                            choices = names(mtcars), map = as.name)) %>% 
  layer_points()
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Control parameters and values

The previous exercises all manipulated properties of the ggvis plots, such as the shape and fill of points in scatterplots. As you will recall from earlier exercises, ggvis often needs additional parameters to build the correct graphs. You can also use widgets to control these parameters. Typically, you want to use the input_numeric() and input_slider() widgets to set numerical parameters.

Below we map the bindwidth to a numeric field.

mtcars %>% 
  ggvis(~mpg) %>% 
  layer_histograms(width = input_numeric(label = "Choose a binwidth:", value = 1))
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Below we map the binwidth to a slider bar

mtcars %>% 
  ggvis(~mpg) %>% 
  layer_histograms(width = input_slider(label = "Choose a binwidth:", min = 1, max = 20))
## Warning: Can't output dynamic/interactive ggvis plots in a knitr document.
## Generating a static (non-dynamic, non-interactive) version of the plot.

Section 8 - Multi-layered plots

Multi-layered plots and their properties

You can create multi-layered plots by adding additional layers to a graph with the %>% syntax.

pressure %>% 
  ggvis(~temperature, ~pressure, stroke := "skyblue") %>% 
  layer_lines()%>%
  layer_points()

If you set or map a property inside ggvis() it will be applied globally , every layer in the graph will use the property. If you set or map a property inside a layer_() function it will be applied locally: only the layer created by the function will use the property. Where applicable, local properties will override global properties.

in graph below only the lines layer uses a skyblue stroke.

pressure %>% 
  ggvis(~temperature, ~pressure) %>% 
  layer_lines(stroke := "skyblue") %>% 
  layer_points()

Global properties can cause trouble when you use multiple layers. In graph below only the points layer uses the shape property.

pressure %>% 
  ggvis(~temperature, ~pressure) %>% 
  layer_lines(stroke := "skyblue") %>% 
  layer_points(shape := "triangle-up")

If you like, you can define every property at the local level, including the x and y mappings. However, your code would not be very concise. Graph below is a concise version of the plot.

pressure %>% 
  ggvis(~temperature, ~pressure, stroke := "skyblue", 
        strokeOpacity := 0.5, strokeWidth := 5) %>% 
  layer_lines() %>% 
  layer_points(fill = ~temperature, shape := "triangle-up", size := 300)

There is no limit on the number of layers!

layer_model_predictions() plots the prediction line of a model fitted to the data. It is similar to layer_smooths(), but you can extend it to more models than just the “loess” or “gam” model.

layer_model_predictions() takes a parameter named model; it should be set to a character string that contains the name of an R model function. layer_model_predictions() will use this function to generate the model predictions. So for example, you could draw the model line of a linear model with

layer_model_predictions(model = “lm”).

Notice that model is a parameter, not a property. This means that you do not need to worry about setting vs. mapping. You can always set parameters with the equal sign, =.

We demonstarted an example below.

pressure %>% 
  ggvis(~temperature, ~pressure) %>%
  layer_lines(opacity := 0.5) %>%
  layer_points() %>%
  layer_model_predictions(model = "lm", stroke := "navy") %>%
  layer_smooths(stroke := "skyblue")
## Guessing formula = pressure ~ temperature

Part five - Customizing Axes, Legends, and Scales

Section 9 - Axes and Legends

Axes

Axes help you to customize the plots you create with ggvis. add_axis() allows you to change the titles, tick schemes and positions of your axes. The example code below clarifies:

add_axis(“x”, title = “x axis title”, values = c(1,2,3), subdivide = 5, orient = “top”)

The title argument is rather straightforward, as it simply sets the title of the axis you specified in the first argument.

You can use the values argument of dd_axis to determine where labelled tick marks will appear on each axis. You can use the subdivide argument to insert unlabelled tick marks between the labelled tick marks on an axis.

To control where an axis appears, use the orient argument. For example, the above code makes the x axis appear on the “top” (and not on the “bottom”) side of the graph. Similarly, you can have the y axis appear on the “left” or “right” side of the graph.

En example :

faithful %>% 
  ggvis(~waiting, ~eruptions) %>% 
  layer_points() %>% 
  add_axis("y", title = "Duration of eruption (m)") %>% 
  add_axis("x", title = "Time since previous eruption (m)")

Place a labelled tick mark for x and y.

faithful %>% 
  ggvis(~waiting, ~eruptions) %>% 
  layer_points() %>% 
  add_axis("y", title = "Duration of eruption (m)", 
           values = c(2, 3, 4, 5), subdivide = 9) %>% 
  add_axis("x", title = "Time since previous eruption (m)", 
           values = c(50, 60, 70, 80, 90), subdivide = 9)

Changing the axes’ locations

faithful %>% 
  ggvis(~waiting, ~eruptions) %>% 
  layer_points() %>%
  add_axis("x", orient = "top") %>% 
  add_axis("y", orient = "right")

Legends

add_legend() works similarly to add_axis(), except that it alters the legend of a plot. Instead of specifying which axis to change, you have to specify the property you want to add to the legend. For example

pressure %>% 
  ggvis(~temperature, ~pressure, fill = ~pressure) %>%
  layer_points() %>%
  add_legend("fill", title = "~ pressure")

adds a legend to the graph that gives more information about the color of the points in the scatterplot. for example :

faithful %>% 
  ggvis(~waiting, ~eruptions, opacity := 0.6, 
        fill = ~factor(round(eruptions))) %>% 
  layer_points() %>% 
  add_legend("fill", title = "~ duration (m)", orient = "left")

ggvis will create a separate legend for each property that you use. Often the results can be confusing.

You can use add_legend() to combine multiple legends into a single legend. To do this, give add_legend() a vector of property names as its first argument. For example, to combine a stroke legend with an opacity legend, call add_legend(c(“stroke”, “opacity”)). Similarly, you can specify the values property inside add_legend() to explicitly set the visible legend values.

Below we specify that only the values 2, 3, 4, and 5 should receive a labelled symbol.

faithful %>% 
  ggvis(~waiting, ~eruptions, opacity := 0.6, 
        fill = ~factor(round(eruptions)), shape = ~factor(round(eruptions)), 
        size = ~round(eruptions)) %>% 
    layer_points() %>% 
    add_legend(c("fill", "shape", "size"), 
               title = "~ duration (m)", values = c(2, 3, 4, 5))

Section 10 - Customize property mappings

Scale types

You can change the color scheme of a ggvis plot by adding a new scale to map a data set variable to fill colors. The first chunk of code below creates a new scale that will map the numeric disp variable to the fill property. The scale will create color output that ranges from red to yellow.

mtcars %>% 
  ggvis(~wt, ~mpg, fill = ~disp, stroke = ~disp, strokeWidth := 2) %>%
  layer_points() %>%
  scale_numeric("fill", range = c("red", "yellow"))

Nowe we make the stroke color range from “darkred” to “orange”.

mtcars %>% 
  ggvis(~wt, ~mpg, fill = ~disp, stroke = ~disp, strokeWidth := 2) %>%
  layer_points() %>%
  scale_numeric("fill", range = c("red", "yellow")) %>%
  scale_numeric("stroke", range = c("darkred", "orange")) 

ggvis provides several different functions for creating scales: scale_datetime(), scale_logical(), scale_nominal(), scale_numeric(), scale_singular(). Each maps a different type of data input to the visual properties that ggvis uses. For example, the first two challenges below require scale_numeric() because the code maps a numeric variable to a visual property.

Below we make the fill colors range from green to beige.

mtcars %>% ggvis(~wt, ~mpg, fill = ~hp) %>%
  layer_points() %>%
  scale_numeric("fill", range = c("green", "beige"))

Below we create a scale that will map factor(cyl) to a new range of colors: purple, blue, and green.

mtcars %>% ggvis(~wt, ~mpg, fill = ~factor(cyl)) %>%
  layer_points() %>%
  scale_nominal("fill", range = c("purple", "blue", "green"))

Adjust any visual property

You can adjust any visual property in your graph with a scale (not just color). Let’s look at another property that you may frequently want to adjust.

Often when you map a variable to opacity some data points will end up so transparent that they are hard to see, as in the first plot in the editor on the right.

Below we add a scale that limits the range of opacity from 0.2 to 1.

mtcars %>% ggvis(x = ~wt, y = ~mpg, fill = ~factor(cyl), opacity = ~hp) %>%
  layer_points() %>%
  scale_numeric("opacity", range = c(0.2, 1))

Just as you can change the range of visual values that your scales produce, you can also change the domain of data values that they consider. For example, you can expand the domain of the x and y scales to zoom out on your plot. The second plot below will expand the y axis to cover data values from 0 to the largest y value in the data set.

Below we add a second scale that will expand the x axis to cover data values from 0 to 6.

mtcars %>% ggvis(~wt, ~mpg, fill = ~disp) %>%
  layer_points() %>%
  scale_numeric("y", domain = c(0, NA)) %>%
  scale_numeric("x", domain = c(0, 6))

“=” versus “:=”

Scales help explain the difference between = and :=. Variables tend to contain values in the data space, things such as numbers measured in various units, datetimes, and so on. But properties need visual values, things such as numbers measured in pixels, colors, opacity levels, and so on.

Whenever you use = to map a variable to a property, ggvis will use a scale to transform the variable values into visual values. Whenever you set a value (or variable) to a property with :=, ggvis will pass the value on as is, without transforming it. For example, the code below passes “red” straight through to the visual space to create a red fill:

mtcars %>% 
  ggvis(x = ~wt, y = ~mpg, 
        fill := "red") %>% 
  layer_points()

This can work nicely if the value passed through makes sense in the visual space, but it can have unfortunate consequences if the value does not.

Example below adds a new column to mtcars that contains valid color names. If you map fill to the color column, ggvis transforms the color names into a new set of colors in the visual space.

mtcars$color <- c("red", "teal", "#cccccc", "tan")
mtcars %>% ggvis(x = ~wt, y = ~mpg, fill := ~color) %>% layer_points()