Lines and Syntax

S3 Three operators: %>%, = and :=

The pipe operator passes the result from its left-hand side into the first argument of the function on its right-hand side. `f(x) %>% g(y)` is a shortcut for `g(f(x), y)`.

= maps a property to a data value or a set of data values. This is how you visualize variation in your data set. ggvis will scale the values appropriately and add a legend that explains how values are mapped to particular instances of the property.

:= sets a property to a specific color (or size, width, etc.). This is how you customize the appearance of your plots. Numbers will be typically interpreted as pixels, such as size or displacement from the top-left hand corner of the plot. Color specifications are passed to vega, a javascript library, so you can use any color name recognized by HTML/CSS.

library(ggvis)

## Warning: package 'ggvis' was built under R version 3.2.5

faithful = read.csv("faithful.csv")
# Rewrite the code with the pipe operator     
#layer_points (ggvis(faithful, ~waiting, ~eruptions))
faithful %>% ggvis(~waiting, ~eruptions) %>% layer_points()

# Modify this graph to map the size property to the pressure variable
# size = ~pressure
pressure %>% ggvis(~temperature, ~pressure, size = ~pressure) %>% layer_points()

# Modify this graph by setting the size property
# size := ~100
pressure %>% ggvis(~temperature, ~pressure, size = ~100) %>% layer_points()

pressure %>% ggvis(~temperature, ~pressure, size := ~100) %>% layer_points()

# Fix this code to set the fill property to red
# fill := "red"
pressure %>% ggvis(~temperature, ~pressure, fill = "red") %>% layer_points()

pressure %>% ggvis(~temperature, ~pressure, fill := "red") %>% layer_points()

Referring to different objects

You can refer to three different types of objects in your `ggvis` code: objects, variables, and raw values.

If you type a string of letters, ggvis will treat the string as an object name. It will look for an object with that name in your current environment.
If you place a tilde, ~, at the start of the string, ggvis will treat the string as a variable name. It will look for the column with that name in the data set that the graph visualizes.
If you surround a string of letters with quotation marks, ggvis will treat the string as a raw value, e.g., a piece of text.

Which of the commands below will create a graph that has green points? Try to predict the answer before running the code? C Which of the commands below will create a graph that uses color to reveal the values of the temperature variable in the pressure data set? A

red <- "green"
pressure$red <- pressure$temperature

# GRAPH A
pressure %>% 
  ggvis(~temperature, ~pressure, 
        fill = ~red) %>% 
  layer_points()

# GRAPH B  
pressure %>% 
  ggvis(~temperature, ~pressure, 
        fill = "red") %>% 
  layer_points()

# GRAPH C
pressure %>% 
  ggvis(~temperature, ~pressure, 
        fill := red) %>% 
  layer_points()

- C

Properties for points

You can manipulate many different properties when using `layer_points()`, including `x`, `y`, `fill`, `fillOpacity`, `opacity`, `shape`, `size`, `stroke`, `strokeOpacity`, `andstrokeWidth`. The `shape` property recognizes several different values: `circle` (default), `square`, `cross`, `diamond`, `triangle-up`, and `triangle-down`. For a complete overview of `ggvis` properties, you can consult the properties and scales vignette.

# Add code, Map size to the eruptions variable
faithful %>% 
  ggvis(~waiting, ~eruptions, 
        size = ~eruptions, opacity := 0.5, 
        fill := "blue", stroke := "black") %>% 
  layer_points()

# Add code, Map fillOpacity to the eruptions variable
faithful %>% 
  ggvis(~waiting, ~eruptions, 
        fillOpacity = ~eruptions, size := 100,  
        fill := "red", stroke := "red", shape := "cross") %>% 
  layer_points()

S4 the line, a special type of mark

Properties for lines

In the previous section, you learned that you can manipulate many different properties when using the points mark. This mark type responds to, among others, x, y, fill, fillOpacity, opacity, shape, size, stroke, strokeOpacity, andstrokeWidth.

Similar to points, lines have specific properties; they respond to: x, y, fill, fillOpacity, opacity, stroke, strokeDash, strokeOpacity, and strokeWidth. Most of them are common to the properties for points, some are missing - there’s no size property - and others are new, like strokeDash

# Update the code
pressure %>% ggvis(~temperature, ~pressure) %>% layer_lines(stroke :="red", strokeWidth :=2, strokeDash :=6)

Path marks and polygons

`layer_lines()` will always connect the points in your plot from the leftmost point to the rightmost point. This can be undesirable if you are trying to plot a specific shape.

A dataframe `texas` is available in your workspace, containing the coordinates of the state of Texas; it is arranged such that consequent observations should be connected. The code on the right would plot a map of Texas if `ggvis` connected the points in the correct order.

You can do this with `layer_paths()`: this mark connects the points in the order that they appear in the data set. So the paths mark will connect the point that corresponds to the first row of the data to the point that corresponds to the second row of data, and so on - no matter where those points appear in the graph.

texas = read.csv("texas.csv")
# Update the plot
# texas %>% ggvis(~long, ~lat) %>% layer_lines()
texas %>% ggvis(~long, ~lat) %>% layer_paths(fill := "darkorange")

Display model fits

`compute_model_prediction()` is a useful function to use with line graphs. It takes a data frame as input and returns a new data frame as output. The new data frame will contain the x and y values of a line fitted to the data in the original data frame.

The code below computes a line that shows the relationship between the `eruptions` and `waiting` variables of the `faithful` data set.

faithful %>% 
  compute_model_prediction(eruptions ~ waiting, 
                           model = "lm")

compute_model_prediction() takes a couple of arguments: - faithful, the dataset, - an R formula, eruptions ~ waiting, that specifies the relationship to model. - a model argument, the name of the R modelling function that is used to calculate the line. "lm" calculates a linear fit, "loess" uses the LOESS method.

compute_smooth() is a wrapper around compute_model_prediction() that calculates a LOESS smooth line by default.

# Compute the x and y coordinates for a loess smooth line that predicts mpg with the wt
mtcars %>% compute_smooth(mpg ~ wt)

##       pred_    resp_
## 1  1.513000 32.08897
## 2  1.562506 31.68786
## 3  1.612013 31.28163
## 4  1.661519 30.87037
## 5  1.711025 30.45419
## 6  1.760532 30.03318
## 7  1.810038 29.60745
## 8  1.859544 29.17711
## 9  1.909051 28.74224
## 10 1.958557 28.30017
## 11 2.008063 27.83462
## 12 2.057570 27.34766
## 13 2.107076 26.84498
## 14 2.156582 26.33229
## 15 2.206089 25.81529
## 16 2.255595 25.29968
## 17 2.305101 24.79115
## 18 2.354608 24.29542
## 19 2.404114 23.81818
## 20 2.453620 23.36514
## 21 2.503127 22.95525
## 22 2.552633 22.61385
## 23 2.602139 22.32759
## 24 2.651646 22.08176
## 25 2.701152 21.86167
## 26 2.750658 21.65260
## 27 2.800165 21.43987
## 28 2.849671 21.20875
## 29 2.899177 20.95334
## 30 2.948684 20.71584
## 31 2.998190 20.49571
## 32 3.047696 20.28293
## 33 3.097203 20.06753
## 34 3.146709 19.83950
## 35 3.196215 19.58885
## 36 3.245722 19.29716
## 37 3.295228 18.94441
## 38 3.344734 18.56700
## 39 3.394241 18.20570
## 40 3.443747 17.90090
## 41 3.493253 17.62060
## 42 3.542759 17.34002
## 43 3.592266 17.07908
## 44 3.641772 16.81759
## 45 3.691278 16.55757
## 46 3.740785 16.30833
## 47 3.790291 16.07916
## 48 3.839797 15.87937
## 49 3.889304 15.70181
## 50 3.938810 15.52594
## 51 3.988316 15.35173
## 52 4.037823 15.17933
## 53 4.087329 15.00894
## 54 4.136835 14.84072
## 55 4.186342 14.67484
## 56 4.235848 14.51148
## 57 4.285354 14.35082
## 58 4.334861 14.19302
## 59 4.384367 14.03826
## 60 4.433873 13.88672
## 61 4.483380 13.73856
## 62 4.532886 13.59396
## 63 4.582392 13.45310
## 64 4.631899 13.31614
## 65 4.681405 13.18326
## 66 4.730911 13.05464
## 67 4.780418 12.93045
## 68 4.829924 12.81086
## 69 4.879430 12.69604
## 70 4.928937 12.58617
## 71 4.978443 12.48143
## 72 5.027949 12.38198
## 73 5.077456 12.28799
## 74 5.126962 12.19966
## 75 5.176468 12.11713
## 76 5.225975 12.04060
## 77 5.275481 11.97023
## 78 5.324987 11.90620
## 79 5.374494 11.84868
## 80 5.424000 11.79784

compute_smooths() to simplify model fits

`compute_smooth()` always returns a data set with two columns, one named `pred_` and one named `resp_`. You can easily pass this data to a `ggvis()` call to plot a smoothed line of your data, as this example shows:

faithful %>% 
  compute_smooths(eruptions ~ waiting) %>%
  ggvis(~pred_, ~resp_) %>% 
  layer_lines()

Because first calling `compute_smooth()` and then `layer_lines()` can be a bit of a hassle, `ggvis` features the `layer_smooths()` function: this layer automatically calls `compute_smooth()` in the background and plots the results as a smoothed line.

# Extend the first command with a ggvis() and layer_lines() command. The plot should place pred_ on the x axis and resp_ on the y axis.
mtcars %>% compute_smooth(mpg ~ wt) %>% ggvis(~pred_, ~ resp_) %>% layer_lines()

# Extend the second command with a layer_points() and a layer_smooths() function. The result will be a point plot of the raw data with a smoothed line on top.
mtcars %>% ggvis(~wt, ~mpg) %>% layer_points() %>% layer_smooths()