f(x) %>% g(y) is a shortcut for g(f(x), y).= maps a property to a data value or a set of data values. This is how you visualize variation in your data set. ggvis will scale the values appropriately and add a legend that explains how values are mapped to particular instances of the property.
:= sets a property to a specific color (or size, width, etc.). This is how you customize the appearance of your plots. Numbers will be typically interpreted as pixels, such as size or displacement from the top-left hand corner of the plot. Color specifications are passed to vega, a javascript library, so you can use any color name recognized by HTML/CSS.
library(ggvis)
## Warning: package 'ggvis' was built under R version 3.2.5
faithful = read.csv("faithful.csv")
# Rewrite the code with the pipe operator
#layer_points (ggvis(faithful, ~waiting, ~eruptions))
faithful %>% ggvis(~waiting, ~eruptions) %>% layer_points()
# Modify this graph to map the size property to the pressure variable
# size = ~pressure
pressure %>% ggvis(~temperature, ~pressure, size = ~pressure) %>% layer_points()
# Modify this graph by setting the size property
# size := ~100
pressure %>% ggvis(~temperature, ~pressure, size = ~100) %>% layer_points()
pressure %>% ggvis(~temperature, ~pressure, size := ~100) %>% layer_points()
# Fix this code to set the fill property to red
# fill := "red"
pressure %>% ggvis(~temperature, ~pressure, fill = "red") %>% layer_points()
pressure %>% ggvis(~temperature, ~pressure, fill := "red") %>% layer_points()
ggvis code: objects, variables, and raw values.ggvis will treat the string as an object name. It will look for an object with that name in your current environment.~, at the start of the string, ggvis will treat the string as a variable name. It will look for the column with that name in the data set that the graph visualizes.Which of the commands below will create a graph that has green points? Try to predict the answer before running the code? C Which of the commands below will create a graph that uses color to reveal the values of the temperature variable in the pressure data set? A
red <- "green"
pressure$red <- pressure$temperature
# GRAPH A
pressure %>%
ggvis(~temperature, ~pressure,
fill = ~red) %>%
layer_points()
# GRAPH B
pressure %>%
ggvis(~temperature, ~pressure,
fill = "red") %>%
layer_points()
# GRAPH C
pressure %>%
ggvis(~temperature, ~pressure,
fill := red) %>%
layer_points()
layer_points(), including x, y, fill, fillOpacity, opacity, shape, size, stroke, strokeOpacity, andstrokeWidth. The shape property recognizes several different values: circle (default), square, cross, diamond, triangle-up, and triangle-down. For a complete overview of ggvis properties, you can consult the properties and scales vignette.# Add code, Map size to the eruptions variable
faithful %>%
ggvis(~waiting, ~eruptions,
size = ~eruptions, opacity := 0.5,
fill := "blue", stroke := "black") %>%
layer_points()
# Add code, Map fillOpacity to the eruptions variable
faithful %>%
ggvis(~waiting, ~eruptions,
fillOpacity = ~eruptions, size := 100,
fill := "red", stroke := "red", shape := "cross") %>%
layer_points()
# Update the code
pressure %>% ggvis(~temperature, ~pressure) %>% layer_lines(stroke :="red", strokeWidth :=2, strokeDash :=6)
layer_lines() will always connect the points in your plot from the leftmost point to the rightmost point. This can be undesirable if you are trying to plot a specific shape.texas is available in your workspace, containing the coordinates of the state of Texas; it is arranged such that consequent observations should be connected. The code on the right would plot a map of Texas if ggvis connected the points in the correct order.layer_paths(): this mark connects the points in the order that they appear in the data set. So the paths mark will connect the point that corresponds to the first row of the data to the point that corresponds to the second row of data, and so on - no matter where those points appear in the graph.texas = read.csv("texas.csv")
# Update the plot
# texas %>% ggvis(~long, ~lat) %>% layer_lines()
texas %>% ggvis(~long, ~lat) %>% layer_paths(fill := "darkorange")
compute_model_prediction() is a useful function to use with line graphs. It takes a data frame as input and returns a new data frame as output. The new data frame will contain the x and y values of a line fitted to the data in the original data frame.eruptions and waiting variables of the faithful data set.faithful %>%
compute_model_prediction(eruptions ~ waiting,
model = "lm")
compute_model_prediction() takes a couple of arguments: - faithful, the dataset, - an R formula, eruptions ~ waiting, that specifies the relationship to model. - a model argument, the name of the R modelling function that is used to calculate the line. "lm" calculates a linear fit, "loess" uses the LOESS method.
compute_smooth() is a wrapper around compute_model_prediction() that calculates a LOESS smooth line by default.
# Compute the x and y coordinates for a loess smooth line that predicts mpg with the wt
mtcars %>% compute_smooth(mpg ~ wt)
## pred_ resp_
## 1 1.513000 32.08897
## 2 1.562506 31.68786
## 3 1.612013 31.28163
## 4 1.661519 30.87037
## 5 1.711025 30.45419
## 6 1.760532 30.03318
## 7 1.810038 29.60745
## 8 1.859544 29.17711
## 9 1.909051 28.74224
## 10 1.958557 28.30017
## 11 2.008063 27.83462
## 12 2.057570 27.34766
## 13 2.107076 26.84498
## 14 2.156582 26.33229
## 15 2.206089 25.81529
## 16 2.255595 25.29968
## 17 2.305101 24.79115
## 18 2.354608 24.29542
## 19 2.404114 23.81818
## 20 2.453620 23.36514
## 21 2.503127 22.95525
## 22 2.552633 22.61385
## 23 2.602139 22.32759
## 24 2.651646 22.08176
## 25 2.701152 21.86167
## 26 2.750658 21.65260
## 27 2.800165 21.43987
## 28 2.849671 21.20875
## 29 2.899177 20.95334
## 30 2.948684 20.71584
## 31 2.998190 20.49571
## 32 3.047696 20.28293
## 33 3.097203 20.06753
## 34 3.146709 19.83950
## 35 3.196215 19.58885
## 36 3.245722 19.29716
## 37 3.295228 18.94441
## 38 3.344734 18.56700
## 39 3.394241 18.20570
## 40 3.443747 17.90090
## 41 3.493253 17.62060
## 42 3.542759 17.34002
## 43 3.592266 17.07908
## 44 3.641772 16.81759
## 45 3.691278 16.55757
## 46 3.740785 16.30833
## 47 3.790291 16.07916
## 48 3.839797 15.87937
## 49 3.889304 15.70181
## 50 3.938810 15.52594
## 51 3.988316 15.35173
## 52 4.037823 15.17933
## 53 4.087329 15.00894
## 54 4.136835 14.84072
## 55 4.186342 14.67484
## 56 4.235848 14.51148
## 57 4.285354 14.35082
## 58 4.334861 14.19302
## 59 4.384367 14.03826
## 60 4.433873 13.88672
## 61 4.483380 13.73856
## 62 4.532886 13.59396
## 63 4.582392 13.45310
## 64 4.631899 13.31614
## 65 4.681405 13.18326
## 66 4.730911 13.05464
## 67 4.780418 12.93045
## 68 4.829924 12.81086
## 69 4.879430 12.69604
## 70 4.928937 12.58617
## 71 4.978443 12.48143
## 72 5.027949 12.38198
## 73 5.077456 12.28799
## 74 5.126962 12.19966
## 75 5.176468 12.11713
## 76 5.225975 12.04060
## 77 5.275481 11.97023
## 78 5.324987 11.90620
## 79 5.374494 11.84868
## 80 5.424000 11.79784
compute_smooth() always returns a data set with two columns, one named pred_ and one named resp_. You can easily pass this data to a ggvis() call to plot a smoothed line of your data, as this example shows:faithful %>%
compute_smooths(eruptions ~ waiting) %>%
ggvis(~pred_, ~resp_) %>%
layer_lines()
compute_smooth() and then layer_lines() can be a bit of a hassle, ggvis features the layer_smooths() function: this layer automatically calls compute_smooth() in the background and plots the results as a smoothed line.# Extend the first command with a ggvis() and layer_lines() command. The plot should place pred_ on the x axis and resp_ on the y axis.
mtcars %>% compute_smooth(mpg ~ wt) %>% ggvis(~pred_, ~ resp_) %>% layer_lines()
# Extend the second command with a layer_points() and a layer_smooths() function. The result will be a point plot of the raw data with a smoothed line on top.
mtcars %>% ggvis(~wt, ~mpg) %>% layer_points() %>% layer_smooths()