library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(ggplot2)
<You will write your summary in this part of the file, you should say what your objective is and clearly state any assumptions>
<If you have particular finding, not applicable here, you should present a summary of them in your introduction>
<Depending on the project you may need to include an Executive Summary, more on that later.>
The purpose of this notebook is to <explain how to format and turn
in an assignment. This example assumes you have been asked to do a set
of problems,we will us R4DS “R for Data Science](”https://r4ds.had.co.nz/index.html“)
<As you review this notebook, knit it and notice how the different typographical effects relate to the notebook file.>
To turn in your work, you will create an RPus file and send me the like. We’ll talk about that at the end of this presentation.”>
<Text contained between <> are instructions from me and would not be required in your submission.> ## Doing an Exercise from a Book
<First restate the exercise, including a code block if one is part of the statement>
ggplot(data = mpg) +
geom_point(mapping=aes(x = displ,y = hwy, color = "blue"))
#### Answer The
color keyword is in the aes
function and color is not a variable in mpg. To fix it we
must move the color assignmet out of the argument of aes
but keep it in the argument of geom_point.
ggplot(data = mpg) +
geom_point(mapping=aes(x = displ,y = hwy), color = "blue")
2$. Which variables in mpg are categorical? Which variables are continuous? (Hint: type ?mpg to read the documentation for the dataset). How can you see this information when you run mpg? 2
| variable name | Cat or Con |
|---|---|
| manufacturer | categorical |
| model | character variable |
| displ | continuous |
Run ggplot(data = mpg). What do you see?
2.How many rows are in mpg? How many columns?
'dim()'
## [1] "dim()"
To get the dimension of a data matrix, we can simply use function ‘dim()’. the rows and columns in mpg is[1] 234 11
3.What does the drv variable describe? Read the help for ?mpg to find out.
The drv variable is categorical variable which is used to categorize cars into front wheels, rear wheels or four wheel drive. for instance, f = font-wheel drive, r = rear wheel drive, 4 = four wheel drive.
this function creats scatterplot of hwy and cyl
ggplot(mpg, aes(x = hwy, y = cyl)) + geom_point()
When we make a scatterplot of class vs drive, the resulting scatterplot will have only few points.
ggplot(mpg, aes(x = class, y = drv)) + geom_point()
Additionally, since drv and class variables are categorical variables,
they typically take a small number of values so there are a limited
number of unique combinations of (x and y values) can be display. As a
result, sctterplote is not use to display this values.
If we try to facet a continuous variable, then the continuous variable is converted to a categorical variable and the plot continuous a facet for each distinct value. see example below
ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() + facet_grid(~ cty)
Empty cells in the facets mean that there is nothing that fits in that combination of variables
ggplot(data = mpg) +
geom_point(mapping = aes(x = drv, y = cyl))
Empty cells in the facets mean that there is nothing that fits in that
combination of variables. For Instance, see from both of the above plots
that there are no cars with fours-wheel drive that have five
cylinders.
As seen below, the first plot has facets arranged in rows and the second plot has facet arranged in columns. The dot(.) fills in for one variables in “facet_grid” so the one variable can be faceted in a specific orientation.
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ .)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
One advantage to using faceting instead of colour aesthetic is that, the data is separated so that trend for each value of the faceted variable can be analyzed individually. Also to reduce overlapping data.The disadvantages is that, no direct comparisons can not be made as easily between values of the faceted variables. With a larger dataset, faceting would be much more useful than color to prevent the graph from becoming too crowed.
Because the two variables sets how many rows and columns are there. For instance
ggplot(mpg) + geom_point(aes(displ, hwy)) + facet_wrap(~ cyl, dir = "v", as.table = FALSE)
when using facet_grid(), you should always put the variable with more unique leves in the columns. see plot blow
ggplot(mpg) + geom_point(aes(displ, hwy)) + facet_grid(drv ~ class)
As the plots display, R seems to have a standard plot size that is wider than it is taller. Therefore, there is better visibility when more unique levels are in the columns instead of rows.
ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
geom_point() +
geom_smooth(se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#### Answer
Graph of engine displacement on x axis and highway miles per gallon on y axis with everything colored by the type of drive, so three colores total. Each car will be visualized as a point(colored by drive type) plus three will be three fitted trend like separated and colored by drive type which do not include shading for standard error.
when show.legend is remove = FALSE from a graph, the legend will be displayed with the graph.The legend was likely removed earlier in the chapter because when including the legend changes the scale of the graph. It also makes it less comparable to other simililar graphs. For instance see example
ggplot(mpg) + geom_smooth(aes(displ, hwy, color = drv))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
show.legend = FALSE
‘se’ shades the standard error around the trend line for ‘geom_smooth()’
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
ggplot() +
geom_point(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_smooth(data = mpg, mapping = aes(x = displ, y = hwy))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
These two graphs will look the same because they contain the same specificatons. For Example, the first graph the aesthestics are just specified once for the entire graph, while the second graph has the same aesthestics specified twice for each layer
ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
##Graph II.
ggplot(mpg, aes(displ, hwy)) + geom_point() +
geom_smooth(aes(group = drv), se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
##Graph III.
ggplot(mpg, aes(displ, hwy, color = drv)) +
geom_point() + geom_smooth(se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
##Graph iv
ggplot(mpg, aes(displ, hwy)) + geom_point(aes(color = drv)) + geom_smooth(se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
##Graph v
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = drv)) +
geom_smooth(aes(linetype = drv), se = FALSE)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
##Graph vi
ggplot(mpg, aes(displ, hwy)) + geom_point(color = "white", size = 4) + geom_point(aes(color = drv))
I. What is the default geom associated with stat_summary()? How could you rewrite the previous plot to use that geom function instead of the stat function?
See plots below
ggplot(data = diamonds) +
geom_pointrange(
mapping = aes(x = cut, y = depth),
stat = "summary"
)
## No summary function supplied, defaulting to `mean_se()`
The resulting message says that stat_summary() uses the
mean and sd to calculate the middle point and
endpoints of the line. However, in the original plot the min and max
values were used for the endpoints. To recreate the original plot we
need to specify values for fun.min, fun.max,
and fun.
ggplot(data = diamonds) +
geom_pointrange(
mapping = aes(x = cut, y = depth),
stat = "summary",
fun.min = min,
fun.max = max,
fun = median
)
The geom_col() function has different default stat than
geom_bar(). The default stat of geom_col() is
stat_identity(), which leaves the data as is. The
geom_col() function expects that the data contains
x values and y values which represent the bar
height.
The following tables lists the pairs of geoms and stats that are almost always used in concert.
| geom | stat |
|---|---|
geom_bar() |
stat_count() |
geom_bin2d() |
stat_bin_2d() |
geom_boxplot() |
stat_boxplot() |
geom_contour_filled() |
stat_contour_filled() |
geom_contour() |
stat_contour() |
geom_count() |
stat_sum() |
geom_density_2d() |
stat_density_2d() |
geom_density() |
stat_density() |
geom_dotplot() |
stat_bindot() |
geom_function() |
stat_function() |
geom_sf() |
stat_sf() |
geom_sf() |
stat_sf() |
geom_smooth() |
stat_smooth() |
geom_violin() |
stat_ydensity() |
geom_hex() |
stat_bin_hex() |
geom_qq_line() |
stat_qq_line() |
geom_qq() |
stat_qq() |
geom_quantile() |
stat_quantile() |
These pairs of geoms and stats tend to have their names in common,
such stat_smooth() and geom_smooth() and be
documented on the same help page. The pairs of geoms and stats that are
used in concert often have each other as the default stat (for a geom)
or geom (for a stat).
The following tables contain the geoms and stats in ggplot2 and their
defaults as of version 3.3.0. Many geoms have
stat_identity() as the default stat.
| geom | default stat | shared docs |
|---|---|---|
geom_abline() |
stat_identity() |
|
geom_area() |
stat_identity() |
|
geom_bar() |
stat_count() |
x |
geom_bin2d() |
stat_bin_2d() |
x |
geom_blank() |
None | |
geom_boxplot() |
stat_boxplot() |
x |
geom_col() |
stat_identity() |
|
geom_count() |
stat_sum() |
x |
geom_countour_filled() |
stat_countour_filled() |
x |
geom_countour() |
stat_countour() |
x |
geom_crossbar() |
stat_identity() |
|
geom_curve() |
stat_identity() |
|
geom_density_2d_filled() |
stat_density_2d_filled() |
x |
geom_density_2d() |
stat_density_2d() |
x |
geom_density() |
stat_density() |
x |
geom_dotplot() |
stat_bindot() |
x |
geom_errorbar() |
stat_identity() |
|
geom_errorbarh() |
stat_identity() |
|
geom_freqpoly() |
stat_bin() |
x |
geom_function() |
stat_function() |
x |
geom_hex() |
stat_bin_hex() |
x |
geom_histogram() |
stat_bin() |
x |
geom_hline() |
stat_identity() |
|
geom_jitter() |
stat_identity() |
|
geom_label() |
stat_identity() |
|
geom_line() |
stat_identity() |
|
geom_linerange() |
stat_identity() |
|
geom_map() |
stat_identity() |
|
geom_path() |
stat_identity() |
|
geom_point() |
stat_identity() |
|
geom_pointrange() |
stat_identity() |
|
geom_polygon() |
stat_identity() |
|
geom_qq_line() |
stat_qq_line() |
x |
geom_qq() |
stat_qq() |
x |
geom_quantile() |
stat_quantile() |
x |
geom_raster() |
stat_identity() |
|
geom_rect() |
stat_identity() |
|
geom_ribbon() |
stat_identity() |
|
geom_rug() |
stat_identity() |
|
geom_segment() |
stat_identity() |
|
geom_sf_label() |
stat_sf_coordinates() |
x |
geom_sf_text() |
stat_sf_coordinates() |
x |
geom_sf() |
stat_sf() |
x |
geom_smooth() |
stat_smooth() |
x |
geom_spoke() |
stat_identity() |
|
geom_step() |
stat_identity() |
|
geom_text() |
stat_identity() |
|
geom_tile() |
stat_identity() |
|
geom_violin() |
stat_ydensity() |
x |
geom_vline() |
stat_identity() |
| stat | default geom | shared docs |
|---|---|---|
stat_bin_2d() |
geom_tile() |
|
stat_bin_hex() |
geom_hex() |
x |
stat_bin() |
geom_bar() |
x |
stat_boxplot() |
geom_boxplot() |
x |
stat_count() |
geom_bar() |
x |
stat_countour_filled() |
geom_contour_filled() |
x |
stat_countour() |
geom_contour() |
x |
stat_density_2d_filled() |
geom_density_2d() |
x |
stat_density_2d() |
geom_density_2d() |
x |
stat_density() |
geom_area() |
|
stat_ecdf() |
geom_step() |
|
stat_ellipse() |
geom_path() |
|
stat_function() |
geom_function() |
x |
stat_function() |
geom_path() |
|
stat_identity() |
geom_point() |
|
stat_qq_line() |
geom_path() |
|
stat_qq() |
geom_point() |
|
stat_quantile() |
geom_quantile() |
x |
stat_sf_coordinates() |
geom_point() |
|
stat_sf() |
geom_rect() |
|
stat_smooth() |
geom_smooth() |
x |
stat_sum() |
geom_point() |
|
stat_summary_2d() |
geom_tile() |
|
stat_summary_bin() |
geom_pointrange() |
|
stat_summary_hex() |
geom_hex() |
|
stat_summary() |
geom_pointrange() |
|
stat_unique() |
geom_point() |
The function stat_smooth() calculates the following
variables:
y: predicted valueymin: lower value of the confidence intervalymax: upper value of the confidence intervalse: standard errorThe parameters that control the behavior of
stat_smooth() include:
method: This is the method used to compute the
smoothing line. If NULL, a default method is used based on
the sample size: stats::loess() when there are less than
1,000 observations in a group, and mgcv::gam() with
formula = y ~ s(x, bs = "CS) otherwise. Alternatively, the
user can provide a character vector with a function name,
e.g. "lm", "loess", or a function,
e.g. MASS::rlm.
formula: When providing a custom method
argument, the formula to use. The default is y ~ x. For
example, to use the line implied by
lm(y ~ x + I(x ^ 2) + I(x ^ 3)), use
method = "lm" or method = lm and
formula = y ~ x + I(x ^ 2) + I(x ^ 3).
method.arg(): Arguments other than than the formula,
which is already specified in the formula
argument, to pass to the function inmethod`.
se: If TRUE, display standard error
bands, if FALSE only display the line.
na.rm: If FALSE, missing values are
removed with a warning, if TRUE the are silently removed.
The default is FALSE in order to make debugging easier. If
missing values are known to be in the data, then can be ignored, but if
missing values are not anticipated this warning can
If group = 1 is not included, then all the bars in the
plot will have the same height, a height of 1. The function
geom_bar() assumes that the groups are equal to the
x values since the stat computes the counts within the
group.
See examples of charts below
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop..))
The problem with these plots is that the proportions are calculated within the groups.
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop..))
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = color, y = ..prop..))
The following code will produce the intended stacked bar charts for
the case with no fill aesthetic.
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop.., group = 1))
With the fill aesthetic, the heights of the bars need to
be normalized.
ggplot(data = diamonds) +
geom_bar(aes(x = cut, y = ..count.. / sum(..count..), fill = color))
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point()
There is overplotting because there are multiple observations for
each combination of cty and hwy values.
I would improve the plot by using a jitter position adjustment to decrease overplotting. for instance,
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point(position = "jitter")
As you seen, the relationship between cty and
hwy is clear even without jittering the points but
jittering shows the locations where there are more observations.
geom_jitter() control the amount of
jittering?From the geom_jitter()
documentation, there are two arguments to jitter:
width controls the amount of horizontal displacement,
andheight controls the amount of vertical
displacement.The defaults values of width and height
will introduce noise in both directions. Here is what the plot looks
like with the default values of height and
width.
However, we can change these parameters. Here are few a examples to
understand how these parameters affect the amount of jittering.
Whenwidth = 0 there is no horizontal jitter.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter(width = 0)
When width = 20, there is too much horizontal
jitter.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter(width = 20)
When height = 0, there is no vertical jitter
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter(height = 0)
When height = 15, there is too much vertical jitter.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter(height = 15)
When width = 0 and height = 0, there is
neither horizontal or vertical jitter, and the plot produced is
identical to the one produced with geom_point().
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter(height = 0, width = 0)
## in summary
Note that the height and width arguments
are in the units of the data. Thus height = 1
(width = 1) corresponds to different relative amounts of
jittering depending on the scale of the y (x)
variable. The default values of height and
width are defined to be 80% of the
resolution() of the data, which is the smallest non-zero
distance between adjacent values of a variable. When x and
y are discrete variables, their resolutions are both equal
to 1, and height = 0.4 and width = 0.4 since
the jitter moves points in both positive and negative directions.
Also, The default values of height and
width in geom_jitter() are non-zero, so unless
both height and width are explicitly set set
0, there will be some jitter.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter()
The geom geom_jitter() adds random variation to the
locations points of the graph. In other words, it “jitters” the
locations of points slightly. This method reduces overplotting since two
points with the same location are unlikely to have the same random
variation.
Examples
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_jitter()
However, the reduction in overlapping comes at the cost of slightly
changing the x and y values of the points.
The geom geom_count() sizes the points relative to the
number of observations. Combinations of (x, y)
values with more observations will be larger than those with fewer
observations.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_count()
The geom_count() geom does not change x and
y coordinates of the points. However, if the points are
close together and counts are large, the size of some points can itself
create overplotting. For example, in the following example, a third
variable mapped to color is added to the plot. In this case,
geom_count() is less readable than
geom_jitter() when adding a third variable as a color
aesthetic.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy, color = class)) +
geom_jitter()
ggplot(data = mpg, mapping = aes(x = cty, y = hwy, color = class)) +
geom_count()
ggplot(data = mpg, mapping = aes(x = cty, y = hwy, color = class)) +
geom_count(position = "jitter")
From the charts above, Combining geom_count() with
jitter, which is specified with the position argument to
geom_count() rather than its own geom, helps overplotting a
little.
But as this example shows, unfortunately, there is no universal solution to overplotting. The costs and benefits of different approaches will depend on the structure of the data and the goal of the data scientist.
The default position for geom_boxplot() is
"dodge2", which is a shortcut for
position_dodge2. This position adjustment does not change
the vertical position of a geom but moves the geom horizontally to avoid
overlapping other geoms. See the documentation for position_dodge2()
for additional discussion on how it works.
For example, When we add colour = class to the box plot,
the different levels of the drv variable are placed side by
side, i.e., dodged.
ggplot(data = mpg, aes(x = drv, y = hwy, colour = class)) +
geom_boxplot()
If position_identity() is used the boxplots overlap.
Example below
ggplot(data = mpg, aes(x = drv, y = hwy, colour = class)) +
geom_boxplot(position = "identity")
coord_polar().A pie chart is a stacked bar chart with the addition of polar coordinates. Take this stacked bar chart with a single category.
ggplot(mpg, aes(x = factor(1), fill = drv)) +
geom_bar()
coord_polar(theta="y") to create pie
chart.ggplot(mpg, aes(x = factor(1), fill = drv)) +
geom_bar(width = 1) +
coord_polar(theta = "y")
theta = "y" maps y to the
angle of each section.If coord_polar() is specified without
theta = "y", then the resulting plot is called a bulls-eye
chart.
ggplot(mpg, aes(x = factor(1), fill = drv)) +
geom_bar(width = 1) +
coord_polar()
The labs function adds axis titles, plot titles, and a
caption to the plot.
For instance
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() +
coord_flip() +
labs(y = "Highway MPG",
x = "Class",
title = "Highway MPG by car class",
subtitle = "1999-2008",
caption = "Source: http://fueleconomy.gov")
The arguments to labs() are optional, so you can add as
many or as few of these as are needed.
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() +
coord_flip() +
labs(y = "Highway MPG",
x = "Year",
title = "Highway MPG by car class")
The labs() function is not the only function that adds
titles to plots. The xlab(), ylab(), and x-
and y-scale functions can add axis titles. The ggtitle()
function adds plot titles.
The coord_map() function uses map projections to project
the three-dimensional Earth onto a two-dimensional plane. By default,
coord_map() uses the Mercator
projection. This projection is applied to all the geoms in the plot.
The coord_quickmap() function uses an approximate but
faster map projection. This approximation ignores the curvature of Earth
and adjusts the map for the latitude/longitude ratio. The
coord_quickmap() project is faster than
coord_map() both because the projection is computationally
easier, and unlike coord_map(), the coordinates of the
individual geoms do not need to be transformed.
See the coord_map() documentation for more information on these functions and some examples.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point() +
geom_abline() +
coord_fixed()
The function coord_fixed() ensures that the line
produced by geom_abline() is at a 45-degree angle. A
45-degree line makes it easy to compare the highway and city mileage to
the case in which city and highway MPG were equal.
for example
p <- ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point() +
geom_abline()
p + coord_fixed()
If we didn’t include coord_fixed(), then the line would
no longer have an angle of 45 degrees.
1.Why does this code not work?
The variable being printed is my_varıable, not my_variable: the seventh character is “ı” (“LATIN SMALL LETTER DOTLESS I”), not “i”.
While it wouldn’t have helped much in this case, the importance of distinguishing characters in code is reasons why fonts which clearly distinguish similar characters are preferred in programming. It is especially important to distinguish between two sets of similar looking characters:
the numeral zero (0), the Latin small letter O (o), and the Latin capital letter O (O), the numeral one (1), the Latin small letter I (i), the Latin capital letter I (I), and Latin small letter L (l). In these fonts, zero and the Latin letter O are often distinguished by using a glyph for zero that uses either a dot in the interior or a slash through it. Some examples of fonts with dotted or slashed zero glyphs are Consolas, Deja Vu Sans Mono, Monaco, Menlo, Source Sans Pro, and FiraCode.
Error messages of the form “object ‘…’ not found” mean exactly what they say. R cannot find an object with that name. Unfortunately, the error does not tell you why that object cannot be found, because R does not know the reason that the object does not exist. The most common scenarios in which I encounter this error message are
I forgot to create the object, or an error prevented the object from being created.
I made a typo in the object’s name, either when using it or when I created it (as in the example above), or I forgot what I had originally named it. If you find yourself often writing the wrong name for an object, it is a good indication that the original name was not a good one.
I forgot to load the package that contains the object using library().
my_variable <- 10
#> Error in eval(expr, envir, enclos): object 'my_varıable' not found
2.Tweak each of the following R commands so that they run correctly:
The error message is argument “data” is missing, with no default. This error is a result of a typo, dota instead of data.
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
R could not find the function fliter() because we made a typo: fliter instead of filter.
We aren’t done yet. But the error message gives a suggestion. Let’s follow it.
filter(mpg, cyl == 8)
## # A tibble: 70 × 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a6 quattro 4.2 2008 8 auto… 4 16 23 p mids…
## 2 chevrolet c1500 sub… 5.3 2008 8 auto… r 14 20 r suv
## 3 chevrolet c1500 sub… 5.3 2008 8 auto… r 11 15 e suv
## 4 chevrolet c1500 sub… 5.3 2008 8 auto… r 14 20 r suv
## 5 chevrolet c1500 sub… 5.7 1999 8 auto… r 13 17 r suv
## 6 chevrolet c1500 sub… 6 2008 8 auto… r 12 17 r suv
## 7 chevrolet corvette 5.7 1999 8 manu… r 16 26 p 2sea…
## 8 chevrolet corvette 5.7 1999 8 auto… r 15 23 p 2sea…
## 9 chevrolet corvette 6.2 2008 8 manu… r 16 26 p 2sea…
## 10 chevrolet corvette 6.2 2008 8 auto… r 15 25 p 2sea…
## # … with 60 more rows
#> # A tibble: 70 x 11
#> manufacturer model displ year cyl trans drv cty hwy fl class
#> <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
#> 1 audi a6 quattro 4.2 2008 8 auto(… 4 16 23 p mids…
#> 2 chevrolet c1500 sub… 5.3 2008 8 auto(… r 14 20 r suv
#> 3 chevrolet c1500 sub… 5.3 2008 8 auto(… r 11 15 e suv
#> 4 chevrolet c1500 sub… 5.3 2008 8 auto(… r 14 20 r suv
#> 5 chevrolet c1500 sub… 5.7 1999 8 auto(… r 13 17 r suv
#> 6 chevrolet c1500 sub… 6 2008 8 auto(… r 12 17 r suv
#> # … with 64 more rows
filter(diamonds, carat > 3)
## # A tibble: 32 × 10
## carat cut color clarity depth table price x y z
## <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 3.01 Premium I I1 62.7 58 8040 9.1 8.97 5.67
## 2 3.11 Fair J I1 65.9 57 9823 9.15 9.02 5.98
## 3 3.01 Premium F I1 62.2 56 9925 9.24 9.13 5.73
## 4 3.05 Premium E I1 60.9 58 10453 9.26 9.25 5.66
## 5 3.02 Fair I I1 65.2 56 10577 9.11 9.02 5.91
## 6 3.01 Fair H I1 56.1 62 10761 9.54 9.38 5.31
## 7 3.65 Fair H I1 67.1 53 11668 9.53 9.48 6.38
## 8 3.24 Premium H I1 62.1 58 12300 9.44 9.4 5.85
## 9 3.22 Ideal I I1 62.6 55 12545 9.49 9.42 5.92
## 10 3.5 Ideal H I1 62.8 57 12587 9.65 9.59 6.03
## # … with 22 more rows
#> Error in filter(diamond, carat > 3): object 'diamond' not found
It knit the file also, This gives a menu with keyboard shortcuts. This can be found in the menu under Tools -> Keyboard Shortcuts Help.