Load necessary packages:
library(ggplot2)
library(dplyr)
library(nycflights13)
library(knitr)
# Install this new package to read in CSV files:
library(readr)This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
mtcars %>%
head() %>%
kable()| mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
| Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
| Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
| Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
| Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
| Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
You can also embed plots, for example:
ggplot(mtcars, aes(x=disp, y=mpg)) +
geom_point()Here is a scatterplot of number of Starbucks/Dunkin Donuts locations per 1000 individuals over median income for 1024 census tracks in Western Massachusetts:
DD_vs_SB <- read_csv("https://rudeboybert.github.io/STAT135/content/PS07/DD_vs_SB.csv")
# Add your code to create plot below:
ggplot(data = DD_vs_SB, aes(x = median_income, y = shops_per_1000, color = Type)) +
geom_point() + facet_wrap("Type") +geom_smooth(method="lm")Here is a plot comparing beer vs spirit vs wine consumption worldwide:
drinks <- read_csv("https://rudeboybert.github.io/STAT135/content/PS07/drinks.csv")
# Add your code to create plot below:
ggplot(data = drinks, aes(x = type, y = servings)) +
geom_boxplot()#Or faceted histogram:
ggplot(data = drinks, mapping = aes(x = servings)) +
geom_histogram()+facet_wrap("type")Here is a table that shows the median departure delay for each airline leaving Newark:
# Add your code to create table below:
delay <- flights %>%
filter(origin == "EWR") %>%
group_by(carrier) %>%
summarise(median = quantile(dep_delay, 0.5, na.rm = TRUE))
kable(delay)| carrier | median |
|---|---|
| 9E | -5 |
| AA | -3 |
| AS | -3 |
| B6 | -3 |
| DL | -2 |
| EV | -1 |
| MQ | -2 |
| OO | -1 |
| UA | 0 |
| US | -4 |
| VX | -1 |
| WN | 1 |