title: “Quiz6” author: “Bryan McGrath” output: html_document: toc: yes editor_options: chunk_output_type: console —
For this quiz, you are going to use mpg (miles per galon) dataset. This dataset contains a subset of the fuel economy data that the EPA makes available on http: //fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.
The dataset has the following variables:
manufacturer manufacturer namemodel model namedispl engine displacement, in litresyear year of manufacturecyl number of cylinderstrans type of transmissiondrv the type of drive train, where f = front-wheel drive, r = rear wheel drive, 4 = 4wdcty city miles per gallonhwy highway miles per gallonfl fuel typeclass “type” of car# Load the package
library(tidyverse)
## -- Attaching packages ------------------------------------ tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts --------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
# Import data
data(mpg, package="ggplot2")
# Print the first 6 rows
head(mpg)
## # A tibble: 6 x 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa~
## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa~
## 3 audi a4 2 2008 4 manual(m6) f 20 31 p compa~
## 4 audi a4 2 2008 4 auto(av) f 21 30 p compa~
## 5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compa~
## 6 audi a4 2.8 1999 6 manual(m5) f 18 26 p compa~
# Get a sense of the dataset
glimpse(mpg)
## Rows: 234
## Columns: 11
## $ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi"...
## $ model <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro"...
## $ displ <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0,...
## $ year <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, ...
## $ cyl <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, ...
## $ trans <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "a...
## $ drv <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4",...
## $ cty <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17...
## $ hwy <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25...
## $ fl <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p",...
## $ class <chr> "compact", "compact", "compact", "compact", "compact",...
summary(mpg)
## manufacturer model displ year
## Length:234 Length:234 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
## Mode :character Mode :character Median :3.300 Median :2004
## Mean :3.472 Mean :2004
## 3rd Qu.:4.600 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:234 Length:234 Min. : 9.00
## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
## Median :6.000 Mode :character Mode :character Median :17.00
## Mean :5.889 Mean :16.86
## 3rd Qu.:8.000 3rd Qu.:19.00
## Max. :8.000 Max. :35.00
## hwy fl class
## Min. :12.00 Length:234 Length:234
## 1st Qu.:18.00 Class :character Class :character
## Median :24.00 Mode :character Mode :character
## Mean :23.44
## 3rd Qu.:27.00
## Max. :44.00
SEM <- sd(mpg$hwy) / sqrt(234)
SEM
## [1] 0.3892672
Sample_Mean <- mean(mpg$hwy)
Sample_Mean
## [1] 23.44017
UpperCI <- Sample_Mean - 1.96 * SEM
UpperCI
## [1] 22.67721
LowerCI <- Sample_Mean - 1.96 * SEM
LowerCI
## [1] 22.67721
c(LowerCI, UpperCI)
## [1] 22.67721 22.67721
Given that at the start we predicted that a typical car drives about 21 miles per gallon and discovering that the lower bound, upper bound and sample mean are all over 21 miles per gallon we can conclude that our hypothesis was correct.
Hint: Inser the code below.
SEM <- sd(mpg$hwy) / sqrt(16.5)
SEM
## [1] 1.465932
Sample_Mean <- mean(mpg$hwy)
Sample_Mean
## [1] 23.44017
LowerCI <- Sample_Mean - 1.96 * SEM
LowerCI
## [1] 20.56694
c(LowerCI, UpperCI)
## [1] 20.56694 22.67721
Given the mpg of the on city is roughly 20 which is greater than 16.5 mpg proves that once again our hypothesis was correct.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.