For this quiz, you are going to use mpg (miles per galon) dataset. This dataset contains a subset of the fuel economy data that the EPA makes available on http: //fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.
The dataset has the following variables:
manufacturer manufacturer namemodel model namedispl engine displacement, in litresyear year of manufacturecyl number of cylinderstrans type of transmissiondrv the type of drive train, where f = front-wheel drive, r = rear wheel drive, 4 = 4wdcty city miles per gallonhwy highway miles per gallonfl fuel typeclass “type” of car# Load the package
library(tidyverse)
# Import data
data(mpg, package="ggplot2")
# Print the first 6 rows
head(mpg)
## # A tibble: 6 x 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa~
## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa~
## 3 audi a4 2 2008 4 manual(m6) f 20 31 p compa~
## 4 audi a4 2 2008 4 auto(av) f 21 30 p compa~
## 5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compa~
## 6 audi a4 2.8 1999 6 manual(m5) f 18 26 p compa~
# Get a sense of the dataset
glimpse(mpg)
## Rows: 234
## Columns: 11
## $ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi"...
## $ model <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro"...
## $ displ <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0,...
## $ year <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, ...
## $ cyl <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, ...
## $ trans <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "a...
## $ drv <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4",...
## $ cty <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17...
## $ hwy <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25...
## $ fl <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p",...
## $ class <chr> "compact", "compact", "compact", "compact", "compact",...
summary(mpg)
## manufacturer model displ year
## Length:234 Length:234 Min. :1.600 Min. :1999
## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999
## Mode :character Mode :character Median :3.300 Median :2004
## Mean :3.472 Mean :2004
## 3rd Qu.:4.600 3rd Qu.:2008
## Max. :7.000 Max. :2008
## cyl trans drv cty
## Min. :4.000 Length:234 Length:234 Min. : 9.00
## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00
## Median :6.000 Mode :character Mode :character Median :17.00
## Mean :5.889 Mean :16.86
## 3rd Qu.:8.000 3rd Qu.:19.00
## Max. :8.000 Max. :35.00
## hwy fl class
## Min. :12.00 Length:234 Length:234
## 1st Qu.:18.00 Class :character Class :character
## Median :24.00 Mode :character Mode :character
## Mean :23.44
## 3rd Qu.:27.00
## Max. :44.00
SEM <- sd(mpg$hwy) / sqrt(234)
SEM
## [1] 0.3892672
sample_mean <-mean(mpg$hwy)
sample_mean
## [1] 23.44017
upperCI <- sample_mean + 1.96 * SEM
upperCI
## [1] 24.20313
lowerCI <- sample_mean - 1.96 * SEM
lowerCI
## [1] 22.67721
c(lowerCI, upperCI)
## [1] 22.67721 24.20313
Well both of my values are above 21, so the hypothesis is correct and shown above.
Hint: Inser the code below The hypothesis states the car drives 16.3 mpg in the city. It is kinda true, a normal car drives less than the hypothesis. A average car, yes it does drive within the mpg range given.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.