For this quiz, you are going to use mpg (miles per galon) dataset. This dataset contains a subset of the fuel economy data that the EPA makes available on http: //fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.

The dataset has the following variables:

# Load the package
library(tidyverse)

# Import data
data(mpg, package="ggplot2")

# Print the first 6 rows
head(mpg)
## # A tibble: 6 x 11
##   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
##   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
## 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa~
## 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa~
## 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa~
## 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa~
## 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa~
## 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa~
# Get a sense of the dataset
glimpse(mpg)
## Rows: 234
## Columns: 11
## $ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi"...
## $ model        <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro"...
## $ displ        <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0,...
## $ year         <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, ...
## $ cyl          <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, ...
## $ trans        <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "a...
## $ drv          <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4",...
## $ cty          <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17...
## $ hwy          <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25...
## $ fl           <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p",...
## $ class        <chr> "compact", "compact", "compact", "compact", "compact",...
summary(mpg)
##  manufacturer          model               displ            year     
##  Length:234         Length:234         Min.   :1.600   Min.   :1999  
##  Class :character   Class :character   1st Qu.:2.400   1st Qu.:1999  
##  Mode  :character   Mode  :character   Median :3.300   Median :2004  
##                                        Mean   :3.472   Mean   :2004  
##                                        3rd Qu.:4.600   3rd Qu.:2008  
##                                        Max.   :7.000   Max.   :2008  
##       cyl           trans               drv                 cty       
##  Min.   :4.000   Length:234         Length:234         Min.   : 9.00  
##  1st Qu.:4.000   Class :character   Class :character   1st Qu.:14.00  
##  Median :6.000   Mode  :character   Mode  :character   Median :17.00  
##  Mean   :5.889                                         Mean   :16.86  
##  3rd Qu.:8.000                                         3rd Qu.:19.00  
##  Max.   :8.000                                         Max.   :35.00  
##       hwy             fl               class          
##  Min.   :12.00   Length:234         Length:234        
##  1st Qu.:18.00   Class :character   Class :character  
##  Median :24.00   Mode  :character   Mode  :character  
##  Mean   :23.44                                        
##  3rd Qu.:27.00                                        
##  Max.   :44.00

Q1-Q6 You believe that the typical car today drives at least 21 miles per galon (mpg) on highway. You have a sample of 234 cars. Test this hypothesis by answering Q1-Q6.

Q1 Calculate the standard error of the mean mgp on highway.

SEM <- sd(mpg$hwy) / sqrt(234)
SEM 
## [1] 0.3892672

The standard error of the mean mpg on highway is 0.389 mpg

Q2 Calculate the sample mean.

sample_mean <- mean(mpg$hwy)
sample_mean
## [1] 23.44017

The mean mpg on highway is 23.44 mpg

Q3 Calculate the upper and lower bound of the 95% confidence interval

upperCI <- sample_mean + 1.96 * SEM
upperCI
## [1] 24.20313

The upper bound is 24.20 mpg

Q4 Calculate the lower bound.

lowerCI <- sample_mean - 1.96 * SEM
lowerCI
## [1] 22.67721

The lower bound is 22.68 mpg

Q5 Calculate the 95% confidence interval.

c(lowerCI,upperCI)
## [1] 22.67721 24.20313

The 95% confidence interval is 22.68mpg-24.20mpg

Q6 What is your conclusion regarding the hypothesis.

The hypothesis is true because we are 95% confident that the typical car drive between 22.68 to 24.20 miles per gallon on the highway. Since the hypothesis stated that the typical car drives at least 21 miles per gallon on the highway, this means that the hypothesis is correct.

Q7 Now that you understand the mpg on highway, you want to test another hypothesis. You believe that the typical car today drives at least 16.5 miles per galon on city. To test this hypothesis, repeat Q1-Q6. Discuss your conclusion using the 95% confidence interval.

Hint: Inser the code below.

SEM <- sd(mpg$cty) / sqrt(234)
SEM 
## [1] 0.2782199
sample_mean <- mean(mpg$cty)
sample_mean
## [1] 16.85897
upperCI <- sample_mean + 1.96 * SEM
upperCI
## [1] 17.40429
lowerCI <- sample_mean - 1.96 * SEM
lowerCI
## [1] 16.31366
c(lowerCI,upperCI)
## [1] 16.31366 17.40429

This hypothesis is untrue because we are 95% confident that typical cars drive between 16.31 and 17.40 miles per gallon in the city. Since the 95% confidence interval shows that some cars do drive under 16.5 miles per gallon in the city, this means that the hypothesis is false.

Q8 Hide the messages and warnings, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.