Question: Is there a relationship between horsepower and fuel efficiency?

  1. Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the data set. Please include some conclusions in the R Markdown text.
gtcars = read.csv("https://raw.githubusercontent.com/LeJQC/Bridge-Course/main/gtcars.csv")

summary(gtcars)
##        X            mfr               model                year     
##  Min.   : 1.0   Length:47          Length:47          Min.   :2014  
##  1st Qu.:12.5   Class :character   Class :character   1st Qu.:2016  
##  Median :24.0   Mode  :character   Mode  :character   Median :2016  
##  Mean   :24.0                                         Mean   :2016  
##  3rd Qu.:35.5                                         3rd Qu.:2016  
##  Max.   :47.0                                         Max.   :2017  
##                                                                     
##      trim            bdy_style               hp            hp_rpm    
##  Length:47          Length:47          Min.   :259.0   Min.   :5000  
##  Class :character   Class :character   1st Qu.:414.5   1st Qu.:5800  
##  Mode  :character   Mode  :character   Median :552.0   Median :6500  
##                                        Mean   :515.0   Mean   :6780  
##                                        3rd Qu.:602.5   3rd Qu.:7750  
##                                        Max.   :949.0   Max.   :9000  
##                                                                      
##       trq           trq_rpm         mpg_c           mpg_h      
##  Min.   :243.0   Min.   :1400   Min.   :11.00   Min.   :16.00  
##  1st Qu.:376.5   1st Qu.:1712   1st Qu.:13.00   1st Qu.:19.25  
##  Median :436.0   Median :3550   Median :15.00   Median :22.00  
##  Mean   :441.0   Mean   :3651   Mean   :15.33   Mean   :22.20  
##  3rd Qu.:508.0   3rd Qu.:5500   3rd Qu.:16.75   3rd Qu.:24.75  
##  Max.   :664.0   Max.   :6750   Max.   :28.00   Max.   :30.00  
##                  NA's   :1      NA's   :1       NA's   :1      
##   drivetrain           trsmn           ctry_origin             msrp        
##  Length:47          Length:47          Length:47          Min.   :  53900  
##  Class :character   Class :character   Class :character   1st Qu.:  86698  
##  Mode  :character   Mode  :character   Mode  :character   Median : 129900  
##                                                           Mean   : 193929  
##                                                           3rd Qu.: 241325  
##                                                           Max.   :1416362  
## 
mean(gtcars$mpg_h,na.rm = TRUE)
## [1] 22.19565
mean(gtcars$mpg_c, na.rm = TRUE)
## [1] 15.32609
median(gtcars$mpg_h, na.rm = TRUE)
## [1] 22
median(gtcars$mpg_c, na.rm = TRUE)
## [1] 15
mean(gtcars$hp)
## [1] 514.9574

There are 47 unique cars from year 2014 to 2017 in this data set . The average miles per gallon on the highway of the cars in this data is 22.20 miles per gallon, with a median of 22 medians per gallon. On the other hand, the average miles per gallon in the city is lower at 15.33 miles per gallon, with a median of 15 miles per gallon.

  1. Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data (for example – if it makes sense you could sum two columnstogether)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
gtcars <- gtcars %>%
  mutate(avg_mpg = (mpg_c + mpg_h) /2,
         msrp = round(msrp/1000)) %>% 
  rename(price_k = msrp)

head(gtcars)
##   X     mfr        model year             trim   bdy_style  hp hp_rpm trq
## 1 1    Ford           GT 2017       Base Coupe       coupe 647   6250 550
## 2 2 Ferrari 458 Speciale 2015       Base Coupe       coupe 597   9000 398
## 3 3 Ferrari   458 Spider 2015             Base convertible 562   9000 398
## 4 4 Ferrari   458 Italia 2014       Base Coupe       coupe 562   9000 398
## 5 5 Ferrari      488 GTB 2016       Base Coupe       coupe 661   8000 561
## 6 6 Ferrari   California 2015 Base Convertible convertible 553   7500 557
##   trq_rpm mpg_c mpg_h drivetrain trsmn   ctry_origin price_k avg_mpg
## 1    5900    11    18        rwd    7a United States     447    14.5
## 2    6000    13    17        rwd    7a         Italy     292    15.0
## 3    6000    13    17        rwd    7a         Italy     264    15.0
## 4    6000    13    17        rwd    7a         Italy     234    15.0
## 5    3000    15    22        rwd    7a         Italy     245    18.5
## 6    4750    16    23        rwd    7a         Italy     199    19.5
  1. Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.
#Scatter plot
library(ggplot2)
gtcars %>% 
  ggplot(aes(x= hp, y = avg_mpg))+
  geom_point(color = "green", na.rm = TRUE, size = 3)+
  geom_smooth(method = lm, na.rm = TRUE, se = F)+
  labs(title = "Horsepower vs. Miles per Gallon", x = "Horsepower", y = "Average Miles per Gallon" )+
  theme_classic()
## `geom_smooth()` using formula = 'y ~ x'

#Box plot
gtcars %>% 
  ggplot(aes(x=avg_mpg, y= ctry_origin))+
  geom_boxplot(na.rm = TRUE)+
  labs(title = "Fuel Efficiency by Country ", x= "Average Miles per Gallon", y = "Country")+
  theme_classic()

#Histogram
gtcars %>% 
  ggplot(aes(hp))+
  geom_histogram(na.rm = TRUE, bins= 10, fill = "red", color = "red", alpha = 0.3 )+
  labs(title = "Horsepower by Country", x= "Horsepower", y = "Frequency")+
  theme_classic()+
  facet_wrap(~ctry_origin)

summary(gtcars)
##        X            mfr               model                year     
##  Min.   : 1.0   Length:47          Length:47          Min.   :2014  
##  1st Qu.:12.5   Class :character   Class :character   1st Qu.:2016  
##  Median :24.0   Mode  :character   Mode  :character   Median :2016  
##  Mean   :24.0                                         Mean   :2016  
##  3rd Qu.:35.5                                         3rd Qu.:2016  
##  Max.   :47.0                                         Max.   :2017  
##                                                                     
##      trim            bdy_style               hp            hp_rpm    
##  Length:47          Length:47          Min.   :259.0   Min.   :5000  
##  Class :character   Class :character   1st Qu.:414.5   1st Qu.:5800  
##  Mode  :character   Mode  :character   Median :552.0   Median :6500  
##                                        Mean   :515.0   Mean   :6780  
##                                        3rd Qu.:602.5   3rd Qu.:7750  
##                                        Max.   :949.0   Max.   :9000  
##                                                                      
##       trq           trq_rpm         mpg_c           mpg_h      
##  Min.   :243.0   Min.   :1400   Min.   :11.00   Min.   :16.00  
##  1st Qu.:376.5   1st Qu.:1712   1st Qu.:13.00   1st Qu.:19.25  
##  Median :436.0   Median :3550   Median :15.00   Median :22.00  
##  Mean   :441.0   Mean   :3651   Mean   :15.33   Mean   :22.20  
##  3rd Qu.:508.0   3rd Qu.:5500   3rd Qu.:16.75   3rd Qu.:24.75  
##  Max.   :664.0   Max.   :6750   Max.   :28.00   Max.   :30.00  
##                  NA's   :1      NA's   :1       NA's   :1      
##   drivetrain           trsmn           ctry_origin           price_k      
##  Length:47          Length:47          Length:47          Min.   :  54.0  
##  Class :character   Class :character   Class :character   1st Qu.:  86.5  
##  Mode  :character   Mode  :character   Mode  :character   Median : 130.0  
##                                                           Mean   : 193.9  
##                                                           3rd Qu.: 241.0  
##                                                           Max.   :1416.0  
##                                                                           
##     avg_mpg     
##  Min.   :13.50  
##  1st Qu.:15.62  
##  Median :18.50  
##  Mean   :18.76  
##  3rd Qu.:20.38  
##  Max.   :28.50  
##  NA's   :1
  1. Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraph in R markdown at the end.

Question: Is there a relationship between horsepower and fuel efficiency?

This data set contains information on 47 deluxe automobiles from 2014 to 2017. To measure fuel efficiency, I took the average of the miles per gallon for both city (mpg_c) and highway (mpg_h) driving. Cars with higher miles per gallon are more fuel efficient than cars with lower miles per gallon. One factor that may affect fuel efficiency is horsepower. By plotting the average miles per gallon of each car against the horsepower of that car, it is clear that there is a relationship between fuel efficiency and horsepower: as horsepower increased, the average miles per gallon decreased.

I also created a box plot to determine which country had more fuel efficient vehicles. Without a doubt, cars from Germany are more fuel efficient than cars produced in the United States, United Kingdom, Japan, and Italy. Based on the histogram, Germany produced more cars with less than the average horsepower (515.0) than the other countries.