https://github.com/hadley/fueleconomy/blob/master/data-raw/vehicles.csv

library(fueleconomy)
library(DT)

fuel<-read.csv("https://github.com/hadley/fueleconomy/blob/master/data-raw/vehicles.csv")
head(fueleconomy::vehicles)
## # A tibble: 6 ⊙ 12
##      id       make               model  year                       class
##   <int>      <chr>               <chr> <int>                       <chr>
## 1 27550 AM General   DJ Po Vehicle 2WD  1984 Special Purpose Vehicle 2WD
## 2 28426 AM General   DJ Po Vehicle 2WD  1984 Special Purpose Vehicle 2WD
## 3 27549 AM General    FJ8c Post Office  1984 Special Purpose Vehicle 2WD
## 4 28425 AM General    FJ8c Post Office  1984 Special Purpose Vehicle 2WD
## 5  1032 AM General Post Office DJ5 2WD  1985 Special Purpose Vehicle 2WD
## 6  1033 AM General Post Office DJ8 2WD  1985 Special Purpose Vehicle 2WD
## # ... with 7 more variables: trans <chr>, drive <chr>, cyl <int>,
## #   displ <dbl>, fuel <chr>, hwy <int>, cty <int>

Research question

What is the relation and comparsion between fuel economy and manufacturer of vehicles over 30 years from 1984 to 2015?

What are the cases, and how many are there?

33,442 cases from 1984 to 2015 with different manufacturer, model name, year, epa vehicle size class, transmission, drive train, fuel type, highway fuel economy and city fuel economy.

Describe the method of data collection.

The fuel economy data were collected from the EPA by observational data.

What type of study is this (observational/experiment)?

The type of study is observational by EPA

Data Source: If you collected the data, state self-collected. If not, provide a citation/link.

The data source is from “http://www.fueleconomy.gov/feg/download.shtml

Response: What is the response variable, and what type is it (numerical/categorical)?

The data structure are 12 column with 33,442 entries, the master key is “id”. the response variable are highway fuel economy and city fuel economy. It is the type of numerical.

Explanatory: What is the explanatory variable(s), and what type is it (numerical/categorival)?

The explanatory variable is manufacturer, year and number of cylinders. The manufacturer and year are the type of catergorival. And the cylinders is the type of numerical.

Relevant summary statistics

Summary of vehicles

summary(fueleconomy::vehicles)
##        id            make              model                year     
##  Min.   :    1   Length:33442       Length:33442       Min.   :1984  
##  1st Qu.: 8361   Class :character   Class :character   1st Qu.:1991  
##  Median :16724   Mode  :character   Mode  :character   Median :1999  
##  Mean   :17038                                         Mean   :1999  
##  3rd Qu.:25265                                         3rd Qu.:2008  
##  Max.   :34932                                         Max.   :2015  
##                                                                      
##     class              trans              drive                cyl        
##  Length:33442       Length:33442       Length:33442       Min.   : 2.000  
##  Class :character   Class :character   Class :character   1st Qu.: 4.000  
##  Mode  :character   Mode  :character   Mode  :character   Median : 6.000  
##                                                           Mean   : 5.772  
##                                                           3rd Qu.: 6.000  
##                                                           Max.   :16.000  
##                                                           NA's   :58      
##      displ           fuel                hwy              cty        
##  Min.   :0.000   Length:33442       Min.   :  9.00   Min.   :  6.00  
##  1st Qu.:2.300   Class :character   1st Qu.: 19.00   1st Qu.: 15.00  
##  Median :3.000   Mode  :character   Median : 23.00   Median : 17.00  
##  Mean   :3.353                      Mean   : 23.55   Mean   : 17.49  
##  3rd Qu.:4.300                      3rd Qu.: 27.00   3rd Qu.: 20.00  
##  Max.   :8.400                      Max.   :109.00   Max.   :138.00  
##  NA's   :57
hist((fueleconomy::vehicles)$cty)

hist((fueleconomy::vehicles)$hwy)