Introduction

Objective

As a way to fulfill my assignment for Algoritma Data Science School, we would like to create a linear regression model to predict the Manufacturer’s Suggested Retail Price of car models from different kind of company brands by using many specification of each model as the predictor for the model that we are about to make. we will then make a prediction based on the model, do a validation test to the model whether the model is acceptable or need some adjustment, and make an interpretation of the model.

As a beginner to machine learning where we have just learn a method to create a model call linear regression, we were taught in class by using data with mostly numerical variables. However, in real world case, data can also contain many categorical variables. we were challenged by the thought of how accurate can a model, where it best to be used to predict numerical variables, predict a data which are mostly categorical variables.

About The Dataset

This is a car dataset which were Scraped from Edmunds and Twitter website, features the year it was made, market prices, all the way up to the specification of each car with a total of 16 columns to cover all the category.

Data Preparation

Before we begin to create our model, we have take a look at our data, observe our data, and clean or change several names and values of our data if necessary.

Define the libraries needed for this project:
library(MLmetrics)
## 
## Attaching package: 'MLmetrics'
## The following object is masked from 'package:base':
## 
##     Recall
library(stats)
library(GGally)
## Loading required package: ggplot2
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
library(nortest)
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following objects are masked from 'package:MLmetrics':
## 
##     MAE, RMSE
library(alookr)
## Loading required package: randomForest
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
## 
##     margin
library(car)
## Loading required package: carData
library(lmtest)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
## 
##     recode
## The following object is masked from 'package:randomForest':
## 
##     combine
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union


Read and preview the dataset:
car_dataset <- read.csv("Car_Dataset/data.csv")
car_dataset


Column description

Make : The company which produce the car which also be branded by the name of the company
Model : Car model released by the company
Year : Year where the car models were released
Engine.Fuel.Type : Type of fuel required to run the engine
Engine.HP : Power of the engine goes by the unit of Horsepower
Engine.Cylinders : Number of cylinder inside the engine
Transmission.Type : Type of transmission gears made for the model
Driven.Wheels : Where the wheels are connected to the engine
Number.of.Doors : Number of doors produced for the market
Market.Category : Categorized a car based on the market category names
Vehicle.Size : Size of the vehicle
Vehicle.Style : Type of the vehicle
Highway.MPG : How many gallons of fuel needed to make 100 miles trip in highway
city.mpg : How many gallons of fuel needed to make 100 miles trip in a city roads
Popularity : Popularity for car brands
MSRP : Manufacturer’s Suggested Retail Price


Check if there are any NA
colSums(is.na(car_dataset))
##              Make             Model              Year  Engine.Fuel.Type 
##                 0                 0                 0                 0 
##         Engine.HP  Engine.Cylinders Transmission.Type     Driven_Wheels 
##                69                30                 0                 0 
##   Number.of.Doors   Market.Category      Vehicle.Size     Vehicle.Style 
##                 6                 0                 0                 0 
##       highway.MPG          city.mpg        Popularity              MSRP 
##                 0                 0                 0                 0


Remove NA
car_dataset <- car_dataset %>% 
  filter(!is.na(Engine.HP),
         !is.na(Engine.Cylinders),
         !is.na(Number.of.Doors))

colSums(is.na(car_dataset))
##              Make             Model              Year  Engine.Fuel.Type 
##                 0                 0                 0                 0 
##         Engine.HP  Engine.Cylinders Transmission.Type     Driven_Wheels 
##                 0                 0                 0                 0 
##   Number.of.Doors   Market.Category      Vehicle.Size     Vehicle.Style 
##                 0                 0                 0                 0 
##       highway.MPG          city.mpg        Popularity              MSRP 
##                 0                 0                 0                 0


Check if there are any duplicated data
car_dataset %>% 
  filter(duplicated(car_dataset))


Remove duplicated data
car_dataset <- car_dataset %>% 
  distinct()

car_dataset %>% 
  filter(duplicated(car_dataset))


Preview the details of the data that we have
glimpse(car_dataset)
## Rows: 11,100
## Columns: 16
## $ Make              <chr> "BMW", "BMW", "BMW", "BMW", "BMW", "BMW", "BMW", "BM~
## $ Model             <chr> "1 Series M", "1 Series", "1 Series", "1 Series", "1~
## $ Year              <int> 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012~
## $ Engine.Fuel.Type  <chr> "premium unleaded (required)", "premium unleaded (re~
## $ Engine.HP         <int> 335, 300, 300, 230, 230, 230, 300, 300, 230, 230, 30~
## $ Engine.Cylinders  <int> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6~
## $ Transmission.Type <chr> "MANUAL", "MANUAL", "MANUAL", "MANUAL", "MANUAL", "M~
## $ Driven_Wheels     <chr> "rear wheel drive", "rear wheel drive", "rear wheel ~
## $ Number.of.Doors   <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4~
## $ Market.Category   <chr> "Factory Tuner,Luxury,High-Performance", "Luxury,Per~
## $ Vehicle.Size      <chr> "Compact", "Compact", "Compact", "Compact", "Compact~
## $ Vehicle.Style     <chr> "Coupe", "Convertible", "Coupe", "Coupe", "Convertib~
## $ highway.MPG       <int> 26, 28, 28, 28, 28, 28, 26, 28, 28, 27, 28, 28, 28, ~
## $ city.mpg          <int> 19, 19, 20, 18, 18, 18, 17, 20, 18, 18, 20, 19, 19, ~
## $ Popularity        <int> 3916, 3916, 3916, 3916, 3916, 3916, 3916, 3916, 3916~
## $ MSRP              <int> 46135, 40650, 36350, 29450, 34500, 31200, 44100, 393~


To make it easier for us to understand the column names. It is best to change couple of names of the columns, and change the data type for a better memory efficiency.

      To be changed:
      1. Make => Brands
      2. Driven_Wheels => Driven.Wheels
      3. Vehicle.Style => Vehicle.Type
      4. highway.MPG => Highway.MpG
      5. city.mpg => City.MpG

      Data types:
      All character data type to factor data type


Clean the dataset
car_dataset <- car_dataset %>%
  mutate_if(is.character, as.factor) %>%
  rename(Brands = Make,
         Driven.Wheels = Driven_Wheels,
         Vehicle.Type = Vehicle.Style,
         Highway.MpG = highway.MPG,
         City.MpG = city.mpg)

unique(car_dataset$Transmission.Type)
## [1] MANUAL           AUTOMATIC        AUTOMATED_MANUAL UNKNOWN         
## [5] DIRECT_DRIVE    
## Levels: AUTOMATED_MANUAL AUTOMATIC DIRECT_DRIVE MANUAL UNKNOWN


In the Transmission.Type variable, we seems to have “UNKNOWN” value for a specific car model. We understand that when we want to buy a car, we would like to know every detail possible, especially the main details such as the type of transmission that the car have. In this case, it is best for us to replace the “UNKNOWN” by looking up through the internet for the detail specification of the car which fall under “UNKNOWN” transmission category if possible.

Preview car data which has unknown transmission type value
car_dataset %>% 
  filter(Transmission.Type == "UNKNOWN")


Changing unknown transmission value based on the model specification
car_dataset_unknown_auto <- car_dataset %>% 
  filter(Model %in% c("Achieva", "Firebird", "Le Baron"),
         Transmission.Type == "UNKNOWN") %>% 
  mutate(Transmission.Type = case_when(Transmission.Type == "UNKNOWN" ~ "AUTOMATIC"))
           
car_dataset_unknown_man <- car_dataset %>% 
  filter(Model %in% c("Jimmy", "RAM 150"),
         Transmission.Type == "UNKNOWN") %>% 
  mutate(Transmission.Type = case_when(Transmission.Type == "UNKNOWN" ~ "MANUAL"))

car_dataset <- rbind(car_dataset, car_dataset_unknown_auto, car_dataset_unknown_man)

car_dataset <- car_dataset %>%
  filter(Transmission.Type != "UNKNOWN")

unique(car_dataset$Transmission.Type)
## [1] MANUAL           AUTOMATIC        AUTOMATED_MANUAL DIRECT_DRIVE    
## Levels: AUTOMATED_MANUAL AUTOMATIC DIRECT_DRIVE MANUAL UNKNOWN


Data Analysis

After we finished cleaning up the data, it is best for us to check for some outliers or noises in our data and decide whether we have to remove it or not in order to make the most accurate regression model as possible.


Distribution of MSRP
hist(car_dataset$MSRP)

The distribution chart above has shown us that there are outliers which makes the distribution chart cannot be interpreted properly. To observe the outliers, we can plot a box graph as follow:


Observe the outliers
boxplot(car_dataset$MSRP)

the major outliers were above $100,000, let us try to remove the outliers.


Removing outliers
car_dataset <- car_dataset %>% 
  filter(MSRP <= 100000)

hist(car_dataset$MSRP)

At this point, we can properly interpret the distribution chart where most of car prices fall between the price of 20,000 to 40,000 US Dollars.

Modeling

Modeling With Linear Regression

Now that we have clean and inspect our data thoroughly, we can start to explore our data further starting from observing the correlation between columns to creating the best linear regression model based on data exploration and analysis. Our target will be MSRP as we want to create a model which can predict the increase of the price of the car listed in our data where the other columns which are the specification of the cars will be our predictors.


Correlation between columns
ggcorr(car_dataset, label = T, hjust = 1)
## Warning in ggcorr(car_dataset, label = T, hjust = 1): data in column(s)
## 'Brands', 'Model', 'Engine.Fuel.Type', 'Transmission.Type', 'Driven.Wheels',
## 'Market.Category', 'Vehicle.Size', 'Vehicle.Type' are not numeric and were
## ignored


According to the correlation chart, the car specification which have the highest correlation to our target of prediction for our linear model is Engine.HP which also happen to have quite strong correlation with Engine.Cylinders which are not that significant compare to Engine.HP. We also happen to have 2 possible predictor which have strong correlation to each other like the Highway.MpG and City.MpG. For this matter, we have to remove one of them as it will cause problem to our model later on. Between the Engine.HP and Engine.Cylinders we would would want to remove the Engine.Cylinders as it have lower correlation to the MSRP compare to the Engine.HP. In case of Highway.MpG and City.MpG, we can observe which one of them are more significant compare to the other by creating a multiple linear regression model and observe the significance level of both predictors since they have the same correlation value to the MSRP.

Remove Engine.Cylinders
car_dataset <- car_dataset %>% 
  select(-Engine.Cylinders)


Linear Regression Model
model_HP <- lm(MSRP ~ Engine.HP, car_dataset)
summary(model_HP)
## 
## Call:
## lm(formula = MSRP ~ Engine.HP, data = car_dataset)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -54581  -5961   1079   5978  59589 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -6778.215    333.878   -20.3   <2e-16 ***
## Engine.HP     159.998      1.318   121.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11930 on 10471 degrees of freedom
## Multiple R-squared:  0.5846, Adjusted R-squared:  0.5845 
## F-statistic: 1.473e+04 on 1 and 10471 DF,  p-value: < 2.2e-16


Correlation plot
plot(car_dataset$Engine.HP, car_dataset$MSRP)
abline(model_HP$coefficients[1], model_HP$coefficients[2], col = "red", lwd = 2)


With the Multiple R-Squared value of 0.5895, this model able to explain MSRP variable for as accurate as 59%. The rest of the 41% are explained by the other predictors which were not included in the model. For every increase of 1 engine horsepower, the price increase as much as 162.518 US Dollars.

Modeling With Multiple Linear Regression

Since our potential predictors comes in a large number, we can build our model using stepwise method where the predictors were chosen automatically. Although there are different kind of direction method in stepwise, we can try to model our data by using every direction method to be compared to each other. The direction previously mentioned are forward, backward, and both. The result of the stepwise model will be compared with the model which have all the variable as the predictors. In addition, we will create a model where it uses all of the potential predictors inside the data and a stepwise model which only use numerical variables as the predictors to be compared to the model which buidl automatically using stepwise methods.

Remove insignificant columns
model_data <- car_dataset %>%
  select(-c(Brands, Model))


Modeling using all variables as the predictors


Create a model where all variables are the predictors
model_all <- lm(MSRP ~ ., model_data)


View summary of the model_all
options("scipen"=100, "digits"=4)
summary(model_all)
## 
## Call:
## lm(formula = MSRP ~ ., data = model_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -32816  -4169   -218   3823  44282 
## 
## Coefficients: (1 not defined because of singularities)
##                                                                     Estimate
## (Intercept)                                                    -1843098.6161
## Year                                                                923.0860
## Engine.Fuel.Typediesel                                              457.5212
## Engine.Fuel.Typeelectric                                          18212.9480
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)     -16417.7832
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)          3924.4277
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                          -7389.8629
## Engine.Fuel.Typenatural gas                                        2326.0020
## Engine.Fuel.Typepremium unleaded (recommended)                    -3266.9013
## Engine.Fuel.Typepremium unleaded (required)                        2294.4914
## Engine.Fuel.Typeregular unleaded                                  -6780.5568
## Engine.HP                                                            89.3439
## Transmission.TypeAUTOMATIC                                         -290.7810
## Transmission.TypeDIRECT_DRIVE                                      -693.0108
## Transmission.TypeMANUAL                                           -1785.0454
## Driven.Wheelsfour wheel drive                                     -1300.3672
## Driven.Wheelsfront wheel drive                                    -1195.7539
## Driven.Wheelsrear wheel drive                                     -4031.8958
## Number.of.Doors                                                      89.6660
## Market.CategoryCrossover,Diesel                                   15428.7435
## Market.CategoryCrossover,Exotic,Luxury,High-Performance           18382.1676
## Market.CategoryCrossover,Exotic,Luxury,Performance                14393.4447
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance    14011.4222
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance          5584.3031
## Market.CategoryCrossover,Factory Tuner,Performance                 -837.8015
## Market.CategoryCrossover,Flex Fuel                                 3238.4533
## Market.CategoryCrossover,Flex Fuel,Luxury                         18023.5839
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance             35029.2018
## Market.CategoryCrossover,Flex Fuel,Performance                     1446.8278
## Market.CategoryCrossover,Hatchback                                 4405.7130
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance      -4238.7659
## Market.CategoryCrossover,Hatchback,Luxury                         14287.8047
## Market.CategoryCrossover,Hatchback,Performance                      764.8238
## Market.CategoryCrossover,Hybrid                                    9012.0055
## Market.CategoryCrossover,Luxury                                    5566.6732
## Market.CategoryCrossover,Luxury,Diesel                            14561.5039
## Market.CategoryCrossover,Luxury,High-Performance                  16111.6363
## Market.CategoryCrossover,Luxury,Hybrid                            12299.7654
## Market.CategoryCrossover,Luxury,Performance                        6557.0979
## Market.CategoryCrossover,Luxury,Performance,Hybrid                29513.3794
## Market.CategoryCrossover,Performance                               1515.3365
## Market.CategoryDiesel                                              3639.9467
## Market.CategoryDiesel,Luxury                                      19885.6262
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance       57836.7444
## Market.CategoryExotic,High-Performance                            35143.1689
## Market.CategoryExotic,Luxury,High-Performance                     45354.6119
## Market.CategoryFactory Tuner,High-Performance                       786.9383
## Market.CategoryFactory Tuner,Luxury,High-Performance              14638.4289
## Market.CategoryFactory Tuner,Luxury,Performance                    1315.9048
## Market.CategoryFactory Tuner,Performance                          -3324.9355
## Market.CategoryFlex Fuel                                           5172.3270
## Market.CategoryFlex Fuel,Diesel                                    8129.7023
## Market.CategoryFlex Fuel,Hybrid                                   12400.4290
## Market.CategoryFlex Fuel,Luxury                                   26113.5591
## Market.CategoryFlex Fuel,Luxury,High-Performance                  17570.6763
## Market.CategoryFlex Fuel,Luxury,Performance                       29312.6830
## Market.CategoryFlex Fuel,Performance                               6212.2071
## Market.CategoryFlex Fuel,Performance,Hybrid                         475.9915
## Market.CategoryHatchback                                           4553.3053
## Market.CategoryHatchback,Diesel                                    1951.5189
## Market.CategoryHatchback,Factory Tuner,High-Performance             148.9705
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance           793.1411
## Market.CategoryHatchback,Factory Tuner,Performance                -4568.0904
## Market.CategoryHatchback,Flex Fuel                                 2120.2187
## Market.CategoryHatchback,Hybrid                                   11519.2295
## Market.CategoryHatchback,Luxury                                    6222.5975
## Market.CategoryHatchback,Luxury,Hybrid                            17204.3637
## Market.CategoryHatchback,Luxury,Performance                        5462.3512
## Market.CategoryHatchback,Performance                               1455.9407
## Market.CategoryHigh-Performance                                    3965.1372
## Market.CategoryHybrid                                             12265.2570
## Market.CategoryLuxury                                             10055.8749
## Market.CategoryLuxury,High-Performance                            21512.6499
## Market.CategoryLuxury,High-Performance,Hybrid                     13167.8424
## Market.CategoryLuxury,Hybrid                                      25175.4771
## Market.CategoryLuxury,Performance                                 13753.1919
## Market.CategoryLuxury,Performance,Hybrid                          26708.4500
## Market.CategoryN/A                                                 5272.2468
## Market.CategoryPerformance                                         3673.5071
## Market.CategoryPerformance,Hybrid                                  1959.5169
## Vehicle.SizeLarge                                                  2116.9936
## Vehicle.SizeMidsize                                               -1977.4405
## Vehicle.Type2dr SUV                                                2379.5828
## Vehicle.Type4dr Hatchback                                         -1724.0878
## Vehicle.Type4dr SUV                                                5095.5883
## Vehicle.TypeCargo Minivan                                          -680.0619
## Vehicle.TypeCargo Van                                             -2792.8188
## Vehicle.TypeConvertible                                            4587.4118
## Vehicle.TypeConvertible SUV                                        7906.8713
## Vehicle.TypeCoupe                                                 -2163.6980
## Vehicle.TypeCrew Cab Pickup                                       -1153.2916
## Vehicle.TypeExtended Cab Pickup                                   -4143.7812
## Vehicle.TypePassenger Minivan                                      1282.2370
## Vehicle.TypePassenger Van                                           596.9037
## Vehicle.TypeRegular Cab Pickup                                    -4417.6117
## Vehicle.TypeSedan                                                 -2311.2938
## Vehicle.TypeWagon                                                         NA
## Highway.MpG                                                          46.8939
## City.MpG                                                           -180.1212
## Popularity                                                            0.1089
##                                                                   Std. Error
## (Intercept)                                                       32731.7969
## Year                                                                 16.5194
## Engine.Fuel.Typediesel                                             4416.2261
## Engine.Fuel.Typeelectric                                           7376.1620
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)       4637.5394
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)          4625.9365
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                           4135.8028
## Engine.Fuel.Typenatural gas                                        6475.0382
## Engine.Fuel.Typepremium unleaded (recommended)                     4109.6380
## Engine.Fuel.Typepremium unleaded (required)                        4110.4739
## Engine.Fuel.Typeregular unleaded                                   4099.3705
## Engine.HP                                                             1.9919
## Transmission.TypeAUTOMATIC                                          431.1711
## Transmission.TypeDIRECT_DRIVE                                      5075.9545
## Transmission.TypeMANUAL                                             440.5604
## Driven.Wheelsfour wheel drive                                       363.1740
## Driven.Wheelsfront wheel drive                                      237.2331
## Driven.Wheelsrear wheel drive                                       278.1471
## Number.of.Doors                                                     345.8576
## Market.CategoryCrossover,Diesel                                    3148.5736
## Market.CategoryCrossover,Exotic,Luxury,High-Performance            7101.4367
## Market.CategoryCrossover,Exotic,Luxury,Performance                 7098.4894
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance     1967.5414
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance          3195.1981
## Market.CategoryCrossover,Factory Tuner,Performance                 3557.0063
## Market.CategoryCrossover,Flex Fuel                                 1033.4243
## Market.CategoryCrossover,Flex Fuel,Luxury                          2535.8241
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance              3649.1990
## Market.CategoryCrossover,Flex Fuel,Performance                     2933.3678
## Market.CategoryCrossover,Hatchback                                 1254.6671
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance       3055.6370
## Market.CategoryCrossover,Hatchback,Luxury                          2860.5131
## Market.CategoryCrossover,Hatchback,Performance                     3049.6843
## Market.CategoryCrossover,Hybrid                                    1167.5744
## Market.CategoryCrossover,Luxury                                     441.2828
## Market.CategoryCrossover,Luxury,Diesel                             2062.5716
## Market.CategoryCrossover,Luxury,High-Performance                   2718.3630
## Market.CategoryCrossover,Luxury,Hybrid                             1514.1993
## Market.CategoryCrossover,Luxury,Performance                         748.0633
## Market.CategoryCrossover,Luxury,Performance,Hybrid                 5048.4985
## Market.CategoryCrossover,Performance                                902.0672
## Market.CategoryDiesel                                              1288.1802
## Market.CategoryDiesel,Luxury                                       1978.1042
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance        4161.4939
## Market.CategoryExotic,High-Performance                             1357.0157
## Market.CategoryExotic,Luxury,High-Performance                      1567.5611
## Market.CategoryFactory Tuner,High-Performance                      1005.4762
## Market.CategoryFactory Tuner,Luxury,High-Performance                870.0279
## Market.CategoryFactory Tuner,Luxury,Performance                    1372.5306
## Market.CategoryFactory Tuner,Performance                            906.7883
## Market.CategoryFlex Fuel                                            623.7253
## Market.CategoryFlex Fuel,Diesel                                    2008.8447
## Market.CategoryFlex Fuel,Hybrid                                    5031.4555
## Market.CategoryFlex Fuel,Luxury                                    1603.1342
## Market.CategoryFlex Fuel,Luxury,High-Performance                   1843.7863
## Market.CategoryFlex Fuel,Luxury,Performance                        1538.0407
## Market.CategoryFlex Fuel,Performance                               1021.1929
## Market.CategoryFlex Fuel,Performance,Hybrid                        5061.2492
## Market.CategoryHatchback                                            900.6487
## Market.CategoryHatchback,Diesel                                    2609.0619
## Market.CategoryHatchback,Factory Tuner,High-Performance            2217.4317
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance          2505.9447
## Market.CategoryHatchback,Factory Tuner,Performance                 1774.6460
## Market.CategoryHatchback,Flex Fuel                                 2903.1701
## Market.CategoryHatchback,Hybrid                                    1380.0765
## Market.CategoryHatchback,Luxury                                    1391.4300
## Market.CategoryHatchback,Luxury,Hybrid                             4232.9808
## Market.CategoryHatchback,Luxury,Performance                        1457.3393
## Market.CategoryHatchback,Performance                                996.6584
## Market.CategoryHigh-Performance                                     817.1806
## Market.CategoryHybrid                                               885.3231
## Market.CategoryLuxury                                               484.3925
## Market.CategoryLuxury,High-Performance                              743.1068
## Market.CategoryLuxury,High-Performance,Hybrid                      2458.4514
## Market.CategoryLuxury,Hybrid                                       1128.0121
## Market.CategoryLuxury,Performance                                   557.3432
## Market.CategoryLuxury,Performance,Hybrid                           2236.0458
## Market.CategoryN/A                                                  384.4053
## Market.CategoryPerformance                                          551.3122
## Market.CategoryPerformance,Hybrid                                  7104.2276
## Vehicle.SizeLarge                                                   281.9431
## Vehicle.SizeMidsize                                                 211.6869
## Vehicle.Type2dr SUV                                                1075.0267
## Vehicle.Type4dr Hatchback                                           818.8356
## Vehicle.Type4dr SUV                                                 417.1570
## Vehicle.TypeCargo Minivan                                          1006.2427
## Vehicle.TypeCargo Van                                               959.0931
## Vehicle.TypeConvertible                                             830.1547
## Vehicle.TypeConvertible SUV                                        1546.1282
## Vehicle.TypeCoupe                                                   804.5947
## Vehicle.TypeCrew Cab Pickup                                         514.5871
## Vehicle.TypeExtended Cab Pickup                                     536.9430
## Vehicle.TypePassenger Minivan                                       506.7778
## Vehicle.TypePassenger Van                                           953.6105
## Vehicle.TypeRegular Cab Pickup                                      891.6942
## Vehicle.TypeSedan                                                   359.6699
## Vehicle.TypeWagon                                                         NA
## Highway.MpG                                                          20.0694
## City.MpG                                                             35.3822
## Popularity                                                            0.0526
##                                                                t value
## (Intercept)                                                     -56.31
## Year                                                             55.88
## Engine.Fuel.Typediesel                                            0.10
## Engine.Fuel.Typeelectric                                          2.47
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)     -3.54
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)         0.85
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                         -1.79
## Engine.Fuel.Typenatural gas                                       0.36
## Engine.Fuel.Typepremium unleaded (recommended)                   -0.79
## Engine.Fuel.Typepremium unleaded (required)                       0.56
## Engine.Fuel.Typeregular unleaded                                 -1.65
## Engine.HP                                                        44.85
## Transmission.TypeAUTOMATIC                                       -0.67
## Transmission.TypeDIRECT_DRIVE                                    -0.14
## Transmission.TypeMANUAL                                          -4.05
## Driven.Wheelsfour wheel drive                                    -3.58
## Driven.Wheelsfront wheel drive                                   -5.04
## Driven.Wheelsrear wheel drive                                   -14.50
## Number.of.Doors                                                   0.26
## Market.CategoryCrossover,Diesel                                   4.90
## Market.CategoryCrossover,Exotic,Luxury,High-Performance           2.59
## Market.CategoryCrossover,Exotic,Luxury,Performance                2.03
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance    7.12
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance         1.75
## Market.CategoryCrossover,Factory Tuner,Performance               -0.24
## Market.CategoryCrossover,Flex Fuel                                3.13
## Market.CategoryCrossover,Flex Fuel,Luxury                         7.11
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance             9.60
## Market.CategoryCrossover,Flex Fuel,Performance                    0.49
## Market.CategoryCrossover,Hatchback                                3.51
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance     -1.39
## Market.CategoryCrossover,Hatchback,Luxury                         4.99
## Market.CategoryCrossover,Hatchback,Performance                    0.25
## Market.CategoryCrossover,Hybrid                                   7.72
## Market.CategoryCrossover,Luxury                                  12.61
## Market.CategoryCrossover,Luxury,Diesel                            7.06
## Market.CategoryCrossover,Luxury,High-Performance                  5.93
## Market.CategoryCrossover,Luxury,Hybrid                            8.12
## Market.CategoryCrossover,Luxury,Performance                       8.77
## Market.CategoryCrossover,Luxury,Performance,Hybrid                5.85
## Market.CategoryCrossover,Performance                              1.68
## Market.CategoryDiesel                                             2.83
## Market.CategoryDiesel,Luxury                                     10.05
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance      13.90
## Market.CategoryExotic,High-Performance                           25.90
## Market.CategoryExotic,Luxury,High-Performance                    28.93
## Market.CategoryFactory Tuner,High-Performance                     0.78
## Market.CategoryFactory Tuner,Luxury,High-Performance             16.83
## Market.CategoryFactory Tuner,Luxury,Performance                   0.96
## Market.CategoryFactory Tuner,Performance                         -3.67
## Market.CategoryFlex Fuel                                          8.29
## Market.CategoryFlex Fuel,Diesel                                   4.05
## Market.CategoryFlex Fuel,Hybrid                                   2.46
## Market.CategoryFlex Fuel,Luxury                                  16.29
## Market.CategoryFlex Fuel,Luxury,High-Performance                  9.53
## Market.CategoryFlex Fuel,Luxury,Performance                      19.06
## Market.CategoryFlex Fuel,Performance                              6.08
## Market.CategoryFlex Fuel,Performance,Hybrid                       0.09
## Market.CategoryHatchback                                          5.06
## Market.CategoryHatchback,Diesel                                   0.75
## Market.CategoryHatchback,Factory Tuner,High-Performance           0.07
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance         0.32
## Market.CategoryHatchback,Factory Tuner,Performance               -2.57
## Market.CategoryHatchback,Flex Fuel                                0.73
## Market.CategoryHatchback,Hybrid                                   8.35
## Market.CategoryHatchback,Luxury                                   4.47
## Market.CategoryHatchback,Luxury,Hybrid                            4.06
## Market.CategoryHatchback,Luxury,Performance                       3.75
## Market.CategoryHatchback,Performance                              1.46
## Market.CategoryHigh-Performance                                   4.85
## Market.CategoryHybrid                                            13.85
## Market.CategoryLuxury                                            20.76
## Market.CategoryLuxury,High-Performance                           28.95
## Market.CategoryLuxury,High-Performance,Hybrid                     5.36
## Market.CategoryLuxury,Hybrid                                     22.32
## Market.CategoryLuxury,Performance                                24.68
## Market.CategoryLuxury,Performance,Hybrid                         11.94
## Market.CategoryN/A                                               13.72
## Market.CategoryPerformance                                        6.66
## Market.CategoryPerformance,Hybrid                                 0.28
## Vehicle.SizeLarge                                                 7.51
## Vehicle.SizeMidsize                                              -9.34
## Vehicle.Type2dr SUV                                               2.21
## Vehicle.Type4dr Hatchback                                        -2.11
## Vehicle.Type4dr SUV                                              12.22
## Vehicle.TypeCargo Minivan                                        -0.68
## Vehicle.TypeCargo Van                                            -2.91
## Vehicle.TypeConvertible                                           5.53
## Vehicle.TypeConvertible SUV                                       5.11
## Vehicle.TypeCoupe                                                -2.69
## Vehicle.TypeCrew Cab Pickup                                      -2.24
## Vehicle.TypeExtended Cab Pickup                                  -7.72
## Vehicle.TypePassenger Minivan                                     2.53
## Vehicle.TypePassenger Van                                         0.63
## Vehicle.TypeRegular Cab Pickup                                   -4.95
## Vehicle.TypeSedan                                                -6.43
## Vehicle.TypeWagon                                                   NA
## Highway.MpG                                                       2.34
## City.MpG                                                         -5.09
## Popularity                                                        2.07
##                                                                            Pr(>|t|)
## (Intercept)                                                    < 0.0000000000000002
## Year                                                           < 0.0000000000000002
## Engine.Fuel.Typediesel                                                      0.91749
## Engine.Fuel.Typeelectric                                                    0.01356
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)                0.00040
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)                   0.39626
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                                    0.07400
## Engine.Fuel.Typenatural gas                                                 0.71943
## Engine.Fuel.Typepremium unleaded (recommended)                              0.42667
## Engine.Fuel.Typepremium unleaded (required)                                 0.57672
## Engine.Fuel.Typeregular unleaded                                            0.09815
## Engine.HP                                                      < 0.0000000000000002
## Transmission.TypeAUTOMATIC                                                  0.50007
## Transmission.TypeDIRECT_DRIVE                                               0.89141
## Transmission.TypeMANUAL                                         0.00005120405006348
## Driven.Wheelsfour wheel drive                                               0.00034
## Driven.Wheelsfront wheel drive                                  0.00000047235345592
## Driven.Wheelsrear wheel drive                                  < 0.0000000000000002
## Number.of.Doors                                                             0.79544
## Market.CategoryCrossover,Diesel                                 0.00000097170960470
## Market.CategoryCrossover,Exotic,Luxury,High-Performance                     0.00965
## Market.CategoryCrossover,Exotic,Luxury,Performance                          0.04262
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance  0.00000000000114015
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance                   0.08054
## Market.CategoryCrossover,Factory Tuner,Performance                          0.81380
## Market.CategoryCrossover,Flex Fuel                                          0.00173
## Market.CategoryCrossover,Flex Fuel,Luxury                       0.00000000000125860
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance          < 0.0000000000000002
## Market.CategoryCrossover,Flex Fuel,Performance                              0.62186
## Market.CategoryCrossover,Hatchback                                          0.00045
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance                0.16541
## Market.CategoryCrossover,Hatchback,Luxury                       0.00000059843833690
## Market.CategoryCrossover,Hatchback,Performance                              0.80198
## Market.CategoryCrossover,Hybrid                                 0.00000000000001285
## Market.CategoryCrossover,Luxury                                < 0.0000000000000002
## Market.CategoryCrossover,Luxury,Diesel                          0.00000000000177317
## Market.CategoryCrossover,Luxury,High-Performance                0.00000000318414166
## Market.CategoryCrossover,Luxury,Hybrid                          0.00000000000000051
## Market.CategoryCrossover,Luxury,Performance                    < 0.0000000000000002
## Market.CategoryCrossover,Luxury,Performance,Hybrid              0.00000000518806344
## Market.CategoryCrossover,Performance                                        0.09302
## Market.CategoryDiesel                                                       0.00473
## Market.CategoryDiesel,Luxury                                   < 0.0000000000000002
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance    < 0.0000000000000002
## Market.CategoryExotic,High-Performance                         < 0.0000000000000002
## Market.CategoryExotic,Luxury,High-Performance                  < 0.0000000000000002
## Market.CategoryFactory Tuner,High-Performance                               0.43385
## Market.CategoryFactory Tuner,Luxury,High-Performance           < 0.0000000000000002
## Market.CategoryFactory Tuner,Luxury,Performance                             0.33771
## Market.CategoryFactory Tuner,Performance                                    0.00025
## Market.CategoryFlex Fuel                                       < 0.0000000000000002
## Market.CategoryFlex Fuel,Diesel                                 0.00005226492162296
## Market.CategoryFlex Fuel,Hybrid                                             0.01373
## Market.CategoryFlex Fuel,Luxury                                < 0.0000000000000002
## Market.CategoryFlex Fuel,Luxury,High-Performance               < 0.0000000000000002
## Market.CategoryFlex Fuel,Luxury,Performance                    < 0.0000000000000002
## Market.CategoryFlex Fuel,Performance                            0.00000000121900704
## Market.CategoryFlex Fuel,Performance,Hybrid                                 0.92507
## Market.CategoryHatchback                                        0.00000043640140776
## Market.CategoryHatchback,Diesel                                             0.45449
## Market.CategoryHatchback,Factory Tuner,High-Performance                     0.94644
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance                   0.75163
## Market.CategoryHatchback,Factory Tuner,Performance                          0.01006
## Market.CategoryHatchback,Flex Fuel                                          0.46522
## Market.CategoryHatchback,Hybrid                                < 0.0000000000000002
## Market.CategoryHatchback,Luxury                                 0.00000782827710830
## Market.CategoryHatchback,Luxury,Hybrid                          0.00004851938201439
## Market.CategoryHatchback,Luxury,Performance                                 0.00018
## Market.CategoryHatchback,Performance                                        0.14409
## Market.CategoryHigh-Performance                                 0.00000123866815708
## Market.CategoryHybrid                                          < 0.0000000000000002
## Market.CategoryLuxury                                          < 0.0000000000000002
## Market.CategoryLuxury,High-Performance                         < 0.0000000000000002
## Market.CategoryLuxury,High-Performance,Hybrid                   0.00000008683034509
## Market.CategoryLuxury,Hybrid                                   < 0.0000000000000002
## Market.CategoryLuxury,Performance                              < 0.0000000000000002
## Market.CategoryLuxury,Performance,Hybrid                       < 0.0000000000000002
## Market.CategoryN/A                                             < 0.0000000000000002
## Market.CategoryPerformance                                      0.00000000002815078
## Market.CategoryPerformance,Hybrid                                           0.78269
## Vehicle.SizeLarge                                               0.00000000000006468
## Vehicle.SizeMidsize                                            < 0.0000000000000002
## Vehicle.Type2dr SUV                                                         0.02688
## Vehicle.Type4dr Hatchback                                                   0.03527
## Vehicle.Type4dr SUV                                            < 0.0000000000000002
## Vehicle.TypeCargo Minivan                                                   0.49916
## Vehicle.TypeCargo Van                                                       0.00360
## Vehicle.TypeConvertible                                         0.00000003355839501
## Vehicle.TypeConvertible SUV                                     0.00000032106948621
## Vehicle.TypeCoupe                                                           0.00717
## Vehicle.TypeCrew Cab Pickup                                                 0.02503
## Vehicle.TypeExtended Cab Pickup                                 0.00000000000001297
## Vehicle.TypePassenger Minivan                                               0.01142
## Vehicle.TypePassenger Van                                                   0.53137
## Vehicle.TypeRegular Cab Pickup                                  0.00000073783026518
## Vehicle.TypeSedan                                               0.00000000013661848
## Vehicle.TypeWagon                                                                NA
## Highway.MpG                                                                 0.01948
## City.MpG                                                        0.00000036293758923
## Popularity                                                                  0.03851
##                                                                   
## (Intercept)                                                    ***
## Year                                                           ***
## Engine.Fuel.Typediesel                                            
## Engine.Fuel.Typeelectric                                       *  
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)   ***
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)         
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                       .  
## Engine.Fuel.Typenatural gas                                       
## Engine.Fuel.Typepremium unleaded (recommended)                    
## Engine.Fuel.Typepremium unleaded (required)                       
## Engine.Fuel.Typeregular unleaded                               .  
## Engine.HP                                                      ***
## Transmission.TypeAUTOMATIC                                        
## Transmission.TypeDIRECT_DRIVE                                     
## Transmission.TypeMANUAL                                        ***
## Driven.Wheelsfour wheel drive                                  ***
## Driven.Wheelsfront wheel drive                                 ***
## Driven.Wheelsrear wheel drive                                  ***
## Number.of.Doors                                                   
## Market.CategoryCrossover,Diesel                                ***
## Market.CategoryCrossover,Exotic,Luxury,High-Performance        ** 
## Market.CategoryCrossover,Exotic,Luxury,Performance             *  
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance ***
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance      .  
## Market.CategoryCrossover,Factory Tuner,Performance                
## Market.CategoryCrossover,Flex Fuel                             ** 
## Market.CategoryCrossover,Flex Fuel,Luxury                      ***
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance          ***
## Market.CategoryCrossover,Flex Fuel,Performance                    
## Market.CategoryCrossover,Hatchback                             ***
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance      
## Market.CategoryCrossover,Hatchback,Luxury                      ***
## Market.CategoryCrossover,Hatchback,Performance                    
## Market.CategoryCrossover,Hybrid                                ***
## Market.CategoryCrossover,Luxury                                ***
## Market.CategoryCrossover,Luxury,Diesel                         ***
## Market.CategoryCrossover,Luxury,High-Performance               ***
## Market.CategoryCrossover,Luxury,Hybrid                         ***
## Market.CategoryCrossover,Luxury,Performance                    ***
## Market.CategoryCrossover,Luxury,Performance,Hybrid             ***
## Market.CategoryCrossover,Performance                           .  
## Market.CategoryDiesel                                          ** 
## Market.CategoryDiesel,Luxury                                   ***
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance    ***
## Market.CategoryExotic,High-Performance                         ***
## Market.CategoryExotic,Luxury,High-Performance                  ***
## Market.CategoryFactory Tuner,High-Performance                     
## Market.CategoryFactory Tuner,Luxury,High-Performance           ***
## Market.CategoryFactory Tuner,Luxury,Performance                   
## Market.CategoryFactory Tuner,Performance                       ***
## Market.CategoryFlex Fuel                                       ***
## Market.CategoryFlex Fuel,Diesel                                ***
## Market.CategoryFlex Fuel,Hybrid                                *  
## Market.CategoryFlex Fuel,Luxury                                ***
## Market.CategoryFlex Fuel,Luxury,High-Performance               ***
## Market.CategoryFlex Fuel,Luxury,Performance                    ***
## Market.CategoryFlex Fuel,Performance                           ***
## Market.CategoryFlex Fuel,Performance,Hybrid                       
## Market.CategoryHatchback                                       ***
## Market.CategoryHatchback,Diesel                                   
## Market.CategoryHatchback,Factory Tuner,High-Performance           
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance         
## Market.CategoryHatchback,Factory Tuner,Performance             *  
## Market.CategoryHatchback,Flex Fuel                                
## Market.CategoryHatchback,Hybrid                                ***
## Market.CategoryHatchback,Luxury                                ***
## Market.CategoryHatchback,Luxury,Hybrid                         ***
## Market.CategoryHatchback,Luxury,Performance                    ***
## Market.CategoryHatchback,Performance                              
## Market.CategoryHigh-Performance                                ***
## Market.CategoryHybrid                                          ***
## Market.CategoryLuxury                                          ***
## Market.CategoryLuxury,High-Performance                         ***
## Market.CategoryLuxury,High-Performance,Hybrid                  ***
## Market.CategoryLuxury,Hybrid                                   ***
## Market.CategoryLuxury,Performance                              ***
## Market.CategoryLuxury,Performance,Hybrid                       ***
## Market.CategoryN/A                                             ***
## Market.CategoryPerformance                                     ***
## Market.CategoryPerformance,Hybrid                                 
## Vehicle.SizeLarge                                              ***
## Vehicle.SizeMidsize                                            ***
## Vehicle.Type2dr SUV                                            *  
## Vehicle.Type4dr Hatchback                                      *  
## Vehicle.Type4dr SUV                                            ***
## Vehicle.TypeCargo Minivan                                         
## Vehicle.TypeCargo Van                                          ** 
## Vehicle.TypeConvertible                                        ***
## Vehicle.TypeConvertible SUV                                    ***
## Vehicle.TypeCoupe                                              ** 
## Vehicle.TypeCrew Cab Pickup                                    *  
## Vehicle.TypeExtended Cab Pickup                                ***
## Vehicle.TypePassenger Minivan                                  *  
## Vehicle.TypePassenger Van                                         
## Vehicle.TypeRegular Cab Pickup                                 ***
## Vehicle.TypeSedan                                              ***
## Vehicle.TypeWagon                                                 
## Highway.MpG                                                    *  
## City.MpG                                                       ***
## Popularity                                                     *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7090 on 10374 degrees of freedom
## Multiple R-squared:  0.855,  Adjusted R-squared:  0.853 
## F-statistic:  623 on 98 and 10374 DF,  p-value: <0.0000000000000002


Evaluation of the model

It appears we have ran into a problem with our model. As seen in the summary above, the last variables of the Vehicle.Type predictor returns an NA value which is an indication of a singularity problem. To be more certain, we can check our model by using alias() function.

Observe the singularity in our model
alias(model_all)
## Model :
## MSRP ~ Year + Engine.Fuel.Type + Engine.HP + Transmission.Type + 
##     Driven.Wheels + Number.of.Doors + Market.Category + Vehicle.Size + 
##     Vehicle.Type + Highway.MpG + City.MpG + Popularity
## 
## Complete :
##                   (Intercept) Year Engine.Fuel.Typediesel
## Vehicle.TypeWagon  1           0    0                    
##                   Engine.Fuel.Typeelectric
## Vehicle.TypeWagon  0                      
##                   Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)
## Vehicle.TypeWagon  0                                                          
##                   Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)
## Vehicle.TypeWagon  0                                                       
##                   Engine.Fuel.Typeflex-fuel (unleaded/E85)
## Vehicle.TypeWagon  0                                      
##                   Engine.Fuel.Typenatural gas
## Vehicle.TypeWagon  0                         
##                   Engine.Fuel.Typepremium unleaded (recommended)
## Vehicle.TypeWagon  0                                            
##                   Engine.Fuel.Typepremium unleaded (required)
## Vehicle.TypeWagon  0                                         
##                   Engine.Fuel.Typeregular unleaded Engine.HP
## Vehicle.TypeWagon  0                                0       
##                   Transmission.TypeAUTOMATIC Transmission.TypeDIRECT_DRIVE
## Vehicle.TypeWagon  0                          0                           
##                   Transmission.TypeMANUAL Driven.Wheelsfour wheel drive
## Vehicle.TypeWagon  0                       0                           
##                   Driven.Wheelsfront wheel drive Driven.Wheelsrear wheel drive
## Vehicle.TypeWagon  0                              0                           
##                   Number.of.Doors Market.CategoryCrossover,Diesel
## Vehicle.TypeWagon  0               0                             
##                   Market.CategoryCrossover,Exotic,Luxury,High-Performance
## Vehicle.TypeWagon  0                                                     
##                   Market.CategoryCrossover,Exotic,Luxury,Performance
## Vehicle.TypeWagon  0                                                
##                   Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance
## Vehicle.TypeWagon  0                                                            
##                   Market.CategoryCrossover,Factory Tuner,Luxury,Performance
## Vehicle.TypeWagon  0                                                       
##                   Market.CategoryCrossover,Factory Tuner,Performance
## Vehicle.TypeWagon  0                                                
##                   Market.CategoryCrossover,Flex Fuel
## Vehicle.TypeWagon  0                                
##                   Market.CategoryCrossover,Flex Fuel,Luxury
## Vehicle.TypeWagon  0                                       
##                   Market.CategoryCrossover,Flex Fuel,Luxury,Performance
## Vehicle.TypeWagon  0                                                   
##                   Market.CategoryCrossover,Flex Fuel,Performance
## Vehicle.TypeWagon  0                                            
##                   Market.CategoryCrossover,Hatchback
## Vehicle.TypeWagon -1                                
##                   Market.CategoryCrossover,Hatchback,Factory Tuner,Performance
## Vehicle.TypeWagon -1                                                          
##                   Market.CategoryCrossover,Hatchback,Luxury
## Vehicle.TypeWagon -1                                       
##                   Market.CategoryCrossover,Hatchback,Performance
## Vehicle.TypeWagon -1                                            
##                   Market.CategoryCrossover,Hybrid
## Vehicle.TypeWagon  0                             
##                   Market.CategoryCrossover,Luxury
## Vehicle.TypeWagon  0                             
##                   Market.CategoryCrossover,Luxury,Diesel
## Vehicle.TypeWagon  0                                    
##                   Market.CategoryCrossover,Luxury,High-Performance
## Vehicle.TypeWagon  0                                              
##                   Market.CategoryCrossover,Luxury,Hybrid
## Vehicle.TypeWagon  0                                    
##                   Market.CategoryCrossover,Luxury,Performance
## Vehicle.TypeWagon  0                                         
##                   Market.CategoryCrossover,Luxury,Performance,Hybrid
## Vehicle.TypeWagon  0                                                
##                   Market.CategoryCrossover,Performance Market.CategoryDiesel
## Vehicle.TypeWagon  0                                    0                   
##                   Market.CategoryDiesel,Luxury
## Vehicle.TypeWagon  0                          
##                   Market.CategoryExotic,Factory Tuner,Luxury,High-Performance
## Vehicle.TypeWagon  0                                                         
##                   Market.CategoryExotic,High-Performance
## Vehicle.TypeWagon  0                                    
##                   Market.CategoryExotic,Luxury,High-Performance
## Vehicle.TypeWagon  0                                           
##                   Market.CategoryFactory Tuner,High-Performance
## Vehicle.TypeWagon  0                                           
##                   Market.CategoryFactory Tuner,Luxury,High-Performance
## Vehicle.TypeWagon  0                                                  
##                   Market.CategoryFactory Tuner,Luxury,Performance
## Vehicle.TypeWagon  0                                             
##                   Market.CategoryFactory Tuner,Performance
## Vehicle.TypeWagon  0                                      
##                   Market.CategoryFlex Fuel Market.CategoryFlex Fuel,Diesel
## Vehicle.TypeWagon  0                        0                             
##                   Market.CategoryFlex Fuel,Hybrid
## Vehicle.TypeWagon  0                             
##                   Market.CategoryFlex Fuel,Luxury
## Vehicle.TypeWagon  0                             
##                   Market.CategoryFlex Fuel,Luxury,High-Performance
## Vehicle.TypeWagon  0                                              
##                   Market.CategoryFlex Fuel,Luxury,Performance
## Vehicle.TypeWagon  0                                         
##                   Market.CategoryFlex Fuel,Performance
## Vehicle.TypeWagon  0                                  
##                   Market.CategoryFlex Fuel,Performance,Hybrid
## Vehicle.TypeWagon  0                                         
##                   Market.CategoryHatchback Market.CategoryHatchback,Diesel
## Vehicle.TypeWagon -1                       -1                             
##                   Market.CategoryHatchback,Factory Tuner,High-Performance
## Vehicle.TypeWagon -1                                                     
##                   Market.CategoryHatchback,Factory Tuner,Luxury,Performance
## Vehicle.TypeWagon -1                                                       
##                   Market.CategoryHatchback,Factory Tuner,Performance
## Vehicle.TypeWagon -1                                                
##                   Market.CategoryHatchback,Flex Fuel
## Vehicle.TypeWagon -1                                
##                   Market.CategoryHatchback,Hybrid
## Vehicle.TypeWagon -1                             
##                   Market.CategoryHatchback,Luxury
## Vehicle.TypeWagon -1                             
##                   Market.CategoryHatchback,Luxury,Hybrid
## Vehicle.TypeWagon -1                                    
##                   Market.CategoryHatchback,Luxury,Performance
## Vehicle.TypeWagon -1                                         
##                   Market.CategoryHatchback,Performance
## Vehicle.TypeWagon -1                                  
##                   Market.CategoryHigh-Performance Market.CategoryHybrid
## Vehicle.TypeWagon  0                               0                   
##                   Market.CategoryLuxury Market.CategoryLuxury,High-Performance
## Vehicle.TypeWagon  0                     0                                    
##                   Market.CategoryLuxury,High-Performance,Hybrid
## Vehicle.TypeWagon  0                                           
##                   Market.CategoryLuxury,Hybrid
## Vehicle.TypeWagon  0                          
##                   Market.CategoryLuxury,Performance
## Vehicle.TypeWagon  0                               
##                   Market.CategoryLuxury,Performance,Hybrid Market.CategoryN/A
## Vehicle.TypeWagon  0                                        0                
##                   Market.CategoryPerformance Market.CategoryPerformance,Hybrid
## Vehicle.TypeWagon  0                          0                               
##                   Vehicle.SizeLarge Vehicle.SizeMidsize Vehicle.Type2dr SUV
## Vehicle.TypeWagon  0                 0                  -1                 
##                   Vehicle.Type4dr Hatchback Vehicle.Type4dr SUV
## Vehicle.TypeWagon  0                        -1                 
##                   Vehicle.TypeCargo Minivan Vehicle.TypeCargo Van
## Vehicle.TypeWagon -1                        -1                   
##                   Vehicle.TypeConvertible Vehicle.TypeConvertible SUV
## Vehicle.TypeWagon -1                      -1                         
##                   Vehicle.TypeCoupe Vehicle.TypeCrew Cab Pickup
## Vehicle.TypeWagon -1                -1                         
##                   Vehicle.TypeExtended Cab Pickup Vehicle.TypePassenger Minivan
## Vehicle.TypeWagon -1                              -1                           
##                   Vehicle.TypePassenger Van Vehicle.TypeRegular Cab Pickup
## Vehicle.TypeWagon -1                        -1                            
##                   Vehicle.TypeSedan Highway.MpG City.MpG Popularity
## Vehicle.TypeWagon -1                 0           0        0


Apparently, the Vehicle.Type predictor correlates with many other predictors, and the variable inside it mostly correlates with each other. As a result, we have to remove Vehicle.Type predictor from our model. In addition to that, If we go back to the previously discussed correlation plot, we have mentioned that there are a strong correlation between the Highway.MpG and City.Mpg predictors. Although different, they have quite the similarities between each other as they are the numbers which represent how economical is a car by calculating how many miles can a car manage to travel per gallons of fuel. Due to this reasons, we have to choose one of them as our predictor which happen to be City.MpG as it has the lower p-value compare to Highway.MpG. Which means that we have to remove Highway.MpG from our model.

Remove Vehicle.Type and Highway.MpG
model_data <- model_data %>%
  select(-c(Vehicle.Type, Highway.MpG))


Remodel the model_all
model_all <- lm(MSRP ~ ., model_data)


Modeling with stepwise method


Automatic predictor selection by forward direction using stepwise method
model_fwd <- step(lm(MSRP ~ 1, model_data),
                  scope = list(lower = lm(MSRP ~ 1, model_data),
                               upper = lm(MSRP ~ ., model_data)),
                  direction = "forward")
## Start:  AIC=205812
## MSRP ~ 1
## 
##                     Df     Sum of Sq           RSS    AIC
## + Engine.HP          1 2096262177374 1489778586322 196615
## + Market.Category   61 1695116344793 1890924418903 199232
## + Year               1 1331909844208 2254130919488 200952
## + Engine.Fuel.Type   9 1166846233190 2419194530507 201708
## + Driven.Wheels      3  520118087971 3065922675725 204177
## + Vehicle.Size       2  400762781697 3185277981999 204575
## + Transmission.Type  3  377722049149 3208318714547 204653
## + City.MpG           1   70245818475 3515794945221 205607
## + Number.of.Doors    1   57313005881 3528727757815 205646
## + Popularity         1    8221255711 3577819507985 205790
## <none>                               3586040763696 205812
## 
## Step:  AIC=196615
## MSRP ~ Engine.HP
## 
##                     Df    Sum of Sq           RSS    AIC
## + Year               1 437616201226 1052162385096 192974
## + Market.Category   61 435489305929 1054289280394 193115
## + Engine.Fuel.Type   9 260443334211 1229335252111 194620
## + City.MpG           1 167815649539 1321962936783 195365
## + Driven.Wheels      3 119687829611 1370090756711 195743
## + Transmission.Type  3 104395132505 1385383453817 195860
## + Number.of.Doors    1  43618048742 1446160537581 196305
## + Vehicle.Size       2  19397585270 1470381001052 196481
## + Popularity         1   2908327896 1486870258426 196596
## <none>                              1489778586322 196615
## 
## Step:  AIC=192974
## MSRP ~ Engine.HP + Year
## 
##                     Df    Sum of Sq           RSS    AIC
## + Market.Category   61 380593004688  671569380409 188394
## + Engine.Fuel.Type   9 200664823655  851497561441 190776
## + Driven.Wheels      3  37671652384 1014490732712 192598
## + Transmission.Type  3  20301722330 1031860662766 192776
## + City.MpG           1  10437591983 1041724793113 192872
## + Popularity         1   9539829673 1042622555423 192881
## + Vehicle.Size       2   2787920653 1049374464443 192951
## + Number.of.Doors    1    257074703 1051905310393 192974
## <none>                              1052162385096 192974
## 
## Step:  AIC=188394
## MSRP ~ Engine.HP + Year + Market.Category
## 
##                     Df   Sum of Sq          RSS    AIC
## + Engine.Fuel.Type   9 58859141778 612710238630 187451
## + Driven.Wheels      3 13933444241 657635936167 188181
## + Vehicle.Size       2  7381835880 664187544528 188282
## + Transmission.Type  3  6996345497 664573034911 188290
## + City.MpG           1   154157347 671415223062 188394
## <none>                             671569380409 188394
## + Number.of.Doors    1    94360971 671475019438 188395
## + Popularity         1    60261230 671509119179 188395
## 
## Step:  AIC=187451
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type
## 
##                     Df   Sum of Sq          RSS    AIC
## + Driven.Wheels      3 13082281579 599627957052 187231
## + Vehicle.Size       2  8984622961 603725615669 187301
## + Transmission.Type  3  5490069274 607220169356 187363
## + City.MpG           1  3683227880 609027010751 187390
## <none>                             612710238630 187451
## + Number.of.Doors    1    60043150 612650195480 187452
## + Popularity         1    23153316 612687085314 187453
## 
## Step:  AIC=187231
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels
## 
##                     Df  Sum of Sq          RSS    AIC
## + Vehicle.Size       2 9627980605 589999976447 187066
## + Transmission.Type  3 5171791941 594456165110 187147
## + City.MpG           1 3519927092 596108029959 187172
## <none>                            599627957052 187231
## + Number.of.Doors    1   50699277 599577257775 187233
## + Popularity         1     435876 599627521175 187233
## 
## Step:  AIC=187066
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size
## 
##                     Df  Sum of Sq          RSS    AIC
## + Transmission.Type  3 6498070581 583501905866 186956
## + City.MpG           1 3757269800 586242706647 187001
## <none>                            589999976447 187066
## + Popularity         1   25969435 589974007012 187067
## + Number.of.Doors    1   18570045 589981406402 187068
## 
## Step:  AIC=186956
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type
## 
##                   Df  Sum of Sq          RSS    AIC
## + City.MpG         1 3135704215 580366201651 186901
## + Number.of.Doors  1  375222610 583126683256 186951
## <none>                          583501905866 186956
## + Popularity       1    1981231 583499924635 186958
## 
## Step:  AIC=186901
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type + City.MpG
## 
##                   Df Sum of Sq          RSS    AIC
## + Number.of.Doors  1 354992827 580011208824 186897
## <none>                         580366201651 186901
## + Popularity       1   6347645 580359854006 186903
## 
## Step:  AIC=186897
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type + City.MpG + 
##     Number.of.Doors
## 
##              Df Sum of Sq          RSS    AIC
## <none>                    580011208824 186897
## + Popularity  1    377595 580010831229 186899


Automatic predictor selection by backward direction stepwise method
model_bwd <- step(lm(MSRP ~ ., model_data), direction = "backward")
## Start:  AIC=186899
## MSRP ~ Year + Engine.Fuel.Type + Engine.HP + Transmission.Type + 
##     Driven.Wheels + Number.of.Doors + Market.Category + Vehicle.Size + 
##     City.MpG + Popularity
## 
##                     Df    Sum of Sq          RSS    AIC
## - Popularity         1       377595 580011208824 186897
## <none>                              580010831229 186899
## - Number.of.Doors    1    349022777 580359854006 186903
## - City.MpG           1   3115546335 583126377564 186953
## - Transmission.Type  3   6194427509 586205258738 187004
## - Vehicle.Size       2  11006581643 591017412872 187092
## - Driven.Wheels      3  13774200193 593785031422 187139
## - Engine.Fuel.Type   9  60408721262 640419552491 187919
## - Engine.HP          1 124187198228 704198029457 188929
## - Year               1 177111737302 757122568530 189688
## - Market.Category   61 221549513475 801560344704 190165
## 
## Step:  AIC=186897
## MSRP ~ Year + Engine.Fuel.Type + Engine.HP + Transmission.Type + 
##     Driven.Wheels + Number.of.Doors + Market.Category + Vehicle.Size + 
##     City.MpG
## 
##                     Df    Sum of Sq          RSS    AIC
## <none>                              580011208824 186897
## - Number.of.Doors    1    354992827 580366201651 186901
## - City.MpG           1   3115474432 583126683256 186951
## - Transmission.Type  3   6209045562 586220254386 187003
## - Vehicle.Size       2  11013587504 591024796328 187090
## - Driven.Wheels      3  13789215848 593800424672 187137
## - Engine.Fuel.Type   9  60462957929 640474166753 187918
## - Engine.HP          1 124318243091 704329451915 188929
## - Year               1 178406873287 758418082111 189704
## - Market.Category   61 223478508833 803489717658 190188


Automatic predictor selection by both direction stepwise method
model_both <- step(lm(MSRP ~ 1, model_data),
                  scope = list(lower = lm(MSRP ~ 1, model_data),
                               upper = lm(MSRP ~ ., model_data)),
                  direction = "both")
## Start:  AIC=205812
## MSRP ~ 1
## 
##                     Df     Sum of Sq           RSS    AIC
## + Engine.HP          1 2096262177374 1489778586322 196615
## + Market.Category   61 1695116344793 1890924418903 199232
## + Year               1 1331909844208 2254130919488 200952
## + Engine.Fuel.Type   9 1166846233190 2419194530507 201708
## + Driven.Wheels      3  520118087971 3065922675725 204177
## + Vehicle.Size       2  400762781697 3185277981999 204575
## + Transmission.Type  3  377722049149 3208318714547 204653
## + City.MpG           1   70245818475 3515794945221 205607
## + Number.of.Doors    1   57313005881 3528727757815 205646
## + Popularity         1    8221255711 3577819507985 205790
## <none>                               3586040763696 205812
## 
## Step:  AIC=196615
## MSRP ~ Engine.HP
## 
##                     Df     Sum of Sq           RSS    AIC
## + Year               1  437616201226 1052162385096 192974
## + Market.Category   61  435489305929 1054289280394 193115
## + Engine.Fuel.Type   9  260443334211 1229335252111 194620
## + City.MpG           1  167815649539 1321962936783 195365
## + Driven.Wheels      3  119687829611 1370090756711 195743
## + Transmission.Type  3  104395132505 1385383453817 195860
## + Number.of.Doors    1   43618048742 1446160537581 196305
## + Vehicle.Size       2   19397585270 1470381001052 196481
## + Popularity         1    2908327896 1486870258426 196596
## <none>                               1489778586322 196615
## - Engine.HP          1 2096262177374 3586040763696 205812
## 
## Step:  AIC=192974
## MSRP ~ Engine.HP + Year
## 
##                     Df     Sum of Sq           RSS    AIC
## + Market.Category   61  380593004688  671569380409 188394
## + Engine.Fuel.Type   9  200664823655  851497561441 190776
## + Driven.Wheels      3   37671652384 1014490732712 192598
## + Transmission.Type  3   20301722330 1031860662766 192776
## + City.MpG           1   10437591983 1041724793113 192872
## + Popularity         1    9539829673 1042622555423 192881
## + Vehicle.Size       2    2787920653 1049374464443 192951
## + Number.of.Doors    1     257074703 1051905310393 192974
## <none>                               1052162385096 192974
## - Year               1  437616201226 1489778586322 196615
## - Engine.HP          1 1201968534392 2254130919488 200952
## 
## Step:  AIC=188394
## MSRP ~ Engine.HP + Year + Market.Category
## 
##                     Df    Sum of Sq           RSS    AIC
## + Engine.Fuel.Type   9  58859141778  612710238630 187451
## + Driven.Wheels      3  13933444241  657635936167 188181
## + Vehicle.Size       2   7381835880  664187544528 188282
## + Transmission.Type  3   6996345497  664573034911 188290
## + City.MpG           1    154157347  671415223062 188394
## <none>                               671569380409 188394
## + Number.of.Doors    1     94360971  671475019438 188395
## + Popularity         1     60261230  671509119179 188395
## - Market.Category   61 380593004688 1052162385096 192974
## - Year               1 382719899985 1054289280394 193115
## - Engine.HP          1 421665509303 1093234889712 193495
## 
## Step:  AIC=187451
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type
## 
##                     Df    Sum of Sq           RSS    AIC
## + Driven.Wheels      3  13082281579  599627957052 187231
## + Vehicle.Size       2   8984622961  603725615669 187301
## + Transmission.Type  3   5490069274  607220169356 187363
## + City.MpG           1   3683227880  609027010751 187390
## <none>                               612710238630 187451
## + Number.of.Doors    1     60043150  612650195480 187452
## + Popularity         1     23153316  612687085314 187453
## - Engine.Fuel.Type   9  58859141778  671569380409 188394
## - Market.Category   61 238787322811  851497561441 190776
## - Year               1 315679721285  928389959915 191802
## - Engine.HP          1 410337420061 1023047658692 192818
## 
## Step:  AIC=187231
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels
## 
##                     Df    Sum of Sq          RSS    AIC
## + Vehicle.Size       2   9627980605 589999976447 187066
## + Transmission.Type  3   5171791941 594456165110 187147
## + City.MpG           1   3519927092 596108029959 187172
## <none>                              599627957052 187231
## + Number.of.Doors    1     50699277 599577257775 187233
## + Popularity         1       435876 599627521175 187233
## - Driven.Wheels      3  13082281579 612710238630 187451
## - Engine.Fuel.Type   9  58007979116 657635936167 188181
## - Market.Category   61 232581066958 832209024009 190542
## - Year               1 281574121952 881202079003 191261
## - Engine.HP          1 331167015958 930794973009 191835
## 
## Step:  AIC=187066
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size
## 
##                     Df    Sum of Sq          RSS    AIC
## + Transmission.Type  3   6498070581 583501905866 186956
## + City.MpG           1   3757269800 586242706647 187001
## <none>                              589999976447 187066
## + Popularity         1     25969435 589974007012 187067
## + Number.of.Doors    1     18570045 589981406402 187068
## - Vehicle.Size       2   9627980605 599627957052 187231
## - Driven.Wheels      3  13725639222 603725615669 187301
## - Engine.Fuel.Type   9  60051209892 650051186339 188063
## - Engine.HP          1 220207884208 810207860655 190386
## - Market.Category   61 235646500912 825646477359 190463
## - Year               1 278090271491 868090247938 191108
## 
## Step:  AIC=186956
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type
## 
##                     Df    Sum of Sq          RSS    AIC
## + City.MpG           1   3135704215 580366201651 186901
## + Number.of.Doors    1    375222610 583126683256 186951
## <none>                              583501905866 186956
## + Popularity         1      1981231 583499924635 186958
## - Transmission.Type  3   6498070581 589999976447 187066
## - Vehicle.Size       2  10954259245 594456165110 187147
## - Driven.Wheels      3  13218358475 596720264341 187184
## - Engine.Fuel.Type   9  59229769736 642731675602 187950
## - Engine.HP          1 208373569078 791875474944 190152
## - Market.Category   61 222751569089 806253474955 190220
## - Year               1 245580885790 829082791656 190633
## 
## Step:  AIC=186901
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type + City.MpG
## 
##                     Df    Sum of Sq          RSS    AIC
## + Number.of.Doors    1    354992827 580011208824 186897
## <none>                              580366201651 186901
## + Popularity         1      6347645 580359854006 186903
## - City.MpG           1   3135704215 583501905866 186956
## - Transmission.Type  3   5876504996 586242706647 187001
## - Vehicle.Size       2  11129778252 591495979903 187096
## - Driven.Wheels      3  13434232219 593800433870 187135
## - Engine.Fuel.Type   9  61035390612 641401592263 187931
## - Engine.HP          1 124275608729 704641810380 188932
## - Year               1 180784617518 761150819169 189739
## - Market.Category   61 223188385761 803554587412 190187
## 
## Step:  AIC=186897
## MSRP ~ Engine.HP + Year + Market.Category + Engine.Fuel.Type + 
##     Driven.Wheels + Vehicle.Size + Transmission.Type + City.MpG + 
##     Number.of.Doors
## 
##                     Df    Sum of Sq          RSS    AIC
## <none>                              580011208824 186897
## + Popularity         1       377595 580010831229 186899
## - Number.of.Doors    1    354992827 580366201651 186901
## - City.MpG           1   3115474432 583126683256 186951
## - Transmission.Type  3   6209045562 586220254386 187003
## - Vehicle.Size       2  11013587504 591024796328 187090
## - Driven.Wheels      3  13789215848 593800424672 187137
## - Engine.Fuel.Type   9  60462957929 640474166753 187918
## - Engine.HP          1 124318243091 704329451915 188929
## - Year               1 178406873287 758418082111 189704
## - Market.Category   61 223478508833 803489717658 190188


Compare the result of the adjusted r squared for all method
summary(model_fwd)$adj.r.squared
## [1] 0.837
summary(model_bwd)$adj.r.squared
## [1] 0.837
summary(model_both)$adj.r.squared
## [1] 0.837


Evaluation of the stepwise model

The result has shown us that no matter which method that we used, the value of the multiple R squared will be exactly the same with the value of around 84%. As such, it is recommended to use the backward method as it is the lightest method for a computer processor to process the code.

View summary of the model_bwd
options("scipen"=100, "digits"=4)
summary(model_bwd)
## 
## Call:
## lm(formula = MSRP ~ Year + Engine.Fuel.Type + Engine.HP + Transmission.Type + 
##     Driven.Wheels + Number.of.Doors + Market.Category + Vehicle.Size + 
##     City.MpG, data = model_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -29299  -4315   -282   3995  47431 
## 
## Coefficients:
##                                                                  Estimate
## (Intercept)                                                    -1871373.9
## Year                                                                939.9
## Engine.Fuel.Typediesel                                             5610.5
## Engine.Fuel.Typeelectric                                          26456.9
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)     -14265.8
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)          8178.9
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                          -5662.0
## Engine.Fuel.Typenatural gas                                        3400.5
## Engine.Fuel.Typepremium unleaded (recommended)                     -976.1
## Engine.Fuel.Typepremium unleaded (required)                        4385.2
## Engine.Fuel.Typeregular unleaded                                  -5156.3
## Engine.HP                                                            94.6
## Transmission.TypeAUTOMATIC                                         -348.5
## Transmission.TypeDIRECT_DRIVE                                      -905.1
## Transmission.TypeMANUAL                                           -2593.0
## Driven.Wheelsfour wheel drive                                       -73.7
## Driven.Wheelsfront wheel drive                                     -670.0
## Driven.Wheelsrear wheel drive                                     -3337.6
## Number.of.Doors                                                    -272.0
## Market.CategoryCrossover,Diesel                                   12542.8
## Market.CategoryCrossover,Exotic,Luxury,High-Performance           18218.3
## Market.CategoryCrossover,Exotic,Luxury,Performance                14690.3
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance    12697.3
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance          4953.2
## Market.CategoryCrossover,Factory Tuner,Performance                -1123.9
## Market.CategoryCrossover,Flex Fuel                                 1428.4
## Market.CategoryCrossover,Flex Fuel,Luxury                         12376.1
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance             28811.3
## Market.CategoryCrossover,Flex Fuel,Performance                     1704.9
## Market.CategoryCrossover,Hatchback                                -1826.9
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance     -10444.8
## Market.CategoryCrossover,Hatchback,Luxury                          6840.8
## Market.CategoryCrossover,Hatchback,Performance                    -5339.0
## Market.CategoryCrossover,Hybrid                                   10276.0
## Market.CategoryCrossover,Luxury                                    4877.8
## Market.CategoryCrossover,Luxury,Diesel                            12007.1
## Market.CategoryCrossover,Luxury,High-Performance                  14899.4
## Market.CategoryCrossover,Luxury,Hybrid                            12862.0
## Market.CategoryCrossover,Luxury,Performance                        5707.5
## Market.CategoryCrossover,Luxury,Performance,Hybrid                28250.4
## Market.CategoryCrossover,Performance                               -120.7
## Market.CategoryDiesel                                             -2665.3
## Market.CategoryDiesel,Luxury                                      11529.8
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance       49712.5
## Market.CategoryExotic,High-Performance                            27750.3
## Market.CategoryExotic,Luxury,High-Performance                     37660.8
## Market.CategoryFactory Tuner,High-Performance                     -6712.2
## Market.CategoryFactory Tuner,Luxury,High-Performance               7198.7
## Market.CategoryFactory Tuner,Luxury,Performance                   -4814.0
## Market.CategoryFactory Tuner,Performance                          -8636.6
## Market.CategoryFlex Fuel                                          -1865.4
## Market.CategoryFlex Fuel,Diesel                                    3449.3
## Market.CategoryFlex Fuel,Hybrid                                    7319.6
## Market.CategoryFlex Fuel,Luxury                                   22573.4
## Market.CategoryFlex Fuel,Luxury,High-Performance                  10216.2
## Market.CategoryFlex Fuel,Luxury,Performance                       27612.1
## Market.CategoryFlex Fuel,Performance                               2939.3
## Market.CategoryFlex Fuel,Performance,Hybrid                       -6031.9
## Market.CategoryHatchback                                           -363.1
## Market.CategoryHatchback,Diesel                                   -6002.5
## Market.CategoryHatchback,Factory Tuner,High-Performance           -6702.8
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance         -4293.8
## Market.CategoryHatchback,Factory Tuner,Performance               -10262.9
## Market.CategoryHatchback,Flex Fuel                                -3035.3
## Market.CategoryHatchback,Hybrid                                    7611.3
## Market.CategoryHatchback,Luxury                                     770.4
## Market.CategoryHatchback,Luxury,Hybrid                            12285.7
## Market.CategoryHatchback,Luxury,Performance                         -34.3
## Market.CategoryHatchback,Performance                              -4430.7
## Market.CategoryHigh-Performance                                   -2427.1
## Market.CategoryHybrid                                              7366.7
## Market.CategoryLuxury                                              4774.3
## Market.CategoryLuxury,High-Performance                            14697.0
## Market.CategoryLuxury,High-Performance,Hybrid                      5206.9
## Market.CategoryLuxury,Hybrid                                      22523.5
## Market.CategoryLuxury,Performance                                  7383.6
## Market.CategoryLuxury,Performance,Hybrid                          19658.5
## Market.CategoryN/A                                                 -113.6
## Market.CategoryPerformance                                        -2171.6
## Market.CategoryPerformance,Hybrid                                 -6097.2
## Vehicle.SizeLarge                                                   484.8
## Vehicle.SizeMidsize                                               -2168.9
## City.MpG                                                           -221.9
##                                                                Std. Error
## (Intercept)                                                       33143.0
## Year                                                                 16.6
## Engine.Fuel.Typediesel                                             4639.0
## Engine.Fuel.Typeelectric                                           7673.9
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)       4883.8
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)          4867.9
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                           4356.3
## Engine.Fuel.Typenatural gas                                        6826.1
## Engine.Fuel.Typepremium unleaded (recommended)                     4330.4
## Engine.Fuel.Typepremium unleaded (required)                        4332.0
## Engine.Fuel.Typeregular unleaded                                   4319.8
## Engine.HP                                                             2.0
## Transmission.TypeAUTOMATIC                                          450.4
## Transmission.TypeDIRECT_DRIVE                                      5349.6
## Transmission.TypeMANUAL                                             461.1
## Driven.Wheelsfour wheel drive                                       324.8
## Driven.Wheelsfront wheel drive                                      243.0
## Driven.Wheelsrear wheel drive                                       265.4
## Number.of.Doors                                                     107.9
## Market.CategoryCrossover,Diesel                                    3300.7
## Market.CategoryCrossover,Exotic,Luxury,High-Performance            7486.8
## Market.CategoryCrossover,Exotic,Luxury,Performance                 7484.1
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance     2071.6
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance          3368.5
## Market.CategoryCrossover,Factory Tuner,Performance                 3749.2
## Market.CategoryCrossover,Flex Fuel                                 1079.5
## Market.CategoryCrossover,Flex Fuel,Luxury                          2654.9
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance              3831.4
## Market.CategoryCrossover,Flex Fuel,Performance                     3084.2
## Market.CategoryCrossover,Hatchback                                  913.5
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance       3077.7
## Market.CategoryCrossover,Hatchback,Luxury                          2850.5
## Market.CategoryCrossover,Hatchback,Performance                     3074.8
## Market.CategoryCrossover,Hybrid                                    1216.7
## Market.CategoryCrossover,Luxury                                     462.0
## Market.CategoryCrossover,Luxury,Diesel                             2143.3
## Market.CategoryCrossover,Luxury,High-Performance                   2864.8
## Market.CategoryCrossover,Luxury,Hybrid                             1582.7
## Market.CategoryCrossover,Luxury,Performance                         785.5
## Market.CategoryCrossover,Luxury,Performance,Hybrid                 5320.6
## Market.CategoryCrossover,Performance                                934.8
## Market.CategoryDiesel                                              1280.4
## Market.CategoryDiesel,Luxury                                       2013.8
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance        4353.8
## Market.CategoryExotic,High-Performance                             1336.2
## Market.CategoryExotic,Luxury,High-Performance                      1568.2
## Market.CategoryFactory Tuner,High-Performance                       932.0
## Market.CategoryFactory Tuner,Luxury,High-Performance                790.3
## Market.CategoryFactory Tuner,Luxury,Performance                    1395.4
## Market.CategoryFactory Tuner,Performance                            902.4
## Market.CategoryFlex Fuel                                            578.8
## Market.CategoryFlex Fuel,Diesel                                    1977.8
## Market.CategoryFlex Fuel,Hybrid                                    5295.6
## Market.CategoryFlex Fuel,Luxury                                    1655.4
## Market.CategoryFlex Fuel,Luxury,High-Performance                   1890.5
## Market.CategoryFlex Fuel,Luxury,Performance                        1604.3
## Market.CategoryFlex Fuel,Performance                               1043.3
## Market.CategoryFlex Fuel,Performance,Hybrid                        5323.2
## Market.CategoryHatchback                                            439.1
## Market.CategoryHatchback,Diesel                                    2608.6
## Market.CategoryHatchback,Factory Tuner,High-Performance            2121.7
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance          2520.4
## Market.CategoryHatchback,Factory Tuner,Performance                 1669.0
## Market.CategoryHatchback,Flex Fuel                                 2911.6
## Market.CategoryHatchback,Hybrid                                    1152.8
## Market.CategoryHatchback,Luxury                                    1199.8
## Market.CategoryHatchback,Luxury,Hybrid                             4357.7
## Market.CategoryHatchback,Luxury,Performance                        1291.3
## Market.CategoryHatchback,Performance                                634.4
## Market.CategoryHigh-Performance                                     703.8
## Market.CategoryHybrid                                               873.3
## Market.CategoryLuxury                                               388.9
## Market.CategoryLuxury,High-Performance                              639.2
## Market.CategoryLuxury,High-Performance,Hybrid                      2551.4
## Market.CategoryLuxury,Hybrid                                       1160.9
## Market.CategoryLuxury,Performance                                   433.0
## Market.CategoryLuxury,Performance,Hybrid                           2313.1
## Market.CategoryN/A                                                  292.7
## Market.CategoryPerformance                                          444.8
## Market.CategoryPerformance,Hybrid                                  7481.2
## Vehicle.SizeLarge                                                   278.2
## Vehicle.SizeMidsize                                                 208.8
## City.MpG                                                             29.7
##                                                                t value
## (Intercept)                                                     -56.46
## Year                                                             56.53
## Engine.Fuel.Typediesel                                            1.21
## Engine.Fuel.Typeelectric                                          3.45
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)     -2.92
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)         1.68
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                         -1.30
## Engine.Fuel.Typenatural gas                                       0.50
## Engine.Fuel.Typepremium unleaded (recommended)                   -0.23
## Engine.Fuel.Typepremium unleaded (required)                       1.01
## Engine.Fuel.Typeregular unleaded                                 -1.19
## Engine.HP                                                        47.19
## Transmission.TypeAUTOMATIC                                       -0.77
## Transmission.TypeDIRECT_DRIVE                                    -0.17
## Transmission.TypeMANUAL                                          -5.62
## Driven.Wheelsfour wheel drive                                    -0.23
## Driven.Wheelsfront wheel drive                                   -2.76
## Driven.Wheelsrear wheel drive                                   -12.57
## Number.of.Doors                                                  -2.52
## Market.CategoryCrossover,Diesel                                   3.80
## Market.CategoryCrossover,Exotic,Luxury,High-Performance           2.43
## Market.CategoryCrossover,Exotic,Luxury,Performance                1.96
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance    6.13
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance         1.47
## Market.CategoryCrossover,Factory Tuner,Performance               -0.30
## Market.CategoryCrossover,Flex Fuel                                1.32
## Market.CategoryCrossover,Flex Fuel,Luxury                         4.66
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance             7.52
## Market.CategoryCrossover,Flex Fuel,Performance                    0.55
## Market.CategoryCrossover,Hatchback                               -2.00
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance     -3.39
## Market.CategoryCrossover,Hatchback,Luxury                         2.40
## Market.CategoryCrossover,Hatchback,Performance                   -1.74
## Market.CategoryCrossover,Hybrid                                   8.45
## Market.CategoryCrossover,Luxury                                  10.56
## Market.CategoryCrossover,Luxury,Diesel                            5.60
## Market.CategoryCrossover,Luxury,High-Performance                  5.20
## Market.CategoryCrossover,Luxury,Hybrid                            8.13
## Market.CategoryCrossover,Luxury,Performance                       7.27
## Market.CategoryCrossover,Luxury,Performance,Hybrid                5.31
## Market.CategoryCrossover,Performance                             -0.13
## Market.CategoryDiesel                                            -2.08
## Market.CategoryDiesel,Luxury                                      5.73
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance      11.42
## Market.CategoryExotic,High-Performance                           20.77
## Market.CategoryExotic,Luxury,High-Performance                    24.02
## Market.CategoryFactory Tuner,High-Performance                    -7.20
## Market.CategoryFactory Tuner,Luxury,High-Performance              9.11
## Market.CategoryFactory Tuner,Luxury,Performance                  -3.45
## Market.CategoryFactory Tuner,Performance                         -9.57
## Market.CategoryFlex Fuel                                         -3.22
## Market.CategoryFlex Fuel,Diesel                                   1.74
## Market.CategoryFlex Fuel,Hybrid                                   1.38
## Market.CategoryFlex Fuel,Luxury                                  13.64
## Market.CategoryFlex Fuel,Luxury,High-Performance                  5.40
## Market.CategoryFlex Fuel,Luxury,Performance                      17.21
## Market.CategoryFlex Fuel,Performance                              2.82
## Market.CategoryFlex Fuel,Performance,Hybrid                      -1.13
## Market.CategoryHatchback                                         -0.83
## Market.CategoryHatchback,Diesel                                  -2.30
## Market.CategoryHatchback,Factory Tuner,High-Performance          -3.16
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance        -1.70
## Market.CategoryHatchback,Factory Tuner,Performance               -6.15
## Market.CategoryHatchback,Flex Fuel                               -1.04
## Market.CategoryHatchback,Hybrid                                   6.60
## Market.CategoryHatchback,Luxury                                   0.64
## Market.CategoryHatchback,Luxury,Hybrid                            2.82
## Market.CategoryHatchback,Luxury,Performance                      -0.03
## Market.CategoryHatchback,Performance                             -6.98
## Market.CategoryHigh-Performance                                  -3.45
## Market.CategoryHybrid                                             8.44
## Market.CategoryLuxury                                            12.28
## Market.CategoryLuxury,High-Performance                           22.99
## Market.CategoryLuxury,High-Performance,Hybrid                     2.04
## Market.CategoryLuxury,Hybrid                                     19.40
## Market.CategoryLuxury,Performance                                17.05
## Market.CategoryLuxury,Performance,Hybrid                          8.50
## Market.CategoryN/A                                               -0.39
## Market.CategoryPerformance                                       -4.88
## Market.CategoryPerformance,Hybrid                                -0.82
## Vehicle.SizeLarge                                                 1.74
## Vehicle.SizeMidsize                                             -10.39
## City.MpG                                                         -7.47
##                                                                            Pr(>|t|)
## (Intercept)                                                    < 0.0000000000000002
## Year                                                           < 0.0000000000000002
## Engine.Fuel.Typediesel                                                      0.22652
## Engine.Fuel.Typeelectric                                                    0.00057
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)                0.00350
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)                   0.09296
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                                    0.19372
## Engine.Fuel.Typenatural gas                                                 0.61838
## Engine.Fuel.Typepremium unleaded (recommended)                              0.82166
## Engine.Fuel.Typepremium unleaded (required)                                 0.31142
## Engine.Fuel.Typeregular unleaded                                            0.23264
## Engine.HP                                                      < 0.0000000000000002
## Transmission.TypeAUTOMATIC                                                  0.43909
## Transmission.TypeDIRECT_DRIVE                                               0.86565
## Transmission.TypeMANUAL                                         0.00000001925471916
## Driven.Wheelsfour wheel drive                                               0.82063
## Driven.Wheelsfront wheel drive                                              0.00584
## Driven.Wheelsrear wheel drive                                  < 0.0000000000000002
## Number.of.Doors                                                             0.01169
## Market.CategoryCrossover,Diesel                                             0.00015
## Market.CategoryCrossover,Exotic,Luxury,High-Performance                     0.01497
## Market.CategoryCrossover,Exotic,Luxury,Performance                          0.04969
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance  0.00000000091456396
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance                   0.14147
## Market.CategoryCrossover,Factory Tuner,Performance                          0.76435
## Market.CategoryCrossover,Flex Fuel                                          0.18582
## Market.CategoryCrossover,Flex Fuel,Luxury                       0.00000317717953205
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance           0.00000000000005941
## Market.CategoryCrossover,Flex Fuel,Performance                              0.58042
## Market.CategoryCrossover,Hatchback                                          0.04555
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance                0.00069
## Market.CategoryCrossover,Hatchback,Luxury                                   0.01642
## Market.CategoryCrossover,Hatchback,Performance                              0.08252
## Market.CategoryCrossover,Hybrid                                < 0.0000000000000002
## Market.CategoryCrossover,Luxury                                < 0.0000000000000002
## Market.CategoryCrossover,Luxury,Diesel                          0.00000002169450558
## Market.CategoryCrossover,Luxury,High-Performance                0.00000020211811529
## Market.CategoryCrossover,Luxury,Hybrid                          0.00000000000000049
## Market.CategoryCrossover,Luxury,Performance                     0.00000000000039815
## Market.CategoryCrossover,Luxury,Performance,Hybrid              0.00000011210373084
## Market.CategoryCrossover,Performance                                        0.89726
## Market.CategoryDiesel                                                       0.03740
## Market.CategoryDiesel,Luxury                                    0.00000001059972076
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance    < 0.0000000000000002
## Market.CategoryExotic,High-Performance                         < 0.0000000000000002
## Market.CategoryExotic,Luxury,High-Performance                  < 0.0000000000000002
## Market.CategoryFactory Tuner,High-Performance                   0.00000000000063352
## Market.CategoryFactory Tuner,Luxury,High-Performance           < 0.0000000000000002
## Market.CategoryFactory Tuner,Luxury,Performance                             0.00056
## Market.CategoryFactory Tuner,Performance                       < 0.0000000000000002
## Market.CategoryFlex Fuel                                                    0.00127
## Market.CategoryFlex Fuel,Diesel                                             0.08118
## Market.CategoryFlex Fuel,Hybrid                                             0.16694
## Market.CategoryFlex Fuel,Luxury                                < 0.0000000000000002
## Market.CategoryFlex Fuel,Luxury,High-Performance                0.00000006662959107
## Market.CategoryFlex Fuel,Luxury,Performance                    < 0.0000000000000002
## Market.CategoryFlex Fuel,Performance                                        0.00485
## Market.CategoryFlex Fuel,Performance,Hybrid                                 0.25718
## Market.CategoryHatchback                                                    0.40835
## Market.CategoryHatchback,Diesel                                             0.02141
## Market.CategoryHatchback,Factory Tuner,High-Performance                     0.00159
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance                   0.08848
## Market.CategoryHatchback,Factory Tuner,Performance              0.00000000080748625
## Market.CategoryHatchback,Flex Fuel                                          0.29720
## Market.CategoryHatchback,Hybrid                                 0.00000000004243056
## Market.CategoryHatchback,Luxury                                             0.52081
## Market.CategoryHatchback,Luxury,Hybrid                                      0.00482
## Market.CategoryHatchback,Luxury,Performance                                 0.97879
## Market.CategoryHatchback,Performance                            0.00000000000303147
## Market.CategoryHigh-Performance                                             0.00057
## Market.CategoryHybrid                                          < 0.0000000000000002
## Market.CategoryLuxury                                          < 0.0000000000000002
## Market.CategoryLuxury,High-Performance                         < 0.0000000000000002
## Market.CategoryLuxury,High-Performance,Hybrid                               0.04130
## Market.CategoryLuxury,Hybrid                                   < 0.0000000000000002
## Market.CategoryLuxury,Performance                              < 0.0000000000000002
## Market.CategoryLuxury,Performance,Hybrid                       < 0.0000000000000002
## Market.CategoryN/A                                                          0.69796
## Market.CategoryPerformance                                      0.00000106693015318
## Market.CategoryPerformance,Hybrid                                           0.41509
## Vehicle.SizeLarge                                                           0.08147
## Vehicle.SizeMidsize                                            < 0.0000000000000002
## City.MpG                                                        0.00000000000008629
##                                                                   
## (Intercept)                                                    ***
## Year                                                           ***
## Engine.Fuel.Typediesel                                            
## Engine.Fuel.Typeelectric                                       ***
## Engine.Fuel.Typeflex-fuel (premium unleaded recommended/E85)   ** 
## Engine.Fuel.Typeflex-fuel (premium unleaded required/E85)      .  
## Engine.Fuel.Typeflex-fuel (unleaded/E85)                          
## Engine.Fuel.Typenatural gas                                       
## Engine.Fuel.Typepremium unleaded (recommended)                    
## Engine.Fuel.Typepremium unleaded (required)                       
## Engine.Fuel.Typeregular unleaded                                  
## Engine.HP                                                      ***
## Transmission.TypeAUTOMATIC                                        
## Transmission.TypeDIRECT_DRIVE                                     
## Transmission.TypeMANUAL                                        ***
## Driven.Wheelsfour wheel drive                                     
## Driven.Wheelsfront wheel drive                                 ** 
## Driven.Wheelsrear wheel drive                                  ***
## Number.of.Doors                                                *  
## Market.CategoryCrossover,Diesel                                ***
## Market.CategoryCrossover,Exotic,Luxury,High-Performance        *  
## Market.CategoryCrossover,Exotic,Luxury,Performance             *  
## Market.CategoryCrossover,Factory Tuner,Luxury,High-Performance ***
## Market.CategoryCrossover,Factory Tuner,Luxury,Performance         
## Market.CategoryCrossover,Factory Tuner,Performance                
## Market.CategoryCrossover,Flex Fuel                                
## Market.CategoryCrossover,Flex Fuel,Luxury                      ***
## Market.CategoryCrossover,Flex Fuel,Luxury,Performance          ***
## Market.CategoryCrossover,Flex Fuel,Performance                    
## Market.CategoryCrossover,Hatchback                             *  
## Market.CategoryCrossover,Hatchback,Factory Tuner,Performance   ***
## Market.CategoryCrossover,Hatchback,Luxury                      *  
## Market.CategoryCrossover,Hatchback,Performance                 .  
## Market.CategoryCrossover,Hybrid                                ***
## Market.CategoryCrossover,Luxury                                ***
## Market.CategoryCrossover,Luxury,Diesel                         ***
## Market.CategoryCrossover,Luxury,High-Performance               ***
## Market.CategoryCrossover,Luxury,Hybrid                         ***
## Market.CategoryCrossover,Luxury,Performance                    ***
## Market.CategoryCrossover,Luxury,Performance,Hybrid             ***
## Market.CategoryCrossover,Performance                              
## Market.CategoryDiesel                                          *  
## Market.CategoryDiesel,Luxury                                   ***
## Market.CategoryExotic,Factory Tuner,Luxury,High-Performance    ***
## Market.CategoryExotic,High-Performance                         ***
## Market.CategoryExotic,Luxury,High-Performance                  ***
## Market.CategoryFactory Tuner,High-Performance                  ***
## Market.CategoryFactory Tuner,Luxury,High-Performance           ***
## Market.CategoryFactory Tuner,Luxury,Performance                ***
## Market.CategoryFactory Tuner,Performance                       ***
## Market.CategoryFlex Fuel                                       ** 
## Market.CategoryFlex Fuel,Diesel                                .  
## Market.CategoryFlex Fuel,Hybrid                                   
## Market.CategoryFlex Fuel,Luxury                                ***
## Market.CategoryFlex Fuel,Luxury,High-Performance               ***
## Market.CategoryFlex Fuel,Luxury,Performance                    ***
## Market.CategoryFlex Fuel,Performance                           ** 
## Market.CategoryFlex Fuel,Performance,Hybrid                       
## Market.CategoryHatchback                                          
## Market.CategoryHatchback,Diesel                                *  
## Market.CategoryHatchback,Factory Tuner,High-Performance        ** 
## Market.CategoryHatchback,Factory Tuner,Luxury,Performance      .  
## Market.CategoryHatchback,Factory Tuner,Performance             ***
## Market.CategoryHatchback,Flex Fuel                                
## Market.CategoryHatchback,Hybrid                                ***
## Market.CategoryHatchback,Luxury                                   
## Market.CategoryHatchback,Luxury,Hybrid                         ** 
## Market.CategoryHatchback,Luxury,Performance                       
## Market.CategoryHatchback,Performance                           ***
## Market.CategoryHigh-Performance                                ***
## Market.CategoryHybrid                                          ***
## Market.CategoryLuxury                                          ***
## Market.CategoryLuxury,High-Performance                         ***
## Market.CategoryLuxury,High-Performance,Hybrid                  *  
## Market.CategoryLuxury,Hybrid                                   ***
## Market.CategoryLuxury,Performance                              ***
## Market.CategoryLuxury,Performance,Hybrid                       ***
## Market.CategoryN/A                                                
## Market.CategoryPerformance                                     ***
## Market.CategoryPerformance,Hybrid                                 
## Vehicle.SizeLarge                                              .  
## Vehicle.SizeMidsize                                            ***
## City.MpG                                                       ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7470 on 10390 degrees of freedom
## Multiple R-squared:  0.838,  Adjusted R-squared:  0.837 
## F-statistic:  657 on 82 and 10390 DF,  p-value: <0.0000000000000002


Since we are including categorical variables in our model, we obtain so many predictors in our model. However, if we bring our focus to the Pr(>|t|) values, we can see that the values there represent how significant the predictors are which also represented by the number of stars besides the values for an easier and quicker interpretations. When statistically tested where:

- H0: Correlation equals to 0 (non-linear)
- H1: Correlation is not equal to 0 (linear)

where as P-value < alpha, reject H0, with alpha = 0.05.

We can say that the predictors which p-value above the value of alpha does not have any significant correlation to the price of the car. However, this does not mean that every predictors which have p-value above alpha is absolutely insignificant. For best possible result, we have to look into the business side or consult with the experts which understand the content of our data to determine whether the predictors is completely insignificant or it is actually a significant data which have to be included into the model.

As for the result itself, the model can explain the MSRP value better then just a single predictor which have the strongest correlation to the MSRP at a value of 86% compared to 59%. this means that chances of predicting the correct value of the MSRP are much higher than when we use a single predictor. For this reason, it is important to include as much possible significant predictors to our model.

To understand the numerical predictors, we can follow the multiple linear regression formula to explain the model and subtitute the variables with our variables obtained from our linear regression model. The formula mentioned is as follow:

\[\hat{y} = \beta_0 + \beta_1*x_1 + \beta_2*x_2 + ... + \beta_n*x_n\]

numerical predictors:
- Year
- Engine.HP
- City.MpG

formula based on the numerical predictors:

\[\hat{y} = -1861117.92 + 931.79*Year + 85.80*Engine.HP + -163.33*City.MpG\]

If we take a look at one of the predictor such as Engine.HP, for every increase of 1 horsepower, there will be an increase of 94.60 US Dollars of the MSRP. and for every decrease of 1 horsepower, the MSRP will be reduced as much as 94.60 US Dollars. This interpretation can also be applied to the other numerical predictors. In the case of City.MpG where the value is negative. For every increase of 1 City.MpG, the MSRP decrease as much as 221.90 US Dollars. And for every decrease of 1 city.mpg, the MSRP increase as much as 221.90 US Dollars. Which means that the higher the power of a car, the higher the car price will be, The more economical the car be, the cheaper the car price wiil be.

As for the categorical predictors, we can do as simple as selecting one of the variable from the same predictor. For example, if we look at the Engine.Fuel.Type predictor. if we choose “electric” variable, the MSRP will increase as much as 26456.90 US Dollars. Do remember that a categorical predictor can only have 1 variable selected from the same predictor.

Due to the categorical predictors being too specific in this data, we will include a multiple linear regression model where the predictors are only numerical data type from our dataset which let us observe how significant the categorical predictors are.

Numerical model for comparison using stepwise method
data_num <- model_data %>% 
  select(where(is.numeric))

model_num <- step(lm(MSRP ~ ., data_num), direction = "backward")
## Start:  AIC=192781
## MSRP ~ Year + Engine.HP + Number.of.Doors + City.MpG + Popularity
## 
##                   Df    Sum of Sq           RSS    AIC
## - Number.of.Doors  1    113533250 1032399753653 192780
## <none>                            1032286220403 192781
## - Popularity       1   9028686609 1041314907012 192870
## - City.MpG         1  10295843306 1042582063709 192883
## - Year             1 259639857771 1291926078174 195128
## - Engine.HP        1 900331830001 1932618050404 199346
## 
## Step:  AIC=192780
## MSRP ~ Year + Engine.HP + City.MpG + Popularity
## 
##              Df    Sum of Sq           RSS    AIC
## <none>                       1032399753653 192780
## - Popularity  1   9325039460 1041724793113 192872
## - City.MpG    1  10222801770 1042622555423 192881
## - Year        1 285581088484 1317980842136 195335
## - Engine.HP   1 906604943545 1939004697198 199379
summary(model_num)
## 
## Call:
## lm(formula = MSRP ~ Year + Engine.HP + City.MpG + Popularity, 
##     data = data_num)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -41124  -5850   -926   4839  61292 
## 
## Coefficients:
##                  Estimate    Std. Error t value            Pr(>|t|)    
## (Intercept) -1763652.3355    32316.8754  -54.57 <0.0000000000000002 ***
## Year             874.8201       16.2572   53.81 <0.0000000000000002 ***
## Engine.HP        140.0783        1.4610   95.88 <0.0000000000000002 ***
## City.MpG         190.8609       18.7467   10.18 <0.0000000000000002 ***
## Popularity        -0.6525        0.0671   -9.72 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9930 on 10468 degrees of freedom
## Multiple R-squared:  0.712,  Adjusted R-squared:  0.712 
## F-statistic: 6.47e+03 on 4 and 10468 DF,  p-value: <0.0000000000000002


The R squared value of around 71% were lower at about 13% compare to the the stepwise model with categorical predictors. It means that the categorical predictors were significant for only around 13%.

Model performance and evaluation

Let us try to predict the MSRP value based on our model. Since we made 4 models, we can make 4 prediction model based on 4 different model that we made and check the performance of each model by checking the error.

Create a prediction model based on model_HP and model_bwd
pred_HP <- predict(model_HP, car_dataset, interval = "confidence", level = 0.95)
pred_all <- predict(model_all, car_dataset, interval = "confidence", level = 0.95)
pred_bwd <- predict(model_bwd, car_dataset, interval = "confidence", level = 0.95)
pred_num <- predict(model_num, car_dataset, interval = "confidence", level = 0.95)

Percentage of error in the prediction

Mean Absolute Percentage of Error
print(paste0("MAPE with single predictor: ", 
             round(MAPE(pred_HP, car_dataset$MSRP) * 100, digit = 3), "%"))
## [1] "MAPE with single predictor: 113.577%"
print(paste0("MAPE with all predictors: ", 
             round(MAPE(pred_all, car_dataset$MSRP) * 100, digit = 3), "%"))
## [1] "MAPE with all predictors: 51.806%"
print(paste0("MAPE with multiple predictors: ", 
             round(MAPE(pred_bwd, car_dataset$MSRP) * 100, digit = 3), "%"))
## [1] "MAPE with multiple predictors: 51.799%"
print(paste0("MAPE with numemrical predictors: ",
             round(MAPE(pred_num, car_dataset$MSRP) * 100, digit = 3), "%"))
## [1] "MAPE with numemrical predictors: 57.753%"

According to the error test with MAPE (Mean Absolute Percentage Error) method, we can see that the model with multiple predictors have a better prediction for the MRSP with a chance of failed in predicting the MSRP for only about 51.806% compare to the single predictor model with a chance of failed in predicting the MSRP for about 113.577%, it has a lower chance of failed prediction compare to the numerical model which have 57.753% chance of a failed prediction, and it has a slightly lower error percentage by 0.007% compare to the model which were using all of the data as its predictors. We can safely say that the model build with stepwise function where categorical predictors were included is better compare to the other.

Linearity Test

As we model our data using linear regression method, it is important to ensure that our model is linear. We can check the linearity by plotting a scatter plot where the x and y axis will be the residual and the fitted values with a smooth line going through the plot to help us see the pattern.

Linearity check model_HP
linear_plot_HP <- data.frame(residual = model_HP$residuals, fitted = model_HP$fitted.values)
linear_plot_HP %>%
  ggplot(aes(fitted, residual))+
  geom_point()+
  geom_smooth()+
  geom_hline(yintercept = 0, col = "red",  size = 1)+
  labs(x = "Fitted Values", y = "Residuals")+
  theme_minimal()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'


Linearity check model_all
linear_plot_all <- data.frame(residual = model_all$residuals, fitted = model_all$fitted.values)
linear_plot_all %>%
  ggplot(aes(fitted, residual))+
  geom_point()+
  geom_smooth()+
  geom_hline(yintercept = 0, col = "red",  size = 1)+
  labs(x = "Fitted Values", y = "Residuals")+
  theme_minimal()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'


Linearity check model_bwd
linear_plot_bwd <- data.frame(residual = model_bwd$residuals, fitted = model_bwd$fitted.values)
linear_plot_bwd %>%
  ggplot(aes(fitted, residual))+
  geom_point()+
  geom_smooth()+
  geom_hline(yintercept = 0, col = "red",  size = 1)+
  labs(x = "Fitted Values", y = "Residuals")+
  theme_minimal()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'


Linearity check model_num
linear_plot_num <- data.frame(residual = model_num$residuals, fitted = model_num$fitted.values)
linear_plot_num %>%
  ggplot(aes(fitted, residual))+
  geom_point()+
  geom_smooth()+
  geom_hline(yintercept = 0, col = "red",  size = 1)+
  labs(x = "Fitted Values", y = "Residuals")+
  theme_minimal()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'


When we observe the 4 chars of our models, they are relatively stable and trying to stay in the line. Although the line were not accurately straight, it can still be considered as a linear model. Even though the single predictor model can be said to have the straightest line compare to the other, the performance of all the model were more or less the same.

Normality Test

To determine whether the residuals of our models were normally distributed, we would want to test our models using normality test. The testing method requires our model to have the p-value measured and compared to the hypothesis as follow:

H0: normally distributed residuals
H1: residual does not distributed normally

when p-value < 0.05, reject H0. which means that the residual of our model is distributed normally

ad.test(model_HP$residuals)
## 
##  Anderson-Darling normality test
## 
## data:  model_HP$residuals
## A = 116, p-value <0.0000000000000002
ad.test(model_all$residuals)
## 
##  Anderson-Darling normality test
## 
## data:  model_all$residuals
## A = 48, p-value <0.0000000000000002
ad.test(model_bwd$residuals)
## 
##  Anderson-Darling normality test
## 
## data:  model_bwd$residuals
## A = 48, p-value <0.0000000000000002
ad.test(model_num$residuals)
## 
##  Anderson-Darling normality test
## 
## data:  model_num$residuals
## A = 117, p-value <0.0000000000000002


In Algoritma Data Science School, we were taught to use shapiro.test() to test the distribution of the residual of the model as the basic of normality test, However. for model with a sample size bigger than 5000, we have to choose other function such as ad.test() as shapiro.test() have a maximum sample of 5000. when our models were tested using Anderson-Darling normality test, the p-value was lower than 0.05 which indicates a normally distributed residuals.

Homoscedasticity

In the model which we have made. we would want to make sure that the error result from our model does not have a pattern. If there is a pattern, there might be outliers which we have to remove. The condition which the error does hava a pattern is called Heteroscedasticity. The test can be done by using Breusch-Pagan test in the result follows a certain rules as follow:
H0: model is homoscedasticity H1: model is heteroscedasticity
when p-value < 0.05, reject H0. which means that our model is homoscedasticity

bptest(model_HP)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_HP
## BP = 643, df = 1, p-value <0.0000000000000002
bptest(model_all)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_all
## BP = 2075, df = 83, p-value <0.0000000000000002
bptest(model_bwd)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_bwd
## BP = 2073, df = 82, p-value <0.0000000000000002
bptest(model_num)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_num
## BP = 1485, df = 4, p-value <0.0000000000000002


Result has shown that all our model have a p-value of less than 0.05 which can be said that all the model is homoscedasticity. If we were to plot our residual error with MSRP, we should get no pattern between the two variable

Multicolinearity

It is important to make sure that each and every predictors in our model does not have any correlation between each other. If any of the predictors have correlation to the other, it can reduce the level of significance of each independent predictors to our target which is the MSRP. For a non-multiple linear model such as model_HP, this test is not necessary.

vif(model_all)
##                       GVIF Df GVIF^(1/(2*Df))
## Year                 2.835  1           1.684
## Engine.Fuel.Type  2463.299  9           1.543
## Engine.HP            5.898  1           2.429
## Transmission.Type   15.946  3           1.587
## Driven.Wheels        3.373  3           1.225
## Number.of.Doors      1.574  1           1.254
## Market.Category   3067.786 61           1.068
## Vehicle.Size         2.744  2           1.287
## City.MpG             7.169  1           2.678
## Popularity           1.168  1           1.081
vif(model_bwd)
##                       GVIF Df GVIF^(1/(2*Df))
## Year                 2.815  1           1.678
## Engine.Fuel.Type  2436.564  9           1.542
## Engine.HP            5.893  1           2.427
## Transmission.Type   15.800  3           1.584
## Driven.Wheels        3.356  3           1.224
## Number.of.Doors      1.558  1           1.248
## Market.Category   2865.424 61           1.067
## Vehicle.Size         2.720  2           1.284
## City.MpG             7.166  1           2.677
vif(model_num)
##       Year  Engine.HP   City.MpG Popularity 
##      1.524      1.772      1.615      1.015


Result of the multicolinearity test shown us that there are 3 predictors on the model which uses all variables as predictors, and 3 predictors on the stepwise model with the GVIF value above 10. Furthermore, we have Market.Category and Engine.Fuel.Type which have GVIF value above 2000. This might be caused by the the number of variables inside those categorical predictors. The higher the number of variables inside the categorical predictors and the number of the correlating variables, the higher the GVIF value will be.

Conclusion

As we progress through working on the data, we realize that choosing the right data is very important. For a data which have many categorical variables, especially the one in which the categorical variables were very specific such as Market.Category, will result in a potentially higher number of correlating categorical variables of predictors which we want to avoid to model our data based on those predictors. Unfortunately, adding only numerical predictors to our model will reduce the R squared value as they are one of the specification of a car model. In our model, the R squares reduced to around 13% compare to our model which have categorical predictors. For linear regression model, we recommend to use a data where numerical variables dominates the data. if there are any categorical variables present, we would like to make sure that the variables haave a general categorical values and not a specific categorical values such as Market.Categories as seen on the dataset that we are working on in this markdown.

However, it does not mean that our model completely useless. if we do not have any other option besides to use this data, we can do a feature engineering to our problematical variables if possible. The result has shown us that our model need an improvement as there were 3 predictors which have a correlation to the other predictors based on the multicolinearity test. Futhermore, the value of GVIF from the 2 predictors which have a value bigger than 10, have a unreasonably high value far above the value of 10. Further data processing or feature engineering might possibly the solution to this problem.