Connceting Data, Information and Knowledge

Typically answering question: - How has the fuel economy changed over time? - Are there any other interesting insights or trends?

EDA

Dataset

The Fuel Economogy dataset

  • 42002 Rows
  • x Variables
##    barrels08       barrelsA08        charge120   charge240       
##  Min.   : 0.06   Min.   : 0.0000   Min.   :0   Min.   : 0.00000  
##  1st Qu.:14.33   1st Qu.: 0.0000   1st Qu.:0   1st Qu.: 0.00000  
##  Median :16.48   Median : 0.0000   Median :0   Median : 0.00000  
##  Mean   :17.36   Mean   : 0.2201   Mean   :0   Mean   : 0.03609  
##  3rd Qu.:19.39   3rd Qu.: 0.0000   3rd Qu.:0   3rd Qu.: 0.00000  
##  Max.   :47.09   Max.   :18.3117   Max.   :0   Max.   :12.00000  
##                                                                  
##      city08          city08U           cityA08            cityA08U       
##  Min.   :  6.00   Min.   :  0.000   Min.   :  0.0000   Min.   :  0.0000  
##  1st Qu.: 15.00   1st Qu.:  0.000   1st Qu.:  0.0000   1st Qu.:  0.0000  
##  Median : 17.00   Median :  0.000   Median :  0.0000   Median :  0.0000  
##  Mean   : 18.21   Mean   :  5.495   Mean   :  0.6161   Mean   :  0.4662  
##  3rd Qu.: 20.00   3rd Qu.: 12.274   3rd Qu.:  0.0000   3rd Qu.:  0.0000  
##  Max.   :150.00   Max.   :150.000   Max.   :145.0000   Max.   :145.0835  
##                                                                          
##      cityCD             cityE              cityUF              co2        
##  Min.   :0.000000   Min.   :  0.0000   Min.   :0.000000   Min.   : -1.00  
##  1st Qu.:0.000000   1st Qu.:  0.0000   1st Qu.:0.000000   1st Qu.: -1.00  
##  Median :0.000000   Median :  0.0000   Median :0.000000   Median : -1.00  
##  Mean   :0.000471   Mean   :  0.2741   Mean   :0.001279   Mean   : 80.11  
##  3rd Qu.:0.000000   3rd Qu.:  0.0000   3rd Qu.:0.000000   3rd Qu.: -1.00  
##  Max.   :5.350000   Max.   :122.0000   Max.   :0.896000   Max.   :847.00  
##                                                                           
##       co2A         co2TailpipeAGpm  co2TailpipeGpm       comb08      
##  Min.   : -1.000   Min.   :  0.00   Min.   :   0.0   Min.   :  7.00  
##  1st Qu.: -1.000   1st Qu.:  0.00   1st Qu.: 386.4   1st Qu.: 17.00  
##  Median : -1.000   Median :  0.00   Median : 447.0   Median : 20.00  
##  Mean   :  5.713   Mean   : 17.72   Mean   : 468.5   Mean   : 20.46  
##  3rd Qu.: -1.000   3rd Qu.:  0.00   3rd Qu.: 523.0   3rd Qu.: 23.00  
##  Max.   :713.000   Max.   :713.00   Max.   :1269.6   Max.   :136.00  
##                                                                      
##     comb08U           combA08            combA08U            combE         
##  Min.   :  0.000   Min.   :  0.0000   Min.   :  0.0000   Min.   :  0.0000  
##  1st Qu.:  0.000   1st Qu.:  0.0000   1st Qu.:  0.0000   1st Qu.:  0.0000  
##  Median :  0.000   Median :  0.0000   Median :  0.0000   Median :  0.0000  
##  Mean   :  6.149   Mean   :  0.6771   Mean   :  0.5042   Mean   :  0.2804  
##  3rd Qu.: 14.273   3rd Qu.:  0.0000   3rd Qu.:  0.0000   3rd Qu.:  0.0000  
##  Max.   :136.000   Max.   :133.0000   Max.   :133.2662   Max.   :121.0000  
##                                                                            
##    combinedCD         combinedUF         cylinders          displ      
##  Min.   :0.000000   Min.   :0.000000   Min.   : 2.000   Min.   :0.000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.: 4.000   1st Qu.:2.200  
##  Median :0.000000   Median :0.000000   Median : 6.000   Median :3.000  
##  Mean   :0.000363   Mean   :0.001261   Mean   : 5.722   Mean   :3.302  
##  3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.: 6.000   3rd Qu.:4.300  
##  Max.   :4.800000   Max.   :0.888000   Max.   :16.000   Max.   :8.400  
##                                        NA's   :171      NA's   :169    
##                         drive           engId                    eng_dscr    
##  Front-Wheel Drive         :13939   Min.   :    0                    :15899  
##  Rear-Wheel Drive          :13539   1st Qu.:    0   (FFS)            : 8827  
##  4-Wheel or All-Wheel Drive: 6648   Median :  186   SIDI             : 4902  
##  All-Wheel Drive           : 2713   Mean   : 8377   (FFS) CA model   :  926  
##  4-Wheel Drive             : 1328   3rd Qu.: 4301   (FFS)      (MPFI):  734  
##                            : 1189   Max.   :69102   FFV              :  683  
##  (Other)                   :  725                   (Other)          : 8110  
##     feScore          fuelCost08    fuelCostA08                 fuelType    
##  Min.   :-1.0000   Min.   : 500   Min.   :   0.00   Regular        :25997  
##  1st Qu.:-1.0000   1st Qu.:1950   1st Qu.:   0.00   Premium        :11067  
##  Median :-1.0000   Median :2350   Median :   0.00   Gasoline or E85: 1287  
##  Mean   : 0.2389   Mean   :2378   Mean   :  90.06   Diesel         : 1142  
##  3rd Qu.:-1.0000   3rd Qu.:2700   3rd Qu.:   0.00   Electricity    :  168  
##  Max.   :10.0000   Max.   :7350   Max.   :3800.00   Premium or E85 :  125  
##                                                     (Other)        :  295  
##              fuelType1        ghgScore        ghgScoreA         highway08     
##  Diesel           : 1142   Min.   :-1.000   Min.   :-1.0000   Min.   :  9.00  
##  Electricity      :  168   1st Qu.:-1.000   1st Qu.:-1.0000   1st Qu.: 20.00  
##  Midgrade Gasoline:  100   Median :-1.000   Median :-1.0000   Median : 24.00  
##  Natural Gas      :   60   Mean   : 0.237   Mean   :-0.9224   Mean   : 24.35  
##  Premium Gasoline :11267   3rd Qu.:-1.000   3rd Qu.:-1.0000   3rd Qu.: 28.00  
##  Regular Gasoline :27344   Max.   :10.000   Max.   : 8.0000   Max.   :123.00  
##                                                                               
##    highway08U        highwayA08        highwayA08U         highwayCD       
##  Min.   :  0.000   Min.   :  0.0000   Min.   :  0.0000   Min.   :0.000000  
##  1st Qu.:  0.000   1st Qu.:  0.0000   1st Qu.:  0.0000   1st Qu.:0.000000  
##  Median :  0.000   Median :  0.0000   Median :  0.0000   Median :0.000000  
##  Mean   :  7.283   Mean   :  0.7826   Mean   :  0.5751   Mean   :0.000242  
##  3rd Qu.: 17.546   3rd Qu.:  0.0000   3rd Qu.:  0.0000   3rd Qu.:0.000000  
##  Max.   :123.340   Max.   :121.0000   Max.   :121.2005   Max.   :4.060000  
##                                                                            
##     highwayE          highwayUF             hlv             hpv        
##  Min.   :  0.0000   Min.   :0.000000   Min.   : 0.00   Min.   :  0.00  
##  1st Qu.:  0.0000   1st Qu.:0.000000   1st Qu.: 0.00   1st Qu.:  0.00  
##  Median :  0.0000   Median :0.000000   Median : 0.00   Median :  0.00  
##  Mean   :  0.2885   Mean   :0.001237   Mean   : 2.02   Mean   : 10.36  
##  3rd Qu.:  0.0000   3rd Qu.:0.000000   3rd Qu.: 0.00   3rd Qu.:  0.00  
##  Max.   :120.0000   Max.   :0.877000   Max.   :49.00   Max.   :195.00  
##                                                                        
##        id             lv2              lv4                make      
##  Min.   :    1   Min.   : 0.000   Min.   : 0.000   Chevrolet: 3944  
##  1st Qu.:10021   1st Qu.: 0.000   1st Qu.: 0.000   Ford     : 3284  
##  Median :20042   Median : 0.000   Median : 0.000   Dodge    : 2559  
##  Mean   :20154   Mean   : 1.815   Mean   : 6.139   GMC      : 2471  
##  3rd Qu.:30331   3rd Qu.: 0.000   3rd Qu.:13.000   Toyota   : 2010  
##  Max.   :40434   Max.   :41.000   Max.   :55.000   BMW      : 1856  
##                                                    (Other)  :23957  
##              model       mpgData   phevBlended        pv2        
##  F150 Pickup 2WD:  215   N:27367   false:40005   Min.   :  0.00  
##  F150 Pickup 4WD:  193   Y:12714   true :   76   1st Qu.:  0.00  
##  Mustang        :  192                           Median :  0.00  
##  Jetta          :  190                           Mean   : 13.56  
##  Truck 2WD      :  187                           3rd Qu.:  0.00  
##  Camaro         :  174                           Max.   :194.00  
##  (Other)        :38930                                           
##       pv4             range            rangeCity          rangeCityA       
##  Min.   :  0.00   Min.   :  0.0000   Min.   :  0.0000   Min.   :  0.00000  
##  1st Qu.:  0.00   1st Qu.:  0.0000   1st Qu.:  0.0000   1st Qu.:  0.00000  
##  Median :  0.00   Median :  0.0000   Median :  0.0000   Median :  0.00000  
##  Mean   : 33.82   Mean   :  0.6164   Mean   :  0.5786   Mean   :  0.06753  
##  3rd Qu.: 91.00   3rd Qu.:  0.0000   3rd Qu.:  0.0000   3rd Qu.:  0.00000  
##  Max.   :192.00   Max.   :335.0000   Max.   :333.1115   Max.   :103.03000  
##                                                                            
##     rangeHwy          rangeHwyA                    trany           UCity       
##  Min.   :  0.0000   Min.   : 0.00000   Automatic 4-spd:11045   Min.   :  0.00  
##  1st Qu.:  0.0000   1st Qu.: 0.00000   Manual 5-spd   : 8351   1st Qu.: 18.11  
##  Median :  0.0000   Median : 0.00000   Automatic 3-spd: 3151   Median : 21.30  
##  Mean   :  0.5639   Mean   : 0.06257   Automatic (S6) : 2984   Mean   : 22.98  
##  3rd Qu.:  0.0000   3rd Qu.: 0.00000   Manual 6-spd   : 2671   3rd Qu.: 25.70  
##  Max.   :346.9000   Max.   :90.55000   Automatic 5-spd: 2198   Max.   :224.80  
##                                        (Other)        : 9681                   
##      UCityA            UHighway        UHighwayA      
##  Min.   :  0.0000   Min.   :  0.00   Min.   :  0.000  
##  1st Qu.:  0.0000   1st Qu.: 27.66   1st Qu.:  0.000  
##  Median :  0.0000   Median : 33.02   Median :  0.000  
##  Mean   :  0.7894   Mean   : 34.11   Mean   :  1.077  
##  3rd Qu.:  0.0000   3rd Qu.: 38.84   3rd Qu.:  0.000  
##  Max.   :207.2622   Max.   :182.70   Max.   :173.144  
##                                                       
##                          VClass           year       youSaveSpend    guzzler  
##  Compact Cars               : 5751   Min.   :1984   Min.   :-29000    :37704  
##  Subcompact Cars            : 5036   1st Qu.:1991   1st Qu.: -5750   G: 1398  
##  Midsize Cars               : 4697   Median :2002   Median : -4000   S:   15  
##  Standard Pickup Trucks     : 2354   Mean   :2001   Mean   : -4135   T:  964  
##  Sport Utility Vehicle - 4WD: 2090   3rd Qu.:2011   3rd Qu.: -2000            
##  Large Cars                 : 2072   Max.   :2019   Max.   :  5250            
##  (Other)                    :18081                                            
##            trans_dscr    tCharger       sCharger            atvType     
##                 :25034   Mode:logical    :39285                 :36707  
##  CLKUP          : 7809   TRUE:6302      S:  796   FFV           : 1412  
##  SIL            : 2189   NA's:33779               Diesel        : 1070  
##  2MODE CLKUP    : 1235                            Hybrid        :  539  
##  Creeper        :  525                            EV            :  168  
##  EMS 2MODE CLKUP:  520                            Plug-in Hybrid:  107  
##  (Other)        : 2769                            (Other)       :   78  
##        fuelType2         rangeA             evMotor         mfrCode     
##             :38534          :38539              :39345          :30818  
##  E85        : 1412   290    :   74   288V Ni-MH :  122   GMX    : 1366  
##  Electricity:  107   270    :   57   245V Ni-MH :   48   BMX    : 1072  
##  Natural Gas:   20   280    :   54   144V Li-Ion:   28   FMX    :  792  
##  Propane    :    8   310    :   41   270V Li-Ion:   26   CRX    :  703  
##                      277    :   38   330V Ni-MH :   26   TYX    :  662  
##                      (Other): 1278   (Other)    :  486   (Other): 4668  
##              c240Dscr       charge240b                     c240bDscr    
##                  :40016   Min.   :0.000000                      :40018  
##  3.6 kW charger  :    4   1st Qu.:0.000000   3.6 kW charger     :    2  
##  6.6 kW charger  :    2   Median :0.000000   6.6 kW charger     :    4  
##  7.2 kW charger  :    2   Mean   :0.007497   80 amp dual charger:   54  
##  single charger  :    3   3rd Qu.:0.000000   dual charger       :    3  
##  standard charger:   54   Max.   :8.500000                              
##                                                                         
##                         createdOn                            modifiedOn   
##  Tue Jan 01 00:00:00 EST 2013:34217   Tue Jan 01 00:00:00 EST 2013:29437  
##  Fri Jun 06 00:00:00 EDT 2014:  100   Mon Sep 26 00:00:00 EDT 2016: 6346  
##  Thu Jul 07 00:00:00 EDT 2016:  100   Wed Apr 05 00:00:00 EDT 2017:  911  
##  Thu Jul 31 00:00:00 EDT 2014:   95   Wed Dec 20 00:00:00 EST 2017:  280  
##  Wed Jun 13 00:00:00 EDT 2018:   93   Tue Nov 22 00:00:00 EST 2016:  191  
##  Thu Aug 03 00:00:00 EDT 2017:   86   Fri Dec 02 00:00:00 EST 2016:  183  
##  (Other)                     : 5390   (Other)                     : 2733  
##  startStop    phevCity          phevHwy           phevComb      
##   :31704   Min.   : 0.0000   Min.   : 0.0000   Min.   : 0.0000  
##  N: 5695   1st Qu.: 0.0000   1st Qu.: 0.0000   1st Qu.: 0.0000  
##  Y: 2682   Median : 0.0000   Median : 0.0000   Median : 0.0000  
##            Mean   : 0.1229   Mean   : 0.1234   Mean   : 0.1225  
##            3rd Qu.: 0.0000   3rd Qu.: 0.0000   3rd Qu.: 0.0000  
##            Max.   :97.0000   Max.   :81.0000   Max.   :88.0000  
## 

The Dependent Variable

UCity

Definition - UCity - unadjusted city MPG for fuelType1; - UCityA - unadjusted city MPG for fuelType2;

  • Remarks
    • The median UCity (21.30) seems to be unchanged over the years
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   18.11   21.30   22.98   25.70  224.80

  • The right-skewed distribution of UCity tells a tale of a different class of vehicles in last ten years

  • There are 169 vehicles having UCity > 75. And as was expected, dominant majority (150+) are Electric vehicles.

  • 25 zero UCity vehicles:
## [1] 25
##            make                             model                fuelType
## 8128       Ford          F150 Dual-fuel 2WD (CNG) Gasoline or natural gas
## 8129       Ford          F150 Dual-fuel 4WD (CNG) Gasoline or natural gas
## 8130       Ford          F150 Dual-fuel 2WD (LPG)     Gasoline or propane
## 8131       Ford          F150 Dual-fuel 4WD (LPG)     Gasoline or propane
## 9175      Dodge              Ram Van 2500 2WD CNG                     CNG
## 9176      Dodge            Ram Wagon 2500 2WD CNG                     CNG
## 9184       Ford          F150 Dual-fuel 2WD (CNG) Gasoline or natural gas
## 9185       Ford          F150 Dual-fuel 4WD (CNG) Gasoline or natural gas
## 9186       Ford          F150 Dual-fuel 2WD (LPG)     Gasoline or propane
## 9187       Ford          F150 Dual-fuel 4WD (LPG)     Gasoline or propane
## 10283      Ford          F150 Dual-fuel 2WD (LPG)     Gasoline or propane
## 10284      Ford          F150 Dual-fuel 4WD (LPG)     Gasoline or propane
## 11585      Ford          F150 Dual-fuel 2WD (LPG)     Gasoline or propane
## 11586      Ford          F150 Dual-fuel 4WD (LPG)     Gasoline or propane
## 11587 Chevrolet           Express Cargo (Bi-fuel) Gasoline or natural gas
## 11588 Chevrolet       Express Passenger (Bi-fuel) Gasoline or natural gas
## 11589       GMC          Savana (cargo) (Bi-fuel) Gasoline or natural gas
## 11590       GMC        Savana Passenger (Bi-fuel) Gasoline or natural gas
## 11592 Chevrolet     Express Cargo (dedicated CNG)                     CNG
## 11593 Chevrolet Express Passenger (dedicated CNG)                     CNG
## 11594       GMC      Savana Cargo (dedicated CNG)                     CNG
## 11595       GMC  Savana Passenger (dedicated CNG)                     CNG
## 12815     Dodge         Caravan/Grand Caravan 2WD         Gasoline or E85
## 12816  Chrysler      Voyager/Town and Country 2WD         Gasoline or E85
## 21506   Porsche                             924 S                 Regular
##            atvType
## 8128  Bifuel (CNG)
## 8129  Bifuel (CNG)
## 8130  Bifuel (LPG)
## 8131  Bifuel (LPG)
## 9175           CNG
## 9176           CNG
## 9184  Bifuel (CNG)
## 9185  Bifuel (CNG)
## 9186  Bifuel (LPG)
## 9187  Bifuel (LPG)
## 10283 Bifuel (LPG)
## 10284 Bifuel (LPG)
## 11585 Bifuel (LPG)
## 11586 Bifuel (LPG)
## 11587 Bifuel (CNG)
## 11588 Bifuel (CNG)
## 11589 Bifuel (CNG)
## 11590 Bifuel (CNG)
## 11592          CNG
## 11593          CNG
## 11594          CNG
## 11595          CNG
## 12815          FFV
## 12816          FFV
## 21506

Independent Variables

Numeric

atvType

  • Definition

    • type of alternative fuel or advanced technology vehicle
  • Remarks

    • Missing values: The atvType information for majority 38248(91%) of the vehicles is Not Avilable. Replacing NA’s to Not Available for possible use in modeling.
##                summary.vhcls_data.atvType.
## Bifuel (CNG)                            20
## Bifuel (LPG)                             8
## CNG                                     50
## Diesel                                1070
## EV                                     168
## FFV                                   1412
## Hybrid                                 539
## Not Available                        36707
## Plug-in Hybrid                         107
  • Remaining 3708 vehicles plot on time scale exhibit three trends
  • Sharp drop in diesel vehicles to the lowest in late nineties and slight pick-up later
  • A strong rise in FFV (dual fuel vehicles) during 2000-2010 period and the a sharp drop to the lowest
## [1] Diesel         CNG            EV             Bifuel (CNG)   FFV           
## [6] Hybrid         Bifuel (LPG)   Plug-in Hybrid
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid

  • Post 2010, a consistent rising trend of Hybrid fuel and EV vehicles.

  • Looking at idividual etvTypes w.r.t. UCity, the Hybrid fuel and EV vehicles are strong predictors.

  • Depending on the limited number of vehicles, it’s not safe to generalize above trends to all vehicels. Therefore, I’m inclined to drop this attribute as it as but with NA’s treatment, it has significant prediction power.

barrelsA08

  • Definitions

    • barrels08 - annual petroleum consumption in barrels for fuelType1
    • barrelsA08 - annual petroleum consumption in barrels for fuelType2
  • Remarks

    • The median annual petroleum consumption (barrels08) is start to decrease post 2010.

chargeXXXX

  • Definition
    • charge120 : time to charge an electric vehicle in hours at 120 V
    • charge240 : time to charge an electric vehicle in hours at 240 V
    • charge240b : time to charge an electric vehicle in hours at 240 V using the alternate charger
  • Remarks
    • There are no vehicles with 120 V charging. Drop this variable.
    • charge240: 252 vehicles. All non-EVs (39829) are having zero value for this variable.
    • charge240: 62 vehicles. All non-EVs (39829) are having zero value for this variable.
    • Due to nearly zero variance of charge columns, i am inclined to drop them off from model.
    • Just out of interest; the correlation between UCity and charge240 ?
    ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    ##       0       0       0       0       0       0

    ## [1] 252
    ## [1] 62
    ##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
    ## 0.000000 0.000000 0.000000 0.007497 0.000000 8.500000

cityXXXX

  • Definitions
    • city08 : city MPG for fuelType1
    • city08U : unrounded city MPG for fuelType1
    • cityA08 : city MPG for fuelType2
    • cityA08U : unrounded city MPG for fuelType2
    • cityCD - city gasoline consumption (gallons/100 miles) in charge depleting mode (4)
    • cityE - city electricity consumption in kw-hrs/100 miles
    • cityUF - EPA city utility factor (share of electricity) for PHEV
    • phevCity - EPA composite gasoline-electricity city MPGe for plug-in hybrid vehicles
  • Remarks
    • city08 is strongly correlated (0.9971) with UCity. Since UCity is actually unadjusted city MPG for fuelType1, including this variable in the model means to including UCity in the independent variables.

  • Such variables are fixation variables which produce too good to be true results. However the predictions are not actually predictions but a 1 to 1 mapping table. drop city08 from the model.

  • other city mpg variables are mostly present for post 2010 (for EV and Hybrid vehicles)

## 
## Call:
## lm(formula = UCity ~ phevCity, data = vhcls_data)
## 
## Coefficients:
## (Intercept)     phevCity  
##     22.9274       0.4425

co2XXXX

  • Definition
    • co2 - tailpipe CO2 in grams/mile for fuelType1
    • co2A - tailpipe CO2 in grams/mile for fuelType2
    • co2TailpipeAGpm - tailpipe CO2 in grams/mile for fuelType2
    • co2TailpipeGpm- tailpipe CO2 in grams/mile for fuelType1
  • Remarks
    • 31954 vehicles don’t have co2 information. Represented as -1.
    • Few vehicles pre-2013 have co2 information
    • Premium and Regular fuelType vehicle are top co2 emitters
## [1] 31954

  • Interestingly, 168 EVs also reporting co2.
  • But 168 EVs report zero co2, just boasting ?
  • Correlation between UCity and co2 is strongly negative. (-0.75). Which means, at least for the 10k~ vehicles in the dataset, higher emission levels give lower mpg. More energy less miles.

## NULL

cylinders

  • Definition
    • engine cylinders
  • Remarks
    • 171 NA’s
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   2.000   4.000   6.000   5.722   6.000  16.000     171
  • Treated as continuous variable which makes no sense. Type should be cast to categorical.
  • 2 3 4 5 6 8 10 12 16 cylender vehicles.
  • most vehicles (30k+) have 4, 6, 8 cylinders
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   2.000   4.000   6.000   5.722   6.000  16.000     171

##             UCity cylinders
## UCity      1.0000   -0.8157
## cylinders -0.8157    1.0000
  • cylinders are negatively correlated with UCity. (-0.8157), which indicates the bigger the vehicle the less the miles per gallon. Make sense.
  • 4, 6 and 8 cylinders vehicles are getting slightly better on UCity in recent years

displ

  • Definition

    • Displacement, combined volume of the pistons inside the cylinders
  • Remarks

    • negatively correlated to UCity (spearman, -0.846), which makes sense. The larger the engine (more cylinders), the less the UCity/MPG.

  • displ variable has missing values for the EV vehicles. It should be like this.

  • Replacing missing values with ‘NULL’ and making dspl as a factor is going to help make better sense of this variable.

  • One Odd Case is of an EV vehicle with displacement value zero. Making it ‘NULL’ is necessary.

  • Alternatively replace all NA’s with zero. That is consisstent with reality. EV has zero displacement. In addition, DT/LM will be happier with displ being numeric variable.

  • over the years, displ distributions is not changing noticeably.

evMotor

  • Definition
    • electric motor (kw-hrs)
  • Remarks
    • 141 types of motors,
    • 38345 vehicles don’t have any
    • 736 EV, Hybrid, Plug-in Hybrid atvType vehicles have evMotors
    • A strong overfitting predictor for high UCity
      • evMotors like ‘88 kW AC PMSM’ are outperforming UCity measure, including such variable into model can lead to overfitting.
## [1] EV             Hybrid         Plug-in Hybrid
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid
## [1] 736
## 
## Call:
## lm(formula = UCity ~ evMotor_Bin, data = vhcls_data)
## 
## Coefficients:
##        (Intercept)  evMotor_BinnoMotor  
##              64.48              -42.27
## 
## Call:
## lm(formula = UCity ~ evMotor, data = vhcls_data)
## 
## Coefficients:
##                                                (Intercept)  
##                                                    22.2056  
##                                      evMotor100 kW AC PMSM  
##                                                   154.7944  
##                                         evMotor100 kW DCPM  
##                                                    88.0944  
##                                          evMotor101V Ni-MH  
##                                                    27.1932  
##                                  evMotor102kW AC Induction  
##                                                    57.3638  
##                                 evMotor104 kW AC Induction  
##                                                   160.9944  
##                                         evMotor104 kW ACPM  
##                                                   160.9944  
##                                         evMotor105 kW ACPM  
##                                                   160.9944  
##                                 evMotor107 kW AC Induction  
##                                                   134.9944  
##                                      evMotor107 kW AC PMSM  
##                                                   138.9982  
##                                         evMotor110 kW DCPM  
##                                                   156.3944  
##                                              evMotor111 kW  
##                                                    24.8224  
##                                   evMotor111 kW 3-Phase AC  
##                                                    25.3944  
##                                 evMotor115 kW AC Induction  
##                                                    89.8348  
##                                         evMotor115V Li-Ion  
##                                                     9.3947  
##                                      evMotor120 kW AC PMSM  
##                                                   157.3944  
##                                              evMotor124 kW  
##                                                    42.6845  
##                                 evMotor125 kW AC Induction  
##                                                   136.2160  
##                                               evMotor125kW  
##                                                    30.1611  
##                                  evMotor125kW AC Induction  
##                                                    26.9944  
##                                              evMotor126 kW  
##                                                    25.3944  
##                                   evMotor126 kW 3-Phase AC  
##                                                    25.3944  
##                                         evMotor126V Li-Ion  
##                                                     2.5620  
##                                 evMotor132 kW AC Induction  
##                                                    99.4944  
##                               evMotor132 kW AC Synchronous  
##                                                    -2.8466  
##                                               evMotor132kW  
##                                                    -2.8466  
##                                       evMotor135kW AC PMSM  
##                                                    39.9944  
##      evMotor140 (front) 140 (rear) (70 kW-hr battery pack)  
##                                                   111.4944  
##      evMotor140 (front) 140 (rear) (85 kW-hr battery pack)  
##                                                   104.1944  
##      evMotor140 (front) 140 (rear) (90 kW-hr battery pack)  
##                                                   104.1944  
##                                         evMotor144V Li-Ion  
##                                                    20.1248  
##                                          evMotor144V Ni-MH  
##                                                    48.7900  
##                           evMotor147 and 188 kW AC 3-Phase  
##                                                   149.0944  
##                           evMotor147 and 211 kW AC 3-Phase  
##                                                   149.0944  
##                                 evMotor150 and 150 kw DCPM  
##                                                     2.2944  
##                                              evMotor150 kW  
##                                                   124.6443  
##                                         evMotor150 kW ACPM  
##                                                   159.9944  
##                                          evMotor158V Ni-MH  
##                                                    32.3944  
##                           evMotor16 and 37 kW AC Induction  
##                                                    56.6141  
##      evMotor164 (front) 350 (rear) (85 kW-hr battery pack)  
##                                                    98.8944  
##      evMotor164 (front) 350 (rear) (90 kW-hr battery pack)  
##                                                    98.8944  
##                                               evMotor18 kW  
##                                                    48.6944  
##                                  evMotor18 kW AC Induction  
##                                                    48.6944  
## evMotor180 and 350 kW AC Induction (85 kW-hr battery pack)  
##                                                    95.5944  
##                                   evMotor192 kW AC 3-Phase  
##                                                   164.4944  
##     evMotor193 (front) 193 (rear) (100 kW-hr battery pack)  
##                                                   103.3944  
##      evMotor193 (front) 193 (rear) (60 kW-hr battery pack)  
##                                                   108.6944  
##      evMotor193 (front) 193 (rear) (70 kW-hr battery pack)  
##                                                   111.4944  
##      evMotor193 (front) 193 (rear) (75 kW-hr battery pack)  
##                                                   108.8444  
##      evMotor193 (front) 193 (rear) (85 kW-hr battery pack)  
##                                                   104.1944  
##      evMotor193 (front) 193 (rear) (90 kW-hr battery pack)  
##                                                   106.3694  
##     evMotor193 (front) 375 (rear) (100 kW-hr battery pack)  
##                                                    96.1944  
##      evMotor193 (front) 375 (rear) (85 kW-hr battery pack)  
##                                                    98.8944  
##      evMotor193 (front) 375 (rear) (90 kW-hr battery pack)  
##                                                   100.1194  
##                                 evMotor2 @ 150 kw (300 kw)  
##                                                     2.6904  
##                                         evMotor202V Li-Ion  
##                                                    21.1809  
##                                          evMotor202V Ni-MH  
##                                                    43.9937  
##                                         evMotor207V Li-Ion  
##                                                    59.3625  
##                                   evMotor211 kW AC 3-Phase  
##                                                   172.7944  
##                                         evMotor220V Li-Ion  
##                                                    35.2944  
##                                      evMotor222 kW AC PMSM  
##                                                    54.3729  
##                                 evMotor225 kW AC Induction  
##                                                    95.7444  
##         evMotor225 kW AC Induction (60 kW-hr battery pack)  
##                                                    96.3944  
##                                evMotor24 KW AC Synchronous  
##                                                    98.1515  
##                                         evMotor240V Li-Ion  
##                                                    46.1989  
##                                          evMotor245V Ni-MH  
##                                                    29.7115  
##                                         evMotor259V Li-Ion  
##                                                    32.9202  
##                                 evMotor260 kW AC Induction  
##                                                    88.1944  
##                                         evMotor260V Li-Ion  
##                                                    11.7453  
##                                         evMotor266V Li-Ion  
##                                                     8.1944  
##                                  evMotor27 KW AC Induction  
##                                                    83.1069  
##                                 evMotor270 kW AC Induction  
##                                                    88.1944  
##         evMotor270 kW AC Induction (75 kW-hr battery pack)  
##                                                   109.8944  
##         evMotor270 kW AC Induction (85 kW-hr battery pack)  
##                                                    88.1944  
##                                         evMotor270V Li-Ion  
##                                                    27.4024  
##                                          evMotor275V Ni-MH  
##                                                    34.7921  
##                                         evMotor280V Li-Ion  
##                                                    37.6450  
##         evMotor285 kW AC Induction (60 kW-hr battery pack)  
##                                                   109.9944  
##         evMotor285 kW AC Induction (70 kW-hr battery pack)  
##                                                    88.1944  
##         evMotor285 kW AC Induction (75 kW-hr battery pack)  
##                                                   109.8944  
##         evMotor285 kW AC Induction (85 kW-hr battery pack)  
##                                                    88.1944  
##         evMotor285 kW AC Induction (90 kW-hr battery pack)  
##                                                    88.1944  
##                                          evMotor288V Ni-MH  
##                                                     8.2222  
##                                          evMotor30 kW DCPM  
##                                                   112.4214  
##                                         evMotor300V Li-Ion  
##                                                    43.6944  
##                                          evMotor300V Ni-MH  
##                                                     2.5944  
##                                         evMotor311V Li-Ion  
##                                                    11.5062  
##                                          evMotor312V Ni-MH  
##                                                    -1.4433  
##                                          evMotor32kW IPMSM  
##                                                    45.5444  
##                                          evMotor330V Ni-MH  
##                                                    19.8865  
##                           evMotor34 and 65kW 3-phase Sync.  
##                                                    10.7444  
##                                    evMotor34 and 65kW DCPM  
##                                                     8.1176  
##                                         evMotor346V Li-Ion  
##                                                    13.8309  
##                                           evMotor36V Ni-MH  
##                                                     9.8194  
##                                         evMotor374V Li-Ion  
##                                                     7.2686  
##                             evMotor48 and 87 kW 3-Phase AC  
##                                                    36.4944  
##                                          evMotor48V Li-Ion  
##                                                     2.5415  
##                                  evMotor49 kW DC Brushless  
##                                                    46.5699  
##                                          evMotor49 kW DCPM  
##                                                   156.7170  
##                                   evMotor49kW DC Brushless  
##                                                    46.5699  
##                                            evMotor50 KW DC  
##                                                    98.3052  
##                                         evMotor50 kW IPMSM  
##                                                    29.0277  
##                                          evMotor50kW IPMSM  
##                                                    31.9944  
##                                  evMotor52 kW AC Induction  
##                                                    66.1944  
##                             evMotor55 and 174kW 3-Phase AC  
##                                                    20.5944  
##                                          evMotor55 kW DCPM  
##                                                   151.5944  
##                                   evMotor56kW AC Induction  
##                                                    28.0929  
##                             evMotor59 and 61 kW 3-Phase AC  
##                                                     9.4944  
##                              evMotor59 and 61kW 3-Phase AC  
##                                                     9.4944  
##                                evMotor60 KW AC Synchronous  
##                                                   151.6444  
##                                          evMotor60 kW DCPM  
##                                                    15.4034  
##                                  evMotor62 KW AC Induction  
##                                                    94.0013  
##                                  evMotor65 kW AC Induction  
##                                                    13.4944  
##                                                evMotor65kW  
##                                                    14.2802  
##                                            evMotor67 KW AC  
##                                                    59.3398  
##                                 evMotor67 KW AC  Induction  
##                                                    66.4786  
##                                  evMotor67 KW AC Induction  
##                                                    40.2018  
##                                               evMotor68 kW  
##                                                    32.7944  
##                                          evMotor68 kW DCPM  
##                                                    34.3801  
##                                               evMotor70 kW  
##                                                     6.6944  
##                                  evMotor70 kW DC Brushless  
##                                                     3.4527  
##                                       evMotor75 kW AC PM/B  
##                                                    82.2944  
##                                       evMotor75 kW AC PMSM  
##                                                    64.8509  
##                                          evMotor80 kW DCPM  
##                                                   108.3424  
##                                       evMotor81 kW AC PMSM  
##                                                   150.1194  
##                                  evMotor82 kW AC Induction  
##                                                   151.1833  
##                                         evMotor82 kW ACIPM  
##                                                   152.6499  
##                                  evMotor83 kW AC Induction  
##                                                     7.1944  
##                                           evMotor83kW DCPM  
##                                                    12.3598  
##                                  evMotor85 kW AC Induction  
##                                                    33.1644  
##                                       evMotor85 kW AC PMSM  
##                                                   157.1444  
##                                          evMotor85 kW DCPM  
##                                                     6.8230  
##                                          evMotor86V Li-Ion  
##                                                     0.6128  
##                                       evMotor88 kW AC PMSM  
##                                                   202.5944  
##                                  evMotor89 kW AC Induction  
##                                                    25.0733  
##                                          evMotor92 kW DCPM  
##                                                   166.2031  
##                       evMotor95 kW and 116 kW DC Brushless  
##                                                     6.3944  
##                                                evMotor96kW  
##                                                    16.1944  
##                                   evMotor96kW AC Induction  
##                                                    12.6611  
##                                     evMotorTwo 444V Li-Ion  
##                                                    -8.1134  
##                                     evMotorTwo 480V Li-Ion  
##                                                    -7.1886

feScore

  • Definition
    • feScore: EPA Fuel Economy Score,
  • Remarks
    • available only for 8054 post-2012 vehicles.
    • 32027 Pre-(including) 2012 vihicles feScore is -1
    • Although numeric, better to treat it as categorical variable
##    -1     1     2     3     4     5     6     7     8     9    10 
## 32027   164   371   851  1541  2095  1195   938   552   118   229
## [1] 8054

## [1] 0.3978345

combXXXX

  • Definitions
    • comb08 : combined MPG for fuelType1
    • comb08U : unrounded combined MPG for fuelType1
    • combA08 : combined MPG for fuelType2
    • combA08U : unrounded combined MPG for fuelType2
    • combE : combined electricity consumption in kw-hrs/100 miles
    • combinedCD : combined gasoline consumption (gallons/100 miles) in charge depleting mode (4)
    • combinedUF : EPA combined utility factor (share of electricity) for PHEV
  • Remarks
    • comb08 strongly correlated (0.9839) with UCity
    • Other comb variables are limited to last few year.

fuelCostXX

  • Definition

    • fuelCost08 - annual fuel cost for fuelType1
    • fuelCostA08 - annual fuel cost for fuelType2
  • Remarks

    • As expected, fuelCost is strongly correlated with barrels08 (0.9256)
    • Post2012 the median fuelCost08 is starting to decrease slowly but noticeably
    • fuelCostA08 pre 2000 is non-existent.
    • EV, Hybrid and Plug-in Hybrid are least expensive i.t.o fuelCost08
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     500    1950    2350    2378    2700    7350

fuelTypeX

  • Definition

    • fuelType : fuel type with fuelType1 and fuelType2 (if applicable)
    • fuelType1 : fuel type 1. For single fuel vehicles, this will be the only fuel. For dual fuel vehicles, this will be the conventional fuel.
    • fuelType2 : fuel type 2. For dual fuel vehicles, this will be the alternative fuel (e.g. E85, Electricity, CNG, LPG). For single fuel vehicles, this field is not used
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 14. Consider
## specifying shapes manually if you must have them.
## Warning: Removed 118 rows containing missing values (geom_point).

ghgScoreX

  • Definition

    • EPA GHG score for dual fuel vehicle running on the alternative fuel (-1 = Not available)
  • Remarks

    • just like feScore, there are 32027 vehicles without ghgScore (represented as -1)
    • Available for onle 8054 vehicles
    • Both feScore and ghgScore are almost copies of each other (cor 0.99),

  • 32027 vehicles without ghgScore
  • 9990 vehicles with ghgScore available
##    -1     1     2     3     4     5     6     7     8     9    10 
## 32027   163   371   855  1570  2081  1196   962   491   116   249
## [1] 8054
  • Over the year ghgScore progression

  • ghgScore ~ UCity
## [1] 0.3973776

  • drop candidate.

highwayXXXX

  • Definitions

    • highway08 - highway MPG for fuelType1
    • highway08U - unrounded highway MPG for fuelType1
    • highwayA08 - highway MPG for fuelType2
    • highwayA08U - unrounded highway MPG for fuelType2
    • highwayCD - highway gasoline consumption (gallons/100miles) in charge depleting mode
    • highwayE - highway electricity consumption in kw-hrs/100 miles
    • highwayUF - EPA highway utility factor (share of electricity) for PHEV
    • UHighway - unadjusted highway MPG for fuelType1; see the description of the EPA test procedures
    • UHighwayA - unadjusted highway MPG for fuelType2; see the description of the EPA test procedures
  • Remarks
    • highway08, UHighway strongly correlated (0.9257, 0.9245) with UCity
    • highway08, UHighway are 0.99 correlated
    • other highway variables are limited to last f

volume hlv, hpv, lv2, lv4, pv2, pv4

  • Definition
    • hlv - hatchback luggage volume (cubic feet)
    • hpv - hatchback passenger volume (cubic feet)
    • lv2 - 2 door luggage volume (cubic feet)
    • lv4 - 4 door luggage volume (cubic feet)
    • pv2 - 2-door passenger volume (cubic feet)
    • pv4 - 4-door passenger volume (cubic feet)
  • Remarks

    • 35240 vehicles don’t have hlv, hpc, lv2, lv4 information (zero)
    • 35531 vehicles don’t have pv2, pv4 information (zero)
    • hlv~hpv 0.933
    • There is a mild correlation between UCity and hlv or hpv
    • There is very weak correlation between UCity and lv2, lv4, pv2, pv4

- Drop cadidate

range

Definition - range NOT in the dictionary but present in the data

Remarks

  • 39913 vehicles don’t have range information (rep as zero)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   0.0000   0.0000   0.0000   0.6164   0.0000 335.0000
  • For remaining 168 vehicles range is strongly correlated with UCity (0.60)
    • The correlation is not reliable because the distribution is heteroskedastic
## [1] 168
## [1] 0.6013586

  • The 168 vehicles with range value are all EV
## [1] EV
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid

rangeXXX

  • Definition

    • rangeA - EPA range for fuelType2
    • rangeCityA - EPA city range for fuelType2
    • rangeHwyA - EPA highway range for fuelType2
  • Remarks

    • 38539 vehicles don’t have rangeA values ""
    • Rest 1542 vehicles are atvType in (Bifuel (CNG), FFV, Bifuel (LPG) Plug-in Hybrid))
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   0.0000   0.0000   0.0000   0.6164   0.0000 335.0000
## [1] 0
##           range     UCity
## range 1.0000000 0.6013586
## UCity 0.6013586 1.0000000

youSaveSpend

  • Definition

    • you save/spend over 5 years compared to an average car ($)
  • Remarks

    • Storngly correlated with UCity (0.6583)
    • youSaveSpend also shows a trend reversal at year 2010
    • EV, Hybrid, Plug-in Hybrid and CNG vechicles return positive
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -29000   -5750   -4000   -4135   -2000    5250
##              youSaveSpend    UCity
## youSaveSpend     1.000000 0.658371
## UCity            0.658371 1.000000

Categorical + Logical

cylinders_cat

  • Definition
    • factorized cylinders variable

drive

  • Definition

    • drive - drive axle type
  • Remarks

    • 1189 vehicle have missing values for drive, replaced with NA
    • Different axle type have mixed relation with UCity. No clear indication of mutual information. cor(-0.117)
##                                         2-Wheel Drive 
##                       1189                        507 
##              4-Wheel Drive 4-Wheel or All-Wheel Drive 
##                       1328                       6648 
##            All-Wheel Drive             Automatic (A1) 
##                       2713                          1 
##          Front-Wheel Drive    Part-time 4-Wheel Drive 
##                      13939                        217 
##           Rear-Wheel Drive 
##                      13539

## [1] 40081
## [1] "UCity~drive correlation -0.1171"

feScore_cat

  • Definition
    • factorized cylinders variable

guzzler

  • Definition
    • guzzler- if G or T, this vehicle is subject to the gas guzzler tax
  • Remarks
    • S value stand for guzzler tax exempt ?
    • 37704 vehicles have missing values for guzzler (does that mean they are exempt from guzzler tax?)
    • Remaining 2377 vehicles are FFV and Hybrid atvType vehicles
    • As expected, guzzler tax paying vehicles have lower-end UCity
##           G     S     T 
## 37704  1398    15   964

## [1] Not Available FFV           Hybrid       
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid

make, mfrCode

  • Definition
    • make - manufacturer (division)
    • mfrCode - 3-character manufacturer code
  • Remarks
    • both variables contain same information. Either of them can be dropped.
##  [1] Alfa Romeo                    Ferrari                      
##  [3] Dodge                         Ford                         
##  [5] GMC                           Toyota                       
##  [7] Volkswagen                    AM General                   
##  [9] Chevrolet                     E. P. Dutton, Inc.           
## [11] Grumman Olson                 Jeep                         
## [13] Plymouth                      American Motors Corporation  
## [15] Isuzu                         Mitsubishi                   
## [17] Subaru                        Suzuki                       
## [19] Nissan                        Honda                        
## [21] Mercedes-Benz                 Mazda                        
## [23] Maserati                      Aston Martin                 
## [25] Audi                          Chrysler                     
## [27] Mercury                       Pontiac                      
## [29] Buick                         Cadillac                     
## [31] Oldsmobile                    Volvo                        
## [33] Jaguar                        Bertone                      
## [35] Lotus                         Renault                      
## [37] Rolls-Royce                   BMW                          
## [39] Pininfarina                   Bill Dovell Motor Car Company
## [41] Porsche                       TVR Engineering Ltd          
## [43] Merkur                        Peugeot                      
## [45] Saab                          Lincoln                      
## 135 Levels: Acura Alfa Romeo AM General ... Yugo

## 
## Call:
## lm(formula = UCity ~ mfrCode, data = vhcls_data)
## 
## Coefficients:
## (Intercept)   mfrCodeADX   mfrCodeASX   mfrCodeAZD   mfrCodeBEX   mfrCodeBGT  
##     21.5954       1.4108      -5.0428      66.8046      -7.0454     -11.5954  
##  mfrCodeBMX   mfrCodeBYD   mfrCodeCDA   mfrCodeCRX   mfrCodeDSX   mfrCodeFEX  
##      6.5156      74.1828      88.7046       3.8127       2.3934      -5.6543  
##  mfrCodeFJX   mfrCodeFMX   mfrCodeFSK   mfrCodeFTG   mfrCodeGMX   mfrCodeHNX  
##      7.3045       4.6702       3.3006       7.1046       2.1571      12.2258  
##  mfrCodeHYX   mfrCodeJCX   mfrCodeJLX   mfrCodeKAL   mfrCodeKGG   mfrCodeKMX  
##      9.7952      -2.0571       1.3929       2.9046      -8.2954      10.5924  
##  mfrCodeLRX   mfrCodeLTX   mfrCodeMAX   mfrCodeMBV   mfrCodeMBX   mfrCodeMLN  
##     -4.4330       0.6201      -3.7190      -5.4454       5.2703      -2.0218  
##  mfrCodeMTX   mfrCodeNLX   mfrCodeNSX   mfrCodePGN   mfrCodePRX   mfrCodeQTM  
##     15.2185      -7.9738       6.3905      -8.6954       1.9600      -0.8954  
##  mfrCodeRII   mfrCodeRRG   mfrCodeSAX   mfrCodeSKR   mfrCodeSKX   mfrCodeTKX  
##     -5.7110      -7.1661       2.3880      -4.4944       6.2741      10.8277  
##  mfrCodeTSL   mfrCodeTVP   mfrCodeTYX   mfrCodeVGA   mfrCodeVVX   mfrCodeVWX  
##    105.9098      -6.2187       9.2301       5.7141       4.9026       7.5194

mpgData

  • Definition
    • has My MPG data
  • Remarks
    • 12714 vehicles have mpgData Y
    • Not a strong predictor for UCity
## [1] 12714

phevBlended

Definition - phevBlended - if true, this vehicle operates on a blend of gasoline and electricity in charge depleting mode Remarks - 40005 N, 76 Y - For 76 Y vehicles, the phevBlended is a strong predictor

## false  true 
## 40005    76
## 
## Call:
## lm(formula = UCity ~ phevBlended, data = vhcls_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -23.867  -4.843  -1.710   2.678 201.857 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     22.94334    0.05218  439.71   <2e-16 ***
## phevBlendedtrue 20.28253    1.19827   16.93   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.44 on 40079 degrees of freedom
## Multiple R-squared:  0.007098,   Adjusted R-squared:  0.007073 
## F-statistic: 286.5 on 1 and 40079 DF,  p-value: < 2.2e-16

trans_dscr

  • Definition

    • trans_dscr - transmission descriptor;
  • Remarks

    • 25034 vehicles have missing values for this variable
    • SIL EMS transmission type is most important one (lm coef 33)
      • These are Honda Civic and Insight cars (2004, 2005, 2006)
##                           2LKUP           2MODE     2MODE 2LKUP     2MODE 3LKUP 
##           25034              63             448             383              22 
##     2MODE CLKUP  2MODE CLKUP FW     2MODE DC/FW     2MODE VLKUP           3LKUP 
##            1235               2              19               9              15 
##           3MODE     3MODE 2LKUP     3MODE CLKUP  3MODE CLKUP FW     3MODE VLKUP 
##             166               2             517               1               5 
##           4MODE     4MODE CLKUP           6MODE     6MODE CLKUP           CLKUP 
##              35               6               6              41            7809 
##           CMODE     CMODE CLKUP     CMODE VLKUP         Creeper           DC/FW 
##             150             130              17             525              53 
##  Elec Overdrive             EMS       EMS 2MODE  EMS 2MODE CLKU EMS 2MODE CLKUP 
##               3             252              46               5             520 
##       EMS 3MODE EMS 3MODE CLKUP       EMS 5MODE       EMS CLKUP EMS CMODE CLKUP 
##              11               3               1              50               2 
##  fuel injection          Lockup       Lockup A3      LONG RATIO  Mech Overdrive 
##               1               9               1               2               3 
##       Overdrive             SIL SIL 2MODE CLKUP       SIL 3MODE SIL 3MODE CLKUP 
##              10            2189               5               6               2 
##       SIL CLKUP       SIL CMODE     SIL Creeper         SIL EMS           VLKUP 
##               1               2              72               7              52 
##           VMODE     VMODE CLKUP     VMODE VLKUP 
##               2             105              26
## 
## Call:
## lm(formula = UCity ~ trans_dscr, data = vhcls_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.427  -4.427  -1.123   2.573 200.373 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                24.42665    0.06457 378.320  < 2e-16 ***
## trans_dscr2LKUP            -2.16581    1.28868  -1.681 0.092841 .  
## trans_dscr2MODE            -4.98120    0.48695 -10.229  < 2e-16 ***
## trans_dscr2MODE 2LKUP      -3.30358    0.52598  -6.281 3.40e-10 ***
## trans_dscr2MODE 3LKUP      -1.49737    2.17896  -0.687 0.491964    
## trans_dscr2MODE CLKUP      -3.28028    0.29778 -11.016  < 2e-16 ***
## trans_dscr2MODE CLKUP FW   -3.42665    7.22390  -0.474 0.635253    
## trans_dscr2MODE DC/FW      -8.90347    2.34454  -3.798 0.000146 ***
## trans_dscr2MODE VLKUP      -6.48838    3.40585  -1.905 0.056779 .  
## trans_dscr3LKUP            -0.93035    2.63848  -0.353 0.724383    
## trans_dscr3MODE            -4.60658    0.79552  -5.791 7.06e-09 ***
## trans_dscr3MODE 2LKUP      -8.59330    7.22390  -1.190 0.234224    
## trans_dscr3MODE CLKUP      -3.69621    0.45390  -8.143 3.96e-16 ***
## trans_dscr3MODE CLKUP FW   -8.77805   10.21593  -0.859 0.390207    
## trans_dscr3MODE VLKUP      -0.10665    4.56907  -0.023 0.981377    
## trans_dscr4MODE            -4.61372    1.72798  -2.670 0.007588 ** 
## trans_dscr4MODE CLKUP      -6.77939    4.17105  -1.625 0.104098    
## trans_dscr6MODE            -6.40625    4.17105  -1.536 0.124575    
## trans_dscr6MODE CLKUP      -9.42931    1.59673  -5.905 3.55e-09 ***
## trans_dscrCLKUP            -4.58441    0.13241 -34.622  < 2e-16 ***
## trans_dscrCMODE            -4.67248    0.83661  -5.585 2.35e-08 ***
## trans_dscrCMODE CLKUP      -2.60704    0.89830  -2.902 0.003708 ** 
## trans_dscrCMODE VLKUP       1.89361    2.47852   0.764 0.444866    
## trans_dscrCreeper          -8.43588    0.45050 -18.726  < 2e-16 ***
## trans_dscrDC/FW            -1.88634    1.40472  -1.343 0.179325    
## trans_dscrElec Overdrive   -3.42665    5.89841  -0.581 0.561280    
## trans_dscrEMS              -0.50839    0.64676  -0.786 0.431844    
## trans_dscrEMS 2MODE        -6.88840    1.50761  -4.569 4.91e-06 ***
## trans_dscrEMS 2MODE CLKU   -7.29999    4.56907  -1.598 0.110118    
## trans_dscrEMS 2MODE CLKUP  -6.40862    0.45262 -14.159  < 2e-16 ***
## trans_dscrEMS 3MODE       -12.33574    3.08083  -4.004 6.24e-05 ***
## trans_dscrEMS 3MODE CLKUP -11.18679    5.89841  -1.897 0.057891 .  
## trans_dscrEMS 5MODE         2.67335   10.21593   0.262 0.793566    
## trans_dscrEMS CLKUP        -9.17839    1.44616  -6.347 2.22e-10 ***
## trans_dscrEMS CMODE CLKUP  -4.73075    7.22390  -0.655 0.512552    
## trans_dscrfuel injection   -5.53775   10.21593  -0.542 0.587773    
## trans_dscrLockup           -7.05285    3.40585  -2.071 0.038384 *  
## trans_dscrLockup A3        -9.42665   10.21593  -0.923 0.356148    
## trans_dscrLONG RATIO      -11.47665    7.22390  -1.589 0.112135    
## trans_dscrMech Overdrive   -4.75999    5.89841  -0.807 0.419674    
## trans_dscrOverdrive        -7.95998    3.23114  -2.464 0.013762 *  
## trans_dscrSIL              -0.23864    0.22769  -1.048 0.294615    
## trans_dscrSIL 2MODE CLKUP  13.89321    4.56907   3.041 0.002362 ** 
## trans_dscrSIL 3MODE       -10.68207    4.17105  -2.561 0.010441 *  
## trans_dscrSIL 3MODE CLKUP -10.02735    7.22390  -1.388 0.165120    
## trans_dscrSIL CLKUP        -0.41005   10.21593  -0.040 0.967983    
## trans_dscrSIL CMODE        -8.42665    7.22390  -1.166 0.243421    
## trans_dscrSIL Creeper      -8.90660    1.20566  -7.387 1.53e-13 ***
## trans_dscrSIL EMS          33.31620    3.86172   8.627  < 2e-16 ***
## trans_dscrVLKUP             5.69771    1.41814   4.018 5.89e-05 ***
## trans_dscrVMODE            -6.38575    7.22390  -0.884 0.376714    
## trans_dscrVMODE CLKUP      -1.09962    0.99904  -1.101 0.271046    
## trans_dscrVMODE VLKUP       6.48897    2.00451   3.237 0.001208 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.22 on 40028 degrees of freedom
## Multiple R-squared:  0.04984,    Adjusted R-squared:  0.04861 
## F-statistic: 40.38 on 52 and 40028 DF,  p-value: < 2.2e-16

trany

  • Definition

    • transmission type
  • Remarks

    • Most common transmission type is Automatic 4 and 4 Speed
    • Automatic (A1) is the only positive coef (48) with global mean 88
      • All Automatic (A1) vehicles are EV (including best UCity Hyundai vehicle Ioniq) except one Plug-in Hybrid BMW
    • Over the years, Automatic transmission cars are beoming more and more common
    • On broader categorization (Automatic, Manual), Manual transmission does improve UCity 1.43922 as much as.
##                                                    Automatic (A1) 
##                               11                              163 
##                Automatic (AM-S6)                Automatic (AM-S7) 
##                              113                              374 
##                Automatic (AM-S8)                Automatic (AM-S9) 
##                               40                                2 
##                  Automatic (AM5)                  Automatic (AM6) 
##                               14                              136 
##                  Automatic (AM7)                  Automatic (AM8) 
##                              216                                6 
##               Automatic (AV-S10)                Automatic (AV-S6) 
##                                5                              189 
##                Automatic (AV-S7)                Automatic (AV-S8) 
##                              126                               43 
##                   Automatic (L3)                   Automatic (L4) 
##                                2                                2 
##                  Automatic (S10)                   Automatic (S4) 
##                               69                              233 
##                   Automatic (S5)                   Automatic (S6) 
##                              830                             2984 
##                   Automatic (S7)                   Automatic (S8) 
##                              307                             1421 
##                   Automatic (S9) Automatic (variable gear ratios) 
##                               65                              766 
##                 Automatic 10-spd                  Automatic 3-spd 
##                               16                             3151 
##                  Automatic 4-spd                  Automatic 5-spd 
##                            11045                             2198 
##                  Automatic 6-spd                  Automatic 7-spd 
##                             1579                              708 
##                  Automatic 8-spd                  Automatic 9-spd 
##                              347                              208 
##                     Manual 3-spd                     Manual 4-spd 
##                               77                             1483 
##             Manual 4-spd Doubled                     Manual 5-spd 
##                               17                             8351 
##                     Manual 6-spd                     Manual 7-spd 
##                             2671                              113

## 
## Call:
## lm(formula = UCity ~ trany, data = vhcls_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -112.656   -3.662   -0.579    2.886  159.877 
## 
## Coefficients:
##                                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                             88.885      2.017   44.07   <2e-16 ***
## tranyAutomatic (A1)                     48.271      2.084   23.16   <2e-16 ***
## tranyAutomatic (AM-S6)                 -58.007      2.113  -27.46   <2e-16 ***
## tranyAutomatic (AM-S7)                 -65.990      2.046  -32.25   <2e-16 ***
## tranyAutomatic (AM-S8)                 -62.173      2.277  -27.30   <2e-16 ***
## tranyAutomatic (AM-S9)                 -60.185      5.142  -11.71   <2e-16 ***
## tranyAutomatic (AM5)                   -44.123      2.695  -16.37   <2e-16 ***
## tranyAutomatic (AM6)                   -53.657      2.097  -25.59   <2e-16 ***
## tranyAutomatic (AM7)                   -67.183      2.068  -32.49   <2e-16 ***
## tranyAutomatic (AM8)                   -60.602      3.395  -17.85   <2e-16 ***
## tranyAutomatic (AV-S10)                -52.199      3.608  -14.47   <2e-16 ***
## tranyAutomatic (AV-S6)                 -54.206      2.075  -26.13   <2e-16 ***
## tranyAutomatic (AV-S7)                 -53.421      2.103  -25.40   <2e-16 ***
## tranyAutomatic (AV-S8)                 -58.879      2.260  -26.05   <2e-16 ***
## tranyAutomatic (L3)                    -71.212      5.142  -13.85   <2e-16 ***
## tranyAutomatic (L4)                    -71.945      5.142  -13.99   <2e-16 ***
## tranyAutomatic (S10)                   -66.174      2.172  -30.47   <2e-16 ***
## tranyAutomatic (S4)                    -66.279      2.064  -32.11   <2e-16 ***
## tranyAutomatic (S5)                    -67.540      2.030  -33.27   <2e-16 ***
## tranyAutomatic (S6)                    -65.364      2.021  -32.35   <2e-16 ***
## tranyAutomatic (S7)                    -67.521      2.053  -32.89   <2e-16 ***
## tranyAutomatic (S8)                    -65.048      2.025  -32.13   <2e-16 ***
## tranyAutomatic (S9)                    -63.524      2.181  -29.13   <2e-16 ***
## tranyAutomatic (variable gear ratios)  -51.185      2.031  -25.20   <2e-16 ***
## tranyAutomatic 10-spd                  -71.121      2.620  -27.15   <2e-16 ***
## tranyAutomatic 3-spd                   -66.926      2.020  -33.12   <2e-16 ***
## tranyAutomatic 4-spd                   -69.027      2.018  -34.21   <2e-16 ***
## tranyAutomatic 5-spd                   -69.671      2.022  -34.46   <2e-16 ***
## tranyAutomatic 6-spd                   -67.688      2.024  -33.45   <2e-16 ***
## tranyAutomatic 7-spd                   -67.454      2.032  -33.19   <2e-16 ***
## tranyAutomatic 8-spd                   -68.289      2.049  -33.34   <2e-16 ***
## tranyAutomatic 9-spd                   -63.249      2.070  -30.56   <2e-16 ***
## tranyManual 3-spd                      -71.337      2.156  -33.09   <2e-16 ***
## tranyManual 4-spd                      -66.996      2.024  -33.09   <2e-16 ***
## tranyManual 4-spd Doubled              -57.885      2.588  -22.36   <2e-16 ***
## tranyManual 5-spd                      -64.423      2.018  -31.92   <2e-16 ***
## tranyManual 6-spd                      -65.401      2.021  -32.36   <2e-16 ***
## tranyManual 7-spd                      -66.132      2.113  -31.30   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.689 on 40043 degrees of freedom
## Multiple R-squared:  0.5925, Adjusted R-squared:  0.5921 
## F-statistic:  1573 on 37 and 40043 DF,  p-value: < 2.2e-16

## 
## Call:
## lm(formula = UCity ~ trany_new, data = vhcls_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.525  -4.725  -1.525   2.575 202.275 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     22.52494    0.06319  356.45   <2e-16 ***
## trany_newManual  1.43922    0.11216   12.83   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.45 on 40079 degrees of freedom
## Multiple R-squared:  0.004092,   Adjusted R-squared:  0.004067 
## F-statistic: 164.7 on 1 and 40079 DF,  p-value: < 2.2e-16

VClass

  • Definition

    • EPA vehicle size class
  • Remarks

    • Compact, Subcompact and Midsize cars are the top 3 in the data
    • Special Purpose Verticia, Vans Passenger are the least

  • Linear model does show the negative effect of vehicle class
  • larger vehicles (Truck and Vans) the lower the UCity with global mean 26.5827
## 
## Call:
## lm(formula = UCity ~ VClass, data = vhcls_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.209  -3.905  -1.054   1.939 199.746 
## 
## Coefficients:
##                                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)                               26.5827     0.1300 204.541  < 2e-16
## VClassLarge Cars                          -3.8831     0.2525 -15.377  < 2e-16
## VClassMidsize Cars                        -1.5285     0.1938  -7.886 3.20e-15
## VClassMidsize Station Wagons              -3.1773     0.4451  -7.138 9.60e-13
## VClassMidsize-Large Station Wagons        -4.7331     0.4062 -11.653  < 2e-16
## VClassMinicompact Cars                    -2.3737     0.2966  -8.004 1.24e-15
## VClassMinivan - 2WD                       -5.5012     0.5383 -10.220  < 2e-16
## VClassMinivan - 4WD                       -7.2576     1.4285  -5.081 3.78e-07
## VClassSmall Pickup Trucks                 -3.4385     0.4443  -7.738 1.03e-14
## VClassSmall Pickup Trucks 2WD             -3.7426     0.4790  -7.813 5.71e-15
## VClassSmall Pickup Trucks 4WD             -5.9404     0.6533  -9.093  < 2e-16
## VClassSmall Sport Utility Vehicle 2WD      1.8148     0.4474   4.056 5.00e-05
## VClassSmall Sport Utility Vehicle 4WD     -0.6477     0.3863  -1.677   0.0936
## VClassSmall Station Wagons                 1.6988     0.2800   6.067 1.32e-09
## VClassSpecial Purpose Vehicle              0.2173     9.8566   0.022   0.9824
## VClassSpecial Purpose Vehicle 2WD         -6.1868     0.4098 -15.096  < 2e-16
## VClassSpecial Purpose Vehicle 4WD         -7.7128     0.5747 -13.421  < 2e-16
## VClassSpecial Purpose Vehicles            -7.5729     0.2892 -26.184  < 2e-16
## VClassSpecial Purpose Vehicles/2wd        -3.6395     6.9703  -0.522   0.6016
## VClassSpecial Purpose Vehicles/4wd        -6.8959     6.9703  -0.989   0.3225
## VClassSport Utility Vehicle - 2WD         -5.1484     0.2768 -18.603  < 2e-16
## VClassSport Utility Vehicle - 4WD         -7.2713     0.2517 -28.885  < 2e-16
## VClassStandard Pickup Trucks              -9.5215     0.2412 -39.483  < 2e-16
## VClassStandard Pickup Trucks 2WD          -7.9801     0.3106 -25.696  < 2e-16
## VClassStandard Pickup Trucks 4WD          -9.4305     0.3316 -28.443  < 2e-16
## VClassStandard Pickup Trucks/2wd         -11.9551     4.9296  -2.425   0.0153
## VClassStandard Sport Utility Vehicle 2WD  -6.1120     0.6533  -9.356  < 2e-16
## VClassStandard Sport Utility Vehicle 4WD  -3.7400     0.4304  -8.690  < 2e-16
## VClassSubcompact Cars                     -0.5774     0.1902  -3.036   0.0024
## VClassTwo Seaters                         -3.9304     0.2547 -15.433  < 2e-16
## VClassVans                                -9.6943     0.3194 -30.351  < 2e-16
## VClassVans Passenger                     -10.4344     6.9703  -1.497   0.1344
## VClassVans, Cargo Type                   -10.8136     0.4885 -22.135  < 2e-16
## VClassVans, Passenger Type               -11.5313     0.5712 -20.189  < 2e-16
##                                             
## (Intercept)                              ***
## VClassLarge Cars                         ***
## VClassMidsize Cars                       ***
## VClassMidsize Station Wagons             ***
## VClassMidsize-Large Station Wagons       ***
## VClassMinicompact Cars                   ***
## VClassMinivan - 2WD                      ***
## VClassMinivan - 4WD                      ***
## VClassSmall Pickup Trucks                ***
## VClassSmall Pickup Trucks 2WD            ***
## VClassSmall Pickup Trucks 4WD            ***
## VClassSmall Sport Utility Vehicle 2WD    ***
## VClassSmall Sport Utility Vehicle 4WD    .  
## VClassSmall Station Wagons               ***
## VClassSpecial Purpose Vehicle               
## VClassSpecial Purpose Vehicle 2WD        ***
## VClassSpecial Purpose Vehicle 4WD        ***
## VClassSpecial Purpose Vehicles           ***
## VClassSpecial Purpose Vehicles/2wd          
## VClassSpecial Purpose Vehicles/4wd          
## VClassSport Utility Vehicle - 2WD        ***
## VClassSport Utility Vehicle - 4WD        ***
## VClassStandard Pickup Trucks             ***
## VClassStandard Pickup Trucks 2WD         ***
## VClassStandard Pickup Trucks 4WD         ***
## VClassStandard Pickup Trucks/2wd         *  
## VClassStandard Sport Utility Vehicle 2WD ***
## VClassStandard Sport Utility Vehicle 4WD ***
## VClassSubcompact Cars                    ** 
## VClassTwo Seaters                        ***
## VClassVans                               ***
## VClassVans Passenger                        
## VClassVans, Cargo Type                   ***
## VClassVans, Passenger Type               ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.856 on 40047 degrees of freedom
## Multiple R-squared:  0.1152, Adjusted R-squared:  0.1145 
## F-statistic:   158 on 33 and 40047 DF,  p-value: < 2.2e-16

sCharger, tCharger

  • Definition
    • sCharger - if S, this vehicle is supercharged
  • Remarks

    • 6302 tCharger Vehicles
    • 796 sCharger Vechiles
    • large amount of missing values make these variables useless
##           S 
## 39285   796
## [1] 796
## [1] Not Available  Hybrid         FFV            Plug-in Hybrid
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid
##    Mode    TRUE    NA's 
## logical    6302   33779
## 
## Call:
## lm(formula = UCity ~ tCharger, data = vhcls_data)
## 
## Coefficients:
##  (Intercept)  tChargerTRUE  
##        24.52            NA
## [1] Not Available  Diesel         Hybrid         FFV            Plug-in Hybrid
## 9 Levels: Bifuel (CNG) Bifuel (LPG) CNG Diesel EV FFV Hybrid ... Plug-in Hybrid

year

  • Definition

    • model year
  • Remarks

    • 1984 to 2019
    • Most vehicles have 1984 model
    • Year 2010 is a turning point in UCity

##     year_cat     cylinders_cat  
##  1984   : 1964   4      :15475  
##  1985   : 1701   6      :13912  
##  2018   : 1330   8      : 8645  
##  2017   : 1293   5      :  769  
##  2015   : 1278   12     :  608  
##  2016   : 1257   (Other):  501  
##  (Other):31258   NA's   :  171
## 
## Call:
## lm(formula = UCity ~ year_cat, data = vhcls_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.696  -4.468  -1.176   2.725 197.901 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   22.6191     0.2284  99.053  < 2e-16 ***
## year_cat1985  -0.1511     0.3352  -0.451 0.652228    
## year_cat1986  -0.4441     0.3698  -1.201 0.229825    
## year_cat1987  -0.9235     0.3664  -2.520 0.011727 *  
## year_cat1988  -0.8832     0.3779  -2.337 0.019430 *  
## year_cat1989  -1.1094     0.3755  -2.955 0.003130 ** 
## year_cat1990  -1.2884     0.3836  -3.359 0.000784 ***
## year_cat1991  -1.5512     0.3776  -4.107 4.01e-05 ***
## year_cat1992  -1.6025     0.3788  -4.230 2.34e-05 ***
## year_cat1993  -1.3514     0.3819  -3.539 0.000403 ***
## year_cat1994  -1.4689     0.3955  -3.714 0.000204 ***
## year_cat1995  -1.9031     0.3976  -4.787 1.70e-06 ***
## year_cat1996  -0.9782     0.4297  -2.276 0.022824 *  
## year_cat1997  -1.2026     0.4319  -2.784 0.005367 ** 
## year_cat1998  -1.1134     0.4222  -2.637 0.008366 ** 
## year_cat1999  -0.8729     0.4152  -2.103 0.035506 *  
## year_cat2000  -0.9748     0.4172  -2.336 0.019479 *  
## year_cat2001  -0.9791     0.4057  -2.414 0.015804 *  
## year_cat2002  -1.5095     0.3965  -3.807 0.000141 ***
## year_cat2003  -1.6279     0.3876  -4.200 2.68e-05 ***
## year_cat2004  -1.7846     0.3787  -4.712 2.46e-06 ***
## year_cat2005  -1.5570     0.3741  -4.162 3.17e-05 ***
## year_cat2006  -1.7979     0.3807  -4.723 2.33e-06 ***
## year_cat2007  -1.8395     0.3783  -4.863 1.16e-06 ***
## year_cat2008  -1.4256     0.3721  -3.832 0.000127 ***
## year_cat2009  -0.8700     0.3723  -2.337 0.019465 *  
## year_cat2010   0.1277     0.3801   0.336 0.736853    
## year_cat2011   0.9275     0.3779   2.455 0.014105 *  
## year_cat2012   1.9746     0.3756   5.258 1.47e-07 ***
## year_cat2013   3.8167     0.3723  10.250  < 2e-16 ***
## year_cat2014   4.3227     0.3688  11.721  < 2e-16 ***
## year_cat2015   4.8909     0.3637  13.447  < 2e-16 ***
## year_cat2016   6.5099     0.3655  17.809  < 2e-16 ***
## year_cat2017   6.8237     0.3624  18.828  < 2e-16 ***
## year_cat2018   6.4955     0.3594  18.075  < 2e-16 ***
## year_cat2019   4.2796     0.5106   8.381  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.12 on 40045 degrees of freedom
## Multiple R-squared:  0.06717,    Adjusted R-squared:  0.06636 
## F-statistic: 82.39 on 35 and 40045 DF,  p-value: < 2.2e-16

Others

engId

  • Definition
    • EPA model type index
    • Just an index variable. no use for prediction
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0     186    8377    4301   69102

eng_desc

eng_dscr - engine descriptor;

## Length  Class   Mode 
##      0   NULL   NULL

id

  • Definition: Engine record id

model

  • model - model name

Summary Insights

  • The dataset has two contrast cases of vehicles. (39829 non EV) and EV
  • The median annual petroleum consumption (barrels08) is start to decrease post 2010.
  • A strong rise in FFV (dual fuel vehicles) during 2000-2010 period and the a sharp drop to the lowest.
  • Premium and Regular fuelType vehicle are top co2 emitters
  • cylinders are negatively correlated with UCity. (-0.8157), which indicates the bigger the vehicle the less the miles per gallon. Make sense.
    • Vehicles with 4, 6 and 8 cylinders vehicles are getting slightly better on UCity in recent years
  • over the years, displ distributions is not changing noticeably.
  • youSaveSpend also shows a trend reversal at year 2010 - EV, Hybrid, Plug-in Hybrid and CNG vechicles return positive
  • Most common transmission type is Automatic 4 and 4 Speed
  • EV, Hybrid and Plug-in Hybrid are least expensive i.t.o fuelCost08
  • Over the years, Automatic transmission cars are beoming more and more common
  • On broader categorization (Automatic, Manual), Manual transmission does improve UCity 1.43922 as much as.
  • Compact, Subcompact and Midsize cars are the top 3 in the data
  • Special Purpose Verticia, Vans Passenger are the least
  • Most vehicles in the dataset are 1984 model
  • Year 2010 is a turning point in UCity