This project is to explore the transaction datasets of 2009 US Gorvenment C.A.R.S (also as known as “Cash for Clunkder”) program. During this program, consumers turned in gas guzzlers and bought nearly 700,000 more fuel-efficient vehicles in fewer than 30 days. NHTSA website provides all the data collected from the program. In this analysis, we first try to create two metrics to identify 10 most successful states and 10 least successful states during the C.A.R.S program. Next, we would like to see whether the fuel-efficient car peference differs among different regions consumers by using the established metrics. Furthermore, by exploring the dataset, we find some behavioral patterns that can help us understand how consumers buy new cars. Lastly, we try to answer whether the data from the NHTSA sufficiently support that the program was “wildly successful” by the government. Although we only analyze Final Paid Transaction Data, other data and materials are also used to answer the last question.
FinPaid_Trans <- read.delim("CARS_PAID_Final.txt", sep=",", header=T, na.strings="NA", dec=".", strip.white=TRUE)
This metric is to count the transaction on the state level. The performance of each state depends on transaction amount, and more transactions occured means more successful of the C.A.R.S program in the state.
## transaction_State_Count$transaction_Count
## n missing unique Info Mean .05 .10 .25 .50
## 55 0 55 1 12313 112.2 811.8 2703.0 8524.0
## .75 .90 .95
## 16524.0 32573.4 34997.5
##
## lowest : 7 13 17 153 505
## highest: 33950 34246 36751 42949 76672
transaction_State_Count_top10
## State Count
## 1 California 76672
## 2 Texas 42949
## 3 New York 36751
## 4 Florida 34246
## 5 Illinois 33950
## 6 Pennsylvania 32879
## 7 Ohio 32115
## 8 Michigan 31180
## 9 New Jersey 24538
## 10 Virginia 23555
transaction_State_Count_least10
## State Count
## 1 Northern Mariana Islands 7
## 2 Virgin Islands 13
## 3 District of Columbia 17
## 4 Guam 153
## 5 Puerto Rico 505
## 6 Wyoming 591
## 7 Alaska 1143
## 8 Montana 1469
## 9 Hawaii 1717
## 10 North Dakota 2102
This metric is to sum the MPG change (MPG difference between new vehicle and trade-in vehicle) of each transaction on the state level. First 10 states with greatest MPG increase are the top 10 most successful states, and last 10 successful states are with least MPG increase.
## mpg_change_State_Sum$mpg_change_Sum
## n missing unique Info Mean .05 .10 .25
## 55 0 55 1 112944 983.3 6795.8 23840.5
## .50 .75 .90 .95
## 76629.0 146145.5 290191.2 318967.9
##
## lowest : 69 90 158 1337 4554
## highest: 299140 314905 328448 380140 800064
mpg_change_State_Sum_top10
## State MPG Increase
## 1 California 800064
## 2 Texas 380140
## 3 New York 328448
## 4 Florida 314905
## 5 Illinois 299140
## 6 Pennsylvania 291306
## 7 Ohio 288519
## 8 Michigan 256893
## 9 Virginia 224187
## 10 New Jersey 219168
mpg_change_State_Sum_least10
## State MPG Increase
## 1 Northern Mariana Islands 69
## 2 Virgin Islands 90
## 3 District of Columbia 158
## 4 Guam 1337
## 5 Puerto Rico 4554
## 6 Wyoming 5123
## 7 Alaska 9305
## 8 Montana 14045
## 9 Hawaii 15408
## 10 North Dakota 17993
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## | N / Col Total |
## |-------------------------|
##
##
## Total Observations in Table: 677238
##
##
## | Paid_Cars$new_vehicle_category
## Paid_Cars$trade_in_vehicle_category | 1 | 2 | 3 | P | Row Total |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
## 1 | 182473 | 235 | 6 | 265098 | 447812 |
## | 0.407 | 0.001 | 0.000 | 0.592 | 0.661 |
## | 0.792 | 0.005 | 0.003 | 0.668 | |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
## 2 | 31372 | 42289 | 19 | 45909 | 119589 |
## | 0.262 | 0.354 | 0.000 | 0.384 | 0.177 |
## | 0.136 | 0.890 | 0.008 | 0.116 | |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
## 3 | 184 | 4990 | 2253 | 149 | 7576 |
## | 0.024 | 0.659 | 0.297 | 0.020 | 0.011 |
## | 0.001 | 0.105 | 0.989 | 0.000 | |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
## P | 16436 | 22 | 0 | 85803 | 102261 |
## | 0.161 | 0.000 | 0.000 | 0.839 | 0.151 |
## | 0.071 | 0.000 | 0.000 | 0.216 | |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
## Column Total | 230465 | 47536 | 2278 | 396959 | 677238 |
## | 0.340 | 0.070 | 0.003 | 0.586 | |
## ------------------------------------|-----------|-----------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 452923.1 d.f. = 9 p = 0
##
##
##
##
## Europe Japan Korea Other USA
## Europe 1894 10489 1770 0 2541
## Japan 3112 58633 8498 0 11267
## Korea 77 1739 1170 0 766
## Other 1 18 3 0 10
## USA 14481 248462 66357 0 245950
##
## Europe Japan Korea USA
## Europe 1894 10489 1770 2541
## Japan 3112 58633 8498 11267
## Korea 77 1739 1170 766
## USA 14481 248462 66357 245950
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## | N / Col Total |
## |-------------------------|
##
##
## Total Observations in Table: 677206
##
##
## | Paid_Cars_new$new_vehicle_country
## Paid_Cars_new$trade_in_country | Europe | Japan | Korea | USA | Row Total |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
## Europe | 1894 | 10489 | 1770 | 2541 | 16694 |
## | 0.113 | 0.628 | 0.106 | 0.152 | 0.025 |
## | 0.097 | 0.033 | 0.023 | 0.010 | |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
## Japan | 3112 | 58633 | 8498 | 11267 | 81510 |
## | 0.038 | 0.719 | 0.104 | 0.138 | 0.120 |
## | 0.159 | 0.184 | 0.109 | 0.043 | |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
## Korea | 77 | 1739 | 1170 | 766 | 3752 |
## | 0.021 | 0.463 | 0.312 | 0.204 | 0.006 |
## | 0.004 | 0.005 | 0.015 | 0.003 | |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
## USA | 14481 | 248462 | 66357 | 245950 | 575250 |
## | 0.025 | 0.432 | 0.115 | 0.428 | 0.849 |
## | 0.740 | 0.778 | 0.853 | 0.944 | |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
## Column Total | 19564 | 319323 | 77795 | 260524 | 677206 |
## | 0.029 | 0.472 | 0.115 | 0.385 | |
## -------------------------------|-----------|-----------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 37699.88 d.f. = 9 p = 0
##
##
##
## Origin of Maker Average MPG
## 1 Europe 26.98
## 2 Japan 26.59
## 3 Korea 25.48
## 4 USA 22.43
## Origin of Maker Average MSRP
## 1 Korea 19007
## 2 Japan 22163
## 3 USA 23630
## 4 Europe 25206
## 'data.frame': 677238 obs. of 43 variables:
## $ vendor_id : int 1 1 1 1 1 1 1 1 1 1 ...
## $ dealer_name : Factor w/ 17020 levels "#N/A","0","10-2002 LLC",..: 7366 7366 7366 7366 7366 7366 7366 7366 7366 7366 ...
## $ address_line1 : Factor w/ 16676 levels "# 1 Swope AuroCenter Dr",..: 11538 11538 11538 11538 11538 11538 11538 11538 11538 11538 ...
## $ address_line2 : Factor w/ 2038 levels "","(M-99)","`",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ address_line3 : Factor w/ 108 levels "","`","100 Blackheath Drive",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ address_line4 : Factor w/ 27 levels "","1012 Fox Creek Circle",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ city : Factor w/ 4103 levels "Abbeville","Abbotsford",..: 2270 2270 2270 2270 2270 2270 2270 2270 2270 2270 ...
## $ state : Factor w/ 55 levels "AK","AL","AR",..: 39 39 39 39 39 39 39 39 39 39 ...
## $ ZIP : int 73110 73110 73110 73110 73110 73110 73110 73110 73110 73110 ...
## $ area_code : num NA NA NA NA NA NA NA NA NA NA ...
## $ phone : Factor w/ 13598 levels "","1-800-528-8993",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ invoice_id : int 844935 380253 129314 726314 817520 297526 1027993 378066 952741 388873 ...
## $ invoice_num : Factor w/ 480465 levels "_35439","-106086",..: 454011 453260 453298 453305 453908 453914 454936 453981 453924 453932 ...
## $ invoice_date : Factor w/ 518 levels "1-Apr-09","1-Aug-06",..: 203 375 396 119 170 4 251 375 236 375 ...
## $ sale_date : Factor w/ 99 levels "1-Aug-09","1-Jul-09",..: 22 80 29 5 18 19 48 71 5 1 ...
## $ disposal_status : Factor w/ 5 levels "CRUSH","No Entity",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ disposal_facility_nmvtis_id : int 3101 3640 3640 3640 3640 3640 3101 3640 3640 3640 ...
## $ disposal_facility_contact_info: Factor w/ 3694 levels "","#1 AUTO","#1 AUTO SALVAGE",..: 1462 1462 1462 1462 1462 1462 1462 1462 1462 1462 ...
## $ sales_type : Factor w/ 2 levels "LEASED","PURCHASED": 2 2 2 2 2 2 2 2 2 2 ...
## $ invoice_amount : int 4500 4500 4500 4500 4500 4500 4500 4500 4500 4500 ...
## $ trade_in_VIN : Factor w/ 677222 levels "/G6KD52YXTU229685",..: 3340 8204 16100 16125 21585 21749 24009 27842 33194 34882 ...
## $ trade_in_vehicle_category : Factor w/ 4 levels "1","2","3","P": 1 1 1 1 1 1 1 1 1 1 ...
## $ trade_in_make : Factor w/ 60 levels "Acura","Alfa Romeo",..: 16 16 16 16 16 16 16 16 16 16 ...
## $ trade_in_model : Factor w/ 952 levels "1-Ton Truck 2WD",..: 272 272 350 350 351 351 163 331 331 331 ...
## $ trade_in_year : int 1994 1997 1999 1999 2000 2000 1984 1995 1999 1997 ...
## $ trade_in_vehicle_drive_train : Factor w/ 1943 levels " 0.0L, 8 cyl, Automatic 4-spd, RWD, (MPFI) (NO-CAT), Diesel ",..: 624 499 1273 1273 1271 1271 1092 650 651 644 ...
## $ trade_in_mileage : int 17 18 13 13 12 12 12 17 17 16 ...
## $ trade_in_title_state : Factor w/ 60 levels "AK","AL","AR",..: 42 42 42 42 42 42 42 42 42 42 ...
## $ trade_in_registration_state : Factor w/ 60 levels "AK","AL","AR",..: 42 42 42 42 42 42 42 42 42 42 ...
## $ trade_in_registration_start : Factor w/ 8971 levels "1-Apr-00","1-Apr-01",..: 4449 4198 4294 7319 670 5479 3065 6265 2304 1049 ...
## $ trade_in_registration_end : Factor w/ 2543 levels "1-Apr-06","1-Apr-07",..: 1985 1963 1863 2008 1986 2075 2008 55 1608 2025 ...
## $ trade_in_insurance_start : Factor w/ 8285 levels "","1-Apr-00",..: 1 1 7109 1 1 1 1 1 1 1 ...
## $ trade_in_NMVTIS_flag : Factor w/ 1626 levels "","1-Apr-03",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ trade_in_odometer_reading : int 166000 231519 130648 178733 93105 169662 68102 175000 162545 106190 ...
## $ new_vehicle_VIN_trunc : Factor w/ 25436 levels "!FMCU49329KXXXXXX",..: 8258 8195 21160 15641 15687 22222 15378 21031 8229 20895 ...
## $ new_vehicle_category : Factor w/ 4 levels "1","2","3","P": 4 4 1 1 4 4 4 4 4 4 ...
## $ new_vehicle_make : Factor w/ 35 levels "Acura","Audi",..: 33 33 33 33 33 33 33 33 33 33 ...
## $ new_vehicle_model : Factor w/ 437 levels "128i","3","300/SRT-8",..: 88 88 200 411 54 54 54 276 88 88 ...
## $ new_vehicle_year : int 2010 2009 2009 2009 2010 2010 2010 2010 2010 2010 ...
## $ new_vehicle_drive_train : Factor w/ 384 levels "1.3L, 4 cyl, Auto(AV-S7), FWD, Regular",..: 19 19 213 219 91 91 91 18 19 19 ...
## $ new_vehicle_car_mileage : int 29 30 20 22 26 26 26 50 29 29 ...
## $ new_vehicle_MSRP : int 17250 16750 27600 27800 21900 24565 23165 22000 16150 16750 ...
## $ mpg_change : int 12 12 7 9 14 14 14 33 12 13 ...
## vendor_id dealer_name address_line1 address_line2
## 1 1 HUDIBURG MOTORS, INC. 5800 Tinker Diagonal
## address_line3 address_line4 city state ZIP area_code phone
## 1 Midwest City OK 73110 NA
## invoice_id invoice_num invoice_date sale_date disposal_status
## 1 844935 T40262A 20-Aug-09 15-Aug-09 CRUSH
## disposal_facility_nmvtis_id disposal_facility_contact_info sales_type
## 1 3101 G&R AUTO PARTS PURCHASED
## invoice_amount trade_in_VIN trade_in_vehicle_category trade_in_make
## 1 4500 1B4GH54R7RX188224 1 Dodge
## trade_in_model trade_in_year
## 1 Caravan/Grand Caravan 2WD 1994
## trade_in_vehicle_drive_train trade_in_mileage
## 1 3.8L, 6 cyl, Automatic 4-spd, FWD, (FFS), Regular 17
## trade_in_title_state trade_in_registration_state
## 1 OK OK
## trade_in_registration_start trade_in_registration_end
## 1 23-Apr-07 31-Dec-09
## trade_in_insurance_start trade_in_NMVTIS_flag trade_in_odometer_reading
## 1 166000
## new_vehicle_VIN_trunc new_vehicle_category new_vehicle_make
## 1 1NXBU4EE7AZXXXXXX P Toyota
## new_vehicle_model new_vehicle_year
## 1 Corolla 2010
## new_vehicle_drive_train new_vehicle_car_mileage
## 1 1.8L, 4 cyl, Automatic 4-spd, FWD, Regular 29
## new_vehicle_MSRP mpg_change
## 1 17250 12