I worked on a task from kaggle - Car Price Prediction earlier this week. You can check my submission here https://www.kaggle.com/lekanali/car-price-predictions . It is long and had a hard time deciding which technique not to include in the report. You can read the problem statement below.
A Chinese automobile company Geely Auto aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts.
They have contracted an automobile consulting company to understand the factors on which the pricing of cars depends. Specifically, they want to understand the factors affecting the pricing of cars in the American market, since those may be very different from the Chinese market. The company wants to know:
Which variables are significant in predicting the price of a car How well those variables describe the price of a car Based on various market surveys, the consulting firm has gathered a large data set of different types of cars across the America market.
We are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels. Further, the model will be a good way for management to understand the pricing dynamics of a new market.
There is no need to repeat the same process here but what we have in this report is a compilation of all the graphics and some snippets of my kaggle submission. I hope you enjoy it.
## car_ID symboling CarName fueltype aspiration doornumber
## 1 1 3 alfa-romero giulia gas std two
## 2 2 3 alfa-romero stelvio gas std two
## 3 3 1 alfa-romero Quadrifoglio gas std two
## 4 4 2 audi 100 ls gas std four
## 5 5 2 audi 100ls gas std four
## 6 6 2 audi fox gas std two
## carbody drivewheel enginelocation wheelbase carlength carwidth carheight
## 1 convertible rwd front 88.6 168.8 64.1 48.8
## 2 convertible rwd front 88.6 168.8 64.1 48.8
## 3 hatchback rwd front 94.5 171.2 65.5 52.4
## 4 sedan fwd front 99.8 176.6 66.2 54.3
## 5 sedan 4wd front 99.4 176.6 66.4 54.3
## 6 sedan fwd front 99.8 177.3 66.3 53.1
## curbweight enginetype cylindernumber enginesize fuelsystem boreratio stroke
## 1 2548 dohc four 130 mpfi 3.47 2.68
## 2 2548 dohc four 130 mpfi 3.47 2.68
## 3 2823 ohcv six 152 mpfi 2.68 3.47
## 4 2337 ohc four 109 mpfi 3.19 3.40
## 5 2824 ohc five 136 mpfi 3.19 3.40
## 6 2507 ohc five 136 mpfi 3.19 3.40
## compressionratio horsepower peakrpm citympg highwaympg price
## 1 9.0 111 5000 21 27 13495
## 2 9.0 111 5000 21 27 16500
## 3 9.0 154 5000 19 26 16500
## 4 10.0 102 5500 24 30 13950
## 5 8.0 115 5500 18 22 17450
## 6 8.5 110 5500 19 25 15250
## symboling CarName fueltype aspiration
## alfa-romero giulia_1 3 alfa-romero giulia gas std
## alfa-romero stelvio_2 3 alfa-romero stelvio gas std
## doornumber carbody drivewheel enginelocation
## alfa-romero giulia_1 two convertible rwd front
## alfa-romero stelvio_2 two convertible rwd front
## wheelbase carlength carwidth carheight curbweight
## alfa-romero giulia_1 88.6 168.8 64.1 48.8 2548
## alfa-romero stelvio_2 88.6 168.8 64.1 48.8 2548
## enginetype cylindernumber enginesize fuelsystem boreratio
## alfa-romero giulia_1 dohc four 130 mpfi 3.47
## alfa-romero stelvio_2 dohc four 130 mpfi 3.47
## stroke compressionratio horsepower peakrpm citympg
## alfa-romero giulia_1 2.68 9 111 5000 21
## alfa-romero stelvio_2 2.68 9 111 5000 21
## highwaympg price CarBrand
## alfa-romero giulia_1 27 13495 alfa-romero
## alfa-romero stelvio_2 27 16500 alfa-romero
## [1] "alfa-romero" "audi" "bmw" "chevrolet" "dodge"
## [6] "honda" "isuzu" "jaguar" "maxda" "mazda"
## [11] "buick" "mercury" "mitsubishi" "Nissan" "nissan"
## [16] "peugeot" "plymouth" "porsche" "porcshce" "renault"
## [21] "saab" "subaru" "toyota" "toyouta" "vokswagen"
## [26] "volkswagen" "vw" "volvo"
## [1] "alfa-romero" "audi" "bmw" "chevrolet" "dodge"
## [6] "honda" "isuzu" "jaguar" "mazda" "buick"
## [11] "mercury" "mitsubishi" "nissan" "peugeot" "plymouth"
## [16] "porsche" "renault" "saab" "subaru" "toyota"
## [21] "volkswagen" "volvo"
## 'data.frame': 205 obs. of 26 variables:
## $ symboling : int 3 3 1 2 2 2 1 1 1 0 ...
## $ CarName : Factor w/ 147 levels "alfa-romero giulia",..: 1 3 2 4 5 9 5 7 6 8 ...
## $ fueltype : Factor w/ 2 levels "diesel","gas": 2 2 2 2 2 2 2 2 2 2 ...
## $ aspiration : Factor w/ 2 levels "std","turbo": 1 1 1 1 1 1 1 1 2 2 ...
## $ doornumber : Factor w/ 2 levels "four","two": 2 2 2 1 1 2 1 1 1 2 ...
## $ carbody : Factor w/ 5 levels "convertible",..: 1 1 3 4 4 4 4 5 4 3 ...
## $ drivewheel : Factor w/ 3 levels "4wd","fwd","rwd": 3 3 3 2 1 2 2 2 2 1 ...
## $ enginelocation : Factor w/ 2 levels "front","rear": 1 1 1 1 1 1 1 1 1 1 ...
## $ wheelbase : num 88.6 88.6 94.5 99.8 99.4 ...
## $ carlength : num 169 169 171 177 177 ...
## $ carwidth : num 64.1 64.1 65.5 66.2 66.4 66.3 71.4 71.4 71.4 67.9 ...
## $ carheight : num 48.8 48.8 52.4 54.3 54.3 53.1 55.7 55.7 55.9 52 ...
## $ curbweight : int 2548 2548 2823 2337 2824 2507 2844 2954 3086 3053 ...
## $ enginetype : Factor w/ 7 levels "dohc","dohcv",..: 1 1 6 4 4 4 4 4 4 4 ...
## $ cylindernumber : Factor w/ 7 levels "eight","five",..: 3 3 4 3 2 2 2 2 2 2 ...
## $ enginesize : int 130 130 152 109 136 136 136 136 131 131 ...
## $ fuelsystem : Factor w/ 8 levels "1bbl","2bbl",..: 6 6 6 6 6 6 6 6 6 6 ...
## $ boreratio : num 3.47 3.47 2.68 3.19 3.19 3.19 3.19 3.19 3.13 3.13 ...
## $ stroke : num 2.68 2.68 3.47 3.4 3.4 3.4 3.4 3.4 3.4 3.4 ...
## $ compressionratio: num 9 9 9 10 8 8.5 8.5 8.5 8.3 7 ...
## $ horsepower : int 111 111 154 102 115 110 110 110 140 160 ...
## $ peakrpm : int 5000 5000 5000 5500 5500 5500 5500 5500 5500 5500 ...
## $ citympg : int 21 21 19 24 18 19 19 19 17 16 ...
## $ highwaympg : int 27 27 26 30 22 25 25 25 20 22 ...
## $ price : num 13495 16500 16500 13950 17450 ...
## $ CarBrand : chr "alfa-romero" "alfa-romero" "alfa-romero" "audi" ...
## vars n mean sd median trimmed mad min
## symboling 1 205 0.83 1.25 1.00 0.81 1.48 -2.00
## CarName* 2 205 78.10 41.09 81.00 78.99 51.89 1.00
## fueltype* 3 205 1.90 0.30 2.00 2.00 0.00 1.00
## aspiration* 4 205 1.18 0.39 1.00 1.10 0.00 1.00
## doornumber* 5 205 1.44 0.50 1.00 1.42 0.00 1.00
## carbody* 6 205 3.61 0.86 4.00 3.64 1.48 1.00
## drivewheel* 7 205 2.33 0.56 2.00 2.34 0.00 1.00
## enginelocation* 8 205 1.01 0.12 1.00 1.00 0.00 1.00
## wheelbase 9 205 98.76 6.02 97.00 98.08 4.00 86.60
## carlength 10 205 174.05 12.34 173.20 173.79 10.23 141.10
## carwidth 11 205 65.91 2.15 65.50 65.66 2.08 60.30
## carheight 12 205 53.72 2.44 54.10 53.70 2.37 47.80
## curbweight 13 205 2555.57 520.68 2414.00 2513.05 572.28 1488.00
## enginetype* 14 205 4.01 1.05 4.00 4.04 0.00 1.00
## cylindernumber* 15 205 3.12 0.80 3.00 3.06 0.00 1.00
## enginesize 16 205 126.91 41.64 120.00 120.58 34.10 61.00
## fuelsystem* 17 205 4.25 2.01 6.00 4.32 1.48 1.00
## boreratio 18 205 3.33 0.27 3.31 3.33 0.39 2.54
## stroke 19 205 3.26 0.31 3.29 3.28 0.21 2.07
## compressionratio 20 205 10.14 3.97 9.00 9.04 0.59 7.00
## horsepower 21 205 104.12 39.54 95.00 99.11 37.06 48.00
## peakrpm 22 205 5125.12 476.99 5200.00 5126.36 444.78 4150.00
## citympg 23 205 25.22 6.54 24.00 24.76 7.41 13.00
## highwaympg 24 205 30.75 6.89 30.00 30.40 7.41 16.00
## price 25 205 13276.71 7988.85 10295.00 11747.30 4901.48 5118.00
## max range skew kurtosis se
## symboling 3.00 5.0 0.21 -0.71 0.09
## CarName* 147.00 146.0 -0.15 -1.13 2.87
## fueltype* 2.00 1.0 -2.69 5.28 0.02
## aspiration* 2.00 1.0 1.65 0.72 0.03
## doornumber* 2.00 1.0 0.24 -1.95 0.03
## carbody* 5.00 4.0 -0.66 0.93 0.06
## drivewheel* 3.00 2.0 -0.06 -0.71 0.04
## enginelocation* 2.00 1.0 8.02 62.70 0.01
## wheelbase 120.90 34.3 1.03 0.92 0.42
## carlength 208.10 67.0 0.15 -0.14 0.86
## carwidth 72.30 12.0 0.89 0.62 0.15
## carheight 59.80 12.0 0.06 -0.49 0.17
## curbweight 4066.00 2578.0 0.67 -0.10 36.37
## enginetype* 7.00 6.0 -0.53 3.13 0.07
## cylindernumber* 7.00 6.0 2.11 10.66 0.06
## enginesize 326.00 265.0 1.92 5.07 2.91
## fuelsystem* 8.00 7.0 -0.24 -1.66 0.14
## boreratio 3.94 1.4 0.02 -0.82 0.02
## stroke 4.17 2.1 -0.68 2.04 0.02
## compressionratio 23.00 16.0 2.57 5.00 0.28
## horsepower 288.00 240.0 1.38 2.54 2.76
## peakrpm 6600.00 2450.0 0.07 0.03 33.31
## citympg 49.00 36.0 0.65 0.50 0.46
## highwaympg 54.00 38.0 0.53 0.37 0.48
## price 45400.00 40282.0 1.75 2.89 557.97
## , , doornumber = four
##
## drivewheel
## CarBrand 4wd fwd rwd Sum
## alfa-romero 0 0 0 0
## audi 1 4 0 5
## bmw 0 0 5 5
## buick 0 0 5 5
## chevrolet 0 1 0 1
## dodge 0 4 0 4
## honda 0 5 0 5
## isuzu 0 1 1 2
## jaguar 0 0 2 2
## mazda 0 6 2 8
## mercury 0 0 0 0
## mitsubishi 0 4 0 4
## nissan 0 9 0 9
## peugeot 0 0 11 11
## plymouth 0 4 0 4
## porsche 0 0 0 0
## renault 0 1 0 1
## saab 0 3 0 3
## subaru 4 5 0 9
## toyota 2 14 2 18
## volkswagen 0 8 0 8
## volvo 0 0 11 11
##
## , , doornumber = two
##
## drivewheel
## CarBrand 4wd fwd rwd Sum
## alfa-romero 0 0 3 3
## audi 1 1 0 2
## bmw 0 0 3 3
## buick 0 0 3 3
## chevrolet 0 2 0 2
## dodge 0 5 0 5
## honda 0 8 0 8
## isuzu 0 1 1 2
## jaguar 0 0 1 1
## mazda 0 5 4 9
## mercury 0 0 1 1
## mitsubishi 0 9 0 9
## nissan 0 6 3 9
## peugeot 0 0 0 0
## plymouth 0 2 1 3
## porsche 0 0 5 5
## renault 0 1 0 1
## saab 0 3 0 3
## subaru 1 2 0 3
## toyota 0 2 12 14
## volkswagen 0 4 0 4
## volvo 0 0 0 0
## Warning in chisq.test(myt): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: myt
## X-squared = 189.68, df = 42, p-value < 2.2e-16
## Warning in chisq.test(myt): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: myt
## X-squared = 2.5191, df = 2, p-value = 0.2838
## Warning in chisq.test(myt): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: myt
## X-squared = 39.249, df = 21, p-value = 0.009165
## # A tibble: 22 x 3
## CarBrand mean sd
## <chr> <dbl> <dbl>
## 1 jaguar 34600 2048
## 2 buick 33647 6790
## 3 porsche 31400 5654
## 4 bmw 26119 9264
## 5 volvo 18063 3315
## 6 audi 17859 3152
## 7 mercury 16503 NA
## 8 alfa-romero 15498 1735
## 9 peugeot 15489 2247
## 10 saab 15223 2861
## # ... with 12 more rows
## `geom_smooth()` using formula 'y ~ x'
## mean_price sd mean_wheelbase mean_carlength mean_carwidth
## jaguar 34600.00 2048 109.3333 196.9667 69.93333
## buick 33647.00 6790 110.9250 195.2625 71.06250
## porsche 31400.50 5654 92.2800 170.2600 67.12000
## bmw 26118.75 9264 103.1625 184.5000 66.47500
## volvo 18063.18 3315 106.4818 188.8000 67.96364
## mean_carheight mean_curbweight mean_enginesize mean_boreratio
## jaguar 51.13333 4027.333 280.6667 3.600000
## buick 55.72500 3696.250 226.5000 3.605000
## porsche 51.10000 2891.200 187.2000 3.820000
## bmw 54.82500 2929.375 166.8750 3.473750
## volvo 56.23636 3037.909 142.2727 3.662727
## mean_stroke mean_compressionratio mean_horsepower mean_peakrpm
## jaguar 3.700000 9.233333 204.6667 4833.333
## buick 3.432500 14.825000 146.2500 4487.500
## porsche 2.984000 9.600000 210.4000 5790.000
## bmw 3.167500 8.575000 138.8750 5068.750
## volvo 3.147273 10.227273 128.0000 5290.909
## mean_citympg mean_highwaympg
## jaguar 14.33333 18.33333
## buick 18.50000 21.00000
## porsche 17.40000 26.00000
## bmw 19.37500 25.37500
## volvo 21.18182 25.81818
## mean_price sd mean_wheelbase mean_carlength mean_carwidth
## buick 33647.00 6790 110.925 195.2625 71.06250
## peugeot 15489.09 2247 110.200 191.1364 68.39091
## mean_carheight mean_curbweight mean_enginesize mean_boreratio
## buick 55.72500 3696.25 226.5000 3.605000
## peugeot 57.18182 3221.00 135.8182 3.582727
## mean_stroke mean_compressionratio mean_horsepower mean_peakrpm
## buick 3.4325 14.825 146.25000 4487.500
## peugeot 3.1600 14.000 99.81818 4668.182
## mean_citympg mean_highwaympg
## buick 18.50000 21.00000
## peugeot 22.45455 26.63636
## mean_price sd mean_wheelbase mean_carlength mean_carwidth
## buick 2.12467531 1.7952821 2.022708 1.689170 2.284325
## peugeot 0.04682718 -0.4257553 1.901069 1.345181 1.036981
## mean_carheight mean_curbweight mean_enginesize mean_boreratio
## buick 1.154188 1.987529 1.99535737 0.9756887
## peugeot 1.897727 1.094851 0.01408637 0.8857085
## mean_stroke mean_compressionratio mean_horsepower mean_peakrpm
## buick 0.6303932 2.643252 0.7954832 -2.084920
## peugeot -0.3557064 2.206910 -0.3550508 -1.518879
## mean_citympg mean_highwaympg
## buick -0.9671597 -1.4485276
## peugeot -0.3011646 -0.5272565
## mean_price sd mean_wheelbase mean_carlength mean_carwidth
## buick 33647.00 6790 110.925 195.2625 71.06250
## peugeot 15489.09 2247 110.200 191.1364 68.39091
## subaru 8541.25 1940 96.175 168.8583 64.95000
## mean_carheight mean_curbweight mean_enginesize mean_boreratio
## buick 55.72500 3696.25 226.5000 3.605000
## peugeot 57.18182 3221.00 135.8182 3.582727
## subaru 53.75000 2316.25 107.0833 3.620000
## mean_stroke mean_compressionratio mean_horsepower mean_peakrpm
## buick 3.432500 14.825000 146.25000 4487.500
## peugeot 3.160000 14.000000 99.81818 4668.182
## subaru 2.616667 8.816667 86.25000 4775.000
## mean_citympg mean_highwaympg
## buick 18.50000 21.00000
## peugeot 22.45455 26.63636
## subaru 26.33333 30.75000
## mean_price sd mean_wheelbase mean_carlength mean_carwidth
## buick 2.12467531 1.7952821 2.022708 1.6891703 2.2843250
## peugeot 0.04682718 -0.4257553 1.901069 1.3451805 1.0369813
## subaru -0.74822903 -0.5758452 -0.452009 -0.5121054 -0.5695507
## mean_carheight mean_curbweight mean_enginesize mean_boreratio
## buick 1.1541883 1.987529 1.99535737 0.9756887
## peugeot 1.8977272 1.094851 0.01408637 0.8857085
## subaru 0.1461769 -0.604572 -0.61372991 1.0362875
## mean_stroke mean_compressionratio mean_horsepower mean_peakrpm
## buick 0.6303932 2.6432515 0.7954832 -2.084920
## peugeot -0.3557064 2.2069099 -0.3550508 -1.518879
## subaru -2.3218744 -0.5345495 -0.6912568 -1.184238
## mean_citympg mean_highwaympg
## buick -0.9671597 -1.4485276
## peugeot -0.3011646 -0.5272565
## subaru 0.3520720 0.1451228
##
## Call:
## lm(formula = price ~ . - cylindernumber_two - fuelsystem_idi,
## data = df.model)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5416.2 -1152.0 -35.8 830.8 9835.6
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.226e+04 1.652e+04 -1.347 0.179705
## symboling 7.388e+01 2.386e+02 0.310 0.757238
## wheelbase 4.882e+01 9.675e+01 0.505 0.614563
## carlength -6.130e+01 4.875e+01 -1.257 0.210410
## carwidth 6.936e+02 2.394e+02 2.897 0.004283 **
## carheight 8.943e+01 1.278e+02 0.700 0.485209
## curbweight 3.942e+00 1.715e+00 2.299 0.022781 *
## enginesize 1.174e+02 2.600e+01 4.515 1.21e-05 ***
## boreratio -1.882e+03 1.598e+03 -1.178 0.240443
## stroke -4.454e+03 9.009e+02 -4.944 1.89e-06 ***
## compressionratio -8.003e+02 5.259e+02 -1.522 0.129981
## horsepower 9.791e+00 2.227e+01 0.440 0.660789
## peakrpm 2.202e+00 6.194e-01 3.555 0.000495 ***
## citympg -1.477e+02 1.474e+02 -1.003 0.317569
## highwaympg 1.916e+02 1.347e+02 1.422 0.156916
## fueltype_gas -1.178e+04 7.017e+03 -1.678 0.095232 .
## aspiration_turbo 1.626e+03 8.856e+02 1.836 0.068172 .
## doornumber_two 1.876e+02 5.854e+02 0.320 0.749028
## carbody_hardtop -3.207e+03 1.376e+03 -2.331 0.020992 *
## carbody_hatchback -3.281e+03 1.223e+03 -2.683 0.008055 **
## carbody_sedan -2.152e+03 1.332e+03 -1.615 0.108182
## carbody_wagon -3.266e+03 1.455e+03 -2.244 0.026191 *
## drivewheel_fwd 7.405e+01 1.040e+03 0.071 0.943351
## drivewheel_rwd 1.033e+03 1.205e+03 0.857 0.392688
## enginelocation_rear 7.695e+03 2.536e+03 3.035 0.002802 **
## enginetype_dohcv -7.189e+03 4.674e+03 -1.538 0.125912
## enginetype_l -1.051e+03 1.608e+03 -0.654 0.514246
## enginetype_ohc 3.126e+03 9.088e+02 3.439 0.000741 ***
## enginetype_ohcf 1.234e+03 1.572e+03 0.785 0.433661
## enginetype_ohcv -5.605e+03 1.247e+03 -4.495 1.31e-05 ***
## enginetype_rotor -6.925e+01 4.505e+03 -0.015 0.987754
## cylindernumber_five -9.280e+03 2.716e+03 -3.417 0.000800 ***
## cylindernumber_four -9.879e+03 3.054e+03 -3.234 0.001476 **
## cylindernumber_six -6.570e+03 2.192e+03 -2.997 0.003154 **
## cylindernumber_three -4.629e+02 4.499e+03 -0.103 0.918173
## cylindernumber_twelve -1.024e+04 4.384e+03 -2.336 0.020707 *
## fuelsystem_2bbl -3.907e+01 8.920e+02 -0.044 0.965118
## fuelsystem_4bbl -1.624e+03 2.775e+03 -0.585 0.559295
## fuelsystem_mfi -3.480e+03 2.590e+03 -1.344 0.180967
## fuelsystem_mpfi -2.444e+02 1.001e+03 -0.244 0.807415
## fuelsystem_spdi -3.027e+03 1.382e+03 -2.191 0.029883 *
## fuelsystem_spfi -6.187e+02 2.508e+03 -0.247 0.805484
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2197 on 163 degrees of freedom
## Multiple R-squared: 0.9395, Adjusted R-squared: 0.9243
## F-statistic: 61.79 on 41 and 163 DF, p-value: < 2.2e-16
## Subset selection object
## Call: regsubsets.formula(price ~ . - cylindernumber_two - fuelsystem_idi,
## data = df.model, nvmax = 20)
## 41 Variables (and intercept)
## Forced in Forced out
## symboling FALSE FALSE
## wheelbase FALSE FALSE
## carlength FALSE FALSE
## carwidth FALSE FALSE
## carheight FALSE FALSE
## curbweight FALSE FALSE
## enginesize FALSE FALSE
## boreratio FALSE FALSE
## stroke FALSE FALSE
## compressionratio FALSE FALSE
## horsepower FALSE FALSE
## peakrpm FALSE FALSE
## citympg FALSE FALSE
## highwaympg FALSE FALSE
## fueltype_gas FALSE FALSE
## aspiration_turbo FALSE FALSE
## doornumber_two FALSE FALSE
## carbody_hardtop FALSE FALSE
## carbody_hatchback FALSE FALSE
## carbody_sedan FALSE FALSE
## carbody_wagon FALSE FALSE
## drivewheel_fwd FALSE FALSE
## drivewheel_rwd FALSE FALSE
## enginelocation_rear FALSE FALSE
## enginetype_dohcv FALSE FALSE
## enginetype_l FALSE FALSE
## enginetype_ohc FALSE FALSE
## enginetype_ohcf FALSE FALSE
## enginetype_ohcv FALSE FALSE
## enginetype_rotor FALSE FALSE
## cylindernumber_five FALSE FALSE
## cylindernumber_four FALSE FALSE
## cylindernumber_six FALSE FALSE
## cylindernumber_three FALSE FALSE
## cylindernumber_twelve FALSE FALSE
## fuelsystem_2bbl FALSE FALSE
## fuelsystem_4bbl FALSE FALSE
## fuelsystem_mfi FALSE FALSE
## fuelsystem_mpfi FALSE FALSE
## fuelsystem_spdi FALSE FALSE
## fuelsystem_spfi FALSE FALSE
## 1 subsets of each size up to 20
## Selection Algorithm: exhaustive
## symboling wheelbase carlength carwidth carheight curbweight
## 1 ( 1 ) " " " " " " " " " " " "
## 2 ( 1 ) " " " " " " " " " " " "
## 3 ( 1 ) " " " " " " "*" " " " "
## 4 ( 1 ) " " " " " " "*" " " " "
## 5 ( 1 ) " " " " " " "*" " " " "
## 6 ( 1 ) " " " " " " "*" " " " "
## 7 ( 1 ) " " " " " " "*" " " " "
## 8 ( 1 ) " " " " " " "*" " " " "
## 9 ( 1 ) " " " " " " " " " " "*"
## 10 ( 1 ) " " " " " " " " " " "*"
## 11 ( 1 ) " " " " " " " " " " "*"
## 12 ( 1 ) " " " " " " " " " " "*"
## 13 ( 1 ) " " " " " " " " " " "*"
## 14 ( 1 ) " " " " " " " " " " "*"
## 15 ( 1 ) " " " " " " "*" " " "*"
## 16 ( 1 ) " " " " " " "*" " " "*"
## 17 ( 1 ) " " " " "*" "*" " " "*"
## 18 ( 1 ) " " " " "*" "*" " " "*"
## 19 ( 1 ) " " " " "*" "*" " " "*"
## 20 ( 1 ) " " " " "*" "*" " " "*"
## enginesize boreratio stroke compressionratio horsepower peakrpm
## 1 ( 1 ) "*" " " " " " " " " " "
## 2 ( 1 ) "*" " " " " " " " " " "
## 3 ( 1 ) "*" " " " " " " " " " "
## 4 ( 1 ) "*" " " " " " " " " " "
## 5 ( 1 ) "*" " " " " " " " " " "
## 6 ( 1 ) "*" " " "*" " " " " " "
## 7 ( 1 ) "*" " " "*" " " " " " "
## 8 ( 1 ) "*" " " "*" " " " " " "
## 9 ( 1 ) "*" " " "*" " " " " " "
## 10 ( 1 ) "*" " " "*" " " " " " "
## 11 ( 1 ) "*" " " "*" " " " " "*"
## 12 ( 1 ) "*" " " "*" " " " " "*"
## 13 ( 1 ) "*" " " "*" " " " " "*"
## 14 ( 1 ) "*" " " "*" " " " " "*"
## 15 ( 1 ) "*" " " "*" " " " " "*"
## 16 ( 1 ) "*" " " "*" " " " " "*"
## 17 ( 1 ) "*" " " "*" " " " " "*"
## 18 ( 1 ) "*" " " "*" " " " " "*"
## 19 ( 1 ) "*" " " "*" " " " " "*"
## 20 ( 1 ) "*" " " "*" " " " " "*"
## citympg highwaympg fueltype_gas aspiration_turbo doornumber_two
## 1 ( 1 ) " " " " " " " " " "
## 2 ( 1 ) " " " " " " " " " "
## 3 ( 1 ) " " " " " " " " " "
## 4 ( 1 ) " " " " " " " " " "
## 5 ( 1 ) " " " " " " " " " "
## 6 ( 1 ) " " " " " " " " " "
## 7 ( 1 ) " " " " " " " " " "
## 8 ( 1 ) " " " " " " "*" " "
## 9 ( 1 ) " " " " " " " " " "
## 10 ( 1 ) " " " " " " " " " "
## 11 ( 1 ) " " " " " " " " " "
## 12 ( 1 ) " " " " "*" " " " "
## 13 ( 1 ) " " " " " " "*" " "
## 14 ( 1 ) " " " " " " "*" " "
## 15 ( 1 ) " " " " " " "*" " "
## 16 ( 1 ) " " " " " " "*" " "
## 17 ( 1 ) " " " " " " "*" " "
## 18 ( 1 ) " " " " "*" "*" " "
## 19 ( 1 ) " " " " " " "*" " "
## 20 ( 1 ) " " " " " " "*" " "
## carbody_hardtop carbody_hatchback carbody_sedan carbody_wagon
## 1 ( 1 ) " " " " " " " "
## 2 ( 1 ) " " " " " " " "
## 3 ( 1 ) " " " " " " " "
## 4 ( 1 ) " " " " " " " "
## 5 ( 1 ) " " " " " " " "
## 6 ( 1 ) " " " " " " " "
## 7 ( 1 ) " " " " " " " "
## 8 ( 1 ) " " " " " " " "
## 9 ( 1 ) " " " " " " " "
## 10 ( 1 ) " " " " " " " "
## 11 ( 1 ) " " " " " " " "
## 12 ( 1 ) " " " " " " " "
## 13 ( 1 ) " " " " "*" " "
## 14 ( 1 ) " " " " "*" " "
## 15 ( 1 ) " " " " "*" " "
## 16 ( 1 ) " " "*" " " "*"
## 17 ( 1 ) " " " " "*" " "
## 18 ( 1 ) " " " " "*" " "
## 19 ( 1 ) "*" "*" " " "*"
## 20 ( 1 ) "*" "*" " " "*"
## drivewheel_fwd drivewheel_rwd enginelocation_rear enginetype_dohcv
## 1 ( 1 ) " " " " " " " "
## 2 ( 1 ) " " " " " " " "
## 3 ( 1 ) " " " " "*" " "
## 4 ( 1 ) " " " " "*" " "
## 5 ( 1 ) " " "*" "*" " "
## 6 ( 1 ) " " " " "*" " "
## 7 ( 1 ) " " " " "*" " "
## 8 ( 1 ) " " " " "*" " "
## 9 ( 1 ) " " " " "*" " "
## 10 ( 1 ) " " " " "*" " "
## 11 ( 1 ) " " " " "*" " "
## 12 ( 1 ) " " " " "*" " "
## 13 ( 1 ) " " " " "*" " "
## 14 ( 1 ) " " " " "*" " "
## 15 ( 1 ) " " " " "*" " "
## 16 ( 1 ) " " " " "*" " "
## 17 ( 1 ) " " " " "*" "*"
## 18 ( 1 ) " " " " "*" "*"
## 19 ( 1 ) " " " " "*" "*"
## 20 ( 1 ) " " " " "*" "*"
## enginetype_l enginetype_ohc enginetype_ohcf enginetype_ohcv
## 1 ( 1 ) " " " " " " " "
## 2 ( 1 ) " " " " " " " "
## 3 ( 1 ) " " " " " " " "
## 4 ( 1 ) " " " " " " " "
## 5 ( 1 ) " " " " " " " "
## 6 ( 1 ) " " " " " " "*"
## 7 ( 1 ) " " " " "*" "*"
## 8 ( 1 ) " " " " "*" "*"
## 9 ( 1 ) " " "*" " " "*"
## 10 ( 1 ) " " "*" " " "*"
## 11 ( 1 ) " " "*" " " "*"
## 12 ( 1 ) " " "*" " " "*"
## 13 ( 1 ) " " "*" " " "*"
## 14 ( 1 ) " " "*" " " "*"
## 15 ( 1 ) " " "*" " " "*"
## 16 ( 1 ) " " "*" " " "*"
## 17 ( 1 ) " " "*" " " "*"
## 18 ( 1 ) " " "*" " " "*"
## 19 ( 1 ) " " "*" " " "*"
## 20 ( 1 ) " " "*" " " "*"
## enginetype_rotor cylindernumber_five cylindernumber_four
## 1 ( 1 ) " " " " " "
## 2 ( 1 ) " " " " "*"
## 3 ( 1 ) " " " " " "
## 4 ( 1 ) " " " " "*"
## 5 ( 1 ) " " " " "*"
## 6 ( 1 ) " " " " "*"
## 7 ( 1 ) " " " " "*"
## 8 ( 1 ) " " " " "*"
## 9 ( 1 ) " " "*" "*"
## 10 ( 1 ) " " "*" "*"
## 11 ( 1 ) " " "*" "*"
## 12 ( 1 ) " " "*" "*"
## 13 ( 1 ) " " "*" "*"
## 14 ( 1 ) " " "*" "*"
## 15 ( 1 ) " " "*" "*"
## 16 ( 1 ) " " "*" "*"
## 17 ( 1 ) " " "*" "*"
## 18 ( 1 ) " " "*" "*"
## 19 ( 1 ) " " "*" "*"
## 20 ( 1 ) " " "*" "*"
## cylindernumber_six cylindernumber_three cylindernumber_twelve
## 1 ( 1 ) " " " " " "
## 2 ( 1 ) " " " " " "
## 3 ( 1 ) " " " " " "
## 4 ( 1 ) " " " " " "
## 5 ( 1 ) " " " " " "
## 6 ( 1 ) " " " " " "
## 7 ( 1 ) " " " " " "
## 8 ( 1 ) " " " " " "
## 9 ( 1 ) "*" " " " "
## 10 ( 1 ) "*" " " "*"
## 11 ( 1 ) "*" " " "*"
## 12 ( 1 ) "*" " " "*"
## 13 ( 1 ) "*" " " "*"
## 14 ( 1 ) "*" " " "*"
## 15 ( 1 ) "*" " " "*"
## 16 ( 1 ) "*" " " "*"
## 17 ( 1 ) "*" " " "*"
## 18 ( 1 ) "*" " " "*"
## 19 ( 1 ) "*" " " "*"
## 20 ( 1 ) "*" " " "*"
## fuelsystem_2bbl fuelsystem_4bbl fuelsystem_mfi fuelsystem_mpfi
## 1 ( 1 ) " " " " " " " "
## 2 ( 1 ) " " " " " " " "
## 3 ( 1 ) " " " " " " " "
## 4 ( 1 ) " " " " " " " "
## 5 ( 1 ) " " " " " " " "
## 6 ( 1 ) " " " " " " " "
## 7 ( 1 ) " " " " " " " "
## 8 ( 1 ) " " " " " " " "
## 9 ( 1 ) " " " " " " " "
## 10 ( 1 ) " " " " " " " "
## 11 ( 1 ) " " " " " " " "
## 12 ( 1 ) " " " " " " " "
## 13 ( 1 ) " " " " " " " "
## 14 ( 1 ) " " " " " " " "
## 15 ( 1 ) " " " " " " " "
## 16 ( 1 ) " " " " " " " "
## 17 ( 1 ) " " " " " " " "
## 18 ( 1 ) " " " " " " " "
## 19 ( 1 ) " " " " " " " "
## 20 ( 1 ) " " " " "*" " "
## fuelsystem_spdi fuelsystem_spfi
## 1 ( 1 ) " " " "
## 2 ( 1 ) " " " "
## 3 ( 1 ) " " " "
## 4 ( 1 ) " " " "
## 5 ( 1 ) " " " "
## 6 ( 1 ) " " " "
## 7 ( 1 ) " " " "
## 8 ( 1 ) " " " "
## 9 ( 1 ) " " " "
## 10 ( 1 ) " " " "
## 11 ( 1 ) " " " "
## 12 ( 1 ) " " " "
## 13 ( 1 ) " " " "
## 14 ( 1 ) "*" " "
## 15 ( 1 ) "*" " "
## 16 ( 1 ) "*" " "
## 17 ( 1 ) "*" " "
## 18 ( 1 ) "*" " "
## 19 ( 1 ) "*" " "
## 20 ( 1 ) "*" " "
## [1] 17
## bmw x5_17 jaguar xk_50
## 17 50
## rstudent unadjusted p-value Bonferroni p
## bmw x5_17 5.667555 5.3412e-08 1.0949e-05
## jaguar xk_50 -3.974038 1.0047e-04 2.0595e-02
## Actual Price Predicted Price Residuals
## bmw x5_17 41315 28587.52 12727.479
## buick regal sport coupe (turbo)_75 45400 38125.21 7274.786
## mazda rx-7 gs_67 18344 11975.85 6368.150
## bmw x3_18 36880 30638.47 6241.530
## bmw 320i_12 16925 11776.43 5148.573
## Actual Price Predicted Price Residuals
## peugeot 304_109 13200 18471.68 -5271.677
## toyota corolla liftback_179 16558 21912.13 -5354.134
## toyota corona_180 15998 22035.70 -6037.702
## toyota starlet_181 15690 22630.68 -6940.678
## jaguar xk_50 36000 44165.94 -8165.943
##
## Call:
## lm(formula = price ~ carlength + carwidth + curbweight + enginesize +
## stroke + peakrpm + aspiration_turbo + carbody_sedan + enginelocation_rear +
## fuelsystem_spdi + enginetype_ohcv + cylindernumber_four +
## cylindernumber_five + cylindernumber_six + cylindernumber_twelve +
## enginetype_ohc + enginetype_dohcv, data = df.model)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5399.1 -1152.7 -21.7 1084.3 10383.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.867e+04 1.033e+04 -2.776 0.006064 **
## carlength -9.535e+01 3.612e+01 -2.640 0.008992 **
## carwidth 6.357e+02 1.905e+02 3.337 0.001020 **
## curbweight 5.079e+00 1.167e+00 4.350 2.24e-05 ***
## enginesize 1.250e+02 1.100e+01 11.367 < 2e-16 ***
## stroke -4.854e+03 6.360e+02 -7.632 1.15e-12 ***
## peakrpm 1.810e+00 3.754e-01 4.822 2.94e-06 ***
## aspiration_turbo 2.051e+03 5.578e+02 3.678 0.000308 ***
## carbody_sedan 1.227e+03 3.462e+02 3.543 0.000499 ***
## enginelocation_rear 9.320e+03 1.632e+03 5.712 4.34e-08 ***
## fuelsystem_spdi -2.569e+03 8.867e+02 -2.897 0.004213 **
## enginetype_ohcv -5.725e+03 9.536e+02 -6.004 9.82e-09 ***
## cylindernumber_four -9.543e+03 9.370e+02 -10.184 < 2e-16 ***
## cylindernumber_five -8.436e+03 1.197e+03 -7.046 3.42e-11 ***
## cylindernumber_six -6.262e+03 9.406e+02 -6.657 3.01e-10 ***
## cylindernumber_twelve -1.390e+04 2.660e+03 -5.225 4.63e-07 ***
## enginetype_ohc 3.342e+03 5.124e+02 6.522 6.30e-10 ***
## enginetype_dohcv -6.919e+03 2.596e+03 -2.665 0.008361 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2197 on 187 degrees of freedom
## Multiple R-squared: 0.9307, Adjusted R-squared: 0.9244
## F-statistic: 147.7 on 17 and 187 DF, p-value: < 2.2e-16
x | |
---|---|
curbweight | 15.620966 |
enginesize | 8.865851 |
carlength | 8.393444 |
carwidth | 7.057938 |
cylindernumber_four | 6.490903 |
cylindernumber_six | 3.884329 |
cylindernumber_five | 3.091772 |
enginetype_ohcv | 2.294144 |
enginetype_ohc | 2.238741 |
aspiration_turbo | 1.954530 |
stroke | 1.681694 |
enginelocation_rear | 1.631093 |
cylindernumber_twelve | 1.459111 |
fuelsystem_spdi | 1.402013 |
enginetype_dohcv | 1.389245 |
peakrpm | 1.355141 |
carbody_sedan | 1.267907 |
##
## Call:
## lm(formula = price ~ carlength + carwidth + curbweight + enginesize +
## stroke + peakrpm + aspiration_turbo + carbody_sedan + enginelocation_rear +
## fuelsystem_spdi + enginetype_ohcv + cylindernumber_four +
## cylindernumber_five + cylindernumber_six + cylindernumber_twelve +
## enginetype_ohc + enginetype_dohcv, data = df.model)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5399.1 -1152.7 -21.7 1084.3 10383.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.867e+04 1.033e+04 -2.776 0.006064 **
## carlength -9.535e+01 3.612e+01 -2.640 0.008992 **
## carwidth 6.357e+02 1.905e+02 3.337 0.001020 **
## curbweight 5.079e+00 1.167e+00 4.350 2.24e-05 ***
## enginesize 1.250e+02 1.100e+01 11.367 < 2e-16 ***
## stroke -4.854e+03 6.360e+02 -7.632 1.15e-12 ***
## peakrpm 1.810e+00 3.754e-01 4.822 2.94e-06 ***
## aspiration_turbo 2.051e+03 5.578e+02 3.678 0.000308 ***
## carbody_sedan 1.227e+03 3.462e+02 3.543 0.000499 ***
## enginelocation_rear 9.320e+03 1.632e+03 5.712 4.34e-08 ***
## fuelsystem_spdi -2.569e+03 8.867e+02 -2.897 0.004213 **
## enginetype_ohcv -5.725e+03 9.536e+02 -6.004 9.82e-09 ***
## cylindernumber_four -9.543e+03 9.370e+02 -10.184 < 2e-16 ***
## cylindernumber_five -8.436e+03 1.197e+03 -7.046 3.42e-11 ***
## cylindernumber_six -6.262e+03 9.406e+02 -6.657 3.01e-10 ***
## cylindernumber_twelve -1.390e+04 2.660e+03 -5.225 4.63e-07 ***
## enginetype_ohc 3.342e+03 5.124e+02 6.522 6.30e-10 ***
## enginetype_dohcv -6.919e+03 2.596e+03 -2.665 0.008361 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2197 on 187 degrees of freedom
## Multiple R-squared: 0.9307, Adjusted R-squared: 0.9244
## F-statistic: 147.7 on 17 and 187 DF, p-value: < 2.2e-16
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma(x = car.lin.final)
##
## Value p-value Decision
## Global Stat 118.193 0.000e+00 Assumptions NOT satisfied!
## Skewness 17.732 2.543e-05 Assumptions NOT satisfied!
## Kurtosis 70.151 0.000e+00 Assumptions NOT satisfied!
## Link Function 28.910 7.580e-08 Assumptions NOT satisfied!
## Heteroscedasticity 1.399 2.369e-01 Assumptions acceptable.
## StudRes Hat CookD
## bmw x5_17 5.667555 0.09770143 0.19923890
## jaguar xk_50 -3.974038 0.30033467 0.41930148
## buick regal sport coupe (turbo)_75 3.344256 0.23352573 0.21561004
## porcshce panamera_127 -0.938417 0.33338970 0.02938012
## porsche boxter_129 1.158257 0.33355880 0.04468373
##
## Regression tree:
## tree(formula = price ~ ., data = df.model)
## Variables actually used in tree construction:
## [1] "enginesize" "curbweight" "carwidth" "compressionratio"
## Number of terminal nodes: 6
## Residual mean deviance: 5997000 = 1.193e+09 / 199
## Distribution of residuals:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8303.0 -1441.0 -129.9 0.0 1114.0 8685.0
## enginesize compressionratio price
## buick regal sport coupe (turbo)_75 304 8.0 45400
## bmw x5_17 209 8.0 41315
## buick century special_74 308 8.0 40960
## porsche boxter_129 194 9.5 37028
## bmw x3_18 209 8.0 36880
## jaguar xk_50 326 11.5 36000
## jaguar xf_49 258 8.1 35550
## buick skylark_73 234 8.3 35056
## buick opel isuzu deluxe_72 234 8.3 34184
## porsche cayenne_128 194 9.5 34028
## porcshce panamera_127 194 9.5 32528
## %IncMSE IncNodePurity
## symboling 5.5506738 27652007.8
## wheelbase 11.6218645 363085238.2
## carlength 9.0886828 388731472.1
## carwidth 10.4412789 684804066.7
## carheight 6.2723744 64822333.0
## curbweight 17.2600207 2447527614.3
## enginesize 20.6506663 3757571354.6
## boreratio 5.8788457 191809604.4
## stroke 6.8302816 64912108.3
## compressionratio 6.4827881 110740876.8
## horsepower 11.9331161 1774957144.4
## peakrpm 7.0620797 107914806.2
## citympg 10.3803305 846994623.4
## highwaympg 9.7742496 968835011.0
## fueltype_gas 2.6800230 12023422.7
## aspiration_turbo 5.5099684 26391562.0
## doornumber_two 0.9832765 9292022.8
## carbody_hardtop 0.9117290 13510364.1
## carbody_hatchback 3.3675654 16799455.8
## carbody_sedan 2.4896032 14297389.6
## carbody_wagon 1.4457744 6597847.6
## drivewheel_fwd 5.0063847 87940091.9
## drivewheel_rwd 6.2407733 201386574.1
## enginelocation_rear 2.9416288 30506090.5
## enginetype_dohcv 0.0000000 2585366.6
## enginetype_l 2.9871659 6456397.5
## enginetype_ohc 5.5181063 12548571.8
## enginetype_ohcf 3.1169770 26152671.1
## enginetype_ohcv 0.6275535 17228263.2
## enginetype_rotor 1.7440113 1991354.7
## cylindernumber_five 4.1596691 11717978.0
## cylindernumber_four 7.2504680 560273960.4
## cylindernumber_six 3.4393328 19015761.7
## cylindernumber_three 0.0000000 920529.5
## cylindernumber_twelve 0.0000000 134063.9
## fuelsystem_2bbl 3.4222256 33404732.2
## fuelsystem_4bbl 2.9847853 1521606.7
## fuelsystem_mfi 0.0000000 166602.0
## fuelsystem_mpfi 6.2596768 75532779.3
## fuelsystem_spdi 2.3757003 1494368.7
## fuelsystem_spfi 0.0000000 204471.9
## var rel.inf
## enginesize enginesize 37.382890448
## curbweight curbweight 17.423063632
## horsepower horsepower 8.429876541
## highwaympg highwaympg 7.582575066
## wheelbase wheelbase 3.799326528
## carwidth carwidth 3.381947405
## boreratio boreratio 2.888442259
## carlength carlength 2.769236076
## carheight carheight 2.689816562
## citympg citympg 2.273619341
## stroke stroke 2.172274605
## cylindernumber_four cylindernumber_four 1.700792919
## peakrpm peakrpm 1.627248022
## compressionratio compressionratio 1.530524317
## symboling symboling 0.850128845
## enginetype_ohc enginetype_ohc 0.766424611
## drivewheel_rwd drivewheel_rwd 0.559863409
## carbody_sedan carbody_sedan 0.498021217
## cylindernumber_six cylindernumber_six 0.355087137
## carbody_hatchback carbody_hatchback 0.350631260
## fuelsystem_mpfi fuelsystem_mpfi 0.239446166
## doornumber_two doornumber_two 0.238945558
## drivewheel_fwd drivewheel_fwd 0.206307660
## aspiration_turbo aspiration_turbo 0.099386153
## fuelsystem_2bbl fuelsystem_2bbl 0.076652948
## carbody_wagon carbody_wagon 0.073180729
## enginetype_l enginetype_l 0.022532153
## enginetype_ohcv enginetype_ohcv 0.010375580
## enginetype_ohcf enginetype_ohcf 0.001382852
## fueltype_gas fueltype_gas 0.000000000
## carbody_hardtop carbody_hardtop 0.000000000
## enginelocation_rear enginelocation_rear 0.000000000
## enginetype_dohcv enginetype_dohcv 0.000000000
## enginetype_rotor enginetype_rotor 0.000000000
## cylindernumber_five cylindernumber_five 0.000000000
## cylindernumber_three cylindernumber_three 0.000000000
## cylindernumber_twelve cylindernumber_twelve 0.000000000
## fuelsystem_4bbl fuelsystem_4bbl 0.000000000
## fuelsystem_mfi fuelsystem_mfi 0.000000000
## fuelsystem_spdi fuelsystem_spdi 0.000000000
## fuelsystem_spfi fuelsystem_spfi 0.000000000