Mtcars

For this exercise I use the r dataset called mtcars

Motor Trend Auto Road Testing

Description

The data was extracted from the 1974 American magazine Motor Trend and includes fuel consumption and 10 aspects of automobile design and performance for 32 cars (1973–74 models).

Use : mtcars Format : A data frame with 32 observations on 11 (numeric) variables.

[, 1] mpg Miles / (US) Gallon [, 2] cyl Number of cylinders [, 3] disp Displacement (cubic inches) [.4] hp Gross horsepower [, 5] drat Rear axle ratio [, 6] wt Weight (1000 lbs) [, 7] qsec 1/4 mile of time [, 8] in front of the motor (0 = V-shaped, 1 = straight) [, 9] a. M. Transmission (0 = automatic, 1 = manual) [, 10] gear Number of forward gears [, 11] carb Number of carburettors

Note Henderson and Velleman (1981) comment in a footnote to Table 1: ‘Hocking [original transcriber]’ noncritical encoding of Mazda’s rotary engine as an inline six-cylinder engine and Porsche’s flatbed engine as an inline engine. V, as well as the inclusion of the Mercedes 240D diesel, have been retained to allow direct comparisons with previous reviews.

Source Henderson and Velleman (1981), Building Multiple Regression Models Interactively. Biometrics, 37, 391–411.

I load the data

library(bpca)
library(scatterplot3d)
library(rgl)
mtcars
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
prcomp(mtcars)
## Standard deviations (1, .., p=11):
##  [1] 136.5330479  38.1480776   3.0710166   1.3066508   0.9064862   0.6635411
##  [7]   0.3085791   0.2859604   0.2506973   0.2106519   0.1984238
## 
## Rotation (n x k) = (11 x 11):
##               PC1          PC2          PC3          PC4         PC5
## mpg  -0.038118199  0.009184847  0.982070847  0.047634784 -0.08832843
## cyl   0.012035150 -0.003372487 -0.063483942 -0.227991962  0.23872590
## disp  0.899568146  0.435372320  0.031442656 -0.005086826 -0.01073597
## hp    0.434784387 -0.899307303  0.025093049  0.035715638  0.01655194
## drat -0.002660077 -0.003900205  0.039724928 -0.057129357 -0.13332765
## wt    0.006239405  0.004861023 -0.084910258  0.127962867 -0.24354296
## qsec -0.006671270  0.025011743 -0.071670457  0.886472188 -0.21416101
## vs   -0.002729474  0.002198425  0.004203328  0.177123945 -0.01688851
## am   -0.001962644 -0.005793760  0.054806391 -0.135658793 -0.06270200
## gear -0.002604768 -0.011272462  0.048524372 -0.129913811 -0.27616440
## carb  0.005766010 -0.027779208 -0.102897231 -0.268931427 -0.85520810
##               PC6          PC7           PC8          PC9         PC10
## mpg  -0.143790084 -0.039239174  2.271040e-02 -0.002790139  0.030630361
## cyl  -0.793818050  0.425011021 -1.890403e-01  0.042677206  0.131718534
## disp  0.007424138  0.000582398 -5.841464e-04  0.003532713 -0.005399132
## hp    0.001653685 -0.002212538  4.748087e-06 -0.003734085  0.001862554
## drat  0.227229260  0.034847411 -9.385817e-01 -0.014131110  0.184102094
## wt   -0.127142296 -0.186558915  1.561907e-01 -0.390600261  0.829886844
## qsec -0.189564973  0.254844548 -1.028515e-01 -0.095914479 -0.204240658
## vs    0.102619063 -0.080788938 -2.132903e-03  0.684043835  0.303060724
## am    0.205217266  0.200858874 -2.273255e-02 -0.572372433 -0.162808201
## gear  0.334971103  0.801625551  2.174878e-01  0.156118559  0.203540645
## carb -0.283788381 -0.165474186  3.972219e-03  0.127583043 -0.239954748
##               PC11
## mpg  -0.0158569365
## cyl   0.1454453628
## disp  0.0009420262
## hp   -0.0021526102
## drat -0.0973818815
## wt   -0.0198581635
## qsec  0.0110677880
## vs    0.6256900918
## am    0.7331658036
## gear -0.1909325849
## carb  0.0557957968
plot(prcomp(mtcars))

summary(prcomp(mtcars))
## Importance of components:
##                            PC1      PC2     PC3     PC4     PC5     PC6    PC7
## Standard deviation     136.533 38.14808 3.07102 1.30665 0.90649 0.66354 0.3086
## Proportion of Variance   0.927  0.07237 0.00047 0.00008 0.00004 0.00002 0.0000
## Cumulative Proportion    0.927  0.99937 0.99984 0.99992 0.99996 0.99998 1.0000
##                          PC8    PC9   PC10   PC11
## Standard deviation     0.286 0.2507 0.2107 0.1984
## Proportion of Variance 0.000 0.0000 0.0000 0.0000
## Cumulative Proportion  1.000 1.0000 1.0000 1.0000
plot(prcomp(mtcars,scale=T))

#Analysis Biplot

summary(prcomp(mtcars,scale=T))
## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6    PC7
## Standard deviation     2.5707 1.6280 0.79196 0.51923 0.47271 0.46000 0.3678
## Proportion of Variance 0.6008 0.2409 0.05702 0.02451 0.02031 0.01924 0.0123
## Cumulative Proportion  0.6008 0.8417 0.89873 0.92324 0.94356 0.96279 0.9751
##                            PC8    PC9    PC10   PC11
## Standard deviation     0.35057 0.2776 0.22811 0.1485
## Proportion of Variance 0.01117 0.0070 0.00473 0.0020
## Cumulative Proportion  0.98626 0.9933 0.99800 1.0000
prcomp(mtcars,scale=T)
## Standard deviations (1, .., p=11):
##  [1] 2.5706809 1.6280258 0.7919579 0.5192277 0.4727061 0.4599958 0.3677798
##  [8] 0.3505730 0.2775728 0.2281128 0.1484736
## 
## Rotation (n x k) = (11 x 11):
##             PC1         PC2         PC3          PC4         PC5         PC6
## mpg  -0.3625305  0.01612440 -0.22574419 -0.022540255  0.10284468 -0.10879743
## cyl   0.3739160  0.04374371 -0.17531118 -0.002591838  0.05848381  0.16855369
## disp  0.3681852 -0.04932413 -0.06148414  0.256607885  0.39399530 -0.33616451
## hp    0.3300569  0.24878402  0.14001476 -0.067676157  0.54004744  0.07143563
## drat -0.2941514  0.27469408  0.16118879  0.854828743  0.07732727  0.24449705
## wt    0.3461033 -0.14303825  0.34181851  0.245899314 -0.07502912 -0.46493964
## qsec -0.2004563 -0.46337482  0.40316904  0.068076532 -0.16466591 -0.33048032
## vs   -0.3065113 -0.23164699  0.42881517 -0.214848616  0.59953955  0.19401702
## am   -0.2349429  0.42941765 -0.20576657 -0.030462908  0.08978128 -0.57081745
## gear -0.2069162  0.46234863  0.28977993 -0.264690521  0.04832960 -0.24356284
## carb  0.2140177  0.41357106  0.52854459 -0.126789179 -0.36131875  0.18352168
##               PC7          PC8          PC9        PC10         PC11
## mpg   0.367723810 -0.754091423  0.235701617  0.13928524 -0.124895628
## cyl   0.057277736 -0.230824925  0.054035270 -0.84641949 -0.140695441
## disp  0.214303077  0.001142134  0.198427848  0.04937979  0.660606481
## hp   -0.001495989 -0.222358441 -0.575830072  0.24782351 -0.256492062
## drat  0.021119857  0.032193501 -0.046901228 -0.10149369 -0.039530246
## wt   -0.020668302 -0.008571929  0.359498251  0.09439426 -0.567448697
## qsec  0.050010522 -0.231840021 -0.528377185 -0.27067295  0.181361780
## vs   -0.265780836  0.025935128  0.358582624 -0.15903909  0.008414634
## am   -0.587305101 -0.059746952 -0.047403982 -0.17778541  0.029823537
## gear  0.605097617  0.336150240 -0.001735039 -0.21382515 -0.053507085
## carb -0.174603192 -0.395629107  0.170640677  0.07225950  0.319594676
plot(prcomp(mtcars,scale=T)$x[,1:2])

plot(prcomp(mtcars,scale=T)$x[,1:2],type="n")
text(prcomp(mtcars,scale=T)$x[,1:2],rownames(mtcars))

biplot(prcomp(mtcars,scale=T)) 

Comments

By analyzing the data, I can look at fuel consumption and 10 aspects of car design and performance for 32 cars (1973–74 models). Analyzing the standard deviations that are the eigenvalues of the correlation matrix, and that represent the variability in each component. The first component is 89.9% represented by disp (cubic inch displacement). Making its graphical representation it is observed that the first variable is the one with the greatest relative importance and making a numerical analysis of it, we observe that this main component CP1 represents the variability of the data or a standard deviation of 136.56.

In the first table of component importance, it is observed that the proportion of variance explained by the first component PC1 is 92.7%, that is, it is practically the only relevant one, and therefore the one that will have the most influence on the final result. . . The second most influential component is PC2, which is horsepower dependent with -89.9, the next largest variable.

I do the analysis again with the standardized data and we can see that with the first two components that we collect they are practically 99.9% of the variability. This denotes that a graph represented by the principal components is sufficiently representative. Through the biplot analysis, I can see that PC1 orders the weighted average of the original variables in the component, it would be ordered from Cyl with 37.3 to mpg with -36.2. PC2 would be ordered from gear in a weighted sense opposite to qsec.

I am left with only the two main components, since their contribution to the variance is sufficient to validate the model.