(#4) [40 Points]

20 engineer apprentices and 20 pilots were given 6 tests (Travers, 1939):

  • \(y_{1}\): intelligence
  • \(y_{2}\): form relations
  • \(y_{3}\): dynamometer
  • \(y_{4}\): dotting
  • \(y_{5}\): sensory motor coordination
  • \(y_{6}\): perseveration
group y1 y2 y3 y4 y5 y6
1 121 22 74 223 54 254
1 108 30 80 175 40 300
1 122 49 87 266 41 223
1 77 37 66 178 80 209
1 140 35 71 175 38 261
1 108 37 57 241 59 245

a. Perform PCA using the pooled covariance matrix.

eigen() decomposition
$values
[1] 1050.59628  858.31581  398.90346  259.14837  108.08917   43.35349

$vectors
            [,1]          [,2]        [,3]         [,4]         [,5]
[1,] -0.44092830  0.1897832542  0.86398819  0.079577443  0.105698146
[2,] -0.04088997  0.0377138213  0.08242069 -0.227027690 -0.057349251
[3,]  0.03933709 -0.0313490863  0.14336529 -0.023912831 -0.985372990
[4,] -0.44974010 -0.8923295867 -0.03264727  0.003570544  0.004135096
[5,]  0.01936520  0.0009446983 -0.05392188  0.970251939 -0.047798944
[6,] -0.77441699  0.4066008303 -0.47138542 -0.012347547 -0.110802522
            [,6]
[1,]  0.07472808
[2,] -0.96709979
[3,]  0.07337894
[4,] -0.01964830
[5,] -0.23031058
[6,]  0.01789487
[1] 0.38647504 0.31574225 0.14674165 0.09533098 0.03976196 0.01594813
[1] 0.3864750 0.7022173 0.8489589 0.9442899 0.9840519 1.0000000

b. Ignore groups and use a covariance matrix based on 40 observations to perform PCA.

Call:
princomp(cor = FALSE, scores = TRUE, covmat = cov.wt(df.sub))

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4    Comp.5    Comp.6 
41.497499 29.637102 20.035932 16.157875 11.353640  7.097781 

 6  variables and  40 observations.
Importance of components:
                           Comp.1     Comp.2     Comp.3      Comp.4
Standard deviation     41.4974985 29.6371023 20.0359318 16.15787527
Proportion of Variance  0.5002739  0.2551734  0.1166227  0.07584597
Cumulative Proportion   0.5002739  0.7554473  0.8720700  0.94791596
                            Comp.5     Comp.6
Standard deviation     11.35364032 7.09778133
Proportion of Variance  0.03744848 0.01463556
Cumulative Proportion   0.98536444 1.00000000

c. Which of the approaches in (a) & (b) seems to be more successful?

The approach from part b seems to be more successful as it explains a higher cumulative proportion of variance with each additional principal component than the pooled variance method for pilots vs engineers in part a. The biggest indicator of the second approach being more appropriate is the difference in proportion of variance explained by the first principal component (0.3865 for method a and 0.5003 for method b).

d. For (e)-(f), combine the two groups into a single sample. Using a scree plot with percentages, is there a clear choice of m?

Standard deviations (1, .., p=6):
[1] 41.497499 29.637102 20.035932 16.157875 11.353640  7.097781

Rotation (n x k) = (6 x 6):
           PC1         PC2         PC3         PC4         PC5         PC6
y1  0.21165160 -0.38949336  0.88819049 -0.03082062 -0.04760343 -0.10677164
y2 -0.03883125 -0.06379320  0.09571590  0.19128493 -0.14793191  0.96269790
y3  0.08012946  0.06602004  0.08145863  0.12854488  0.97505667  0.12379748
y4  0.77552673  0.60795970  0.08071120 -0.08125631 -0.10891968  0.06295166
y5 -0.09593926 -0.01046493  0.01494473 -0.96813856  0.10919120  0.20309559
y6  0.58019734 -0.68566916 -0.43426141 -0.04518327  0.03644629  0.03572141

The scree plot shows significant decreases in marginal increase in variance explained after the 2nd principal component. Thus, m should be set to either 2 or 3, depending on theory behind the decision. Either seem appropriate, but as a statistician I prefer explaining more variance over the simplistinc interpretation, so I would choose m = 3.

e. Extract 3 factors by the principle component method and carry out a varimax rotation.

       y1        y2        y3        y4        y5        y6 
 8.783012 -1.611400  3.325172 32.182419 -3.981239 24.076738 
         y1          y2          y3          y4          y5          y6 
-11.5434545  -1.8906456   1.9566427  18.0181637  -0.3101502 -20.3212470 
        y1         y2         y3         y4         y5         y6 
17.7957240  1.9177572  1.6320995  1.6171240  0.2994317 -8.7008320 
$loadings

Loadings:
   l1      l2      l3     
y1   5.246  -5.720  21.606
y2  -1.773   0.590   2.521
y3   4.143  -0.166   0.601
y4  35.522  -8.086  -5.979
y5  -3.156   2.456   0.210
y6   4.445 -32.148   3.887

                     l1       l2      l3
SS loadings    1339.397 1137.992 524.449
Proportion Var  223.233  189.665  87.408
Cumulative Var  223.233  412.898 500.306

$rotmat
          [,1]       [,2]        [,3]
[1,] 0.7697299 -0.6363192  0.05112505
[2,] 0.5710494  0.6505477 -0.50068983
[3,] 0.2853393  0.4145909  0.86411569

f. Extract 3 factors by the maximum likelihood method and carry out a varimax rotation.


Call:
factanal(x = df.sub, factors = 3, rotation = "none")

Uniquenesses:
   y1    y2    y3    y4    y5    y6 
0.005 0.703 0.866 0.378 0.005 0.726 

Loadings:
   Factor1 Factor2 Factor3
y1  0.725           0.685 
y2  0.298  -0.455         
y3  0.179   0.316         
y4  0.217   0.757         
y5 -0.723           0.687 
y6  0.372   0.325   0.172 

               Factor1 Factor2 Factor3
SS loadings      1.355   0.985   0.977
Proportion Var   0.226   0.164   0.163
Cumulative Var   0.226   0.390   0.553

The degrees of freedom for the model is 0 and the fit was 0.0165 
       y1        y2        y3        y4        y5        y6 
0.0050000 0.7027878 0.8663403 0.3775656 0.0050000 0.7260919 

Call:
factanal(x = df.sub, factors = 3, rotation = "varimax")

Uniquenesses:
   y1    y2    y3    y4    y5    y6 
0.005 0.703 0.866 0.378 0.005 0.726 

Loadings:
   Factor1 Factor2 Factor3
y1  0.996                 
y2  0.199  -0.328  -0.387 
y3                  0.345 
y4                  0.784 
y5          0.977  -0.201 
y6  0.367           0.366 

               Factor1 Factor2 Factor3
SS loadings      1.178   1.078   1.061
Proportion Var   0.196   0.180   0.177
Cumulative Var   0.196   0.376   0.553

The degrees of freedom for the model is 0 and the fit was 0.0165 

g. Compare the results of (e) and (f).

Method (e) seems to outperform method f with higher cumulative variance explained.

i. Find the classification confusion matrix using quadratic discriminant function.

[1] 0.9
   pred.class
     1  2
  1 18  2
  2  2 18

j. Compare the results of (i) and (g).

The results of LDA and QDA suggest very similar performance; they each classify correctly 90% of the time in this particular example.