1. From the calculated correlation matrix for all four variables given below, we can see that the first highest (in absolute value) correlation coefficient is between the variables ozone and temperature with the value 0.6985414, and the second highest is between ozone and wind with the absolute value 0.6129508.
| ozone |
1.0000000 |
0.3483417 |
0.6985414 |
-0.6129508 |
| radiation |
0.3483417 |
1.0000000 |
0.2940876 |
-0.1273656 |
| temperature |
0.6985414 |
0.2940876 |
1.0000000 |
-0.4971459 |
| wind |
-0.6129508 |
-0.1273656 |
-0.4971459 |
1.0000000 |
2. The standard deviations of the scores are 1.5363, 0.9458, 0.6898, 0.5191. The calculated loadings of the principal components are given in the following table:
| ozone |
0.5890271 |
-0.0630412 |
0.1137638 |
0.7975780 |
| radiation |
0.3168987 |
0.8985548 |
-0.2773707 |
-0.1234503 |
| temperature |
0.5527125 |
-0.0612848 |
0.6585842 |
-0.5069713 |
| wind |
-0.4971228 |
0.4299643 |
0.6902102 |
0.3026705 |
3. The sum of squares of the loadings of each principal component are all equal to 1 as shown here:
- PC1: \(\sum (PC1)^2\) = 1
- PC2: \(\sum (PC2)^2\) = 1
- PC3: \(\sum (PC3)^2\) = 1
- PC4: \(\sum (PC4)^2\) = 1
4. The two largest loadings in absolute value of the first principle component are for the variables ozone and temperature with the values 0.5890271 and 0.5527125 respectively. This corresponds with the correlation analysis of the first question because with that we had found that the variables ozone and temperature had the highest correlation coefficient.
5. The largest loading in absolute value of the second principle component is for the variable radiation with the value of 0.8985548.
6. We can see in the Importance table output below, the first principal component explains about 59% of the variance, and the fourth principal component explains about 6.7% of the variance.
importance:
| Standard deviation |
1.536 |
0.9458 |
0.6898 |
0.5191 |
| Proportion of Variance |
0.59 |
0.2236 |
0.119 |
0.06736 |
| Cumulative Proportion |
0.59 |
0.8137 |
0.9326 |
1 |
7. No, it is not quite enough to use the first two principal components if we want to explain at least 90% of the total variance because we can easily see from the summary output table above in problem 6 that the cumulative proportion of the first two principal components only explains about 81.4% of the variance.