8.9

Use the fat data, fitting the model described in Section 4.2.

#8.9  Fitting as per 4.2 section
data(fat, package = "faraway")

lmod <- lm(brozek ~ age + weight + height + neck + chest + abdom + hip + thigh + knee + ankle + biceps + forearm + wrist, data=fat)

summary(lmod)
## 
## Call:
## lm(formula = brozek ~ age + weight + height + neck + chest + 
##     abdom + hip + thigh + knee + ankle + biceps + forearm + wrist, 
##     data = fat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.264  -2.572  -0.097   2.898   9.327 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -15.29255   16.06992  -0.952  0.34225    
## age           0.05679    0.02996   1.895  0.05929 .  
## weight       -0.08031    0.04958  -1.620  0.10660    
## height       -0.06460    0.08893  -0.726  0.46830    
## neck         -0.43754    0.21533  -2.032  0.04327 *  
## chest        -0.02360    0.09184  -0.257  0.79740    
## abdom         0.88543    0.08008  11.057  < 2e-16 ***
## hip          -0.19842    0.13516  -1.468  0.14341    
## thigh         0.23190    0.13372   1.734  0.08418 .  
## knee         -0.01168    0.22414  -0.052  0.95850    
## ankle         0.16354    0.20514   0.797  0.42614    
## biceps        0.15280    0.15851   0.964  0.33605    
## forearm       0.43049    0.18445   2.334  0.02044 *  
## wrist        -1.47654    0.49552  -2.980  0.00318 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.988 on 238 degrees of freedom
## Multiple R-squared:  0.749,  Adjusted R-squared:  0.7353 
## F-statistic: 54.63 on 13 and 238 DF,  p-value: < 2.2e-16

8.9a

Fit the same model but now using Huber’s robust method. Comment on any substantial differences between this model and the least squares fit.

library(MASS)
rlmMod <- rlm(brozek ~ age + weight + height + neck + chest + abdom + hip + thigh + knee + ankle + biceps + forearm + wrist, data=fat)
summary(rlmMod)
## 
## Call: rlm(formula = brozek ~ age + weight + height + neck + chest + 
##     abdom + hip + thigh + knee + ankle + biceps + forearm + wrist, 
##     data = fat)
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.3964  -2.7352  -0.1171   2.8008   9.4446 
## 
## Coefficients:
##             Value    Std. Error t value 
## (Intercept) -11.3460  17.1216    -0.6627
## age           0.0650   0.0319     2.0368
## weight       -0.0643   0.0528    -1.2163
## height       -0.0625   0.0948    -0.6595
## neck         -0.4553   0.2294    -1.9846
## chest        -0.0256   0.0978    -0.2614
## abdom         0.8778   0.0853    10.2891
## hip          -0.2142   0.1440    -1.4872
## thigh         0.2632   0.1425     1.8473
## knee         -0.1076   0.2388    -0.4505
## ankle         0.1815   0.2186     0.8306
## biceps        0.1367   0.1689     0.8091
## forearm       0.4152   0.1965     2.1126
## wrist        -1.5739   0.5279    -2.9812
## 
## Residual standard error: 4.073 on 238 degrees of freedom

Comments…the intercept is lower and each of the predictors (generally) have marginally greater effect on the result. The most predictive predictors in the original lm model are adom, wrist and forearm, and they are as well in the huber model, but what varies are some of the lesser predictors. In the huber model, age, appears to be significant, while it doesn’t in the basic lm model. The std error is actually better in the original, but only modestly so.

8.9.b

Identify which two cases have the lowest weights in the Huber fit. What is unusual about these two points?

wts <- rlmMod$w
names(wts) <- row.names(fat)
head(sort(wts))
##       224       207        39       231       225        81 
## 0.5269652 0.5800712 0.5987751 0.6260611 0.6304030 0.6367342

Cases 224 and 207 have lowest weights.

fat[224,]
##     brozek siri density age weight height adipos  free neck chest abdom
## 224    6.1  5.2  1.0874  55 142.25  67.25   22.2 133.6 35.2  92.7  82.8
##      hip thigh knee ankle biceps forearm wrist
## 224 91.9  54.4 35.2  22.5   29.4    26.8    17
fat[207,]
##     brozek siri density age weight height adipos  free neck chest abdom
## 207   31.7 32.9   1.025  44    166   65.5   27.2 113.5 39.1 100.6  93.9
##       hip thigh knee ankle biceps forearm wrist
## 207 100.1  58.9 37.6  21.4   33.1    29.5  17.3
head(fat[order(fat$brozek),])
##     brozek siri density age weight height adipos  free neck chest abdom
## 182    0.0  0.0  1.1089  40 118.50  68.00   18.1 118.5 33.8  79.3  69.4
## 172    1.9  0.7  1.0983  35 125.75  65.50   20.6 123.4 34.0  90.8  75.0
## 171    4.1  3.0  1.0926  35 152.25  67.75   23.4 146.1 37.0  92.2  81.9
## 26     4.6  3.7  1.0911  27 159.25  71.50   21.9 151.9 35.7  89.6  79.7
## 29     4.7  3.7  1.0910  27 133.25  64.75   22.4 127.0 36.4  93.5  73.9
## 55     4.9  3.9  1.0906  42 136.25  67.50   21.1 129.6 37.8  87.6  77.6
##      hip thigh knee ankle biceps forearm wrist
## 182 85.0  47.2 33.5  20.2   27.7    24.6  16.5
## 172 89.2  50.0 34.8  22.0   24.8    25.9  16.9
## 171 92.8  54.7 36.2  22.1   30.4    27.4  17.7
## 26  96.5  55.0 36.7  22.5   29.9    28.2  17.7
## 29  88.5  50.1 34.5  21.3   30.5    27.9  17.2
## 55  88.6  51.9 34.9  22.5   27.7    27.5  18.5
head(fat[order(fat$brozek, decreasing = TRUE),])
##     brozek siri density age weight height adipos  free neck chest abdom
## 216   45.1 47.5  0.9950  51 219.00  64.00   37.6 120.2 41.2 119.8 122.1
## 36    38.2 40.1  1.0101  49 191.75  65.00   32.0 118.4 38.4 118.5 113.1
## 192   36.5 38.1  1.0140  42 244.25  76.00   29.8 155.2 41.8 115.2 113.7
## 169   34.7 34.3  1.0180  35 228.25  69.50   33.3 149.3 40.4 114.9 115.9
## 39    33.8 35.2  1.0202  46 363.15  72.25   48.9 240.5 51.2 136.2 148.1
## 242   33.6 35.0  1.0207  65 224.50  68.25   33.9 149.2 38.8 119.6 118.0
##       hip thigh knee ankle biceps forearm wrist
## 216 112.8  62.5 36.9  23.6   34.7    29.1  18.4
## 36  113.8  61.9 38.3  21.9   32.0    29.8  17.0
## 192 112.4  68.5 45.0  25.5   37.1    31.2  19.9
## 169 111.9  74.4 40.6  24.0   36.1    31.8  18.8
## 39  147.7  87.3 49.1  29.6   45.0    29.0  21.4
## 242 114.3  61.3 42.1  23.4   34.9    30.1  19.4
colMeans(fat)
##     brozek       siri    density        age     weight     height 
##  18.938492  19.150794   1.055574  44.884921 178.924405  70.148810 
##     adipos       free       neck      chest      abdom        hip 
##  25.436905 143.713889  37.992063 100.824206  92.555952  99.904762 
##      thigh       knee      ankle     biceps    forearm      wrist 
##  59.405952  38.590476  23.102381  32.273413  28.663889  18.229762

What is interesting is they are near the high and low end of the values for the brozek weight but note, neither are all that close to being the lowest for either measure. What is also unusual, is that if one looks at colMeans for the fat dataset, the values for rows 224 and 207 do not have values that seem to vary all that much from the mean, they seem “relatively” typical.

8.9.c

Plot weight (of the man) against height. Identify the two outlying cases. Are these the same as those identified in the previous question? Discuss.

plot(fat$weight ~ fat$height)
identify(fat$weight ~ fat$height)

## integer(0)

Points 39, 42 were identified as outliers. One has the lowest height and the other the highest weight and are substantially separate from the rest of the points.

But, these are not the two lowest weighted points identified in the prior question (b), though “42” with lowest height is the 3rd lowest physical weight. But 39, with highest physical weight, is actually given a model weight of 1, meaning it is giving full weight in the model.

fat[39,]
##    brozek siri density age weight height adipos  free neck chest abdom
## 39   33.8 35.2  1.0202  46 363.15  72.25   48.9 240.5 51.2 136.2 148.1
##      hip thigh knee ankle biceps forearm wrist
## 39 147.7  87.3 49.1  29.6     45      29  21.4
fat[42,]
##    brozek siri density age weight height adipos  free neck chest abdom
## 42   31.7 32.9   1.025  44    205   29.5   29.9 140.1 36.6   106 104.3
##      hip thigh knee ankle biceps forearm wrist
## 42 115.5  70.6 42.5  23.7   33.6    28.7  17.4
wts[39]
##        39 
## 0.5987751
wts[42]
## 42 
##  1

One might expect these to have the lowest weights in the model. But being a perceived outlier doesn’t necessarily mean that a case isn’t important to the model (at least from a Huber perspective). The huber algorithm only weights values as less important, if they vary substantially from the mean square error.

So if we take the lowest and highest errors (prediction less actual y), and then sort the errors, we see that predictions row 207 and 224 are the farthest away from the true value.

min(predict(lmod) - fat$brozek)
## [1] -9.32674
max(predict(lmod) - fat$brozek)
## [1] 10.26353
errors <- predict(lmod) - fat$brozek
errors[order(errors)]
##          207           81           82          128          135 
## -9.326740246 -8.480250444 -8.310690775 -7.937236953 -7.758955591 
##          192          249          119          121           33 
## -7.323569740 -7.222483537 -7.057953104 -6.922060347 -6.374215955 
##            3           24          115          148           38 
## -6.252231350 -6.099412184 -6.068221177 -6.048565123 -5.972179308 
##           76           62          216          156          202 
## -5.905070445 -5.781976643 -5.760728538 -5.753483484 -5.665321123 
##          153           23           67          195          208 
## -5.607709967 -5.607066085 -5.600336102 -5.567206164 -5.496702951 
##           25           86          127           84           17 
## -5.408701711 -5.272933099 -5.261188961 -5.239335593 -5.138519868 
##           44          215          200          137          104 
## -5.128290958 -4.892854878 -4.748428662 -4.747473150 -4.740088194 
##          235          134           28          109          197 
## -4.700175029 -4.590060672 -4.547318115 -4.520872194 -4.513017439 
##           71          120          252          139          143 
## -4.466810428 -4.260100079 -4.176970668 -4.051587183 -4.013073267 
##          213            6          175          167           46 
## -3.856960302 -3.709195183 -3.689518678 -3.672680204 -3.661451588 
##          144          240          173          141           66 
## -3.658081445 -3.584511978 -3.541652183 -3.538373408 -3.426388034 
##          160          228           18          138          234 
## -3.292389956 -3.191149954 -3.148293222 -3.055496422 -3.002179264 
##          117           94          237           63          100 
## -2.949819930 -2.928179976 -2.910291908 -2.894011636 -2.773960791 
##           13           47           78           59          122 
## -2.766336569 -2.755563313 -2.620682562 -2.620063863 -2.594752168 
##          103          196          101          132            7 
## -2.531944167 -2.485774218 -2.451910868 -2.313650732 -2.310623343 
##          157           36          243          217          241 
## -2.308508365 -2.239497617 -2.220884542 -2.016906272 -1.956784895 
##           65          129           74          212          178 
## -1.943050272 -1.891207796 -1.825364485 -1.793654500 -1.624201324 
##          146          179          206          181           10 
## -1.621389342 -1.618837460 -1.517487855 -1.412340955 -1.394875743 
##          251            5          105          114          246 
## -1.371373735 -1.355656321 -1.312001824 -1.273753572 -1.215500947 
##           88          168          166           96          113 
## -1.198519465 -1.165096790 -1.129126835 -1.012550774 -0.994287810 
##          191          145          203          151          199 
## -0.979269809 -0.936836446 -0.912585051 -0.783202256 -0.768405351 
##           61          219          159          123           92 
## -0.699878050 -0.629847610 -0.615146285 -0.614063380 -0.577461426 
##          111          102           42          110           58 
## -0.461265211 -0.438073261 -0.397557661 -0.348638044 -0.315270117 
##          165          116          174          136          106 
## -0.283844742 -0.266894612 -0.242014491 -0.226214261 -0.122204790 
##           37           99           40           35          130 
## -0.039775918 -0.004164290  0.003380545  0.032541671  0.085673834 
##           90          154          164          131          244 
##  0.090389476  0.103646852  0.161977787  0.188430945  0.198381292 
##           77           85          245          226          185 
##  0.272175438  0.278121162  0.336389040  0.352395229  0.376412091 
##          118          247           19          155          149 
##  0.425856244  0.441460975  0.611702470  0.688095922  0.733336556 
##           70           56          205          233          124 
##  0.771006154  0.802323028  0.886750166  0.929834140  1.067504958 
##          133          176           91           27           64 
##  1.071808823  1.089736027  1.110899192  1.143719656  1.296101092 
##           50            8          169          218          150 
##  1.321479840  1.346472137  1.354520090  1.371188704  1.379469960 
##            4          188           79          125           60 
##  1.397600909  1.463068549  1.470454965  1.535017900  1.543936943 
##          242           68          239           21          193 
##  1.557764745  1.740667507  1.759099800  1.818916398  1.838677020 
##          194           15          190          189           16 
##  1.850908182  1.905346337  1.943330571  1.971224424  1.972940322 
##          198           11           69          229          161 
##  1.992984700  2.063488268  2.131570179  2.162792720  2.228433106 
##           93           41          186           34           73 
##  2.231387704  2.250878879  2.289392886  2.291908594  2.296554594 
##          163           29          152           31           87 
##  2.365234487  2.369271885  2.417510053  2.422538099  2.431043322 
##           43          214          177            2           52 
##  2.487680848  2.530132369  2.535439065  2.559765439  2.609298207 
##           30          220          162          142           45 
##  2.628319455  2.652202377  2.826430491  2.833008837  2.855322308 
##          147          201          126          230          248 
##  3.039754680  3.130436174  3.230399417  3.367332565  3.370326238 
##          170          210           55            1          108 
##  3.383059932  3.426305657  3.474885433  3.564508677  3.564998529 
##          236           14           48           72          187 
##  3.566693218  3.596004602  3.605176496  3.660259456  3.729414039 
##          209           54          227          112           51 
##  3.863018374  3.936430547  3.957901882  4.120180330  4.124763307 
##           89           49          222           83          183 
##  4.220435703  4.233347035  4.235489955  4.260587556  4.361475626 
##           22          184           26           12           75 
##  4.406119506  4.406373970  4.497675174  4.675528543  4.754878386 
##           32          232           98           57            9 
##  4.875461736  4.894576083  5.022766447  5.034121476  5.113356443 
##          158           95          211          182          223 
##  5.175276754  5.231336135  5.258555963  5.309336175  5.335983091 
##           80           53          107           20          180 
##  5.548460835  5.643654028  6.093045743  6.140421251  6.281388794 
##          238           97          172          250          140 
##  6.667960176  6.723220533  6.925153108  7.116068918  7.726238505 
##          221          171           39          204          231 
##  7.831497228  7.911169314  8.170497848  8.371973433  8.609910081 
##          225          224 
##  8.736079635 10.263532226