1a

library(alr4)
## Loading required package: car
## Loading required package: carData
## Loading required package: effects
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
attach(BGSall)
library(ggplot2)
ggplot(BGSall, aes(x=HT2, y=HT18, color=factor(Sex)))+geom_point(shape=1)+ geom_smooth(method=lm, se=FALSE)

A linear relationship appears appropriate for both groups, and the slopes look similar between males and females.

1b

BGS.lm = lm(HT18 ~ HT2*Sex, data = BGSall)
summary(BGS.lm)
## 
## Call:
## lm(formula = HT18 ~ HT2 * Sex, data = BGSall)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.2921  -3.0481  -0.0345   3.1263  12.9258 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  79.95704   16.48559   4.850 3.42e-06 ***
## HT2           1.12052    0.18642   6.011 1.69e-08 ***
## Sex         -18.98222   22.80001  -0.833    0.407    
## HT2:Sex       0.08941    0.25940   0.345    0.731    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.99 on 132 degrees of freedom
## Multiple R-squared:  0.6887, Adjusted R-squared:  0.6816 
## F-statistic: 97.34 on 3 and 132 DF,  p-value: < 2.2e-16

Since the p-value for the interaction HT2:Sex is 0.731, the evidence is strong that the slope differs between the two groups. Hence, the parallel regression model should suffice.

1c

BGSp.lm = lm(HT18 ~ HT2+Sex, data = BGSall)
confint(BGSp.lm)
##                   2.5 %    97.5 %
## (Intercept)  53.2604815 98.492691
## HT2           0.9111399  1.422249
## Sex         -12.8416680 -9.417779

We are 95% confident that, for any height at age 2 the difference between males and females are between -12.841 and -9.4178 centimeters shorter than males.

2a

library(alr4)
# Forward selection
mantel.0 <-  lm(Y ~ 1, data = mantel)
mantel.fw <-  step(mantel.0, scope = c(upper = ~ X1 + X2 + X3), direction = 'forward')
## Start:  AIC=9.59
## Y ~ 1
## 
##        Df Sum of Sq     RSS     AIC
## + X3    1   20.6879  2.1121 -0.3087
## + X1    1    8.6112 14.1888  9.2151
## + X2    1    8.5064 14.2936  9.2519
## <none>              22.8000  9.5866
## 
## Step:  AIC=-0.31
## Y ~ X3
## 
##        Df Sum of Sq    RSS      AIC
## <none>              2.1121 -0.30875
## + X2    1  0.066328 2.0458  1.53172
## + X1    1  0.064522 2.0476  1.53613
# Backward selection
mantel1 <- lm(Y~ X1 + X2 + X3, data = mantel) # full model
mantel1.bw <-  step(mantel.fw, scope = c(lower = ~ 1), direct = 'backward')
## Start:  AIC=-0.31
## Y ~ X3
## 
##        Df Sum of Sq     RSS     AIC
## <none>               2.1121 -0.3087
## - X3    1    20.688 22.8000  9.5866

The final model after forward selection is Y~X3. For backward selection, the final model is also Y~X3.

2b

library(leaps)
library(alr4)
attach(mantel)
mod.reg <- regsubsets(cbind(mantel$X1,mantel$X2,mantel$X3), mantel$Y, data=mantel)
summary.reg <- summary(mod.reg)
names(summary.reg)
## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"
summary.reg$which
##   (Intercept)     a     b     c
## 1        TRUE FALSE FALSE  TRUE
## 2        TRUE  TRUE  TRUE FALSE
## 3        TRUE  TRUE  TRUE  TRUE
summary.reg$rsq
## [1] 0.9073635 1.0000000 1.0000000
summary.reg$adjr2
## [1] 0.8764847 1.0000000 1.0000000
summary.reg$cp
## [1] 1.838726e+27 6.802184e+00 4.000000e+00
summary.reg$bic
## [1]   -8.676486 -312.170229 -319.351963

From using the best subsets regression, we found the most important predictors to be X1, X2, and X3 using R^2, adjusted R^2, mallow’s cp, and BIC. When we evaluate the regression using R^2 and adjusted R^2, we find the predictors with the largest increase in R^2 to be X2. When using BIC as the criterion we get X3. From mallows’ Cp statistic, we get X1, X2, and X3.

3a

attach(lathe1)
lathe1.lm = lm(Life~poly(Feed, 2) + poly(Speed, 2) + Feed:Speed, data = lathe1)
boxCox(lathe1.lm)

3b

lathe1F.lm = lm(log(Life)~Speed+Feed+I(Speed^2)+I(Feed^2)+Speed*Feed, data = lathe1)
lathe1R.lm = lm(log(Life)~Feed+I(Feed^2), data = lathe1)
anova(lathe1R.lm,lathe1F.lm)
## Analysis of Variance Table
## 
## Model 1: log(Life) ~ Feed + I(Feed^2)
## Model 2: log(Life) ~ Speed + Feed + I(Speed^2) + I(Feed^2) + Speed * Feed
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     17 32.300                                  
## 2     14  1.237  3    31.063 117.22 3.726e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

H0: H0: β1=β2=β11=β22=β12=0 H1: at least 1 is not equal to 0 The p-value for the global F-test is 3.726e-1 which is less than α=0.05, so we fail to reject the null hypothesis that the predictors are equal to 0. So, there is not significant evidence that the predictors are useful.

3c We are testing if the predictors Speed, Speed^2, and Feed^2 are useful. From the above test, we found that none of the predictors were useful.

3d

lathe1F.lm = lm(log(Life)~Speed+I(Speed^2)+I(Feed^2), data = lathe1)
lathe1R.lm = lm(log(Life)~Feed+I(Feed^2), data = lathe1)
anova(lathe1R.lm,lathe1F.lm)
## Analysis of Variance Table
## 
## Model 1: log(Life) ~ Feed + I(Feed^2)
## Model 2: log(Life) ~ Speed + I(Speed^2) + I(Feed^2)
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     17 32.300                                  
## 2     16  8.772  1    23.528 42.915 6.675e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The p-value for the F-test is 6.675e-06 which is significantly less than α=0.05, so we fail to reject the null hypothesis that the predictors Speed, Speed^2, and Feed^2 are equal to 0. So, there is not significant evidence that the predictors are useful.

3e

lathe2.lm = lm(log(Life)~Speed+Feed+I(Speed^2)+I(Feed^2)+Speed*Feed, data = lathe1)
influenceIndexPlot(lathe2.lm)

The two most influential cases are 9 and 10, due to their large Cook’s distance. They also have the largest leverages. However, there are two other points with the same leverage values, so the influence is likely more due to the standardized residuals being large.

lathe.rmv = lathe1[-c(9,10),]
lathe3.lm = lm(log(Life)~Speed+Feed+I(Speed^2)+I(Feed^2)+Speed*Feed, data = lathe.rmv)
summary(lathe3.lm)
## 
## Call:
## lm(formula = log(Life) ~ Speed + Feed + I(Speed^2) + I(Feed^2) + 
##     Speed * Feed, data = lathe.rmv)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.39963 -0.14660  0.00387  0.14917  0.32783 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.18809    0.08241  14.417 6.11e-09 ***
## Speed       -1.43300    0.08241 -17.388 7.10e-10 ***
## Feed        -0.79023    0.06729 -11.743 6.15e-08 ***
## I(Speed^2)   0.28022    0.12363   2.267 0.042700 *  
## I(Feed^2)    0.42244    0.09217   4.583 0.000629 ***
## Speed:Feed  -0.07286    0.08241  -0.884 0.394025    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2331 on 12 degrees of freedom
## Multiple R-squared:  0.9759, Adjusted R-squared:  0.9658 
## F-statistic: 97.07 on 5 and 12 DF,  p-value: 2.804e-09

The coefficient most affected is the main effect for Speed, while the others stay mostly the same. Also, the standard errors are uniformly smaller using the reduced data set.

4abc

library(leaps)
library(faraway)
## 
## Attaching package: 'faraway'
## The following objects are masked from 'package:alr4':
## 
##     cathedral, pipeline, twins
## The following objects are masked from 'package:car':
## 
##     logit, vif
attach(divusa)
div.reg <- regsubsets(cbind(year,unemployed,femlab,marriage,birth,military), divorce, data=divusa)
summary.reg <- summary(div.reg)
names(summary.reg)
## [1] "which"  "rsq"    "rss"    "adjr2"  "cp"     "bic"    "outmat" "obj"
summary.reg$which
##   (Intercept)  year unemployed femlab marriage birth military
## 1        TRUE FALSE      FALSE   TRUE    FALSE FALSE    FALSE
## 2        TRUE FALSE      FALSE   TRUE    FALSE  TRUE    FALSE
## 3        TRUE FALSE      FALSE   TRUE     TRUE  TRUE    FALSE
## 4        TRUE  TRUE      FALSE   TRUE     TRUE  TRUE    FALSE
## 5        TRUE  TRUE      FALSE   TRUE     TRUE  TRUE     TRUE
## 6        TRUE  TRUE       TRUE   TRUE     TRUE  TRUE     TRUE
summary.reg$adjr2
## [1] 0.8265403 0.8720158 0.9105579 0.9208807 0.9289506 0.9287914
summary.reg$cp
## [1] 109.695444  62.001274  22.692257  12.998703   5.841314   7.000000
summary.reg$bic
## [1] -127.2216 -147.3224 -171.6165 -177.7776 -182.7945 -179.3707

Using the criterion R^2, mallow’s Cp, and BIC in our best subsets regressuion, we found that the most important predictors are year, femlab, marriage, birth, military. The best model will be divorce~year+unemployed+femlab+marriage+birth+military.