3.4 Using the sat data: ### (a) Fit a model with total sat score as the response and expend, ratio and salary as predictors. Test the hypothesis that βsalary = 0. Test the hypothesis that βsalary = βratio = βexpend = 0. Do any of these predictors have an effect on the response?

library(faraway)
## Warning: package 'faraway' was built under R version 4.2.3
data(sat, package="faraway")
require(ellipse)
## Loading required package: ellipse
## Warning: package 'ellipse' was built under R version 4.2.3
## 
## Attaching package: 'ellipse'
## The following object is masked from 'package:graphics':
## 
##     pairs
require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.2.2
names(sat) 
## [1] "expend" "ratio"  "salary" "takers" "verbal" "math"   "total"
totalmod<-lm(total~expend+ratio+salary,data=sat) 
summary(totalmod) 
## 
## Call:
## lm(formula = total ~ expend + ratio + salary, data = sat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -140.911  -46.740   -7.535   47.966  123.329 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1069.234    110.925   9.639 1.29e-12 ***
## expend        16.469     22.050   0.747   0.4589    
## ratio          6.330      6.542   0.968   0.3383    
## salary        -8.823      4.697  -1.878   0.0667 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 68.65 on 46 degrees of freedom
## Multiple R-squared:  0.2096, Adjusted R-squared:  0.1581 
## F-statistic: 4.066 on 3 and 46 DF,  p-value: 0.01209

According to the t-testH0:βsalary= 0vs. Ha:βsalary = 0we fail to reject the null hypothesis at 5% level of significance because thep-value= 0.0667>0.05, but we would say the variable is significant at a more conservative confidence level of 10%.

The F-statistics=4.066 helps us to test the null hypothesis:βsalary=βratio=βexpend = 0. Since the p-value= 0.01209is lower than 0.05, we reject the null hypothesis and conclude that these predictors collectively have an effect ontotal sat score.

According to the t-tests none of the predictors individually have an effect on the response at a 5% level.

(b) Now add takers to the model. Test the hypothesis that βtakers = 0. Compare this model to the previous one using an F-test. Demonstrate that the F-test and t-test here are equivalent.

totalmod1<-lm(total~expend+ratio+salary+takers,data=sat) 
summary(totalmod1)
## 
## Call:
## lm(formula = total ~ expend + ratio + salary + takers, data = sat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -90.531 -20.855  -1.746  15.979  66.571 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1045.9715    52.8698  19.784  < 2e-16 ***
## expend         4.4626    10.5465   0.423    0.674    
## ratio         -3.6242     3.2154  -1.127    0.266    
## salary         1.6379     2.3872   0.686    0.496    
## takers        -2.9045     0.2313 -12.559 2.61e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 32.7 on 45 degrees of freedom
## Multiple R-squared:  0.8246, Adjusted R-squared:  0.809 
## F-statistic: 52.88 on 4 and 45 DF,  p-value: < 2.2e-16
anova(totalmod,totalmod1)
## Analysis of Variance Table
## 
## Model 1: total ~ expend + ratio + salary
## Model 2: total ~ expend + ratio + salary + takers
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     46 216812                                  
## 2     45  48124  1    168688 157.74 2.607e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

According to the t-testH0:βtakers= 0vs. Ha:βtakers = 0we reject the null hypothesis at 5% level of significance because thep-value∼0. The corresponding t-test statistic has a large absolute value, indicating that this variable is highly significant to explain the response.

The partial F-test comparing models 1 and 2, is calculated as(RSS1-RSS2 )/1 RSS2 /45 = 157.74which has an F- distribution with 1 degrees of freedom in the numerator and 45 degrees of freedom in the denominator. This is equivalent to the square of the t-statistic for the variabletakers:t2= (12.5592) = 157.7285with 45 degrees of freedom.