3.4 Using the sat data: ### (a) Fit a model with total sat score as the response and expend, ratio and salary as predictors. Test the hypothesis that βsalary = 0. Test the hypothesis that βsalary = βratio = βexpend = 0. Do any of these predictors have an effect on the response?
library(faraway)
## Warning: package 'faraway' was built under R version 4.2.3
data(sat, package="faraway")
require(ellipse)
## Loading required package: ellipse
## Warning: package 'ellipse' was built under R version 4.2.3
##
## Attaching package: 'ellipse'
## The following object is masked from 'package:graphics':
##
## pairs
require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.2.2
names(sat)
## [1] "expend" "ratio" "salary" "takers" "verbal" "math" "total"
totalmod<-lm(total~expend+ratio+salary,data=sat)
summary(totalmod)
##
## Call:
## lm(formula = total ~ expend + ratio + salary, data = sat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -140.911 -46.740 -7.535 47.966 123.329
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1069.234 110.925 9.639 1.29e-12 ***
## expend 16.469 22.050 0.747 0.4589
## ratio 6.330 6.542 0.968 0.3383
## salary -8.823 4.697 -1.878 0.0667 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 68.65 on 46 degrees of freedom
## Multiple R-squared: 0.2096, Adjusted R-squared: 0.1581
## F-statistic: 4.066 on 3 and 46 DF, p-value: 0.01209
According to the t-testH0:βsalary= 0vs. Ha:βsalary = 0we fail to reject the null hypothesis at 5% level of significance because thep-value= 0.0667>0.05, but we would say the variable is significant at a more conservative confidence level of 10%.
The F-statistics=4.066 helps us to test the null hypothesis:βsalary=βratio=βexpend = 0. Since the p-value= 0.01209is lower than 0.05, we reject the null hypothesis and conclude that these predictors collectively have an effect ontotal sat score.
According to the t-tests none of the predictors individually have an effect on the response at a 5% level.
totalmod1<-lm(total~expend+ratio+salary+takers,data=sat)
summary(totalmod1)
##
## Call:
## lm(formula = total ~ expend + ratio + salary + takers, data = sat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -90.531 -20.855 -1.746 15.979 66.571
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1045.9715 52.8698 19.784 < 2e-16 ***
## expend 4.4626 10.5465 0.423 0.674
## ratio -3.6242 3.2154 -1.127 0.266
## salary 1.6379 2.3872 0.686 0.496
## takers -2.9045 0.2313 -12.559 2.61e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 32.7 on 45 degrees of freedom
## Multiple R-squared: 0.8246, Adjusted R-squared: 0.809
## F-statistic: 52.88 on 4 and 45 DF, p-value: < 2.2e-16
anova(totalmod,totalmod1)
## Analysis of Variance Table
##
## Model 1: total ~ expend + ratio + salary
## Model 2: total ~ expend + ratio + salary + takers
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 46 216812
## 2 45 48124 1 168688 157.74 2.607e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
According to the t-testH0:βtakers= 0vs. Ha:βtakers = 0we reject the null hypothesis at 5% level of significance because thep-value∼0. The corresponding t-test statistic has a large absolute value, indicating that this variable is highly significant to explain the response.
The partial F-test comparing models 1 and 2, is calculated as(RSS1-RSS2 )/1 RSS2 /45 = 157.74which has an F- distribution with 1 degrees of freedom in the numerator and 45 degrees of freedom in the denominator. This is equivalent to the square of the t-statistic for the variabletakers:t2= (12.5592) = 157.7285with 45 degrees of freedom.