The dataset teengamb concerns a study of teenage gambling in Britain. Fit a regression model with the expenditure on gambling as the response and the sex, status, income and verbal score as predictors. Present the output.
library(faraway)
## Warning: 套件 'faraway' 是用 R 版本 4.3.1 來建造的
data(teengamb)
r<-lm(gamble~sex+status+income+verbal, data=teengamb)
summary(r)
##
## Call:
## lm(formula = gamble ~ sex + status + income + verbal, data = teengamb)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.082 -11.320 -1.451 9.452 94.252
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.55565 17.19680 1.312 0.1968
## sex -22.11833 8.21111 -2.694 0.0101 *
## status 0.05223 0.28111 0.186 0.8535
## income 4.96198 1.02539 4.839 1.79e-05 ***
## verbal -2.95949 2.17215 -1.362 0.1803
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 22.69 on 42 degrees of freedom
## Multiple R-squared: 0.5267, Adjusted R-squared: 0.4816
## F-statistic: 11.69 on 4 and 42 DF, p-value: 1.815e-06
(a) What percentage of variation in the response is explained by these predictors?
\(R^2 = 0.5267\). The percentage of variation in the response is 52.7%.
(b) Which observation has the largest (positive) residual? Give the case number.
r.res<-r$residuals
max(r.res)
## [1] 94.25222
which.max(r.res)
## 24
## 24
The observationcase with the case number 24 has the largest positive residual (94.25).
(c) Computet the mean and median of the residuals.
mean(r.res)
## [1] -1.556914e-16
median(r.res)
## [1] -1.451392
The mean of the residuals = 0. The median of the residuals = -1.451.
(d) Compute the correlation of the residuals with the fitted values.
cor(r$fitted.values, r.res)
## [1] -6.215823e-17
The correlation of the residuals with the fitted values = 0.
(e) Compute the correlation of the residuals with the income.
cor(teengamb$gamble, r.res)
## [1] 0.687951
The correlation of the residuals with the income = 0.688.
(f) For all other predictors held constant, what would be the difference in predicted expenditure on gambling for a male comapred to a female?
male: sex = 0
female: sex = 1
Therefore, for all other predictors held constant, the difference in predicted expenditure on gampling for a male compared to a female is gamle(sex=0) - gamle(sex=1) = 22.118.