Exercise 5.1

Use the teengamb data with gamble as the response. We focus on the effect of sex on the response and so we include this predictor in all models. There are eight possible models that include all, some, or none of the other three predictors. Fit all these models and report on the coefficient and significance of sex in each case. Comment on the stability of the effect.

Model 1: -25.909, 0.00444 (coefficient, p-value) Model 2: -22.118, 0.0101 (coefficient, p-value) Model 3: -35.709, 0.00049 (coefficient, p-value) Model 4: -24.339, 0.00454 (coefficient, p-value) Model 5: -33.752, 0.00114 (coefficient, p-value) Model 6: -21.634, 0.00272 (coefficient, p-value) Model 7: -22.960, 0.0015 (coefficient, p-value) Model 8: -27.722, 0.00195 (coefficient, p-value)

The coefficient of sex varies from -21.634 to -35.709 and the value is significant in all 8 models. This implies that sex has significant predictive value in predicting expenditure on gambling (pounds/year). Model 2 considers the impact of all variables in predicting gambling expenditures and we see that only sex and income have p-values less than 0.05. Socioeconomic status and verbal scores are not significant in predicting gambling expenditures. Since the value of the sex coefficient is negative, it implies when sex changes from male to female, there is a 22% reduction in gambling expenditures. Across the 8 models, the effect of the sex variable is relatively stable and has predictive value.

require(faraway)
## Loading required package: faraway
lmod1 <- lm(gamble ~ sex, teengamb)
lmod2 <- lm(gamble ~ sex +  status + income + verbal, teengamb)
lmod3 <- lm(gamble ~ sex +  status, teengamb)
lmod4 <- lm(gamble ~ sex +  status + income, teengamb)
lmod5 <- lm(gamble ~ sex +  status + verbal, teengamb)
lmod6 <- lm(gamble ~ sex +  income, teengamb)
lmod7 <- lm(gamble ~ sex +  income + verbal, teengamb)
lmod8 <- lm(gamble ~ sex +  verbal, teengamb)
summary(lmod1)
## 
## Call:
## lm(formula = gamble ~ sex, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.775 -18.325  -3.766   6.334 126.225 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   29.775      5.498   5.415 2.28e-06 ***
## sex          -25.909      8.648  -2.996  0.00444 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 29.09 on 45 degrees of freedom
## Multiple R-squared:  0.1663, Adjusted R-squared:  0.1478 
## F-statistic: 8.977 on 1 and 45 DF,  p-value: 0.004437
summary(lmod2)
## 
## Call:
## lm(formula = gamble ~ sex + status + income + verbal, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -51.082 -11.320  -1.451   9.452  94.252 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  22.55565   17.19680   1.312   0.1968    
## sex         -22.11833    8.21111  -2.694   0.0101 *  
## status        0.05223    0.28111   0.186   0.8535    
## income        4.96198    1.02539   4.839 1.79e-05 ***
## verbal       -2.95949    2.17215  -1.362   0.1803    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.69 on 42 degrees of freedom
## Multiple R-squared:  0.5267, Adjusted R-squared:  0.4816 
## F-statistic: 11.69 on 4 and 42 DF,  p-value: 1.815e-06
summary(lmod3)
## 
## Call:
## lm(formula = gamble ~ sex + status, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -35.873 -15.755  -3.007  10.924 111.586 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  60.2233    15.1347   3.979 0.000255 ***
## sex         -35.7094     9.4899  -3.763 0.000493 ***
## status       -0.5855     0.2727  -2.147 0.037321 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 27.99 on 44 degrees of freedom
## Multiple R-squared:  0.2454, Adjusted R-squared:  0.2111 
## F-statistic: 7.154 on 2 and 44 DF,  p-value: 0.002042
summary(lmod4)
## 
## Call:
## lm(formula = gamble ~ sex + status + income, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -48.682 -12.169  -0.268   9.161  97.728 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  13.0315    15.8676   0.821  0.41603    
## sex         -24.3393     8.1274  -2.995  0.00454 ** 
## status       -0.1496     0.2413  -0.620  0.53856    
## income        4.9280     1.0352   4.760 2.21e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.92 on 43 degrees of freedom
## Multiple R-squared:  0.5058, Adjusted R-squared:  0.4713 
## F-statistic: 14.67 on 3 and 43 DF,  p-value: 1.014e-06
summary(lmod5)
## 
## Call:
## lm(formula = gamble ~ sex + status + verbal, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -41.108 -15.290  -3.998   8.401 108.499 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  69.2216    17.5619   3.942 0.000293 ***
## sex         -33.7520     9.6839  -3.485 0.001144 ** 
## status       -0.4039     0.3267  -1.237 0.222971    
## verbal       -2.7037     2.6784  -1.009 0.318410    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 27.99 on 43 degrees of freedom
## Multiple R-squared:  0.2629, Adjusted R-squared:  0.2114 
## F-statistic: 5.111 on 3 and 43 DF,  p-value: 0.004106
summary(lmod6)
## 
## Call:
## lm(formula = gamble ~ sex + income, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -49.757 -11.649   0.844   8.659 100.243 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.041      6.394   0.632  0.53070    
## sex          -21.634      6.809  -3.177  0.00272 ** 
## income         5.172      0.951   5.438 2.24e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.75 on 44 degrees of freedom
## Multiple R-squared:  0.5014, Adjusted R-squared:  0.4787 
## F-statistic: 22.12 on 2 and 44 DF,  p-value: 2.243e-07
summary(lmod7)
## 
## Call:
## lm(formula = gamble ~ sex + income + verbal, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -50.639 -11.765  -1.594   9.305  93.867 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  24.1390    14.7686   1.634   0.1095    
## sex         -22.9602     6.7706  -3.391   0.0015 ** 
## income        4.8981     0.9551   5.128 6.64e-06 ***
## verbal       -2.7468     1.8253  -1.505   0.1397    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.43 on 43 degrees of freedom
## Multiple R-squared:  0.5263, Adjusted R-squared:  0.4933 
## F-statistic: 15.93 on 3 and 43 DF,  p-value: 4.148e-07
summary(lmod8)
## 
## Call:
## lm(formula = gamble ~ sex + verbal, data = teengamb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -42.034 -14.702  -3.700   5.595 113.450 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   60.662     16.237   3.736 0.000535 ***
## sex          -27.722      8.417  -3.294 0.001957 ** 
## verbal        -4.528      2.249  -2.013 0.050209 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 28.15 on 44 degrees of freedom
## Multiple R-squared:  0.2366, Adjusted R-squared:  0.2019 
## F-statistic:  6.82 on 2 and 44 DF,  p-value: 0.002631

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.