MTH245: Introduction to the Practice of Statistics

Ansha Zaman – Section L02 – Lab #3 – Fall 2012 – Prof. Baumer

require(mosaic, quietly = TRUE)
data(Bodyimage)

The first couple of data from the entire dataset is given below.

head(Bodyimage)
##   Gender Height  GPA HS_GPA Seat  WtFeel Cheat
## 1 Female     64 2.60   2.63    M AboutRt    No
## 2   Male     69 2.70   3.72    M AboutRt    No
## 3 Female     66 3.00   3.44    F AboutRt    No
## 4 Female     63 3.11   2.73    F AboutRt    No
## 5   Male     72 3.40   2.35    B  OverWt    No
## 6 Female     67 3.43   3.84    M AboutRt    No

Below the summary of the first model based on gender, high school GPA and whether the student cheats or not is given.

fm1 = lm(GPA ~ HS_GPA + Gender + Cheat, data = Bodyimage)
summary(fm1)
## 
## Call:
## lm(formula = GPA ~ HS_GPA + Gender + Cheat, data = Bodyimage)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8341 -0.2252  0.0088  0.2659  0.9011 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.0755     0.1373    7.83  1.9e-13 ***
## HS_GPA        0.6147     0.0398   15.43  < 2e-16 ***
## GenderMale   -0.0210     0.0514   -0.41   0.6832    
## CheatYes      0.2563     0.0918    2.79   0.0057 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 0.369 on 224 degrees of freedom
##   (8 observations deleted due to missingness)
## Multiple R-squared: 0.53,    Adjusted R-squared: 0.524 
## F-statistic: 84.2 on 3 and 224 DF,  p-value: <2e-16

The graph below shows the relationship between gpa and high school gpa.


plotPoints(GPA ~ HS_GPA, data = Bodyimage, pch = 19, alpha = 0.3)

plot of chunk unnamed-chunk-4

Summary for the second model is given below which is based on high school gpa, gender and how the student feels about his/her weight.


fm2 = lm(GPA ~ HS_GPA + Cheat + Seat, data = Bodyimage)
summary(fm2)
## 
## Call:
## lm(formula = GPA ~ HS_GPA + Cheat + Seat, data = Bodyimage)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8605 -0.2148  0.0099  0.2511  0.9004 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.06014    0.13649    7.77  2.9e-13 ***
## HS_GPA       0.61254    0.04047   15.14  < 2e-16 ***
## CheatYes     0.25458    0.09210    2.76   0.0062 ** 
## SeatF        0.04234    0.07678    0.55   0.5819    
## SeatM        0.00898    0.06422    0.14   0.8890    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 0.37 on 223 degrees of freedom
##   (8 observations deleted due to missingness)
## Multiple R-squared: 0.53,    Adjusted R-squared: 0.522 
## F-statistic:   63 on 4 and 223 DF,  p-value: <2e-16

Q1: Interpret the coefficients in the model described above. What is the meaning of each coefficient? What does it tell you about college GPA?

The coeffecients tell us that the model formula for GPA is:
GPA=1.07545+0.61466.HS_GPA-0.02101.GenderMale + 0.25628.CheatYes

This gives us the average change in GPA which is the response variable caused by change in high school GPA, gender of the student and whether the student cheated or not which are the explanatory variables.

The intercept predicts the value of the GPA when the student's gender is female and when the student doesn't cheat.The HS_GPA coeffecient predicts on average the change in GPA due to a one unit change in high school GPA. The Gender coeffecient gives us the average difference in GPA between males and females holding the cheating factor and high school gpa factor constant with the GPA for males being higher. The Cheat coeffecient gives us the average difference in GPA between students who cheat and students who don't holding the gender factor and high school gpa factor constant with the GPA for people who cheat being less.

Q2: Experiment with different models for GPA. Select one that you think is informative, and interpret the coefficients similar to the way you did in the previous question. Be sure to reflect on what the coefficients tell you about college GPA.

The model that I selected was:
GPA=1.09725+HS_GPA0.61595-0.07138GenderMale+0.18154WtFeelUnderWt-0.01065WtFeelOverWt

After trying out different models this model seemed pretty effecient to me. The summary shows that its residuals are also less than the residuals in the first model.

This model predicts the relationship of GPA with high school gpa, gender of the student and how the student feels about his or her weight. The intercept tells us the average gpa when the stuent's high school gpa is zero, when the student is female and when the student feels right about her weight. The HS_GPA coeffecient tells us the average change in gpa when the high school gpa changes by a unit holding the feel about weight and gender constant. The GenderMale coeffecient gives us the average difference in gpa due to a change in gender holding other factors in the model constant. The WtFeelUnderWt coeffecient tells us the average change in GPA when student feels underweight compared to when student feels right about weight holding other factors in the model constant.The WtFeelUnderWt coeffecient tells us the average change in GPA when student feels overweight compared to when student feels right about weight holding other factors in the model constant.