R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Introduction

This analysis evaluates the impact of PC ownership and parental college education on college GPA using the GPA1 dataset. The base model is:

\[ \widehat{\text{colGPA}} = \beta_0 + \beta_1 \cdot \text{pc} + \beta_2 \cdot \text{hsgpa} + \beta_3 \cdot \text{act} \]

We extend the model by including:

Load and Prepare Data

library(readxl)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Load the data
gpa1 <- read_excel("gpa1.xlsx")

data <- gpa1 %>%
  select(colgpa, pc, hsgpa, act, mothcoll, fathcoll) %>%
  mutate(hsgpa2 = hsgpa^2) %>%
  na.omit()

Model Estimation

library(stats)

# Fit the linear model
model <- lm(colgpa ~ pc + hsgpa + act + mothcoll + fathcoll, data = data)
summary(model)
## 
## Call:
## lm(formula = colgpa ~ pc + hsgpa + act + mothcoll + fathcoll, 
##     data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.78149 -0.25726 -0.02121  0.24691  0.74432 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.255554   0.335392   3.744 0.000268 ***
## pc           0.151854   0.058716   2.586 0.010762 *  
## hsgpa        0.450220   0.094280   4.775 4.61e-06 ***
## act          0.007724   0.010678   0.723 0.470687    
## mothcoll    -0.003758   0.060270  -0.062 0.950376    
## fathcoll     0.041800   0.061270   0.682 0.496265    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3344 on 135 degrees of freedom
## Multiple R-squared:  0.2222, Adjusted R-squared:  0.1934 
## F-statistic: 7.713 on 5 and 135 DF,  p-value: 2.083e-06
# Fit the linear model with hsgpa^2
model <- lm(colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll + fathcoll, data = data)
summary(model)
## 
## Call:
## lm(formula = colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll + 
##     fathcoll, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.78998 -0.24327 -0.00648  0.26179  0.72231 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  5.040328   2.443038   2.063   0.0410 *
## pc           0.140446   0.058858   2.386   0.0184 *
## hsgpa       -1.802520   1.443551  -1.249   0.2140  
## I(hsgpa^2)   0.337340   0.215710   1.564   0.1202  
## act          0.004786   0.010786   0.444   0.6580  
## mothcoll     0.003091   0.060110   0.051   0.9591  
## fathcoll     0.062761   0.062401   1.006   0.3163  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3326 on 134 degrees of freedom
## Multiple R-squared:  0.2361, Adjusted R-squared:  0.2019 
## F-statistic: 6.904 on 6 and 134 DF,  p-value: 2.088e-06

Results and Interpretation linear model

\[ \widehat{\text{colGPA}} = 1.256 + 0.152 \cdot \text{pc} + 0.450 \cdot \text{hsgpa} + 0.0077 \cdot \text{act} - 0.0038 \cdot \text{mothcoll} + 0.0418 \cdot \text{fathcoll} \]

Statistical Significance

  • pc: Statistically significant at 5% (p = 0.011)
  • mothcoll and fathcoll: Not statistically significant

Joint Significance Test

# F-test for mothcoll and fathcoll
anova_model <- anova(update(model, . ~ . - mothcoll - fathcoll), model)
anova_model
## Analysis of Variance Table
## 
## Model 1: colgpa ~ pc + hsgpa + I(hsgpa^2) + act
## Model 2: colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll + fathcoll
##   Res.Df    RSS Df Sum of Sq     F Pr(>F)
## 1    136 14.949                          
## 2    134 14.823  2   0.12544 0.567 0.5686
  • Joint p-value for mothcoll and fathcoll: 0.783
  • We fail to reject the null hypothesis that both variables are jointly zero.

Interpretation

  • The effect of owning a PC remains positive and significant even after adding parental education variables.
  • Neither mothcoll nor fathcoll show a statistically significant direct or joint effect on GPA in this model.

Conclusion

Adding parental education does not substantially change the estimated effect of PC ownership on college GPA. PC ownership remains an important factor in predicting academic performance.

Results and Interpretation linear model with hsgpa^2

\[ \widehat{\text{colGPA}} = \beta_0 + \beta_1 \cdot \text{pc} + \beta_2 \cdot \text{hsgpa} + \beta_3 \cdot \text{hsgpa}^2 + \beta_4 \cdot \text{act} + \beta_5 \cdot \text{mothcoll} + \beta_6 \cdot \text{fathcoll} \]

Interpretation

  • Inclusion of hsgpa^2 allows for a nonlinear effect of high school GPA on college GPA.
  • Check significance of hsgpa^2 coefficient to determine if generalization improves model.

Conclusion

Adding parental education does not substantially change the estimated effect of PC ownership on college GPA. Including a quadratic term for hsgpa tests whether non-linear relationships matter. The significance of this term will guide whether the generalization is needed.