This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
This analysis evaluates the impact of PC ownership and parental
college education on college GPA using the GPA1
dataset.
The base model is:
\[ \widehat{\text{colGPA}} = \beta_0 + \beta_1 \cdot \text{pc} + \beta_2 \cdot \text{hsgpa} + \beta_3 \cdot \text{act} \]
We extend the model by including:
mothcoll
: Mother attended collegefathcoll
: Father attended collegehsgpa^2
: Square of high school GPAlibrary(readxl)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Load the data
gpa1 <- read_excel("gpa1.xlsx")
data <- gpa1 %>%
select(colgpa, pc, hsgpa, act, mothcoll, fathcoll) %>%
mutate(hsgpa2 = hsgpa^2) %>%
na.omit()
library(stats)
# Fit the linear model
model <- lm(colgpa ~ pc + hsgpa + act + mothcoll + fathcoll, data = data)
summary(model)
##
## Call:
## lm(formula = colgpa ~ pc + hsgpa + act + mothcoll + fathcoll,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.78149 -0.25726 -0.02121 0.24691 0.74432
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.255554 0.335392 3.744 0.000268 ***
## pc 0.151854 0.058716 2.586 0.010762 *
## hsgpa 0.450220 0.094280 4.775 4.61e-06 ***
## act 0.007724 0.010678 0.723 0.470687
## mothcoll -0.003758 0.060270 -0.062 0.950376
## fathcoll 0.041800 0.061270 0.682 0.496265
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3344 on 135 degrees of freedom
## Multiple R-squared: 0.2222, Adjusted R-squared: 0.1934
## F-statistic: 7.713 on 5 and 135 DF, p-value: 2.083e-06
# Fit the linear model with hsgpa^2
model <- lm(colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll + fathcoll, data = data)
summary(model)
##
## Call:
## lm(formula = colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll +
## fathcoll, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.78998 -0.24327 -0.00648 0.26179 0.72231
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.040328 2.443038 2.063 0.0410 *
## pc 0.140446 0.058858 2.386 0.0184 *
## hsgpa -1.802520 1.443551 -1.249 0.2140
## I(hsgpa^2) 0.337340 0.215710 1.564 0.1202
## act 0.004786 0.010786 0.444 0.6580
## mothcoll 0.003091 0.060110 0.051 0.9591
## fathcoll 0.062761 0.062401 1.006 0.3163
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3326 on 134 degrees of freedom
## Multiple R-squared: 0.2361, Adjusted R-squared: 0.2019
## F-statistic: 6.904 on 6 and 134 DF, p-value: 2.088e-06
\[ \widehat{\text{colGPA}} = 1.256 + 0.152 \cdot \text{pc} + 0.450 \cdot \text{hsgpa} + 0.0077 \cdot \text{act} - 0.0038 \cdot \text{mothcoll} + 0.0418 \cdot \text{fathcoll} \]
pc
: Statistically significant at 5% (p = 0.011)mothcoll
and fathcoll
: Not statistically
significant# F-test for mothcoll and fathcoll
anova_model <- anova(update(model, . ~ . - mothcoll - fathcoll), model)
anova_model
## Analysis of Variance Table
##
## Model 1: colgpa ~ pc + hsgpa + I(hsgpa^2) + act
## Model 2: colgpa ~ pc + hsgpa + I(hsgpa^2) + act + mothcoll + fathcoll
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 136 14.949
## 2 134 14.823 2 0.12544 0.567 0.5686
mothcoll
and
fathcoll
: 0.783mothcoll
nor fathcoll
show a
statistically significant direct or joint effect on GPA in this
model.Adding parental education does not substantially change the estimated effect of PC ownership on college GPA. PC ownership remains an important factor in predicting academic performance.
\[ \widehat{\text{colGPA}} = \beta_0 + \beta_1 \cdot \text{pc} + \beta_2 \cdot \text{hsgpa} + \beta_3 \cdot \text{hsgpa}^2 + \beta_4 \cdot \text{act} + \beta_5 \cdot \text{mothcoll} + \beta_6 \cdot \text{fathcoll} \]
hsgpa^2
allows for a nonlinear effect of
high school GPA on college GPA.hsgpa^2
coefficient to determine
if generalization improves model.Adding parental education does not substantially change the estimated
effect of PC ownership on college GPA. Including a quadratic term for
hsgpa
tests whether non-linear relationships matter. The
significance of this term will guide whether the generalization is
needed.