GPA and GRE

dta <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
dta <- dta[, c("gre", "gpa")]
head(dta)
##   gre  gpa
## 1 380 3.61
## 2 660 3.67
## 3 800 4.00
## 4 640 3.19
## 5 520 2.93
## 6 760 3.00

基本統計圖

400位學生GRE(X軸)與GPA(Y軸)的散佈圖

plot(dta, type = "p", xlab = "GRE分數", ylab = "GPA分數")
grid()

線性模型分析

\[y_i = \beta_0 + \beta_1 x_i + \epsilon_i ,~~ \epsilon_i \sim N(0, \sigma^2) \] GPA=截距參數+斜率參數×GRE+殘差

分析概要報表

小數點4位,去掉星星.

options(digits = 4, show.signif.stars = FALSE)
summary(m0 <- lm(gpa ~ gre, data = dta))
## 
## Call:
## lm(formula = gpa ~ gre, data = dta)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0867 -0.2244 -0.0002  0.2481  0.7618 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.645898   0.091310    29.0  < 2e-16
## gre         0.001266   0.000152     8.3  1.6e-15
## 
## Residual standard error: 0.352 on 398 degrees of freedom
## Multiple R-squared:  0.148,  Adjusted R-squared:  0.146 
## F-statistic: 68.9 on 1 and 398 DF,  p-value: 1.6e-15

根據本份資料,學生每增加1分的GRE,能夠增加0.0012分的GPA(誤差為0.0001)。 殘差的估計(σ^)為0.352。

方差分析表

anova(m0)
## Analysis of Variance Table
## 
## Response: gpa
##            Df Sum Sq Mean Sq F value  Pr(>F)
## gre         1    8.5    8.53      69 1.6e-15
## Residuals 398   49.3    0.12

模型擬合圖

plot(dta, type = "p", xlab = "GRE分數", ylab = "GPA分數")
abline(m0, lty = 2)
grid()

殘差圖

檢查殘差分配有沒有規律

plot(resid(m0) ~ fitted(m0),
     xlab = "Fitted values", 
     ylab = "Residuals", 
     ylim = c(-1.5, 1.5))
abline(h = 0, lty = 2)
grid()

驗證殘差常態分佈

qqnorm(resid(m0))
qqline(resid(m0))
grid()

結束