GPA and GRE

dta <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
dta <- dta[, c("gpa", "gre")] 
head(dta)
##    gpa gre
## 1 3.61 380
## 2 3.67 660
## 3 4.00 800
## 4 3.19 640
## 5 2.93 520
## 6 3.00 760

基本統計圖形

下面R程式碼畫出GPA and GRE數據集的散點圖.

plot(dta, type = 'p', xlab = "GPA", ylab = "GRE")
grid()

線性模型分析

\[y_i = \beta_0 + \beta_1 x_i + \epsilon_i ,~~ \epsilon_i \sim N(0, \sigma^2) \] GRE = 截距參數 + 斜率參數 x GPA + 殘差(常態分佈)

分析概要報表

小數點4位,去掉星星.

options(digits = 4, show.signif.stars = FALSE)
summary(m0 <- lm(gre ~ gpa, data = dta))
## 
## Call:
## lm(formula = gre ~ gpa, data = dta)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -302.39  -62.79   -2.21   68.51  283.44 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)    192.3       47.9    4.01  7.2e-05
## gpa            116.6       14.0    8.30  1.6e-15
## 
## Residual standard error: 107 on 398 degrees of freedom
## Multiple R-squared:  0.148,  Adjusted R-squared:  0.146 
## F-statistic: 68.9 on 1 and 398 DF,  p-value: 1.6e-15

根據本份資料,學生每增加1分的GPA,能夠增加117分的GRE。

變異數分析表

anova(m0)
## Analysis of Variance Table
## 
## Response: gre
##            Df  Sum Sq Mean Sq F value  Pr(>F)
## gpa         1  786185  786185      69 1.6e-15
## Residuals 398 4538099   11402

模型擬合圖

plot(dta, type = "p", xlab = "GPA", ylab = "GRE")
abline(m0, lty = 2)
grid()

殘差圖

檢查殘差分配有沒有規律

plot(resid(m0) ~ fitted(m0), xlab = "Fitted values", 
     ylab = "Residuals", ylim = c(-3.5, 3.5))
grid()
abline(h = 0, lty = 2)

驗證殘差常態分佈

qqnorm(resid(m0))
qqline(resid(m0))
grid()

結束