讀取數據

dta <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
dta <- dta[, c("gre","gpa")]

套內數據

400位學生的GRE與GPA成績前6筆資料。

head(dta)
  gre  gpa
1 380 3.61
2 660 3.67
3 800 4.00
4 640 3.19
5 520 2.93
6 760 3.00

基本統計圖形

400位學生GRE(X軸)與GPA(Y軸)的散佈圖

plot(dta, type = 'p', xlab = "GPA分數", ylab = "GRE分數")
grid()

線性模型分析

\[y_i = \beta_0 + \beta_1 x_i + \epsilon_i ,~~ \epsilon_i \sim N(0, \sigma^2) \]

GRE = 截距參數 + 斜率參數 x GPA + 殘差(常態分佈)

分析概要報表

小數點4位,去掉星星.

options(digits = 4, show.signif.stars = FALSE)
summary(m0 <- lm(gre ~ gpa, data = dta))

Call:
lm(formula = gre ~ gpa, data = dta)

Residuals:
    Min      1Q  Median      3Q     Max 
-302.39  -62.79   -2.21   68.51  283.44 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    192.3       47.9    4.01  7.2e-05
gpa            116.6       14.0    8.30  1.6e-15

Residual standard error: 107 on 398 degrees of freedom
Multiple R-squared:  0.148, Adjusted R-squared:  0.146 
F-statistic: 68.9 on 1 and 398 DF,  p-value: 1.6e-15

根據這份數據, 平均GPA多出1分時, 平均GRE大約增加116.6分(誤差為14.0). 殘差估計為\(\hat{\sigma} = 107\).

方差分析表

anova(m0)
Analysis of Variance Table

Response: gre
           Df  Sum Sq Mean Sq F value  Pr(>F)
gpa         1  786185  786185      69 1.6e-15
Residuals 398 4538099   11402                

模型擬合圖

plot(dta, xlab = "GPA分數", ylab = "GRE分數")
abline(m0, lty = 2)
grid()

殘差圖

檢查殘差分配有沒有規律

plot(resid(m0) ~ fitted(m0), xlab = "Fitted values", 
     ylab = "Residuals")
grid()
abline(h = 0, lty = 2)

驗證殘差常態分佈

qqnorm(resid(m0))
qqline(resid(m0))
grid()

結束

顯示演練單元信息

sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Traditional)_Taiwan.950 
[2] LC_CTYPE=Chinese (Traditional)_Taiwan.950   
[3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
[4] LC_NUMERIC=C                                
[5] LC_TIME=Chinese (Traditional)_Taiwan.950    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] compiler_3.4.2  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
 [5] tools_3.4.2     htmltools_0.3.6 yaml_2.1.16     Rcpp_0.12.13   
 [9] stringi_1.1.5   rmarkdown_1.8   knitr_1.20      stringr_1.2.0  
[13] digest_0.6.12   evaluate_0.10.1