2-1 When N is quite large, what is the critical value for the F statistic (\(\alpha = .05\))?

\(F=(RSS_1-RSS_2)/\hat{\sigma}^2\), \(\sigma^2=RSS_2/(N-(P_1+1)-1)\),透過題目兩個 model 可以得知 K = 1, 又 \(F \sim F_{K,N-(P+K)-1}\), 因此可由查表得知 \(F_{1,\infty,0.05}=3.84\)

來源:F Distribution Tables


2-2 When AIC is adopted for model selection, what are the expressions of \(AIC_1\) and \(AIC_2\).

\(AIC_j=\frac{1}{N\hat{\sigma}^2}(RSS_j+2P_j\hat{\sigma}^2)\), 求 \(AIC_1\)\(AIC_2\)

\(AIC_1=\frac{1}{N}(\frac{RSS_1(N-P_1-2)}{RSS_2}+2P_1)\)

\(AIC_2=\frac{1}{N}(N+P_1)\)


2-3 Please show that the decision rules of both F-test and AIC can be written as, What are the corresponding c’s?

已知 \(AIC_1=\frac{1}{N\hat{\sigma}^2}(RSS_1+2p_1\hat{\sigma}^2)\), \(AIC_2=\frac{1}{N\hat{\sigma}^2}(RSS_2+2(p_1+1)\hat{\sigma}^2)\) 首先將 AIC 寫成題目給的格式,想要拿到 \(RSS_1-RSS_2\) 直覺將 \(AIC_1-AIC_2\),而我們不知道 \(AIC_1\), \(AIC_2\) 哪個比較大(或說模型較差)因此會有 \(AIC_1-AIC_2<0\)\(AIC_1-AIC_2\geq0\) 的情形發生,可以將式子列成

\(AIC_1-AIC_2<0\)

\(\frac{RSS_1-RSS_2-2\hat{\sigma}^2}{N\hat{\sigma}^2}<0\),同乘 N 整理一下可得

\(\frac{RSS_1-RSS_2}{\hat{\sigma}^2}<2\),同理另一種情況 \(AIC_1-AIC_2\geq0\),也可得到

\(\frac{RSS_1-RSS_2}{\hat{\sigma}^2}\geq2\),得 \(c=2\)


2-4 Actually, decision rule of BIC can be also written in that form of 3. What is the corresponding c?

已知 \(BIC_1=\frac{1}{N\hat{\sigma}^2}(RSS_1+log(N)p_1\hat{\sigma}^2)\), \(BIC_2=\frac{1}{N\hat{\sigma}^2}(RSS_2+log(N)(p_1+1)\hat{\sigma}^2)\),與上題類似,寫成

\(BIC_1-BIC_2<0\)

\(\frac{RSS_1-RSS_2-log(N)\hat{\sigma}^2}{N\hat{\sigma}^2}<0\),同乘 N 整理一下可得

\(\frac{RSS_1-RSS_2}{\hat{\sigma}^2}<log(N)\),同理另一種情況 \(BIC_1-BIC_2\geq0\),也可得到

\(\frac{RSS_1-RSS_2}{\hat{\sigma}^2}\geq log(N)\),得 \(c=log(N)\)


2-5 If the population value of \(\beta_{P_1+1}\) is not zero, which decision rule has larger statistical power?

由 3 / 4 題的結果可以得到,假如 c 越容易達到,那該項的統計定力(statistical power)就越好,這裡的定力指的是能夠顯著分辨 M1, M2 好壞的機率,由本題來說 BIC 的 c 是 log(N),在樣本數很大的情況下 c 相對會縮的很小,相比之下 AIC 的 c 更能夠分辨 M1, M2 的好壞