有15个同类企业的生产性固定资产值(X,万元)和工业总产值(Y,万元)之间的资料,要求确定X与Y之间的关系.
设Y=a+bX,则对于回归方程显著性检验的零假设和备择假设为:
H0:b=0
Ha:b≠0
α=0.05
x <- c(318, 910, 200, 409, 415, 502, 314, 1210, 1022, 1225, 1800, 540, 2050,
1303, 890)
y <- c(524, 1019, 638, 815, 913, 928, 605, 1516, 1219, 1624, 2312, 870, 2346,
1789, 1250)
lm.reg <- lm(y ~ x)
summary(lm.reg)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -242.11 -60.94 9.14 87.74 152.95
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 339.9673 60.0090 5.67 7.7e-05 ***
## x 1.0122 0.0583 17.36 2.2e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 123 on 13 degrees of freedom
## Multiple R-squared: 0.959, Adjusted R-squared: 0.955
## F-statistic: 301 on 1 and 13 DF, p-value: 2.24e-10
plot(x, y)
abline(lm.reg)
predict(lm.reg, interval = "prediction", level = 0.95)
## Warning: predictions on current data refer to _future_ responses
## fit lwr upr
## 1 661.9 379.0 944.7
## 2 1261.1 987.0 1535.2
## 3 542.4 255.5 829.3
## 4 754.0 473.8 1034.2
## 5 760.0 480.0 1040.1
## 6 848.1 570.1 1126.1
## 7 657.8 374.9 940.8
## 8 1564.8 1287.5 1842.1
## 9 1374.5 1099.8 1649.1
## 10 1580.0 1302.4 1857.5
## 11 2162.0 1864.2 2459.8
## 12 886.6 609.3 1163.8
## 13 2415.1 2103.6 2726.6
## 14 1658.9 1379.6 1938.2
## 15 1240.9 966.8 1514.9
op <- par(mfrow = c(2, 2))
plot(lm.reg)
par(op)
从输出结果可以看到,F统计量的p值和t统计量的p值都很小,回归方程通过回归参数的检验与回归方程的检验,拒绝H0假设.
回归方程为Y=339.9673+1.0122X,R square = 0.9587.
从Residuals vs Fitted图中可以看出数据点都基本均匀分布于直线y=0的两侧,无明显趋势.
Normal Q-Q图中的数据点分布区域一条直线,说明残差服从正态分布.
Scale-Location图显示了标准化残差的平方根分布情况. 最高点为残差最大值点.
Cook's distance显示了对回归的影响点.