3.23
##data input
setwd("C:\\Users\\Mr.Young\\Desktop")
data <- read.csv("Table 3_8.csv", header = TRUE)
attach(data)
(1)scatterplot
plot(Year, NGDP, col = "blue", ylab = "GDP")
points(Year, RGDP, col = "red")
axis(4)
legend("topleft", pch = c("o","o"), c("NGDP","RGDP"),
col = c("blue","red"))
(2)linear regression model
X <- 1:47
Y1 <- NGDP
Y2 <- RGDP
lm.N <- lm(Y1~X)
summary(lm.N)
##
## Call:
## lm(formula = Y1 ~ X)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1084 -869 -297 773 2308
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1723.2 293.3 -5.87 4.8e-07 ***
## X 252.6 10.6 23.74 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 989 on 45 degrees of freedom
## Multiple R-squared: 0.926, Adjusted R-squared: 0.924
## F-statistic: 564 on 1 and 45 DF, p-value: <2e-16
lm.R <- lm(Y2~X)
summary(lm.R)
##
## Call:
## lm(formula = Y2 ~ X)
##
## Residuals:
## Min 1Q Median 3Q Max
## -757.4 -348.6 -65.2 324.4 955.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1620.42 126.16 12.8 <2e-16 ***
## X 180.26 4.58 39.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 426 on 45 degrees of freedom
## Multiple R-squared: 0.972, Adjusted R-squared: 0.971
## F-statistic: 1.55e+03 on 1 and 45 DF, p-value: <2e-16
(3)The slope here gives the rate of change GDP per time period
(4)The difference between the two represents inflation over time
(5)As the figure and regression results indicate
the nominal GDP has been growing faster than real GDP
So the inflation may has been rising over time.
4.3
n <- 1000
X <- rexp(n,2) #assume that theta=0.5
lnLL <- function(param,data)
#参数"params"是包含了参数:theta
#参数"data",是观测数据
{
lnDens <- dexp(data, rate = param , log = TRUE)
lnLL <- sum(lnDens)
return(-lnLL)
#下面用到的nlminb()函数是最小化一个函数的值
#但是要最大化log-likeilhood函数
#所以加个“-”号
}
MLE <- nlminb(1,lnLL,data=X,lower=0,upper=5)
初始值为theta=1
lnLL是被最小化的函数
data是拟合用的数据,这里是X
lower和upper分别指定参数的上界和下界,分别取了0和5
theta=1/MLE$par
theta
## [1] 0.4987
发现结果接近理论值0.5,证明了极大似然的可靠性
mean(X)
## [1] 0.4987
与X的均值作比较,结果近似相同
5.8
(1)we can find that there is positive association in the LFPR of 1972 and 1968
(2)Use t test(one-tailed)
b2Hat <- 0.6560
se <- 0.1961
(t <- (b2Hat-1)/se)
## [1] -1.754
df <- 17
the one-tailed t value at a=5% is about 1.740 which means that the t value is not significant,thus we can safely accept the hypothesis and draw the conclusion that the slop is not greater than 1
(3)
mean1972 <- 0.2033+0.6560*0.58
we can use Eq.(5.10.4) to establish a 95% confidence interval.However we can’t compute this at the moment in that we don’t know the standard error of the forcast value.
(4) we can’t do this problem without the real data
5.9 data input
data <- read.csv("Table 5_5.csv", header = TRUE)
attach(data)
(1)
plot(SALARY,SPENDING)
(2)
lm.fit <- lm(SALARY~SPENDING)
summary(lm.fit)
##
## Call:
## lm(formula = SALARY ~ SPENDING)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3848 -1845 -218 1660 5529
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.21e+04 1.20e+03 10.1 1.3e-13 ***
## SPENDING 3.31e+00 3.12e-01 10.6 2.7e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2320 on 49 degrees of freedom
## Multiple R-squared: 0.697, Adjusted R-squared: 0.691
## F-statistic: 113 on 1 and 49 DF, p-value: 2.71e-14
SALARY=1213 + 3.308*SPENDING se (1197) (0.3117) Multiple R-squared: 0.6968
RSS=sum(residuals(lm.fit)^2)
RSS=264825250
(3) when the spending increases by one dollar,the average pay would increases by $3.31.The intercept has no specific meaning.
(4)
confint(lm.fit)
## 2.5 % 97.5 %
## (Intercept) 9723.204 14535.538
## SPENDING 2.681 3.934
we can see the 95% confidence interval for beta2 is (2.681192,3.933978) we will not reject the null hypothesis that the slope is 3
(5)
置信区间confidence interval
表示在给定预测变量的指定设置时,平均响应可能落入的范围
预测区间Prediction Interval
表示在给定预测变量的指定设置时,单个观测值可能落入的范围
predict(lm.fit, newdata = data.frame(SPENDING = 5000), interval = "confidence", se.fit = TRUE)
## $fit
## fit lwr upr
## 1 28667 27621 29713
##
## $se.fit
## [1] 520.6
##
## $df
## [1] 49
##
## $residual.scale
## [1] 2325
the mean forcast values is 28667.3
the standard error of mean forcast value is 520.6056 which is specified by Eq.5.10.2
the mean prediction’s 95% confidence interval is(27621.1,29713.49)
predict(lm.fit, newdata = data.frame(SPENDING=5000),interval = "prediction", se.fit = TRUE)
## $fit
## fit lwr upr
## 1 28667 23880 33455
##
## $se.fit
## [1] 520.6
##
## $df
## [1] 49
##
## $residual.scale
## [1] 2325
the individual forcast values is 28667.3
according to the Eq.5.10.6 and compared with Eq.5.10.2
we can calculate by ourselves to get the standard error of individual forcast which is 2382.357
sigma2 <- RSS/(51-2)
sqrt(sigma2+520.6056^2)
## [1] 2382
the individual prediction’s 95% confidence interval is (23879.77,33454.82)
(6)
plot(lm.fit,which = 2)
we can use the QQ plot to find that most points are on a straight line
so we don’t reject the normality assumption on the residuals