3.23

##data input
setwd("C:\\Users\\Mr.Young\\Desktop")
data <- read.csv("Table 3_8.csv", header = TRUE)
attach(data)

(1)scatterplot

plot(Year, NGDP, col = "blue", ylab = "GDP")
points(Year, RGDP, col = "red")
axis(4)
legend("topleft", pch = c("o","o"), c("NGDP","RGDP"), 
       col = c("blue","red"))

plot of chunk unnamed-chunk-2

(2)linear regression model

X <- 1:47
Y1 <- NGDP
Y2 <- RGDP
lm.N <- lm(Y1~X)
summary(lm.N)
## 
## Call:
## lm(formula = Y1 ~ X)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -1084   -869   -297    773   2308 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1723.2      293.3   -5.87  4.8e-07 ***
## X              252.6       10.6   23.74  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 989 on 45 degrees of freedom
## Multiple R-squared:  0.926,  Adjusted R-squared:  0.924 
## F-statistic:  564 on 1 and 45 DF,  p-value: <2e-16
lm.R <- lm(Y2~X)
summary(lm.R)
## 
## Call:
## lm(formula = Y2 ~ X)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -757.4 -348.6  -65.2  324.4  955.8 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1620.42     126.16    12.8   <2e-16 ***
## X             180.26       4.58    39.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 426 on 45 degrees of freedom
## Multiple R-squared:  0.972,  Adjusted R-squared:  0.971 
## F-statistic: 1.55e+03 on 1 and 45 DF,  p-value: <2e-16

(3)The slope here gives the rate of change GDP per time period

(4)The difference between the two represents inflation over time

(5)As the figure and regression results indicate

the nominal GDP has been growing faster than real GDP

So the inflation may has been rising over time.

4.3

n <- 1000
X <- rexp(n,2)   #assume that theta=0.5
lnLL <- function(param,data) 
  #参数"params"是包含了参数:theta
  #参数"data",是观测数据
{
  lnDens <- dexp(data, rate = param , log = TRUE)
  lnLL <- sum(lnDens)
  return(-lnLL)
  #下面用到的nlminb()函数是最小化一个函数的值
  #但是要最大化log-likeilhood函数
  #所以加个“-”号
}
MLE <- nlminb(1,lnLL,data=X,lower=0,upper=5)

初始值为theta=1

lnLL是被最小化的函数

data是拟合用的数据,这里是X

lower和upper分别指定参数的上界和下界,分别取了0和5

theta=1/MLE$par
theta
## [1] 0.4987

发现结果接近理论值0.5,证明了极大似然的可靠性

mean(X)
## [1] 0.4987

与X的均值作比较,结果近似相同

5.8

(1)we can find that there is positive association in the LFPR of 1972 and 1968

(2)Use t test(one-tailed)

b2Hat <- 0.6560
se <- 0.1961
(t <- (b2Hat-1)/se)
## [1] -1.754
df <- 17

the one-tailed t value at a=5% is about 1.740 which means that the t value is not significant,thus we can safely accept the hypothesis and draw the conclusion that the slop is not greater than 1

(3)

mean1972 <- 0.2033+0.6560*0.58

we can use Eq.(5.10.4) to establish a 95% confidence interval.However we can’t compute this at the moment in that we don’t know the standard error of the forcast value.

(4) we can’t do this problem without the real data

5.9 data input

data <- read.csv("Table 5_5.csv", header = TRUE)
attach(data)

(1)

plot(SALARY,SPENDING)

plot of chunk unnamed-chunk-12

(2)

lm.fit <- lm(SALARY~SPENDING)
summary(lm.fit)
## 
## Call:
## lm(formula = SALARY ~ SPENDING)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -3848  -1845   -218   1660   5529 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.21e+04   1.20e+03    10.1  1.3e-13 ***
## SPENDING    3.31e+00   3.12e-01    10.6  2.7e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2320 on 49 degrees of freedom
## Multiple R-squared:  0.697,  Adjusted R-squared:  0.691 
## F-statistic:  113 on 1 and 49 DF,  p-value: 2.71e-14

SALARY=1213 + 3.308*SPENDING se (1197) (0.3117) Multiple R-squared: 0.6968

RSS=sum(residuals(lm.fit)^2)

RSS=264825250

(3) when the spending increases by one dollar,the average pay would increases by $3.31.The intercept has no specific meaning.

(4)

confint(lm.fit)
##                2.5 %    97.5 %
## (Intercept) 9723.204 14535.538
## SPENDING       2.681     3.934

we can see the 95% confidence interval for beta2 is (2.681192,3.933978) we will not reject the null hypothesis that the slope is 3

(5)

置信区间confidence interval

表示在给定预测变量的指定设置时,平均响应可能落入的范围

预测区间Prediction Interval

表示在给定预测变量的指定设置时,单个观测值可能落入的范围

predict(lm.fit, newdata = data.frame(SPENDING = 5000), interval = "confidence", se.fit = TRUE)
## $fit
##     fit   lwr   upr
## 1 28667 27621 29713
## 
## $se.fit
## [1] 520.6
## 
## $df
## [1] 49
## 
## $residual.scale
## [1] 2325

the mean forcast values is 28667.3

the standard error of mean forcast value is 520.6056 which is specified by Eq.5.10.2

the mean prediction’s 95% confidence interval is(27621.1,29713.49)

predict(lm.fit, newdata = data.frame(SPENDING=5000),interval = "prediction", se.fit = TRUE)
## $fit
##     fit   lwr   upr
## 1 28667 23880 33455
## 
## $se.fit
## [1] 520.6
## 
## $df
## [1] 49
## 
## $residual.scale
## [1] 2325

the individual forcast values is 28667.3

according to the Eq.5.10.6 and compared with Eq.5.10.2

we can calculate by ourselves to get the standard error of individual forcast which is 2382.357

sigma2 <- RSS/(51-2)
sqrt(sigma2+520.6056^2)
## [1] 2382

the individual prediction’s 95% confidence interval is (23879.77,33454.82)

(6)

plot(lm.fit,which = 2)

plot of chunk unnamed-chunk-19

we can use the QQ plot to find that most points are on a straight line

so we don’t reject the normality assumption on the residuals