1-step-ahead forecast \(R_{100}(1) = 0.2a_{100} = 0.002\)
The standard deviation of R_{100}(1) e(1) is 0.2*σ_{a} = 0.05. The autocorrelation of the lag-2 and the return series is: \[ corr(r_{t},r_{t-1}) = cov(r_{t},r_{t-1})/(1.04*var(a)) =0.2*var(a)/(1.04*var(a)) = 0.2/1.04 = 0.1923 \]
2-step-ahead forecast \(R_{100}(2) = a_{102}+0.2a_{102} = 0\)
The standard deviation of \(R_{100}(2)\) \(e(2)\) is \((1+0.2^2)^{0.5}*σ_{a} = 0.0255\). Based on the feature of short memory of MA Model, the autocorrelation of the lag-2 and the return series is 0.
the mean of the return series E(rt) = 0.01/0.8 = 0.0125
the variance of the return series Var(rt) = Var(at)/0.96 = 0.02/0.96 = 0.0208
lag-1 and lag-2 autocorrelation: \[ corr(r_{t},r_{t-1}) = cov(r_{t},r_{t-1})/var(rt) = 0 corr(r_{t},r_{t-2}) = cov(r_{t},r_{t-2})/var(rt) = 0.2 \]
1-step ahead forecast \[ r_{100}(1) = 0.1+0.2*a_{99} = 0.014 \] The standard deviation of R_{100}(1) e(1) = the standard deviation of at = 0.1414
2-step ahead forecast \[ r_{100}(2) = 0.1+0.2*a_{100} = 0.008 \] The standard deviation of R_{100}(2) e(2) = the standard deviation of at = 0.1414
First, I loaded the dataset and library the packages needed.
#load the dataset
path = "D:\\university\\23春夏\\金融计量\\作业1"
setwd(path)
library(readxl)
## Warning: 程辑包'readxl'是用R版本4.2.2 来建造的
library(tidyverse)
## Warning: 程辑包'tidyverse'是用R版本4.2.2 来建造的
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 0.3.5
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## Warning: 程辑包'ggplot2'是用R版本4.2.2 来建造的
## Warning: 程辑包'tidyr'是用R版本4.2.2 来建造的
## Warning: 程辑包'dplyr'是用R版本4.2.2 来建造的
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(ggplot2)
library(astsa)
## Warning: 程辑包'astsa'是用R版本4.2.2 来建造的
library(forecast)
## Warning: 程辑包'forecast'是用R版本4.2.2 来建造的
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
##
## 载入程辑包:'forecast'
##
## The following object is masked from 'package:astsa':
##
## gas
library(tseries)
## Warning: 程辑包'tseries'是用R版本4.2.2 来建造的
library(agricolae)
## Warning: 程辑包'agricolae'是用R版本4.2.3 来建造的
data<-read_excel("quarterly.xls")
glimpse(data)
## Rows: 212
## Columns: 19
## $ DATE <chr> "1960Q1", "1960Q2", "1960Q3", "1960Q4", "1961Q1", "1961Q2", "…
## $ FFR <dbl> 3.93, 3.70, 2.94, 2.30, 2.00, 1.73, 1.68, 2.40, 2.46, 2.61, 2…
## $ Tbill <dbl> 3.87, 2.99, 2.36, 2.31, 2.35, 2.30, 2.30, 2.46, 2.72, 2.72, 2…
## $ Tb1yr <dbl> 4.57, 3.87, 3.07, 2.99, 2.87, 2.94, 3.01, 3.10, 3.21, 3.02, 3…
## $ r5 <dbl> 4.64, 4.30, 3.67, 3.75, 3.64, 3.62, 3.90, 3.84, 3.84, 3.63, 3…
## $ r10 <dbl> 4.49, 4.26, 3.83, 3.89, 3.79, 3.79, 3.98, 3.97, 4.02, 3.87, 3…
## $ PPINSA <dbl> 31.67, 31.73, 31.63, 31.70, 31.80, 31.47, 31.50, 31.53, 31.70…
## $ Finished <dbl> 33.20, 33.40, 33.43, 33.67, 33.63, 33.33, 33.33, 33.37, 33.53…
## $ CPI <dbl> 29.40, 29.57, 29.59, 29.78, 29.84, 29.83, 29.95, 29.99, 30.11…
## $ CPICORE <dbl> 18.92, 19.00, 19.07, 19.14, 19.17, 19.23, 19.32, 19.37, 19.44…
## $ M1NSA <dbl> 140.53, 138.40, 139.60, 142.67, 142.23, 141.40, 142.00, 146.6…
## $ M2SA <dbl> 896.1, 903.3, 919.4, 932.8, 948.9, 966.4, 982.7, 1000.0, 1020…
## $ M2NSA <dbl> 299.40, 300.03, 305.50, 312.30, 317.10, 320.97, 326.50, 334.7…
## $ Unemp <dbl> 5.13, 5.23, 5.53, 6.27, 6.80, 7.00, 6.77, 6.20, 5.63, 5.53, 5…
## $ IndProd <dbl> 23.93, 23.41, 23.02, 22.47, 22.13, 23.00, 23.74, 24.57, 24.94…
## $ RGDP <dbl> 2845.3, 2832.0, 2836.6, 2800.2, 2816.9, 2869.6, 2915.9, 2975.…
## $ Potent <dbl> 2824.2, 2851.2, 2878.7, 2906.7, 2934.8, 2962.9, 2991.3, 3019.…
## $ Deflator <dbl> 18.521, 18.579, 18.648, 18.700, 18.743, 18.785, 18.843, 18.90…
## $ Curr <dbl> 31.830, 31.862, 32.217, 32.624, 32.073, 32.131, 32.699, 33.42…
y<-diff(log(data$IndProd),lag=1)
mdl_a1_lag1<-lm(y~lag(y,1))
summary(mdl_a1_lag1)
##
## Call:
## lm(formula = y ~ lag(y, 1))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.047714 -0.006337 0.000170 0.006327 0.044898
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0028088 0.0009497 2.957 0.00346 **
## lag(y, 1) 0.5999273 0.0548041 10.947 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0127 on 208 degrees of freedom
## (因为不存在,1个观察量被删除了)
## Multiple R-squared: 0.3655, Adjusted R-squared: 0.3625
## F-statistic: 119.8 on 1 and 208 DF, p-value: < 2.2e-16
After adding the AR(||8||), the equation’s adjusted R^2 is increased. It means the explanatory ability of the model is increased.
Box.test(y)
##
## Box-Pierce test
##
## data: y
## X-squared = 75.941, df = 1, p-value < 2.2e-16
mdl_a2_lag8<-lm(y~lag(y,1)+lag(y,8))
summary(mdl_a2_lag8)
##
## Call:
## lm(formula = y ~ lag(y, 1) + lag(y, 8))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.044260 -0.006154 0.000062 0.006414 0.046069
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.003618 0.001015 3.565 0.000455 ***
## lag(y, 1) 0.588308 0.055857 10.532 < 2e-16 ***
## lag(y, 8) -0.131225 0.053174 -2.468 0.014434 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01225 on 200 degrees of freedom
## (因为不存在,8个观察量被删除了)
## Multiple R-squared: 0.3802, Adjusted R-squared: 0.374
## F-statistic: 61.33 on 2 and 200 DF, p-value: < 2.2e-16
However, simply adding an AR(||8||) may have two concerns: one is the equation may emit the effects of intermediate orders(as the acf plot depicts 2 and 3 orders also need to be added into the equation), another is the high order may reduce the sample space.
acf(y)
ggplot(data=data,aes(x=DATE,y=Unemp))+geom_point()
acf(data$Unemp)
y=data$Unemp
mdl_b1_lag2<-lm(y~lag(y,1)+lag(y,2))
summary(mdl_b1_lag2)
##
## Call:
## lm(formula = y ~ lag(y, 1) + lag(y, 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.71003 -0.15973 -0.02581 0.13814 1.02401
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.22616 0.06725 3.363 0.000918 ***
## lag(y, 1) 1.64595 0.05096 32.299 < 2e-16 ***
## lag(y, 2) -0.68267 0.05110 -13.360 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2485 on 207 degrees of freedom
## (因为不存在,2个观察量被删除了)
## Multiple R-squared: 0.9767, Adjusted R-squared: 0.9765
## F-statistic: 4335 on 2 and 207 DF, p-value: < 2.2e-16
its characteristic function is $ ^2 - _{1}*- = 0$
the characteristic roots are \(\lambda_{1}=\frac{\phi_1+\sqrt{\phi_1^2+4 \phi_2}}{2}, \lambda_2=\frac{\phi_1-\sqrt{\phi_1^2+4 \phi_2}}{2}\).
Therefore, the characteristic roots of deterministic part of the difference equation are \(\lambda_{1} = 0.82+0.074i,\lambda_{2} = 0.82-0.074i\).
The implied adjustment process is not stationary based on the result of the characteristic roots.
y=data$Unemp
mdl_b1_lag1<-lm(y~lag(y,1))
summary(mdl_b1_lag1)
##
## Call:
## lm(formula = y ~ lag(y, 1))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.78687 -0.18887 -0.06627 0.08743 1.66773
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.13427 0.09070 1.48 0.14
## lag(y, 1) 0.98000 0.01443 67.89 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3376 on 209 degrees of freedom
## (因为不存在,1个观察量被删除了)
## Multiple R-squared: 0.9566, Adjusted R-squared: 0.9564
## F-statistic: 4609 on 1 and 209 DF, p-value: < 2.2e-16
dly<-diff(log(data$CPICORE),lag=1)
acf(dly)
pacf(dly)
The d2ly shows more stationary property than the dly, thus the modeler
may want to work with the second difference of the logarithm of the core
CPI.
d2ly<-diff(dly,lag=1)
acf(d2ly)
pacf(d2ly)
ii. Both the AR(1) and MA(1) models’ coefficients are significant. Based
on the AIC, the MA(1) is better than AR(1). The MA(1) model is \(d2ly_{t} = -0.2704*a_{t-1} +ε_{t}\).
# ar(1) model
mdl_d2ly_ar1<-arima(d2ly,order=c(1,0,0))
mdl_d2ly_ar1
##
## Call:
## arima(x = d2ly, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## -0.2497 0e+00
## s.e. 0.0667 1e-04
##
## sigma^2 estimated as 4.169e-06: log likelihood = 1002.71, aic = -1999.41
# ma(1) model
mdl_d2ly_ma1<-arima(d2ly,order=c(0,0,1))
mdl_d2ly_ma1
##
## Call:
## arima(x = d2ly, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## -0.2704 0e+00
## s.e. 0.0679 1e-04
##
## sigma^2 estimated as 4.15e-06: log likelihood = 1003.2, aic = -2000.39
d2ly<-data.frame(d2ly)
train_set<-d2ly[1:round(0.8*length(d2ly$d2ly)),]
evaluation_set<-d2ly[round(0.8*length(d2ly$d2ly)):length(d2ly$d2ly),]
mdl_train_ar1<-arima(train_set,order=c(1,0,0))
mdl_train_ma1<-arima(train_set,order=c(0,0,1))
summary(mdl_train_ar1)
##
## Call:
## arima(x = train_set, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## -0.2438 0e+00
## s.e. 0.0749 2e-04
##
## sigma^2 estimated as 4.636e-06: log likelihood = 793.25, aic = -1580.5
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -6.9603e-07 0.002153073 0.001521556 128.5028 164.8044 0.6221351
## ACF1
## Training set -0.006486651
summary(mdl_train_ma1)
##
## Call:
## arima(x = train_set, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## -0.2490 0e+00
## s.e. 0.0746 1e-04
##
## sigma^2 estimated as 4.633e-06: log likelihood = 793.3, aic = -1580.6
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -1.131661e-06 0.002152453 0.00152763 133.2132 167.0032 0.6246186
## ACF1
## Training set -0.004947497
mdl_dly_ar2<-arima(dly,order=c(2,0,0))
forecast(mdl_dly_ar2,h=12)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 212 0.002762031 0.0001771710 0.005346891 -0.001191171 0.006715233
## 213 0.002967140 -0.0002182688 0.006152549 -0.001904522 0.007838803
## 214 0.003202030 -0.0005130304 0.006917091 -0.002479665 0.008883726
## 215 0.003416511 -0.0007064766 0.007539500 -0.002889055 0.009722078
## 216 0.003622875 -0.0008407211 0.008086471 -0.003203606 0.010449356
## 217 0.003818882 -0.0009318913 0.008569656 -0.003446799 0.011084564
## 218 0.004005638 -0.0009914016 0.009002678 -0.003636675 0.011647951
## 219 0.004183443 -0.0010268106 0.009393698 -0.003784953 0.012151839
## 220 0.004352758 -0.0010435481 0.009749065 -0.003900180 0.012605697
## 221 0.004513981 -0.0010456408 0.010073602 -0.003988727 0.013016688
## 222 0.004667500 -0.0010361614 0.010371161 -0.004055497 0.013390496
## 223 0.004813682 -0.0010175062 0.010644871 -0.004104351 0.013731716
plot(forecast(mdl_dly_ar2,h=12))
Then I build the MA(1) model of the d2ly series and make 12 step forecasts. After transforming the forecasts of the shock to the forecasts of core CPI, the below plot also shows the rising tendency, but the difference compared with the AR(2) model of dly series is that the point forecasts are stable from the second step forecast to the end.
mdl_d2ly_ma1<-arima(d2ly,order=c(0,0,1))
summary(mdl_d2ly_ma1)
##
## Call:
## arima(x = d2ly, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## -0.2704 0e+00
## s.e. 0.0679 1e-04
##
## sigma^2 estimated as 4.15e-06: log likelihood = 1003.2, aic = -2000.39
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -9.937184e-07 0.002037075 0.001477732 124.407 159.0768 0.6175522
## ACF1
## Training set 0.001615847
forecasts<-forecast(mdl_d2ly_ma1,h=12)
forecasts
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 211 2.126180e-04 -0.002397999 0.002823235 -0.003779976 0.004205212
## 212 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 213 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 214 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 215 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 216 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 217 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 218 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 219 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 220 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 221 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
## 222 -6.932459e-06 -0.002711293 0.002697428 -0.004142895 0.004129030
#transform the forecast to the cpicore forecasts
x<-data.frame(forecasts$mean)
x.lag<-rbind(factor(tail(forecasts$x,1)),head(x,11))
cpicore_forecasts<-x-0.2704*x.lag
plot(cpicore_forecasts)