The first step is to import the monthly time series for Real Gross Private Domestic Investment, FRED/GPDIC1, using Quandl packages. We will denote FRED/GPDIC1 as \(INV_{t}\).
inv <- Quandl("FRED/GPDIC1", type="ts")
str(inv)
Time-Series [1:276] from 1947 to 2016: 223 206 200 238 263 ...
plot(inv, ylab="Real Gross Private Domestic Investment")
Judging by the movement of the data that exhibit an exponential trend and unit used in the dataset (billions of chained 2005 dollars seasonally adjusted), the proper transformation for this dataset is a log transformation. We define log changes of quarterly Real Gross Private Domestic Investment as \(LINV_{t}= \Delta \log{INV_{t}} = \log{INV_{t}} - \log{INV_{t-1}}\), where \(INV_{t}\) is the original quarterly Real Gross Private Domestic Investment, we can implement the calculation in R using the following code:
linv <- diff(log(inv), differences=1)
The plot of the new transformed \(LINV_{t}\) can be seen as follow:
plot(linv, ylab="Log of Real Gross Private Domestic Investment")
To check for the unit-root, we will run ADF test and KPSS test for \(LINV_{t}\) as follow:
linv.urdf <- ur.df(linv, type = "trend", selectlags = "BIC")
summary(linv.urdf)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.180696 -0.021800 0.000491 0.023798 0.206680
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.044e-02 5.878e-03 1.777 0.0767 .
## z.lag.1 -7.283e-01 7.575e-02 -9.614 <2e-16 ***
## tt -2.402e-05 3.654e-05 -0.657 0.5116
## z.diff.lag -7.447e-02 6.038e-02 -1.233 0.2186
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04753 on 269 degrees of freedom
## Multiple R-squared: 0.3973, Adjusted R-squared: 0.3906
## F-statistic: 59.11 on 3 and 269 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -9.6143 30.8294 46.2414
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.98 -3.42 -3.13
## phi2 6.15 4.71 4.05
## phi3 8.34 6.30 5.36
test1 <- kpss.test(linv, null = "Trend")
test1
KPSS Test for Trend Stationarity
data: linv
KPSS Trend = 0.01618, Truncation lag parameter = 3, p-value = 0.1
From ADF test we conclude that \(LINV_{t}\) is trend stationary. KPSS test confirms our finding form ADF-test, it shows a borderline rejection of \(H_{0}: LINV_{t}\), our series is Trend Stationary.
The next step is to decide whether to use zero mean or non-zero mean in our specification for \(LINV_{t}\). To make the determination, we can compare the ADF test from both specification as follow:
linv.urdf.nonzeromean <- ur.df(linv, type ="drift", selectlags = "BIC")
linv.urdf.zeromean <- ur.df(linv, selectlags = "BIC")
We then compare the \(\tau_{3}\) statistics for non-zero and zero mean and found the following:
linv.urdf.nonzeromean@teststat[1]
[1] -9.604417
linv.urdf.zeromean@teststat[1]
[1] -9.219451
We found that non-zero mean specification is more preferable compared to zero mean. Hence, we will estimate ARIMA(p,0,q)(p,0,q)[4] with non-zero mean as our preferred specification if we want to conduct forecasting.
The first step is to import the daily closing price of NYSE Composite Index, YAHOO/INDEX_NYA, using Quandl packages. We will denote closing price of YAHOO/INDEX_NYA as \(NYSE_{t}\).
nyse <- Quandl("YAHOO/INDEX_NYA", type="zoo")
str(nyse[,4])
'zoo' series from 1965-12-31 to 2016-02-22
Data: num [1:12623] 529 527 528 531 532 ...
Index: Date[1:12623], format: "1965-12-31" "1966-01-03" "1966-01-04" "1966-01-05" ...
plot(nyse[,4], ylab="Closing Price of NYSE Composite Index")
Judging by the movement of the data that exhibit an exponential trend and unit used in the dataset (current US$), the proper transformation for this dataset is a log transformation. We define log changes of quarterly Real Gross Private Domestic Investment as \(LNYSE_{t}= \Delta \log{NYSE_{t}} = \log{NYSE_{t}} - \log{NYSE_{t-1}}\), where \(NYSE_{t}\) is the original quarterly Real Gross Private Domestic Investment, we can implement the calculation in R using the following code:
lnyse <- diff(log(nyse[,4]), differences=1)
The plot of the new transformed \(LNYSE_{t}\) can be seen as follow:
plot(lnyse, ylab="Log of Closing Price of NYSE Composite Index")
To check for the unit-root, we will run ADF test and KPSS test for \(LNYSE_{t}\) as follow:
lnyse.urdf <- ur.df(lnyse, type = "trend", selectlags = "BIC")
summary(lnyse.urdf)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.211468 -0.004575 0.000215 0.004864 0.113151
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.972e-04 1.804e-04 1.093 0.274426
## z.lag.1 -9.867e-01 1.230e-02 -80.227 < 2e-16 ***
## tt 4.715e-09 2.475e-08 0.190 0.848928
## z.diff.lag 3.306e-02 8.899e-03 3.716 0.000204 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01013 on 12616 degrees of freedom
## Multiple R-squared: 0.4781, Adjusted R-squared: 0.478
## F-statistic: 3853 on 3 and 12616 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -80.2268 2145.445 3218.168
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
test2 <- kpss.test(lnyse, null = "Trend")
test2
KPSS Test for Trend Stationarity
data: lnyse
KPSS Trend = 0.097619, Truncation lag parameter = 25, p-value =
0.1
From ADF test we conclude that \(LNYSE_{t}\) is trend stationary. KPSS test confirms our finding form ADF-test, it shows a borderline rejection of \(H_{0}: LNYSE_{t}\), our series is Trend Stationary.
The next step is to decide whether to use zero mean or non-zero mean in our specification for \(LNYSE_{t}\). To make the determination, we can compare the ADF test from both specification as follow:
lnyse.urdf.nonzeromean <- ur.df(lnyse, type ="drift", selectlags = "BIC")
lnyse.urdf.zeromean <- ur.df(lnyse, selectlags = "BIC")
We then compare the \(\tau_{3}\) statistics for non-zero and zero mean and found the following:
lnyse.urdf.nonzeromean@teststat[1]
[1] -80.22961
lnyse.urdf.zeromean@teststat[1]
[1] -80.17325
We found that non-zero mean specification is more preferable compared to zero mean. Hence, we will estimate ARIMA(p,0,q)(p,0,q) with non-zero mean as our preferred specification if we want to conduct forecasting.
The first step is to import the monthly 3-Month Treasury Bill Secondary Market Rate, FRED/TB3MS, using Quandl packages. We will denote FRED/TB3MS as \(TB_{t}\).
tb <- Quandl("FRED/TB3MS", type="ts")
str(tb)
Time-Series [1:985] from 1934 to 2016: 0.72 0.62 0.24 0.15 0.16 0.15 0.15 0.19 0.21 0.27 ...
plot(tb, ylab="3-Month Treasury Bill: Secondary Market Rate (%)")
Since the unit used in the dataset is percentage, the proper transformation for this dataset is to leave it as is. Hence, we can move to the next stage to check for the unit-root, we will run ADF test and KPSS test for \(TB_{t}\) as follow:
tb.urdf <- ur.df(tb, type = "trend", selectlags = "BIC")
summary(tb.urdf)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8517 -0.0666 -0.0222 0.0851 2.5890
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.111e-02 2.342e-02 1.329 0.1843
## z.lag.1 -9.025e-03 3.631e-03 -2.485 0.0131 *
## tt 1.886e-06 4.062e-05 0.046 0.9630
## z.diff.lag 3.406e-01 3.009e-02 11.322 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3472 on 979 degrees of freedom
## Multiple R-squared: 0.1189, Adjusted R-squared: 0.1162
## F-statistic: 44.02 on 3 and 979 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -2.4853 2.2068 3.31
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
From the result of ADF-test, using \(\tau_{3}\) statistics, we failed reject \(H_{0}: \gamma = 0\). This indicate that \(TB_{t}\) is not trend stationary. Furthermore, using \(\phi_{3}\) statistics, we also failed to reject \(H_{0}: \gamma = \beta = 0\).
Further testing using alternative specification of non-zero mean (drift, Model B as in the notes) and zero mean (Model A as in the notes) are conducted using the following code:
tb.urdf <- ur.df(tb, type = "drift", selectlags = "BIC")
summary(tb.urdf)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression drift
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-3.8522 -0.0671 -0.0216 0.0852 2.5887
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.031875 0.016666 1.913 0.0561 .
z.lag.1 -0.008979 0.003488 -2.574 0.0102 *
z.diff.lag 0.340576 0.030038 11.338 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.347 on 980 degrees of freedom
Multiple R-squared: 0.1189, Adjusted R-squared: 0.1171
F-statistic: 66.09 on 2 and 980 DF, p-value: < 2.2e-16
Value of test-statistic is: -2.5738 3.3125
Critical values for test statistics:
1pct 5pct 10pct
tau2 -3.43 -2.86 -2.57
phi1 6.43 4.59 3.78
tb.urdf <- ur.df(tb, selectlags = "BIC")
summary(tb.urdf)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression none
Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-3.8914 -0.0461 0.0035 0.0967 2.5531
Coefficients:
Estimate Std. Error t value Pr(>|t|)
z.lag.1 -0.00399 0.00232 -1.72 0.0857 .
z.diff.lag 0.33797 0.03005 11.25 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3474 on 981 degrees of freedom
Multiple R-squared: 0.1156, Adjusted R-squared: 0.1138
F-statistic: 64.09 on 2 and 981 DF, p-value: < 2.2e-16
Value of test-statistic is: -1.7202
Critical values for test statistics:
1pct 5pct 10pct
tau1 -2.58 -1.95 -1.62
From both non-zero mean and zero mean model for ADF-test result, the rejection of \(H_{0}\) are borderline at best, to get more conclusive result we will run KPPS test as follow:
test3 <- kpss.test(tb, null = "Trend")
test3
KPSS Test for Trend Stationarity
data: tb
KPSS Trend = 2.2343, Truncation lag parameter = 7, p-value = 0.01
KPSS test confirms our finding form ADF-test, this indicate that \(TB_{t}\) is not trend stationary and following a random walk that warrant for differencing before we can use the time series for forecasting purpose.
The next step is to transform \(TB_{t}\) into first difference form and re-checking for the existence of unit root and stationary of our new dataset \(\Delta TB_{t}\) using both ADF test and KPSS test:
dtb <- diff(tb, differences=1)
dtb.urdf <- ur.df(dtb, type = "trend", selectlags = "BIC")
summary(dtb.urdf)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3199 -0.0564 -0.0001 0.0790 2.1806
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.665e-02 2.180e-02 0.764 0.445
## z.lag.1 -8.035e-01 3.604e-02 -22.298 < 2e-16 ***
## tt -3.369e-05 3.836e-05 -0.878 0.380
## z.diff.lag 2.085e-01 3.126e-02 6.672 4.22e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3406 on 978 degrees of freedom
## Multiple R-squared: 0.3618, Adjusted R-squared: 0.3598
## F-statistic: 184.8 on 3 and 978 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -22.2981 165.7374 248.6052
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
test4 <- kpss.test(dtb, null = "Trend")
test4
KPSS Test for Trend Stationarity
data: dtb
KPSS Trend = 0.03477, Truncation lag parameter = 7, p-value = 0.1
From ADF test we conclude that \(\Delta TB_{t}\) is difference stationary. KPSS test confirms our finding form ADF-test, it shows a borderline rejection of \(H_{0}: \Delta TB_{t}\), our series is difference stationary.
The next step is to decide whether to use zero mean or non-zero mean in our specification for \(\Delta TB_{t}\). To make the determination, we can compare the ADF test from both specification as follow:
dtb.urdf.nonzeromean <- ur.df(dtb, type ="drift", selectlags = "BIC")
dtb.urdf.zeromean <- ur.df(dtb, selectlags = "BIC")
We then compare the \(\tau_{3}\) statistics for non-zero and zero mean and found the following:
dtb.urdf.nonzeromean@teststat[1]
[1] -22.28352
dtb.urdf.zeromean@teststat[1]
[1] -22.29494
We found that zero mean specification is more preferable compared to zero mean. Hence, we will estimate ARIMA(p,0,q)(p,0,q)[12] with zero mean as our preferred specification if we want to conduct forecasting.