The dataset we will analyse includes monthly averages of the ASX Price Index, Gold price, Crude Oil, and Copper price starting from January 2004.
library(TSA)
##
## Attaching package: 'TSA'
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
library(urca)
library(x12)
## Loading required package: x13binary
## x12 is ready to use.
## Use the package x12GUI for a Graphical User Interface.
## By default the X13-ARIMA-SEATS binaries provided by the R package x13binary
## are used but this can be changed with x12path(validpath)
## ---------------
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/x12/issues
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Registered S3 methods overwritten by 'forecast':
## method from
## fitted.Arima TSA
## plot.Arima TSA
library(dLagM)
## Loading required package: nardl
## Loading required package: dynlm
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'dLagM'
## The following object is masked from 'package:forecast':
##
## forecast
library(tseries)
library(readr)
##
## Attaching package: 'readr'
## The following object is masked from 'package:TSA':
##
## spec
asx_df <- read.csv("ASX_data.csv")
class(asx_df)
## [1] "data.frame"
str(asx_df)
## 'data.frame': 161 obs. of 4 variables:
## $ ASX.price : num 2935 2778 2849 2971 2980 ...
## $ Gold.price : Factor w/ 160 levels "1,009.10","1,010.10",..: 136 135 128 112 119 110 109 122 130 117 ...
## $ Crude.Oil..Brent._USD.bbl: num 31.3 32.6 30.3 25 25.8 ...
## $ Copper_USD.tonne : Factor w/ 158 levels "1,588","1,650",..: 2 5 4 1 3 6 7 8 9 10 ...
#convert numeric class to time series for ASX,Price column
asx_price = ts(asx_df$ASX.price, start = c(2004, 1),frequency = 12)
head(asx_price)
## Jan Feb Mar Apr May Jun
## 2004 2935.4 2778.4 2848.6 2970.9 2979.8 2999.7
#Converting factor class to numeric in order to
#convert numeric to time series for Gold.Price column
asx_df$Gold.price = sub(",","", asx_df$Gold.price)
asx_df$Gold.price = as.numeric(as.character(asx_df$Gold.price))
gold_price= ts(asx_df$Gold.price, start = c(2004, 1),frequency = 12)
head(gold_price)
## Jan Feb Mar Apr May Jun
## 2004 611.9 603.3 565.7 538.6 549.4 535.9
#convert numeric class to time series for Crude oil column
asx_df$Crude.Oil..Brent._USD.bbl = as.numeric(asx_df$Crude.Oil..Brent._USD.bbl)
crude_oil_price= ts(asx_df$Crude.Oil..Brent._USD.bbl, start = c(2004, 1),frequency = 12)
head(crude_oil_price)
## Jan Feb Mar Apr May Jun
## 2004 31.29 32.65 30.34 25.02 25.81 27.55
##Converting factor class to numeric in order to
#convert numeric to time series for Copper_USD.tonne column
asx_df$Copper_USD.tonne = sub(",","", asx_df$Copper_USD.tonne)
asx_df$Copper_USD.tonne = as.numeric(as.character(asx_df$Copper_USD.tonne))
copper_price= ts(asx_df$Copper_USD.tonne, start = c(2004, 1),frequency = 12)
head(copper_price)
## Jan Feb Mar Apr May Jun
## 2004 1650 1682 1656 1588 1651 1685
The following are the five basic patterns we can deduce from a time series plot:
Existence of a recurring pattern or trend
Seasonality evidence
Evidence of fluctuating variation across time which is also the variance
Moving average and autoregression behaviour patterns
Any indication of a possible intervention
#plots the visualization of series
par(mfrow=c(2,2))
plot(asx_price, ylab="Avg monthly ASX price", xlab = "Year", main = "ASX time series", type='o')
plot(gold_price, ylab="Avg monthly Gold price", xlab = "Year", main = "Gold time series", type='o')
plot(crude_oil_price, ylab="Avg monthly ASX price", xlab = "Year", main = "Crude oil time series", type='o')
plot(copper_price, ylab="Avg monthly Copper price", xlab = "Year", main = "Copper time series", type='o')
There is a deifinite upwards trend observed in the time series, with the exception of a dip around the years 2008-2009. No clear evidence of changing variance can be observed.
With gold prices it is very clear that an upwards trend is observed. Although as with ASX prices, there is not any intervention or dip in the prices. There is a constant increase. No clear evidence of changing variance can be observed.
No clear trend is visible. There are regular upshifts and dips in the plot.
AS with crude oil prices, there isn’t a clear trend. The prices had a constant upwards trend until the huge dip in 2008.
The goal here is to determine whether or not the time series is stationary. Visualization, as recommended by the ACF and PACF, is the most effective technique to check for it. Unit root tests are used to ensure the authenticity of this. I’m just interested in the augmented Dickey-Fuller test (ADF) and not the Phillips-Perron (PP) test.
#ACF and PACF
par(mfrow=c(2,4))
acf(asx_price, main = "ACF ASX price")
pacf(asx_price, main = "PACF ASX price")
acf(gold_price, main = "ACF Gold price")
pacf(gold_price, main = "PACF Gold price")
acf(crude_oil_price, main = "ACF Crude oil price")
pacf(crude_oil_price, main = "PACF Crude oil price")
acf(copper_price, main = "ACF Copper price")
pacf(copper_price, main = "PACF Copper price")
We deduce that there is a trend that dominates the series correlation characteristics because we observe a progressively declining pattern in ACF. We need to remove the trend and then re-display sample ACF to see more serial correlation properties of these series.
#The next steps are to apply ADF test to asx_price column with the first approach to determine the lag length.
k1 = ar(asx_price)$order
k1
## [1] 2
asx_test =adf.test(asx_price, k = k1)
asx_test$parameter
## Lag order
## 2
print(asx_test)
##
## Augmented Dickey-Fuller Test
##
## data: asx_price
## Dickey-Fuller = -2.442, Lag order = 2, p-value = 0.392
## alternative hypothesis: stationary
#The next steps are to apply ADF test to gold_price column with the first approach to determine the lag length.
k2 = ar(gold_price)$order
k2
## [1] 1
gold_test =adf.test(gold_price, k = k2)
gold_test$parameter
## Lag order
## 1
print(gold_test)
##
## Augmented Dickey-Fuller Test
##
## data: gold_price
## Dickey-Fuller = -2.2824, Lag order = 1, p-value = 0.4585
## alternative hypothesis: stationary
#The next steps are to apply ADF test to crude_oil_price column with the first approach to determine the lag length.
k3 = ar(crude_oil_price)$order
k3
## [1] 3
crude_test =adf.test(crude_oil_price, k = k3)
crude_test$parameter
## Lag order
## 3
print(crude_test)
##
## Augmented Dickey-Fuller Test
##
## data: crude_oil_price
## Dickey-Fuller = -2.0703, Lag order = 3, p-value = 0.547
## alternative hypothesis: stationary
#The next steps are to apply ADF test to copper_price column with the first approach to determine the lag length.
k4 = ar(copper_price)$order
k4
## [1] 2
copper_test =adf.test(copper_price, k = k1)
copper_test$parameter
## Lag order
## 2
print(copper_test)
##
## Augmented Dickey-Fuller Test
##
## data: copper_price
## Dickey-Fuller = -2.3847, Lag order = 2, p-value = 0.4159
## alternative hypothesis: stationary
Decomposition of time series into multiple components is useful for observing the individual effects of existing components as well as historical influences. The components that may be dissected from a time series i.e. seasonality, trend, and the rest that include impacts that the seasonal and trend components do not capture.
The time series may need ome transformations before they are decomposed. To achieve that we need to check their lambda values, to get a better idea, if any of the series need a transformation.
#Lambda values
asx_lambda = BoxCox.lambda(asx_price)
gold_lambda = BoxCox.lambda(gold_price)
crude_oil_lambda = BoxCox.lambda(crude_oil_price)
copper_lambda = BoxCox.lambda(copper_price)
cbind(asx_lambda, gold_lambda, crude_oil_lambda, copper_lambda)
## asx_lambda gold_lambda crude_oil_lambda copper_lambda
## [1,] 1.999924 0.976695 -0.8304136 0.9336783
As we can see the lambda values for copper and gold are very close to 1, hence they need no transformation before decomposition.
However lambda value for asx and crude oil suggest that they may need transformations, since they are not close to 1. Close to 2 in the case of asx_price and negative in case crude_oil_prices.
In the next steps we transform both the time series using boxcox transformations.
#First we transform asx_price, since lambda is not equal to 0 we use the following formula
asx_boxcox = ((asx_price^(asx_lambda)) - 1) / asx_lambda
#Similarly for crude_oil_price
crude_boxcox = ((crude_oil_price^(crude_oil_lambda)) - 1) / crude_oil_lambda
plot(asx_boxcox,ylab='ASX price Index',xlab='Year',type='o', main="Box-Cox Transformed ASX price Index")
plot(crude_boxcox,ylab='Crude Oil price Index',xlab='Year',type='o', main="Box-Cox Transformed Crude Oil price Index")
Now that the transformation has been done, we can infer that all the time series are ready for differencing.
#ASX differencing
asx_diff = diff(asx_boxcox)
gold_diff = diff(gold_price)
crude_diff = diff(crude_boxcox)
copper_diff = diff(copper_price)
par(mfrow=c(2,2))
plot(asx_diff,ylab='ASX prices',xlab='Year', main = "First differences of ASX prices")
plot(gold_diff,ylab='Gold prices',xlab='Year', main = "First differences of Gold prices")
plot(crude_diff,ylab='Crude oil prices',xlab='Year', main = "First differences of Crude Oil prices")
plot(copper_diff,ylab='Copper prices',xlab='Year', main = "First differences of Copper prices")
Lets confirm if the above plotted time series’ are stationary or not by using ADF tests on each.
asx_diff_adf = adf.test(asx_diff)
## Warning in adf.test(asx_diff): p-value smaller than printed p-value
asx_diff_adf
##
## Augmented Dickey-Fuller Test
##
## data: asx_diff
## Dickey-Fuller = -4.4343, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
gold_diff_adf = adf.test(gold_diff)
## Warning in adf.test(gold_diff): p-value smaller than printed p-value
gold_diff_adf
##
## Augmented Dickey-Fuller Test
##
## data: gold_diff
## Dickey-Fuller = -5.8718, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
crude_diff_adf = adf.test(crude_diff)
## Warning in adf.test(crude_diff): p-value smaller than printed p-value
crude_diff_adf
##
## Augmented Dickey-Fuller Test
##
## data: crude_diff
## Dickey-Fuller = -5.5931, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
copper_diff_adf = adf.test(copper_diff)
## Warning in adf.test(copper_diff): p-value smaller than printed p-value
copper_diff_adf
##
## Augmented Dickey-Fuller Test
##
## data: copper_diff
## Dickey-Fuller = -5.478, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Now it is time to actually analyze the effects of components on each of the time series in the dataset. This can be tested by decomposing above used series. We will decompose each of them using the X12 and STL decompostions.
#ASX Price decomposition
asx_decomp = x12(asx_price)
plot(asx_decomp , sa=TRUE , trend=TRUE , forecast = TRUE)
#Seasonal Factors ASX Price
plotSeasFac(asx_decomp)
#STL Test
asx_stl <- stl(asx_price,t.window=15, s.window="periodic", robust=TRUE)
plot(asx_stl)
#Gold Price decomposition
gold_decomp = x12(gold_price)
plot(gold_decomp , sa=TRUE , trend=TRUE , forecast = TRUE)
#Seasonal Factors Gold Price
plotSeasFac(gold_decomp)
#STL test
gold_stl <- stl(gold_price,t.window=15, s.window="periodic", robust=TRUE)
plot(gold_stl)
#Crude oil Price decomposition
crude_oil_decomp = x12(crude_oil_price)
plot(crude_oil_decomp , sa=TRUE , trend=TRUE , forecast = TRUE)
#Seasonal Factors Crude Oil Price
plotSeasFac(crude_oil_decomp)
#STL test
crude_stl <- stl(crude_oil_price,t.window=15, s.window="periodic", robust=TRUE)
plot(crude_stl)
#Copper Price decomposition
copper_decomp = x12(copper_price)
plot(copper_decomp , sa=TRUE , trend=TRUE , forecast = TRUE)
#Seasonal Factors Copper Price
plotSeasFac(copper_decomp)
#STL test
copper_stl <- stl(copper_price,t.window=15, s.window="periodic", robust=TRUE)
plot(copper_stl)
Ou next piece of business is to select the best possible model with respect to ASX Price index. We will see how different models perform with each of the columns namely, Gold prices, crude oil price and copper prices.
# Finite dlm
gold_model1 =dlm(x =as.vector(gold_price) , y =as.vector(asx_price) , q =9)
summary(gold_model1)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1484.01 -590.23 13.97 473.71 1991.28
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4452.20100 223.27538 19.940 <2e-16 ***
## x.t -0.35288 1.27113 -0.278 0.782
## x.1 -0.10591 1.88058 -0.056 0.955
## x.2 0.02149 1.91708 0.011 0.991
## x.3 -0.07972 1.93268 -0.041 0.967
## x.4 -0.25836 1.93610 -0.133 0.894
## x.5 0.36953 1.93817 0.191 0.849
## x.6 0.16311 1.95298 0.084 0.934
## x.7 0.70404 1.94616 0.362 0.718
## x.8 -0.21513 1.91991 -0.112 0.911
## x.9 0.16715 1.29373 0.129 0.897
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 802.4 on 141 degrees of freedom
## Multiple R-squared: 0.05795, Adjusted R-squared: -0.008866
## F-statistic: 0.8673 on 10 and 141 DF, p-value: 0.5654
##
## AIC and BIC values for the model:
## AIC BIC
## 1 2476.983 2513.27
checkresiduals(gold_model1$model)
##
## Breusch-Godfrey test for serial correlation of order up to 14
##
## data: Residuals
## LM test = 140.36, df = 14, p-value < 2.2e-16
#ploymonial dlm
gold_model2 =polyDlm(x =as.vector(gold_price) , y =as.vector(asx_price) , q =2, k = 2, show.beta = TRUE)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.396 1.28 0.310 0.757
## beta.1 -0.233 1.90 -0.122 0.903
## beta.2 0.536 1.27 0.421 0.675
summary(gold_model2)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -1632.50 -700.82 4.61 549.72 2213.87
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3998.3587 212.3161 18.832 <2e-16 ***
## z.t0 0.3958 1.2767 0.310 0.757
## z.t1 -1.3268 5.7723 -0.230 0.819
## z.t2 0.6983 2.8546 0.245 0.807
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 834.5 on 155 degrees of freedom
## Multiple R-squared: 0.1015, Adjusted R-squared: 0.08409
## F-statistic: 5.835 on 3 and 155 DF, p-value: 0.0008385
checkresiduals(gold_model2$model)
##
## Breusch-Godfrey test for serial correlation of order up to 10
##
## data: Residuals
## LM test = 147.24, df = 10, p-value < 2.2e-16
#Koyk's transform
gold_model3 = koyckDlm(x =as.vector(gold_price) , y =as.vector(asx_price) )
summary(gold_model3)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -682.19 -105.44 15.86 135.04 783.60
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.902e+02 8.958e+01 2.123 0.0353 *
## Y.1 9.635e-01 1.909e-02 50.469 <2e-16 ***
## X.t 2.595e-03 4.304e-02 0.060 0.9520
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 201.4 on 157 degrees of freedom
## Multiple R-Squared: 0.9488, Adjusted R-squared: 0.9481
## Wald test: 1454 on 2 and 157 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 5205.15 0.002595168 0.9634602
checkresiduals(gold_model3$model)
#ARDLM
gold_model4.1 =ardlDlm(x =as.vector(gold_price) , y =as.vector(asx_price) ,p =1 ,q =1 )$model
summary(gold_model4.1)
##
## Time series regression with "ts" data:
## Start = 2, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -598.9 -102.9 10.4 119.5 724.6
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 182.80452 85.05812 2.149 0.0332 *
## X.t -1.24911 0.29162 -4.283 3.21e-05 ***
## X.1 1.23169 0.28976 4.251 3.66e-05 ***
## Y.1 0.97172 0.01812 53.624 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 191.1 on 156 degrees of freedom
## Multiple R-squared: 0.9542, Adjusted R-squared: 0.9533
## F-statistic: 1083 on 3 and 156 DF, p-value: < 2.2e-16
gold_model4.2 =ardlDlm(x =as.vector(gold_price) , y =as.vector(asx_price) ,p =2 ,q =2 )$model
summary(gold_model4.2)
##
## Time series regression with "ts" data:
## Start = 3, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -590.39 -101.05 5.98 120.08 717.11
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 206.07177 86.92387 2.371 0.01900 *
## X.t -1.23319 0.30048 -4.104 6.58e-05 ***
## X.1 1.20172 0.43989 2.732 0.00704 **
## X.2 0.01244 0.30803 0.040 0.96783
## Y.1 0.99348 0.08164 12.169 < 2e-16 ***
## Y.2 -0.02596 0.08198 -0.317 0.75195
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 191.7 on 153 degrees of freedom
## Multiple R-squared: 0.9532, Adjusted R-squared: 0.9517
## F-statistic: 623 on 5 and 153 DF, p-value: < 2.2e-16
gold_model4.3 =ardlDlm(x =as.vector(gold_price) , y =as.vector(asx_price) ,p =3 ,q =3 )$model
summary(gold_model4.3)
##
## Time series regression with "ts" data:
## Start = 4, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -557.60 -111.52 14.36 115.19 697.04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 219.25957 88.32755 2.482 0.01415 *
## X.t -1.22010 0.30229 -4.036 8.63e-05 ***
## X.1 1.29275 0.44643 2.896 0.00435 **
## X.2 0.15014 0.45449 0.330 0.74160
## X.3 -0.23147 0.31042 -0.746 0.45703
## Y.1 0.98562 0.08183 12.045 < 2e-16 ***
## Y.2 0.14973 0.11517 1.300 0.19555
## Y.3 -0.17417 0.08164 -2.133 0.03453 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 190.6 on 150 degrees of freedom
## Multiple R-squared: 0.9531, Adjusted R-squared: 0.9509
## F-statistic: 435.1 on 7 and 150 DF, p-value: < 2.2e-16
gold_models.AIC =AIC(gold_model4.1,gold_model4.2,gold_model4.3)
gold_models.BIC =BIC(gold_model4.1,gold_model4.2,gold_model4.3)
sortScore(gold_models.AIC, "aic")
## df AIC
## gold_model4.3 9 2117.305
## gold_model4.2 7 2130.523
## gold_model4.1 5 2140.897
sortScore(gold_models.BIC, "bic")
## df BIC
## gold_model4.3 9 2144.868
## gold_model4.2 7 2152.005
## gold_model4.1 5 2156.273
#According to both AIC and BIC, ARDL(3,3) model is the most accurate one. We will displaydiagnostic plots of residuals from ARDL(3,3) to examine residuals.
checkresiduals(gold_model4.3$residuals)
# Finite dlm
crude_model1 =dlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) , q =9)
summary(crude_model1)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1563.08 -643.18 -11.17 571.05 1707.84
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4641.7675 196.4286 23.631 <2e-16 ***
## x.t 8.8037 11.3647 0.775 0.440
## x.1 2.8911 19.0087 0.152 0.879
## x.2 -2.1607 19.2799 -0.112 0.911
## x.3 2.9569 19.4045 0.152 0.879
## x.4 -7.0213 19.3913 -0.362 0.718
## x.5 0.7784 19.3589 0.040 0.968
## x.6 1.7254 19.4044 0.089 0.929
## x.7 -2.5268 19.5635 -0.129 0.897
## x.8 -2.7015 19.3336 -0.140 0.889
## x.9 0.8312 11.3876 0.073 0.942
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 802.1 on 141 degrees of freedom
## Multiple R-squared: 0.05864, Adjusted R-squared: -0.008123
## F-statistic: 0.8783 on 10 and 141 DF, p-value: 0.5551
##
## AIC and BIC values for the model:
## AIC BIC
## 1 2476.871 2513.158
checkresiduals(crude_model1$model)
##
## Breusch-Godfrey test for serial correlation of order up to 14
##
## data: Residuals
## LM test = 140.06, df = 14, p-value < 2.2e-16
#ploymonial dlm
crude_model2 =polyDlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) , q =2, k = 2, show.beta = TRUE)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 13.70 11.6 1.180 0.239
## beta.1 3.18 19.1 0.167 0.868
## beta.2 -8.41 11.5 -0.730 0.467
summary(crude_model2)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -1616.4 -703.8 -77.9 657.5 1783.6
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4209.0519 179.6342 23.431 <2e-16 ***
## z.t0 13.6870 11.5777 1.182 0.239
## z.t1 -9.9541 57.9232 -0.172 0.864
## z.t2 -0.5483 28.7836 -0.019 0.985
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 835.6 on 155 degrees of freedom
## Multiple R-squared: 0.09909, Adjusted R-squared: 0.08165
## F-statistic: 5.683 on 3 and 155 DF, p-value: 0.001019
checkresiduals(crude_model2$model)
##
## Breusch-Godfrey test for serial correlation of order up to 10
##
## data: Residuals
## LM test = 146.42, df = 10, p-value < 2.2e-16
#Koyk's transform
crude_model3 = koyckDlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) )
summary(crude_model3)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -683.91 -108.66 13.68 139.77 762.55
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 209.89536 87.89368 2.388 0.0181 *
## Y.1 0.97537 0.01905 51.193 <2e-16 ***
## X.t -0.99907 0.58045 -1.721 0.0872 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 201.1 on 157 degrees of freedom
## Multiple R-Squared: 0.949, Adjusted R-squared: 0.9483
## Wald test: 1461 on 2 and 157 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 8522.034 -0.9990694 0.9753703
checkresiduals(crude_model3$model)
#ARDLM
crude_model4.1 =ardlDlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) ,p =1 ,q =1 )$model
summary(crude_model4.1)
##
## Time series regression with "ts" data:
## Start = 2, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -684.06 -111.57 -1.77 138.90 719.35
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 211.45397 85.03705 2.487 0.01395 *
## X.t 7.45111 2.46199 3.026 0.00290 **
## X.1 -8.17917 2.44422 -3.346 0.00103 **
## Y.1 0.97067 0.01837 52.827 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 194.5 on 156 degrees of freedom
## Multiple R-squared: 0.9525, Adjusted R-squared: 0.9516
## F-statistic: 1044 on 3 and 156 DF, p-value: < 2.2e-16
crude_model4.2 =ardlDlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) ,p =2 ,q =2 )$model
summary(crude_model4.2)
##
## Time series regression with "ts" data:
## Start = 3, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -676.1 -115.8 -10.3 141.0 731.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 244.07731 86.68388 2.816 0.00551 **
## X.t 6.24419 2.79912 2.231 0.02715 *
## X.1 -3.49738 4.46844 -0.783 0.43502
## X.2 -3.63452 2.70013 -1.346 0.18028
## Y.1 0.94407 0.08337 11.324 < 2e-16 ***
## Y.2 0.02266 0.08323 0.272 0.78584
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 194 on 153 degrees of freedom
## Multiple R-squared: 0.9521, Adjusted R-squared: 0.9505
## F-statistic: 607.9 on 5 and 153 DF, p-value: < 2.2e-16
crude_model4.3 =ardlDlm(x =as.vector(crude_oil_price) , y =as.vector(asx_price) ,p =3 ,q =3 )$model
summary(crude_model4.3)
##
## Time series regression with "ts" data:
## Start = 4, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -649.54 -119.25 -12.53 138.52 709.15
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 249.79681 89.55878 2.789 0.00597 **
## X.t 6.52707 2.82900 2.307 0.02241 *
## X.1 -4.76939 4.71360 -1.012 0.31325
## X.2 -3.02318 4.58005 -0.660 0.51022
## X.3 0.47726 2.73635 0.174 0.86178
## Y.1 0.94185 0.08455 11.139 < 2e-16 ***
## Y.2 0.12269 0.11766 1.043 0.29876
## Y.3 -0.10083 0.08400 -1.200 0.23187
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 194.9 on 150 degrees of freedom
## Multiple R-squared: 0.9509, Adjusted R-squared: 0.9486
## F-statistic: 415.2 on 7 and 150 DF, p-value: < 2.2e-16
crude_models.AIC =AIC(crude_model4.1,crude_model4.2,crude_model4.3)
crude_models.BIC =BIC(crude_model4.1,crude_model4.2,crude_model4.3)
sortScore(crude_models.AIC, "aic")
## df AIC
## crude_model4.3 9 2124.324
## crude_model4.2 7 2134.235
## crude_model4.1 5 2146.524
sortScore(crude_models.BIC, "bic")
## df BIC
## crude_model4.3 9 2151.887
## crude_model4.2 7 2155.718
## crude_model4.1 5 2161.900
#According to both AIC and BIC, ARDL(3,3) model is the most accurate one. We will displaydiagnostic plots of residuals from ARDL(3,3) to examine residuals.
checkresiduals(crude_model4.3$residuals)
# Finite dlm
copper_model1 =dlm(x =as.vector(copper_price) , y =as.vector(asx_price) , q =9)
summary(copper_model1)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1163.95 -653.62 -5.48 601.46 1422.56
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.921e+03 2.114e+02 18.546 <2e-16 ***
## x.t 1.576e-01 1.351e-01 1.166 0.245
## x.1 1.829e-02 2.177e-01 0.084 0.933
## x.2 4.688e-02 2.177e-01 0.215 0.830
## x.3 2.755e-02 2.164e-01 0.127 0.899
## x.4 2.061e-02 2.157e-01 0.096 0.924
## x.5 -5.263e-02 2.157e-01 -0.244 0.808
## x.6 3.688e-02 2.165e-01 0.170 0.865
## x.7 -5.357e-03 2.186e-01 -0.025 0.980
## x.8 -2.372e-04 2.195e-01 -0.001 0.999
## x.9 -9.203e-02 1.337e-01 -0.688 0.493
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 736.7 on 141 degrees of freedom
## Multiple R-squared: 0.2059, Adjusted R-squared: 0.1496
## F-statistic: 3.656 on 10 and 141 DF, p-value: 0.000233
##
## AIC and BIC values for the model:
## AIC BIC
## 1 2451.016 2487.302
checkresiduals(copper_model1$model)
##
## Breusch-Godfrey test for serial correlation of order up to 14
##
## data: Residuals
## LM test = 140.43, df = 14, p-value < 2.2e-16
#ploymonial dlm
copper_model2 =polyDlm(x =as.vector(copper_price) , y =as.vector(asx_price) , q =2, k = 2, show.beta = TRUE)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.17800 0.131 1.3600 0.176
## beta.1 0.05290 0.207 0.2550 0.799
## beta.2 -0.00654 0.129 -0.0507 0.960
summary(copper_model2)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -1302.7 -694.4 -135.3 635.5 1512.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3484.26812 181.89988 19.155 <2e-16 ***
## z.t0 0.17781 0.13067 1.361 0.176
## z.t1 -0.15771 0.62898 -0.251 0.802
## z.t2 0.03277 0.31191 0.105 0.916
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 743 on 155 degrees of freedom
## Multiple R-squared: 0.2877, Adjusted R-squared: 0.274
## F-statistic: 20.87 on 3 and 155 DF, p-value: 2.065e-11
checkresiduals(copper_model2$model)
##
## Breusch-Godfrey test for serial correlation of order up to 10
##
## data: Residuals
## LM test = 146.69, df = 10, p-value < 2.2e-16
#Koyk's transform
copper_model3 = koyckDlm(x =as.vector(copper_price) , y =as.vector(asx_price) )
summary(copper_model3)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -689.64 -108.62 12.78 140.20 771.79
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 189.368812 87.644648 2.161 0.0322 *
## Y.1 0.971621 0.021895 44.376 <2e-16 ***
## X.t -0.005864 0.009517 -0.616 0.5387
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 201.9 on 157 degrees of freedom
## Multiple R-Squared: 0.9485, Adjusted R-squared: 0.9479
## Wald test: 1448 on 2 and 157 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 6672.885 -0.005863623 0.9716211
checkresiduals(copper_model3$model)
#ARDLM
copper_model4.1 =ardlDlm(x =as.vector(copper_price) , y =as.vector(asx_price) ,p =1 ,q =1 )$model
summary(copper_model4.1)
##
## Time series regression with "ts" data:
## Start = 2, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -734.34 -101.82 16.56 123.05 774.55
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 173.61066 84.94572 2.044 0.04266 *
## X.t 0.10629 0.03258 3.263 0.00136 **
## X.1 -0.10695 0.03228 -3.313 0.00115 **
## Y.1 0.96784 0.02103 46.027 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 195.2 on 156 degrees of freedom
## Multiple R-squared: 0.9522, Adjusted R-squared: 0.9513
## F-statistic: 1035 on 3 and 156 DF, p-value: < 2.2e-16
copper_model4.2 =ardlDlm(x =as.vector(copper_price) , y =as.vector(asx_price) ,p =2 ,q =2 )$model
summary(copper_model4.2)
##
## Time series regression with "ts" data:
## Start = 3, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -693.17 -110.95 7.57 121.75 765.87
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 195.791146 85.366884 2.294 0.0232 *
## X.t 0.079371 0.034097 2.328 0.0212 *
## X.1 -0.004626 0.053845 -0.086 0.9316
## X.2 -0.078848 0.034420 -2.291 0.0233 *
## Y.1 0.972757 0.079625 12.217 <2e-16 ***
## Y.2 -0.005210 0.079957 -0.065 0.9481
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 192.6 on 153 degrees of freedom
## Multiple R-squared: 0.9527, Adjusted R-squared: 0.9512
## F-statistic: 616.8 on 5 and 153 DF, p-value: < 2.2e-16
copper_model4.3 =ardlDlm(x =as.vector(copper_price) , y =as.vector(asx_price) ,p =3 ,q =3 )$model
summary(copper_model4.3)
##
## Time series regression with "ts" data:
## Start = 4, End = 161
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -633.75 -106.38 -7.83 119.11 746.52
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 211.80399 86.99784 2.435 0.0161 *
## X.t 0.08242 0.03418 2.412 0.0171 *
## X.1 -0.02677 0.05491 -0.488 0.6266
## X.2 -0.01751 0.05445 -0.322 0.7482
## X.3 -0.04240 0.03493 -1.214 0.2267
## Y.1 0.94704 0.08119 11.665 <2e-16 ***
## Y.2 0.11444 0.11280 1.015 0.3120
## Y.3 -0.09745 0.08010 -1.217 0.2257
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 192.1 on 150 degrees of freedom
## Multiple R-squared: 0.9523, Adjusted R-squared: 0.9501
## F-statistic: 428.2 on 7 and 150 DF, p-value: < 2.2e-16
copper_models.AIC =AIC(copper_model4.1,copper_model4.2,copper_model4.3)
copper_models.BIC =BIC(copper_model4.1,copper_model4.2,copper_model4.3)
sortScore(copper_models.AIC, "aic")
## df AIC
## copper_model4.3 9 2119.696
## copper_model4.2 7 2132.038
## copper_model4.1 5 2147.741
sortScore(copper_models.BIC, "bic")
## df BIC
## copper_model4.3 9 2147.259
## copper_model4.2 7 2153.520
## copper_model4.1 5 2163.116
#According to both AIC and BIC, ARDL(3,3) model is the most accurate one. We will displaydiagnostic plots of residuals from ARDL(3,3) to examine residuals.
checkresiduals(copper_model4.3$residuals)
The residual standard deviation values for the ARDLM are lower when compared to any of the other. Although the Koyk transform also has a relatively low residual SD values, it still performs a little less optimally when compared to ARDLM. The polynomial DLM and finite lag DLM fall short of our expectation as soon as well have a look at the residual SD, which is very high in comparison to the other two models. There is also the case of the ACF values falling out of 95% confidence interval, which is not a case in ARDLM and Koyk transform.
The best fit model for this particular ARDL model is with p,q = (3,3), which give us the least AIC and BIC values.
When we initially started with our dataset, we had to make sure if the data was stationary or not. For those purpose we tested the data using Augmented Dickey Fuller test. The given data was non-stationary hence we had to transform it using transromation and differencing.
We also had to check the effects of seasonality and trend on the time series. These components were responsible in making the data non-stationary. The process of decomposition was crucial in helping solve the effects of seasonality and trend on the data and hence in turn making it Stationary.
Then it was time to find the best fit model for ASX price. We started with comparing 4 models namely Finite lag DLM, polynomialDLM, Koyk transform and ARDLM. After looking at the residual plots and summary we concluded that it was the ARDLM which was the best fit model when it comes to our dataset.