The project report consist of three tasks: [a] Task1: Your task is to give best 4 weeks ahead forecasts in terms of R squared, AIC, BIC, MASE etc (as is appropriate) for the mortality series. Provide the point forecasts and confidence intervals and corresponding plot for the most optimal model for each method used.
[b] Task2: Your task is to model FFD and forecast FFD. Single climate predictors (univariate models) are to be tested. Your task is to give best FFD 4 years ahead forecasts for the FFD series. Point forecasts and confidence intervals are required for the forecasts with appropriate graphs.
[c] Task 3(a)Carry out your analysis based on univariate climate regressors (model one climate indicator at a time, i.e., univariate regressor). • Modelling methods to try (DLM, ARDL, polyck, koyck, dynlm). • Choice of optimal models within EACH a specific method can be assessed from values of R squared, AIC, BIC, MASE etc (as is appropriate to the method).
[c] Task 3(b)Perform the appropriate analysis and obtain the 3 year ahead forecasts (suggest using the dynlm package) only for part (b))
The aim of the investigation is to perform the analysis on disease specific mortality between the years 2010-2020 effected by both climate and pollution and observe the results to reach upon conclusions
The dataset used ASX.csv includes averaged weekly mortality in Paris, France and the city’s local climate (temperature degrees Fahrenheit), size of pollutants and levels of noxious chemical emissions from cars and industry in the air - all measured at the same points between 2010-2020. All 5 series i.e. mortality, temperature, pollutants particle size and two chemical emissions (chem1, chem2) between 2010-2020 (508 time points) are given here in mort.csv. You will use this data for the calculation of 4 weeks ahead forecasts for mortality
mort_data <- read_csv("D:/Drive data/Rmit/Sem4/Forecasting/mort.csv")
## Warning: Missing column names filled in: 'X1' [1]
##
## -- Column specification --------------------------------------------------------
## cols(
## X1 = col_double(),
## mortality = col_double(),
## temp = col_double(),
## chem1 = col_double(),
## chem2 = col_double(),
## `particle size` = col_double()
## )
colnames(mort_data)
## [1] "X1" "mortality" "temp" "chem1"
## [5] "chem2" "particle size"
head(mort_data,5)
## # A tibble: 5 x 6
## X1 mortality temp chem1 chem2 `particle size`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 97.8 72.4 11.5 3.37 72.7
## 2 2 105. 67.2 8.92 2.59 49.6
## 3 3 94.4 62.9 9.48 3.29 55.7
## 4 4 98.0 72.5 10.3 3.04 55.2
## 5 5 95.8 74.2 10.6 3.39 66.0
summary(mort_data)
## X1 mortality temp chem1
## Min. : 1.0 Min. : 68.11 Min. :50.91 Min. : 2.520
## 1st Qu.:127.8 1st Qu.: 81.90 1st Qu.:67.23 1st Qu.: 4.970
## Median :254.5 Median : 87.33 Median :74.06 Median : 6.865
## Mean :254.5 Mean : 88.70 Mean :74.26 Mean : 7.909
## 3rd Qu.:381.2 3rd Qu.: 94.36 3rd Qu.:81.49 3rd Qu.:10.080
## Max. :508.0 Max. :132.04 Max. :99.88 Max. :22.390
## chem2 particle size
## Min. :0.860 Min. :20.25
## 1st Qu.:2.050 1st Qu.:35.85
## Median :2.740 Median :44.25
## Mean :2.844 Mean :47.41
## 3rd Qu.:3.465 3rd Qu.:57.54
## Max. :6.570 Max. :97.94
class(mort_data)
## [1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
#tail(mort_data)
#508 points from 2010 to 2020 weekly so we have 52 weeks
T1_ts <- ts(mort_data[,2:6], start = c(2010,7), frequency = 52)
class(T1_ts)
## [1] "mts" "ts" "matrix"
#T1_ts
#tail(T1_ts)
Timeseries on each column
##Mortality
mortal_ts <- ts(mort_data$mortality, start = c(2010,7),frequency = 52)
##temperature
temp_ts <- ts(mort_data$temp, start = c(2010,7),frequency = 52)
##Chemical 1 &chemical2
chem1_ts <- ts(mort_data$chem1, start = c(2010,7),frequency = 52)
chem2_ts <- ts(mort_data$chem2, start = c(2010,7),frequency = 52)
#Particle size
part_ts <- ts(mort_data$`particle size`, start = c(2010,7),frequency = 52)
Plotting each column in data
1)Mortality
##PLOtting
plot(mortal_ts,type = "o", ylab="mortality index", xlab="Year", main = "Time series plot of mortality rates between 2010 to 2020")
In Mortality series we do not find any trend. We do not see seasonality in the series. Intervention point is observed. Moving average is found. Changing varince is present
2)Temperature
plot(temp_ts,type = "o", ylab="temperature index", xlab="Year", main = "Time series plot of temperature change rates between 2010 to 2020")
In Temperature series we do not see any trend. We found seasonality in the series. No intervention point is observed. Moving average was visible. Changing varince is present.
plot(chem1_ts,type = "o", ylab="chemical1 index", xlab="Year", main = "Time series plot of chemical 1 change rates between 2010 to 2020")
In Chemical 1 series we see downward any trend. We see no seasonality in the series. No intervention point is observed. Moving average is visible. Changing variance was present.
plot(chem2_ts,type = "o", ylab="chemical 2 index", xlab="Year", main = "Time series plot of chemical 2 change rates between 2010 to 2020")
In Chemical 2 series we find no trend. We found slight seasonality in the series. Intervention point is observed. Moving average is visible. Changing variance was present.
plot(part_ts,type = "o", ylab="particle size", xlab="Year", main = "Time series plot of particle size change rates between 2010 to 2020")
In Particle Size series we did not find any trend. We observed seasonality in the series. No intervention point is observed. Moving average is visible. Changing variance was found.
Successive points as well as fluctuations around mean level suggest autoregressive and moving average behaviour.
To further explore the relationship between the all ords index and our independent series, we display them within the same plot. Standartisation is performed over all variables by centering and scaling to clearly plot them on the same scale.
T1scale_date = scale(T1_ts)
plot(T1scale_date, plot.type = "s" ,col = c("Red","blue", "Green", "black","brown"),main="Time series plot of Scaled mortality data")
legend("bottomright",lty=1, text.width = 3, col=c("Red","blue", "Green", "black","brown"), c("Mortality", "Temperature", "Chemical1", "Chemical2","Particle Size"))
#Find the correlation between them
cor(T1_ts)
## mortality temp chem1 chem2 particle size
## mortality 1.0000000 -0.43863962 0.55744759 0.2569989 0.44387133
## temp -0.4386396 1.00000000 -0.09785582 0.4043740 -0.01723095
## chem1 0.5574476 -0.09785582 1.00000000 0.5130047 0.86611747
## chem2 0.2569989 0.40437401 0.51300467 1.0000000 0.46793404
## particle size 0.4438713 -0.01723095 0.86611747 0.4679340 1.00000000
It is observed that there is a moderate positive correlation between the dependent series all parameters From value obtained we see that Size is has good correlation than other predictors
To find the suitable lag and find stationarity unit root test is performed
##Augmented dicky fuller test
par(mfrow=c(1,2))
acf(mortal_ts, main = "ACF for the mortality rate",cex.main=0.65)
pacf(mortal_ts, main = "PACF of Mortality",cex.main=0.05)
par(mfrow=c(1,1))
ar(mortal_ts)
##
## Call:
## ar(x = mortal_ts)
##
## Coefficients:
## 1 2
## 0.4339 0.4376
##
## Order selected 2 sigma^2 estimated as 32.84
#order selected=2
adf.test(mortal_ts,k = 2)
## Warning in adf.test(mortal_ts, k = 2): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: mortal_ts
## Dickey-Fuller = -5.161, Lag order = 2, p-value = 0.01
## alternative hypothesis: stationary
Since the p value is smaller than 0.05, we reject the null hypothesis that implies stationarity.
PP.test(mortal_ts)
##
## Phillips-Perron Unit Root Test
##
## data: mortal_ts
## Dickey-Fuller = -9.454, Truncation lag parameter = 6, p-value = 0.01
According to PP test,p value is lower than 5% thus mortal_ts series is stationarity.
par(mfrow=c(1,2))
acf(temp_ts, main = "ACF for the temperature series",cex.main=0.65)
pacf(temp_ts, main = "PACF of temperature",cex.main=0.05)
par(mfrow=c(1,1))
ar(temp_ts)
##
## Call:
## ar(x = temp_ts)
##
## Coefficients:
## 1 2 3 4 5 6 7 8
## 0.1479 0.2072 0.0702 0.1794 0.0486 0.0769 0.0191 0.0618
## 9 10 11 12 13 14 15 16
## 0.0934 -0.0328 -0.0889 -0.0992 -0.0092 -0.0335 -0.0240 0.0094
## 17 18 19 20 21 22
## 0.0180 0.0004 -0.0465 -0.0204 -0.0373 -0.1382
##
## Order selected 22 sigma^2 estimated as 36.83
#order selected=22
adf.test(temp_ts,k = 22)
## Warning in adf.test(temp_ts, k = 22): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: temp_ts
## Dickey-Fuller = -8.2554, Lag order = 22, p-value = 0.01
## alternative hypothesis: stationary
Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.
PP.test(temp_ts)
##
## Phillips-Perron Unit Root Test
##
## data: temp_ts
## Dickey-Fuller = -12.095, Truncation lag parameter = 6, p-value = 0.01
According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.
par(mfrow=c(1,2))
acf(chem1_ts, main = "ACF for the chemical1 values",cex.main=0.65)
pacf(chem1_ts, main = "PACF of chemical1 values",cex.main=0.05)
par(mfrow=c(1,1))
ar(chem1_ts)
##
## Call:
## ar(x = chem1_ts)
##
## Coefficients:
## 1 2 3 4 5 6 7 8
## 0.0883 0.3275 0.1834 0.1018 0.1016 0.1447 0.0522 0.0184
## 9 10 11 12 13 14 15 16
## -0.0052 0.0542 -0.1058 -0.1009 0.0225 -0.0643 -0.0490 -0.0802
##
## Order selected 16 sigma^2 estimated as 5.609
#order selected=16
adf.test(chem1_ts,k = 16)
## Warning in adf.test(chem1_ts, k = 16): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: chem1_ts
## Dickey-Fuller = -8.1588, Lag order = 16, p-value = 0.01
## alternative hypothesis: stationary
Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.
PP.test(chem1_ts)
##
## Phillips-Perron Unit Root Test
##
## data: chem1_ts
## Dickey-Fuller = -12.819, Truncation lag parameter = 6, p-value = 0.01
According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.
par(mfrow=c(1,2))
acf(chem2_ts, main = "ACF for the chemical2 values",cex.main=0.65)
pacf(chem2_ts, main = "PACF of chemical2",cex.main=0.05)
par(mfrow=c(1,1))
ar(chem2_ts)
##
## Call:
## ar(x = chem2_ts)
##
## Coefficients:
## 1 2 3 4 5 6 7 8
## 0.1319 0.2025 0.0119 0.1425 0.1070 0.0523 0.0631 0.0958
##
## Order selected 8 sigma^2 estimated as 0.7765
#order selected=8
adf.test(chem2_ts,k = 8)
## Warning in adf.test(chem2_ts, k = 8): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: chem2_ts
## Dickey-Fuller = -5.3362, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.
PP.test(chem2_ts)
##
## Phillips-Perron Unit Root Test
##
## data: chem2_ts
## Dickey-Fuller = -20.014, Truncation lag parameter = 6, p-value = 0.01
According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.
par(mfrow=c(1,2))
acf(part_ts, main = "ACF for the particle size series",cex.main=0.65)
pacf(part_ts, main = "PACF of particle size",cex.main=0.05)
par(mfrow=c(1,1))
ar(part_ts)
##
## Call:
## ar(x = part_ts)
##
## Coefficients:
## 1 2 3 4 5 6 7 8
## 0.1272 0.2584 0.1620 0.1593 0.0681 0.1083 0.0666 0.0256
## 9 10 11 12 13 14
## -0.0359 0.0504 -0.0827 -0.0989 -0.0665 -0.1112
##
## Order selected 14 sigma^2 estimated as 114.6
#order selected=14
adf.test(part_ts,k = 14)
## Warning in adf.test(part_ts, k = 14): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: part_ts
## Dickey-Fuller = -7.2956, Lag order = 14, p-value = 0.01
## alternative hypothesis: stationary
Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.
PP.test(part_ts)
##
## Phillips-Perron Unit Root Test
##
## data: part_ts
## Dickey-Fuller = -13.343, Truncation lag parameter = 6, p-value = 0.01
According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.
Based on the slowly decaying pattern of significant lags in the sample ACF plots in Figures 6-9, we can conclude that all explored series have a trend. The ADF test reports p-values > 0.05 for all the series, so we fail to reject the H0 that the series are nonstationary at 5% level. Overall, from a descriptive analysis of time series plots, sample ACF plots and the ADF test results, we can observe that there is nonstationarity existent in the asx data. Augmented Dickey-Fuller Test states tha t all the series is stationary as we get p-value less than 5% hence we reject null hypothesis.
#Decomposition ##Decomposition of time series Decomposition of time series into different components is useful to observe the individualeffects of the existing components and historical effects occurred in the past. Thecomponents that a time series can be decomposed into are * seasonal * trend and * remainder , which includes other effects that are not captured by the seasonal and trend components. Basically, there are three main decomposition methods for time series. The most basic oneis the * classical decomposition, which provides the basis for other decomposition methods. * The X-12-ARIMA decomposition is another decomposition which is more complex than the classical decomposition. It is mostly used for quarterly and monthly data.
One of thevery robust and commonly used decomposition methods is the Seasonal and Trend decomposition using Loss (STL) decomposition). When a time series is displayed, it includes trend and seasonal effect in a confounded way;hence, it would be very difficult to infer about the main characteristics of the series underthe effect of seasonality.
Therefore, we use time series decomposition to extract eachcomponent from the series and adjust the series for various effects like seasonality.
#Decomposition function giving output STL plot.
decompose <- function(x){
stldeco=mstl(x, t.window=15, s.window="periodic", robust=TRUE)
plot(stldeco)
}
#decomposition of Mortality
decompose(mortal_ts)
#Decomposition of Temperature
decompose(temp_ts)
#Decomposition of Chemical 1
decompose(chem1_ts)
#Decomposition of chemical 2
decompose(chem2_ts)
#Decomposition of Particle Size
decompose(part_ts)
STL handles any type of seasonality, whiles others are somewhat limited to onlymonthly and/or quarterly series. The seasonal component can change by the time and the rate of change can becontrolled by the user. The smoothness of the trend-cycle can also be controlled by the user. We can make it robust to outliers by sending the effect of occasional unusualobservations to the remainder component.
1)Mortality
stl_mort = stl(mortal_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_mort, main = "STL decomposition of all price index series")
#Seasonal period too large.thus cannot fit x12 model
##Naive forcast
mortadj =seasadj(stl_mort)
plot(naive(mortadj), xlab="mortality rates", main= "Naive forecasts of seasonally adjusted mortality rates")
From the remainder series of the all ords series in Figure 10, it is observed that the spikes in the raw data are caused by other external factors, they happen around the intervention point.
stl_temp = stl(temp_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_temp, main = "STL decomposition of gold price index series")
##Naive forcast
tempadj =seasadj(stl_temp)
plot(naive(tempadj), ylab="temperature rates",xlab="Time period", main= "Naive forecasts of seasonally adjusted temperature rates")
There was no seasonal effect found in the gold price series at the data visualisation stage, so the seasonally adjusted data in X12 decomposition is very close to the original series,and the seasonal pattern in STL decomposition is meaningless (Figure 11). The remainder component of this series is not smooth at all, meaning there are other unknown factors that have an impact on the series.
3)Chemical1
stl_c1 = stl(chem1_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_c1, main = "STL decomposition of chemical1 value index series")
##Naive forcast
c1adj =seasadj(stl_c1)
plot(naive(c1adj), ylab="chemical 1 values",xlab="Time period", main= "Naive forecasts of seasonally adjusted chemical1 rates")
From the STL decomposition in Figure 12,it is observed that the remainder has a major peak when the intervention happened, but otherwise is rather smooth.
4)Chemical2
stl_c2 = stl(chem2_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_c2, main = "STL decomposition of chemical2 value index series")
##Naive forcast
c2adj =seasadj(stl_c2)
plot(naive(c2adj), ylab="chemical2 values",xlab="Time period", main= "Naive forecasts of seasonally adjusted chemical 2 value")
5)particle space
stl_p = stl(part_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_p, main = "STL decomposition of particle space value index series")
##Naive forecast
partadj =seasadj(stl_c2)
plot(naive(partadj), ylab="Particle size",xlab="Time period", main= "Naive forecasts of seasonally adjusted particle space value")
The conclusions that we can make from the decomposition of the mortality data are similar to all the other previously analysed series. There are ups and downs around the intervention in the remainder of the series (Figure 13). There was no seasonality found prior to decomposition, so there is no seasonal effect here.
Overall, it is observed that there is no seasonality effect on the mortal data, and all the fluctuations are due to other external factors. Since we did not find any evidence of seasonality from the time series and ACF plots, the seasonal pattern from STL decomposition is not meaningful. From the X12 decomposition we can observe that the seasonally adjusted data is very close to the original series.
In the modelling process we attempt to find the best appropriate model for the all ordinaries price index.
These predictors were chosen based on their correlation with each other and with the dependent variable.
#1 Finite DLM
dataf = mort_data
colnames(dataf) <- c("mortality", "temp", "X1", "X2","X3")
for ( i in 1:10){
model1.1 = dlm(formula = mortality ~ temp + +X1+X2+X3, data = data.frame(dataf), q = i )
cat("q = ", i, "AIC = ", AIC(model1.1$model), "BIC = ", BIC(model1.1$model),"Mase =",MASE(model1.1)$MASE, "\n")
}
## q = 1 AIC = 6162.854 BIC = 6205.139 Mase = 84.97144
## q = 2 AIC = 6102.699 BIC = 6161.871 Mase = 79.95024
## q = 3 AIC = 6049.467 BIC = 6125.509 Mase = 76.15284
## q = 4 AIC = 6010.61 BIC = 6103.507 Mase = 73.65758
## q = 5 AIC = 5975.72 BIC = 6085.455 Mase = 71.25103
## q = 6 AIC = 5939.706 BIC = 6066.264 Mase = 68.52489
## q = 7 AIC = 5901.759 BIC = 6045.123 Mase = 65.89878
## q = 8 AIC = 5861.617 BIC = 6021.772 Mase = 62.95227
## q = 9 AIC = 5820.038 BIC = 5996.968 Mase = 60.43504
## q = 10 AIC = 5776.348 BIC = 5970.035 Mase = 58.22366
Finite dlm
Multiple predictors For all indexes
Model1.AllIndexes = dlm(formula = mortality ~ temp + X3, data = data.frame(dataf), q=10)
summary(Model1.AllIndexes)
##
## Call:
## lm(formula = as.formula(model.formula), data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -243.910 -55.471 8.942 54.356 220.948
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1228.49183 40.41868 30.394 < 2e-16 ***
## temp.t -1.15116 0.66343 -1.735 0.083358 .
## temp.1 -0.36794 0.72348 -0.509 0.611288
## temp.2 -0.32930 0.76406 -0.431 0.666675
## temp.3 -0.09361 0.76381 -0.123 0.902506
## temp.4 0.17080 0.76156 0.224 0.822640
## temp.5 0.11340 0.76171 0.149 0.881713
## temp.6 -0.07927 0.76292 -0.104 0.917287
## temp.7 -0.33629 0.76079 -0.442 0.658672
## temp.8 -0.46105 0.76066 -0.606 0.544719
## temp.9 -1.48477 0.71179 -2.086 0.037514 *
## temp.10 -2.46332 0.65084 -3.785 0.000173 ***
## X3.t -20.06364 4.14000 -4.846 1.71e-06 ***
## X3.1 -19.55329 4.26072 -4.589 5.70e-06 ***
## X3.2 -15.02953 4.32428 -3.476 0.000556 ***
## X3.3 -16.71820 4.31411 -3.875 0.000121 ***
## X3.4 -13.09503 4.37328 -2.994 0.002894 **
## X3.5 -10.46192 4.38709 -2.385 0.017484 *
## X3.6 -9.49201 4.36805 -2.173 0.030270 *
## X3.7 -9.38949 4.37222 -2.148 0.032256 *
## X3.8 -7.44280 4.39035 -1.695 0.090681 .
## X3.9 -8.14088 4.33394 -1.878 0.060938 .
## X3.10 -8.62138 4.30207 -2.004 0.045637 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 79.12 on 475 degrees of freedom
## Multiple R-squared: 0.7111, Adjusted R-squared: 0.6977
## F-statistic: 53.14 on 22 and 475 DF, p-value: < 2.2e-16
##
## AIC and BIC values for the model:
## AIC BIC
## 1 5791.183 5892.238
residualcheck=function(x){
shapiro.test(x$residuals)
}
residualcheck(Model1.AllIndexes$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.99173, p-value = 0.00714
checkresiduals(Model1.AllIndexes$model)
##
## Breusch-Godfrey test for serial correlation of order up to 26
##
## data: Residuals
## LM test = 474.68, df = 26, p-value < 2.2e-16
VIF_m1 = vif(Model1.AllIndexes$model)
VIF_m1
## temp.t temp.1 temp.2 temp.3 temp.4 temp.5 temp.6 temp.7
## 3.522721 4.188573 4.672719 4.660662 4.619670 4.617936 4.615754 4.590513
## temp.8 temp.9 temp.10 X3.t X3.1 X3.2 X3.3 X3.4
## 4.588034 4.035825 3.375787 1.523050 1.608759 1.658110 1.648431 1.689063
## X3.5 X3.6 X3.7 X3.8 X3.9 X3.10
## 1.696054 1.681233 1.679486 1.691992 1.649012 1.625328
VIF_m1 > 10
## temp.t temp.1 temp.2 temp.3 temp.4 temp.5 temp.6 temp.7 temp.8 temp.9
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## temp.10 X3.t X3.1 X3.2 X3.3 X3.4 X3.5 X3.6 X3.7 X3.8
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## X3.9 X3.10
## FALSE FALSE
If the value of VIF is greater than 10, we can conclude that the effect of multicollinearity is high.
#Temp
model1.temp <- dlm(x=as.vector(dataf$temp), y=as.vector(dataf$mortality), q=10)
summary(model1.temp)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -262.11 -89.47 -2.66 93.64 275.67
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.174e+03 6.016e+01 19.518 < 2e-16 ***
## x.t -2.899e+00 9.459e-01 -3.065 0.00230 **
## x.1 -1.431e+00 1.024e+00 -1.397 0.16304
## x.2 -6.515e-01 1.095e+00 -0.595 0.55218
## x.3 -3.835e-01 1.095e+00 -0.350 0.72639
## x.4 -1.668e-03 1.096e+00 -0.002 0.99879
## x.5 6.399e-02 1.096e+00 0.058 0.95346
## x.6 4.227e-02 1.097e+00 0.039 0.96929
## x.7 -1.910e-01 1.096e+00 -0.174 0.86177
## x.8 -3.774e-01 1.097e+00 -0.344 0.73087
## x.9 -1.514e+00 1.025e+00 -1.477 0.14031
## x.10 -2.970e+00 9.479e-01 -3.133 0.00183 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 119.6 on 486 degrees of freedom
## Multiple R-squared: 0.3243, Adjusted R-squared: 0.309
## F-statistic: 21.2 on 11 and 486 DF, p-value: < 2.2e-16
##
## AIC and BIC values for the model:
## AIC BIC
## 1 6192.336 6247.073
checkresiduals(model1.temp)
## 1 2 3 4 5 6
## -173.9110411 -165.2324112 -174.8008560 -189.3055931 -180.6809853 -221.4638279
## 7 8 9 10 11 12
## -247.3735674 -227.1723497 -245.3420008 -205.4787361 -221.9912166 -202.9837259
## 13 14 15 16 17 18
## -208.2602349 -228.4757166 -220.9296073 -256.6840365 -246.8174666 -251.4165222
## 19 20 21 22 23 24
## -254.1753122 -224.2394737 -262.1047517 -207.0276056 -219.0632537 -223.2998477
## 25 26 27 28 29 30
## -217.9343181 -230.7859978 -230.3158899 -210.1819458 -203.6393006 -197.2741253
## 31 32 33 34 35 36
## -194.0331829 -171.5325787 -174.4802410 -172.2179381 -131.6226821 -105.3451398
## 37 38 39 40 41 42
## -48.6899586 -53.1170919 -61.8432574 -88.6530015 -110.8604109 -89.6887571
## 43 44 45 46 47 48
## -72.2442783 -54.1794219 -20.7490259 5.8095577 5.6322319 -38.7757658
## 49 50 51 52 53 54
## -39.7721468 -85.4694422 -98.4607513 -88.8176746 -87.4717181 -110.9609435
## 55 56 57 58 59 60
## -100.2946092 -117.4186719 -118.8705575 -135.9659968 -147.5358642 -179.7906844
## 61 62 63 64 65 66
## -178.0871199 -201.2200879 -173.5349275 -182.3780260 -175.4744648 -208.2497122
## 67 68 69 70 71 72
## -134.8915080 -151.1722332 -192.2902827 -194.6058011 -202.1978437 -179.9241408
## 73 74 75 76 77 78
## -157.3069215 -168.8761442 -196.0763584 -134.2858767 -106.4080165 -134.7091395
## 79 80 81 82 83 84
## -149.9134025 -110.8794141 -155.7726334 -97.6465106 -119.9820358 -128.0926220
## 85 86 87 88 89 90
## -92.8433508 -36.4216330 -47.6779134 0.6172922 18.2685926 15.5064126
## 91 92 93 94 95 96
## -17.9884370 10.6357179 -3.5140161 13.7314847 25.5753221 23.8991052
## 97 98 99 100 101 102
## 18.4211074 6.1616208 3.7415844 -38.0688099 -32.2155699 -53.8971423
## 103 104 105 106 107 108
## -39.3222936 -39.2794719 -73.9907584 -103.0364305 -106.7168873 -138.5681816
## 109 110 111 112 113 114
## -111.7302957 -159.4940562 -140.6790148 -146.0943322 -152.2750188 -121.4275755
## 115 116 117 118 119 120
## -152.3556562 -140.8921654 -123.7577516 -126.6159927 -93.2879471 -150.4030344
## 121 122 123 124 125 126
## -140.6874649 -174.6404861 -157.0349134 -146.0061261 -159.5970659 -122.6880306
## 127 128 129 130 131 132
## -130.2045384 -108.1279414 -129.8213431 -128.2992776 -139.3149877 -126.7641326
## 133 134 135 136 137 138
## -97.0783600 -106.5736113 -78.0101038 -71.0953055 -81.2066496 -55.9815085
## 139 140 141 142 143 144
## -28.9066180 26.6882114 88.8513420 135.9963432 110.3244287 98.3749746
## 145 146 147 148 149 150
## 104.1927651 57.2438324 34.3560483 51.4163753 62.0871469 102.1742134
## 151 152 153 154 155 156
## 115.5886114 91.8571490 74.8493395 58.2456571 1.4443511 -10.3959732
## 157 158 159 160 161 162
## -34.0853401 -34.8477166 -41.1870225 -71.1625015 -97.2893636 -86.6236138
## 163 164 165 166 167 168
## -97.9256322 -116.6323228 -75.6753600 -43.7854109 -63.3130468 -78.2767021
## 169 170 171 172 173 174
## -106.8780085 -131.3438858 -113.1327713 -129.5157462 -122.3572612 -106.8655809
## 175 176 177 178 179 180
## -94.1544865 -125.7118683 -107.8625569 -131.5631704 -102.6837570 -103.7311777
## 181 182 183 184 185 186
## -109.0075678 -129.9077939 -100.6835113 -129.6657779 -109.0005020 -88.5667733
## 187 188 189 190 191 192
## -50.9169580 -56.6397917 -21.9340464 -44.3238750 -35.7959609 -3.4875565
## 193 194 195 196 197 198
## 15.8105871 -9.9731550 -6.9839119 -14.0156600 -9.2261644 -2.5210838
## 199 200 201 202 203 204
## 31.3415099 50.5351638 71.3315129 108.0761944 79.5420133 33.5288103
## 205 206 207 208 209 210
## -4.0611845 9.7126412 19.4947916 0.1845762 25.6009758 25.3531766
## 211 212 213 214 215 216
## 47.0822098 32.5389403 13.2655778 -2.4361076 -26.1738947 -19.5852577
## 217 218 219 220 221 222
## -35.5725741 -25.9144264 -36.2276628 -46.1004604 -28.2305516 -71.0238228
## 223 224 225 226 227 228
## -33.2528133 -78.4121254 -83.9972637 -88.0313576 -61.1217263 -33.9119232
## 229 230 231 232 233 234
## -80.1138492 -42.0712671 -60.3759753 -60.2872359 -42.7516725 -63.6687115
## 235 236 237 238 239 240
## -32.0303724 -31.5691248 -33.1001406 -6.9103383 -35.6704511 13.9655184
## 241 242 243 244 245 246
## -7.0651455 12.8894679 41.7735716 83.7673601 97.0447985 90.9364386
## 247 248 249 250 251 252
## 117.0436708 121.2470288 114.0451237 151.6632081 93.7949207 100.3520305
## 253 254 255 256 257 258
## 98.8787510 106.9725834 108.8316053 102.1952407 113.3214993 95.8947674
## 259 260 261 262 263 264
## 111.3481214 111.3393414 31.4938755 11.2431719 -33.3549514 -42.4940337
## 265 266 267 268 269 270
## -11.9078781 -26.4498547 -26.6982475 -49.1462300 -35.4142297 -17.1022903
## 271 272 273 274 275 276
## -62.0353873 -75.9187542 -92.9345614 -67.3451004 -57.4499227 -71.0669274
## 277 278 279 280 281 282
## -95.2828227 -106.3500125 -62.8642128 -47.0413772 -58.6385604 -79.1655618
## 283 284 285 286 287 288
## -77.4430311 -90.7417161 -101.1776738 -81.1180183 -67.1323958 -26.2725043
## 289 290 291 292 293 294
## -16.9002758 -16.8197327 -7.4938436 5.6728464 -6.5250404 -24.5922283
## 295 296 297 298 299 300
## -17.5443565 33.8692648 79.6887738 76.7411639 62.1276778 66.3711272
## 301 302 303 304 305 306
## 80.3892910 103.2368524 113.5093210 124.3178664 120.5872121 148.7913364
## 307 308 309 310 311 312
## 170.6713758 128.7406993 110.6687384 86.5476827 85.7993029 102.7617846
## 313 314 315 316 317 318
## 110.0108287 104.3184657 75.2865080 77.4013079 68.1129219 25.9307189
## 319 320 321 322 323 324
## 24.8773584 22.5704773 21.5079489 48.2350046 21.3235909 -2.7998949
## 325 326 327 328 329 330
## -28.6147523 -46.9380522 -50.7033023 -17.8418280 -22.4094737 -8.6282053
## 331 332 333 334 335 336
## -12.9582795 5.4533145 -48.0516536 -49.3854554 -51.4804530 -44.3979682
## 337 338 339 340 341 342
## -4.4050588 -11.1578846 -40.3930242 -8.2757927 -4.1708865 3.0312284
## 343 344 345 346 347 348
## -9.1948511 5.3571259 25.3198118 29.5265549 58.6581666 70.2916994
## 349 350 351 352 353 354
## 81.6243173 105.0612008 98.8015038 93.1666441 61.2774768 79.5090566
## 355 356 357 358 359 360
## 77.7378484 56.2428133 95.5294256 109.7464431 107.5546921 103.9317380
## 361 362 363 364 365 366
## 122.0513358 113.7015140 85.5641022 94.8729890 84.9217364 66.0685108
## 367 368 369 370 371 372
## 68.6454496 46.2888535 83.4792172 89.3688022 96.5279139 73.9755551
## 373 374 375 376 377 378
## 64.7687383 59.8849744 53.2985000 35.3215966 17.3624197 24.0752905
## 379 380 381 382 383 384
## 77.2737364 49.7921768 13.8205515 3.2374796 4.7531111 -19.1420345
## 385 386 387 388 389 390
## -7.5467576 -16.9386685 -14.7532279 30.8192199 26.6119694 23.3322726
## 391 392 393 394 395 396
## 8.7386887 16.7490077 14.9491508 33.1096089 25.3681342 36.0473420
## 397 398 399 400 401 402
## 54.2410084 76.7943775 61.0960212 101.7692150 109.7963259 117.4093029
## 403 404 405 406 407 408
## 156.6606119 177.6374809 183.2556861 199.5284671 174.2559457 195.8519721
## 409 410 411 412 413 414
## 174.7081311 180.8513014 171.7538240 192.9447516 208.3689367 226.6724946
## 415 416 417 418 419 420
## 214.6019846 183.3621545 169.7319925 154.2075359 113.4402415 113.9344483
## 421 422 423 424 425 426
## 128.3864246 110.7443819 109.8350659 111.9560800 89.7233961 92.8190013
## 427 428 429 430 431 432
## 65.5205748 55.8287225 89.0948209 79.5177601 63.5130770 43.9827485
## 433 434 435 436 437 438
## 76.4482458 72.3550218 67.0843620 80.4266190 63.1989710 78.0503284
## 439 440 441 442 443 444
## 122.3753821 118.5729562 98.7910630 90.9281151 81.4551768 89.9618333
## 445 446 447 448 449 450
## 97.1036751 117.1531973 112.9403018 114.7317245 189.1447389 184.4545147
## 451 452 453 454 455 456
## 166.0084300 192.0168165 221.8667952 222.5538436 190.9321043 188.1357716
## 457 458 459 460 461 462
## 172.0766387 210.8810949 260.9855969 252.3104862 263.2185253 275.6710715
## 463 464 465 466 467 468
## 273.8045108 215.5562524 203.4027388 201.7934596 229.4717186 258.9219239
## 469 470 471 472 473 474
## 219.8017599 201.8511522 179.3778020 199.8654510 202.1286520 196.7796983
## 475 476 477 478 479 480
## 180.3718445 180.1995478 207.0451458 164.1815072 154.3418304 165.9926964
## 481 482 483 484 485 486
## 156.6109438 159.3341912 147.1279771 126.7553044 120.0005996 138.6958912
## 487 488 489 490 491 492
## 143.4101241 108.0228197 143.1792720 159.2367441 147.0819503 129.9027004
## 493 494 495 496 497 498
## 105.2081827 109.5904347 117.5496196 122.6991169 163.9425184 171.0521727
#chemical1
model1.c1 <- dlm(x=as.vector(dataf$X1), y=as.vector(dataf$mortality), q=10)
summary(model1.c1)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -247.62 -136.81 10.29 123.43 240.01
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 160.28008 73.78138 2.172 0.0303 *
## x.t 0.06170 1.03202 0.060 0.9524
## x.1 0.12744 1.05214 0.121 0.9036
## x.2 0.22030 1.09015 0.202 0.8399
## x.3 0.12043 1.09395 0.110 0.9124
## x.4 0.12271 1.11476 0.110 0.9124
## x.5 0.22306 1.11401 0.200 0.8414
## x.6 0.12467 1.11382 0.112 0.9109
## x.7 0.12565 1.09305 0.115 0.9085
## x.8 0.15956 1.08662 0.147 0.8833
## x.9 0.07092 1.05232 0.067 0.9463
## x.10 -0.02110 1.03575 -0.020 0.9838
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 145.2 on 486 degrees of freedom
## Multiple R-squared: 0.004279, Adjusted R-squared: -0.01826
## F-statistic: 0.1899 on 11 and 486 DF, p-value: 0.9981
##
## AIC and BIC values for the model:
## AIC BIC
## 1 6385.399 6440.137
checkresiduals(model1.c1)
## 1 2 3 4 5 6
## -241.707607 -243.109962 -244.648658 -241.809855 -243.135078 -242.393354
## 7 8 9 10 11 12
## -240.376199 -242.036307 -241.205730 -243.107096 -244.435117 -243.893069
## 13 14 15 16 17 18
## -243.373263 -245.410050 -245.532661 -246.076783 -247.620538 -246.603628
## 19 20 21 22 23 24
## -245.020807 -243.972870 -242.269183 -240.819006 -240.637417 -239.633138
## 25 26 27 28 29 30
## -235.957558 -232.249299 -231.192134 -227.826895 -226.132129 -223.517645
## 31 32 33 34 35 36
## -220.296983 -216.010295 -211.460938 -210.332169 -207.266866 -202.063001
## 37 38 39 40 41 42
## -199.976681 -197.721036 -193.080959 -195.697644 -194.874592 -193.683107
## 43 44 45 46 47 48
## -194.062050 -195.065990 -193.239460 -192.511112 -195.190833 -191.755051
## 49 50 51 52 53 54
## -192.143821 -192.104145 -192.092660 -189.983417 -188.507222 -189.007978
## 55 56 57 58 59 60
## -188.171122 -187.037512 -187.263141 -185.550802 -181.932425 -183.892040
## 61 62 63 64 65 66
## -185.571251 -186.439206 -187.569105 -188.888228 -190.646542 -190.885953
## 67 68 69 70 71 72
## -192.933098 -196.361827 -196.457459 -194.602667 -195.066419 -196.485921
## 73 74 75 76 77 78
## -194.226974 -194.032488 -191.491656 -188.708697 -188.017697 -186.251151
## 79 80 81 82 83 84
## -179.890548 -177.963479 -175.610289 -169.394043 -166.347924 -163.514246
## 85 86 87 88 89 90
## -159.443357 -153.535980 -150.878738 -149.166845 -146.077885 -144.283246
## 91 92 93 94 95 96
## -143.864930 -141.719173 -140.177819 -140.847512 -141.236090 -141.056024
## 97 98 99 100 101 102
## -142.155981 -141.839772 -142.388548 -140.994028 -141.130968 -142.337726
## 103 104 105 106 107 108
## -141.518684 -140.441259 -140.472151 -139.829070 -140.000373 -138.319667
## 109 110 111 112 113 114
## -139.253025 -139.305191 -139.206683 -140.612000 -138.696213 -139.892801
## 115 116 117 118 119 120
## -140.090157 -140.068954 -140.995821 -141.822565 -141.587145 -143.068214
## 121 122 123 124 125 126
## -141.397029 -140.077380 -141.802096 -139.530372 -137.208771 -135.593034
## 127 128 129 130 131 132
## -133.610598 -131.143729 -128.240519 -126.216804 -123.981962 -120.457121
## 133 134 135 136 137 138
## -117.914104 -115.293525 -111.844544 -109.928619 -107.802674 -103.759569
## 139 140 141 142 143 144
## -100.220867 -101.816462 -100.268385 -95.242475 -95.134891 -95.780899
## 145 146 147 148 149 150
## -90.907081 -89.850443 -91.349866 -87.938730 -87.560793 -86.648335
## 151 152 153 154 155 156
## -84.682066 -83.732016 -83.418075 -81.786998 -82.224679 -84.038323
## 157 158 159 160 161 162
## -81.347749 -81.569960 -82.927819 -82.507069 -83.283348 -83.980174
## 163 164 165 166 167 168
## -85.024793 -83.972521 -85.928361 -87.005380 -89.161744 -88.823517
## 169 170 171 172 173 174
## -88.165888 -90.055108 -88.864919 -88.936978 -88.207600 -86.967150
## 175 176 177 178 179 180
## -85.639004 -82.996036 -80.673793 -80.027040 -78.220425 -75.745381
## 181 182 183 184 185 186
## -75.364330 -72.087981 -71.467105 -71.451618 -68.618419 -67.187829
## 187 188 189 190 191 192
## -64.979494 -60.729505 -57.745514 -56.878933 -54.104328 -49.943196
## 193 194 195 196 197 198
## -46.441153 -43.609364 -40.250175 -41.266461 -40.195318 -37.770646
## 199 200 201 202 203 204
## -37.420161 -36.952939 -35.141299 -36.393237 -37.050493 -34.560707
## 205 206 207 208 209 210
## -35.445392 -33.613637 -32.611631 -34.576646 -33.022922 -31.720386
## 211 212 213 214 215 216
## -33.160233 -32.037619 -31.300872 -32.492439 -33.303660 -31.287912
## 217 218 219 220 221 222
## -31.583067 -34.532919 -34.018610 -35.005979 -36.323365 -37.590688
## 223 224 225 226 227 228
## -38.622943 -37.777874 -37.330498 -36.729041 -34.723768 -34.272087
## 229 230 231 232 233 234
## -34.024457 -32.568275 -29.523362 -28.034390 -25.675868 -23.586562
## 235 236 237 238 239 240
## -22.902647 -19.548371 -14.623281 -12.900319 -13.222337 -8.252571
## 241 242 243 244 245 246
## -6.339807 -6.058185 -2.144757 0.460595 1.938088 5.659977
## 247 248 249 250 251 252
## 6.411807 6.848453 9.148531 11.429171 14.117572 13.945599
## 253 254 255 256 257 258
## 15.400067 16.216068 16.179546 19.967615 22.854585 22.434129
## 259 260 261 262 263 264
## 24.063138 25.748472 25.807806 26.937739 27.392202 26.349142
## 265 266 267 268 269 270
## 25.601281 26.044741 25.165574 23.411984 23.708631 23.014884
## 271 272 273 274 275 276
## 22.189555 21.042982 20.568352 20.033186 19.387419 18.455926
## 277 278 279 280 281 282
## 18.287178 18.495051 18.761782 19.447334 20.783027 22.372669
## 283 284 285 286 287 288
## 22.484112 21.686344 24.327206 27.168906 27.797959 29.990351
## 289 290 291 292 293 294
## 33.498904 35.087891 37.007005 41.237433 44.101312 47.026663
## 295 296 297 298 299 300
## 48.021072 51.147866 53.097315 53.938108 56.411862 56.659320
## 301 302 303 304 305 306
## 54.724842 58.073152 58.825816 59.968876 61.455170 61.640462
## 307 308 309 310 311 312
## 64.711876 66.451206 66.593973 68.084044 71.038678 70.921426
## 313 314 315 316 317 318
## 71.327805 72.110971 72.045989 70.975224 72.044652 72.177394
## 319 320 321 322 323 324
## 73.551364 73.643968 70.973893 70.258415 69.046000 66.419296
## 325 326 327 328 329 330
## 68.199519 68.209204 67.116105 68.082348 68.084043 69.117098
## 331 332 333 334 335 336
## 70.930573 72.628327 72.440492 74.397808 76.687281 77.908762
## 337 338 339 340 341 342
## 78.255835 79.549659 79.746450 82.608036 84.665366 84.613235
## 343 344 345 346 347 348
## 85.668583 89.019029 88.548625 90.911083 95.234619 95.529067
## 349 350 351 352 353 354
## 98.617079 103.126496 107.440452 108.974325 109.654973 111.642717
## 355 356 357 358 359 360
## 113.634614 112.380973 113.188938 114.880779 114.978642 114.078020
## 361 362 363 364 365 366
## 116.799854 119.926679 120.169167 122.011203 123.601249 124.140841
## 367 368 369 370 371 372
## 123.672930 125.443106 127.166257 126.242410 125.312605 126.188149
## 373 374 375 376 377 378
## 125.604067 126.321991 126.884058 124.913077 123.890644 122.346856
## 379 380 381 382 383 384
## 120.710299 119.545983 119.413816 119.550184 119.801193 120.345021
## 385 386 387 388 389 390
## 121.895636 122.921317 124.712660 128.812822 130.177271 131.082292
## 391 392 393 394 395 396
## 133.631237 135.026994 137.043372 138.838345 140.890756 141.694849
## 397 398 399 400 401 402
## 142.926190 143.684770 146.931826 150.966958 152.329341 155.246363
## 403 404 405 406 407 408
## 159.327050 161.484596 163.789648 167.261742 168.986773 170.590094
## 409 410 411 412 413 414
## 172.213226 171.576549 172.607060 174.047214 172.395068 171.698751
## 415 416 417 418 419 420
## 174.466895 175.282536 174.345207 176.052925 176.221119 175.487023
## 421 422 423 424 425 426
## 174.720434 174.218172 175.104246 173.922175 171.504938 170.313520
## 427 428 429 430 431 432
## 169.761755 169.045632 169.712829 169.864909 168.672060 170.146874
## 433 434 435 436 437 438
## 169.905134 169.924429 172.898488 173.588225 173.529363 175.748166
## 439 440 441 442 443 444
## 177.974098 178.170490 176.602281 179.222444 181.920123 180.591079
## 445 446 447 448 449 450
## 184.164461 188.301527 188.908809 194.372287 200.096362 203.789895
## 451 452 453 454 455 456
## 207.796207 211.633559 213.980929 216.995779 219.503874 220.230946
## 457 458 459 460 461 462
## 222.607166 224.645075 225.767827 226.895371 227.221856 228.800971
## 463 464 465 466 467 468
## 229.014771 227.877525 228.836346 230.663465 228.624926 227.427643
## 469 470 471 472 473 474
## 229.278886 228.079408 227.941082 228.359989 228.902517 226.196806
## 475 476 477 478 479 480
## 226.944540 227.491844 225.711728 224.573995 224.591752 224.943760
## 481 482 483 484 485 486
## 222.619397 223.584937 223.780780 222.174632 222.627556 224.436403
## 487 488 489 490 491 492
## 225.266965 226.696767 226.484550 226.807417 226.636800 226.522123
## 493 494 495 496 497 498
## 228.899481 230.714377 231.959976 235.017308 238.179584 240.006664
#chem2
model1.c2 <- dlm(x=as.vector(dataf$X2), y=as.vector(dataf$mortality), q=10)
summary(model1.c2)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -253.260 -104.880 1.077 113.113 255.319
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 428.37924 17.13593 24.999 <2e-16 ***
## x.t -5.42914 2.42679 -2.237 0.0257 *
## x.1 -4.31465 2.44531 -1.764 0.0783 .
## x.2 -1.46195 2.59464 -0.563 0.5734
## x.3 -0.21991 2.64517 -0.083 0.9338
## x.4 0.38362 2.66043 0.144 0.8854
## x.5 0.63485 2.65979 0.239 0.8115
## x.6 0.53842 2.66269 0.202 0.8398
## x.7 0.01805 2.64856 0.007 0.9946
## x.8 -1.46525 2.59724 -0.564 0.5729
## x.9 -4.42015 2.44851 -1.805 0.0717 .
## x.10 -5.63263 2.42882 -2.319 0.0208 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 131.4 on 486 degrees of freedom
## Multiple R-squared: 0.1848, Adjusted R-squared: 0.1663
## F-statistic: 10.01 on 11 and 486 DF, p-value: < 2.2e-16
##
## AIC and BIC values for the model:
## AIC BIC
## 1 6285.802 6340.54
checkresiduals(model1.c2)
## 1 2 3 4 5 6
## -209.9881945 -232.7902552 -203.8387637 -218.8249413 -236.7919802 -245.3285501
## 7 8 9 10 11 12
## -247.6496339 -252.5770017 -248.5150022 -218.8721104 -216.5962028 -224.7032445
## 13 14 15 16 17 18
## -226.1531620 -253.2596834 -237.2812101 -241.5297272 -243.5194066 -219.1116587
## 19 20 21 22 23 24
## -211.1211117 -205.7081414 -196.2162861 -159.2298951 -149.9909165 -151.3228120
## 25 26 27 28 29 30
## -176.8992754 -196.2974058 -135.0947429 -131.3681316 -109.3156601 -55.3007704
## 31 32 33 34 35 36
## -68.0628251 -51.1554509 -61.0561422 -109.4304743 -103.8296268 -65.4548562
## 37 38 39 40 41 42
## -41.0639857 -36.0168270 -0.6665881 6.7024160 -42.0977938 3.9059892
## 43 44 45 46 47 48
## -66.2371484 -132.3317164 -109.7171234 -128.9994327 -104.8394083 -90.2848172
## 49 50 51 52 53 54
## -84.7852728 -75.2222641 -137.3600951 -150.7694332 -217.5418074 -207.3569552
## 55 56 57 58 59 60
## -196.6760405 -209.1025999 -212.9332877 -222.7682562 -209.0606193 -219.1308582
## 61 62 63 64 65 66
## -232.3006851 -237.1031334 -219.7000874 -213.6408281 -225.4063371 -232.2632779
## 67 68 69 70 71 72
## -225.7905463 -215.8335217 -221.5400319 -203.4569356 -222.9155048 -184.3527165
## 73 74 75 76 77 78
## -182.2072466 -189.3292379 -177.2312837 -120.0085624 -140.6924543 -153.0541883
## 79 80 81 82 83 84
## -160.3849560 -116.0851321 -123.7622281 -100.8758138 -111.8966819 -100.1557526
## 85 86 87 88 89 90
## -88.4351106 -54.3591692 -109.9538481 -83.0803062 -48.3631864 13.1249322
## 91 92 93 94 95 96
## -38.5314976 -41.2122367 -70.0534773 -77.1121293 -61.6930613 -72.6103126
## 97 98 99 100 101 102
## -106.2037308 -66.9957326 -56.4107316 -77.7293611 -112.3752471 -92.9044066
## 103 104 105 106 107 108
## -100.4830944 -97.9109463 -115.4059789 -153.7021296 -161.0053130 -174.2685116
## 109 110 111 112 113 114
## -173.2543350 -163.7630420 -162.4840592 -166.5019874 -176.7853327 -173.7475650
## 115 116 117 118 119 120
## -175.5961444 -181.7754373 -177.1465272 -163.7896798 -155.2895500 -180.4477761
## 121 122 123 124 125 126
## -178.6238444 -177.9953049 -173.8802283 -175.6294589 -162.8955033 -147.7765284
## 127 128 129 130 131 132
## -132.4170181 -121.4513808 -139.6066979 -126.6810150 -109.8157141 -88.2986065
## 133 134 135 136 137 138
## -101.7974110 -107.7658767 -70.6099922 -40.7625987 -70.8860154 -65.9128220
## 139 140 141 142 143 144
## -19.1195187 -23.4363362 -29.5166189 -15.2017475 -52.8157406 -29.6497917
## 145 146 147 148 149 150
## -8.9960912 -43.2577036 -65.4522767 -30.6181485 -36.2893383 -81.2827991
## 151 152 153 154 155 156
## -68.8930741 -77.2961829 -103.3275600 -81.6342945 -102.8229794 -116.9891218
## 157 158 159 160 161 162
## -114.8104247 -124.8028071 -129.4521267 -144.1610648 -138.1495439 -149.9398337
## 163 164 165 166 167 168
## -130.5510074 -128.1570079 -125.1606299 -128.5330484 -136.6135380 -145.6452774
## 169 170 171 172 173 174
## -150.9655140 -142.9030694 -137.6842848 -141.9087558 -115.3253699 -113.7155985
## 175 176 177 178 179 180
## -118.5858481 -144.7367485 -145.0951684 -137.0914468 -90.2725671 -94.9000611
## 181 182 183 184 185 186
## -74.0265468 -55.0542976 -35.8967605 -59.4743040 -60.7709897 -74.5336413
## 187 188 189 190 191 192
## -75.1385480 -36.3295143 14.6311883 25.2146907 47.5251054 14.6290454
## 193 194 195 196 197 198
## -17.5567563 -15.1119217 3.9236992 2.3084051 39.5489595 67.2417883
## 199 200 201 202 203 204
## 57.0345634 12.9846997 -26.3960795 -57.6258969 -30.2206820 -9.5802171
## 205 206 207 208 209 210
## -23.0973250 -18.0912716 -23.9185305 -40.4369594 -58.7547638 -72.1909833
## 211 212 213 214 215 216
## -78.8764700 -81.1610143 -77.7001943 -112.7969178 -106.8970532 -93.9671011
## 217 218 219 220 221 222
## -88.1870830 -83.4120139 -104.8930309 -111.4937045 -100.6458644 -96.5513912
## 223 224 225 226 227 228
## -97.0615453 -115.9869883 -104.9830019 -88.6846645 -71.9650968 -60.6287389
## 229 230 231 232 233 234
## -74.7576004 -45.7871844 -50.2674391 -68.4174295 -61.4951619 -30.7637186
## 235 236 237 238 239 240
## -29.9457992 -24.1556358 10.7920935 33.3635289 26.0473413 48.1944823
## 241 242 243 244 245 246
## 39.7719225 55.7519028 109.4013933 59.9599667 33.5319012 61.9285473
## 247 248 249 250 251 252
## 126.8435325 142.0800349 103.1133509 98.6120189 83.7253408 86.0213248
## 253 254 255 256 257 258
## 94.5471665 31.7083752 61.2925222 76.1403418 86.5115126 38.1849652
## 259 260 261 262 263 264
## -0.7787071 2.8460324 1.9348245 11.7305187 16.6908787 -24.5324712
## 265 266 267 268 269 270
## -20.3692699 -33.7760850 -39.0249007 -37.5406085 -50.2965691 -33.4294139
## 271 272 273 274 275 276
## -30.8469309 -29.4269252 -43.6114078 -53.7674902 -49.3170568 -40.2452893
## 277 278 279 280 281 282
## -46.2073789 -41.9586980 -20.9386120 -9.6749396 -24.2584989 -11.7406964
## 283 284 285 286 287 288
## -10.4570347 -2.4878244 -11.3131865 8.3350048 -4.7174457 41.2231208
## 289 290 291 292 293 294
## 60.4031013 70.4017123 87.3142150 85.3704508 79.6488986 65.3621861
## 295 296 297 298 299 300
## 92.9960962 118.2774125 102.8271271 146.0609556 156.0451076 143.5081053
## 301 302 303 304 305 306
## 144.2842677 101.5183109 98.3015043 95.1065631 115.1688639 81.1744214
## 307 308 309 310 311 312
## 89.3622452 121.1108539 104.5983007 75.7719766 57.9429176 15.0771369
## 313 314 315 316 317 318
## 33.0006810 28.2202623 18.5818998 23.4253967 30.2798225 17.8529691
## 319 320 321 322 323 324
## 5.7146168 5.4140443 6.5924938 25.7482481 18.7971723 14.2832713
## 325 326 327 328 329 330
## 0.2191319 4.0867184 -8.5816990 2.4535726 16.4435134 19.8384762
## 331 332 333 334 335 336
## 43.7384733 48.4447210 32.2441071 24.5944529 13.3566701 21.0224254
## 337 338 339 340 341 342
## 47.3127094 59.2938845 50.5669672 75.1268315 126.3299729 108.4279664
## 343 344 345 346 347 348
## 118.0432086 104.5002027 120.2943425 147.4597031 162.7031515 143.5422830
## 349 350 351 352 353 354
## 141.9487650 148.9174182 169.5321362 161.2489978 174.5101467 177.2883739
## 355 356 357 358 359 360
## 206.3586791 200.2778320 159.0934339 128.2154117 102.6452944 107.4465810
## 361 362 363 364 365 366
## 118.2481728 105.3217904 110.4897165 113.5588200 114.1830490 84.0868657
## 367 368 369 370 371 372
## 56.8388653 62.9070746 57.1507391 51.7943208 51.5015250 51.2297637
## 373 374 375 376 377 378
## 47.5201374 49.2373302 52.9696740 38.2211561 41.3586587 48.2215940
## 379 380 381 382 383 384
## 65.0199311 61.5127160 61.1150081 61.1170146 58.0945208 55.9213795
## 385 386 387 388 389 390
## 66.0542639 66.2044573 72.5230389 93.8007157 114.2714706 110.5336753
## 391 392 393 394 395 396
## 117.4262777 124.7738515 121.0272461 145.8713249 154.6894194 149.2499998
## 397 398 399 400 401 402
## 184.4926822 208.2010612 197.3065684 158.8696850 155.6021020 174.6119107
## 403 404 405 406 407 408
## 185.4110696 200.8488758 213.8629441 228.0696740 219.6385379 187.5155099
## 409 410 411 412 413 414
## 156.7994331 133.3906608 146.3867163 156.8293565 154.5456048 151.3156790
## 415 416 417 418 419 420
## 150.6667354 121.5560465 116.4767238 122.3440498 114.0188320 113.7085002
## 421 422 423 424 425 426
## 121.0377816 117.1614444 102.7018432 96.8569025 101.8718132 103.8807619
## 427 428 429 430 431 432
## 101.5864983 96.2556889 111.1992996 116.9678250 111.7768712 106.6165714
## 433 434 435 436 437 438
## 101.0621174 110.1123166 117.7941915 113.6625563 110.7917528 117.0654527
## 439 440 441 442 443 444
## 146.9226317 161.4561567 146.6701930 158.2138222 158.0312240 157.7489617
## 445 446 447 448 449 450
## 176.3441544 170.4006661 186.5676528 186.4470579 225.4475520 215.1551294
## 451 452 453 454 455 456
## 227.2281865 254.0061166 235.1189857 235.8351428 230.3778223 216.7961277
## 457 458 459 460 461 462
## 222.2352891 200.4408875 255.3186526 248.4187754 248.2821427 231.7141565
## 463 464 465 466 467 468
## 217.4050744 192.7229436 182.1615573 179.7366678 173.6628765 187.5911162
## 469 470 471 472 473 474
## 207.5326987 170.2405038 161.8518850 169.2445786 157.2040551 142.0110209
## 475 476 477 478 479 480
## 140.0525141 145.7508405 154.4634711 154.5290637 155.8091198 148.9884873
## 481 482 483 484 485 486
## 146.9323980 152.7195437 144.5051349 147.4576015 155.3846221 154.8711854
## 487 488 489 490 491 492
## 161.5463399 160.8844333 170.6959262 188.3453217 196.0092555 181.7248785
## 493 494 495 496 497 498
## 187.8677640 183.2258036 177.4403401 192.6258538 217.9486337 220.7019342
#particle space
model1.part <- dlm(x=as.vector(dataf$X3), y=as.vector(dataf$mortality), q=10)
summary(model1.part)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -263.86 -63.66 11.95 76.78 181.16
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 719.374 17.893 40.203 < 2e-16 ***
## x.t -20.379 4.777 -4.266 2.39e-05 ***
## x.1 -18.325 4.810 -3.810 0.000157 ***
## x.2 -13.553 4.885 -2.774 0.005747 **
## x.3 -15.384 4.864 -3.163 0.001661 **
## x.4 -13.032 4.903 -2.658 0.008126 **
## x.5 -11.305 4.923 -2.297 0.022065 *
## x.6 -12.551 4.902 -2.561 0.010751 *
## x.7 -13.555 4.868 -2.784 0.005571 **
## x.8 -12.030 4.885 -2.462 0.014148 *
## x.9 -14.863 4.803 -3.094 0.002086 **
## x.10 -16.192 4.774 -3.391 0.000752 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 93.12 on 486 degrees of freedom
## Multiple R-squared: 0.5905, Adjusted R-squared: 0.5812
## F-statistic: 63.71 on 11 and 486 DF, p-value: < 2.2e-16
##
## AIC and BIC values for the model:
## AIC BIC
## 1 5942.914 5997.652
checkresiduals(model1.part)
## 1 2 3 4 5 6
## -251.1089667 -263.8642991 -211.3376274 -215.0273046 -214.8033521 -227.6847095
## 7 8 9 10 11 12
## -253.7116178 -244.9425670 -232.6368604 -149.7716057 -136.4534390 -115.1587036
## 13 14 15 16 17 18
## -91.0679420 -98.9333576 -52.2685786 -57.3521835 -41.4061256 4.1375118
## 19 20 21 22 23 24
## 19.8090726 17.2487898 10.7132418 20.7386341 11.5104162 -0.3629868
## 25 26 27 28 29 30
## -17.4537538 -80.5947698 -54.0560633 -63.9811758 -82.3041537 -77.8017437
## 31 32 33 34 35 36
## -104.8487502 -133.1787262 -152.1382413 -186.1357856 -199.1821690 -199.2515678
## 37 38 39 40 41 42
## -172.5672209 -192.8760742 -185.7342358 -170.8506565 -172.0418286 -130.0022939
## 43 44 45 46 47 48
## -160.9817327 -169.1660982 -134.6361060 -132.8234062 -113.9580230 -108.3776224
## 49 50 51 52 53 54
## -80.5774312 -78.7206649 -117.7860029 -146.2419839 -180.0809135 -169.0826853
## 55 56 57 58 59 60
## -158.5799195 -162.1756560 -174.8357723 -200.1275696 -202.7068437 -187.6806515
## 61 62 63 64 65 66
## -189.0272987 -187.3150468 -133.4839948 -126.5817556 -113.0672132 -94.5010224
## 67 68 69 70 71 72
## -29.4487327 -14.8737105 -7.8350884 41.0424805 4.2661380 58.3637597
## 73 74 75 76 77 78
## 64.3539541 33.0953219 13.6798878 76.0542543 81.5352408 20.4929508
## 79 80 81 82 83 84
## -8.2449011 45.5236255 37.0291627 20.8761838 -9.6107655 -19.6834660
## 85 86 87 88 89 90
## -34.6326556 10.4894013 -62.6845095 -82.7064040 -18.5618475 45.7878178
## 91 92 93 94 95 96
## 19.9954096 -9.4886964 -0.3938329 -18.6952430 25.2323463 65.7580293
## 97 98 99 100 101 102
## 69.5022540 120.6194892 139.3414700 105.0023557 70.6084157 56.6231570
## 103 104 105 106 107 108
## 52.3825363 30.8449790 79.3735003 31.1895149 30.1536093 -9.1383300
## 109 110 111 112 113 114
## -54.2827263 -59.4739182 -79.3666718 -83.9956879 -112.0359826 -66.1187264
## 115 116 117 118 119 120
## -14.0625364 -44.7614407 -45.8379287 -32.0370769 -11.5535344 -21.4827147
## 121 122 123 124 125 126
## -26.8586067 -20.9645887 -25.5956123 -15.1180799 -16.3262990 -62.4427306
## 127 128 129 130 131 132
## -54.7425499 -50.2796597 -107.2471128 -87.3470544 -84.3895856 -66.5035170
## 133 134 135 136 137 138
## -68.5953079 -79.8943940 -66.8617192 -66.4588217 -82.1458531 -95.0828805
## 139 140 141 142 143 144
## -72.8562869 -51.9277953 -85.3986234 -69.0532668 -105.1744423 -106.8055732
## 145 146 147 148 149 150
## -109.0196258 -129.0687041 -137.6923437 -132.0309133 -144.0765882 -167.3905097
## 151 152 153 154 155 156
## -176.8168714 -178.7437586 -196.4651024 -188.9266306 -179.8145321 -176.5510401
## 157 158 159 160 161 162
## -162.6570608 -165.4638249 -147.4936832 -143.8837648 -111.1786168 -111.1088852
## 163 164 165 166 167 168
## -77.3913934 -88.6510168 -46.1548513 -5.7155419 21.4259253 18.4720485
## 169 170 171 172 173 174
## 46.9544197 70.2801699 85.2337766 86.8083079 147.9217138 133.9549952
## 175 176 177 178 179 180
## 138.7918447 111.9577168 85.5521967 79.7178000 101.1523246 94.4823123
## 181 182 183 184 185 186
## 92.9327808 94.4086063 98.5689709 47.4451812 81.8703356 55.5182195
## 187 188 189 190 191 192
## 51.1634638 72.1619682 56.0498828 54.2471370 56.0509817 20.3452644
## 193 194 195 196 197 198
## -34.5120172 -47.0351881 -40.9557516 -77.3627167 -45.8972246 -45.7558593
## 199 200 201 202 203 204
## -47.5812747 -68.2810584 -84.2119174 -105.5131036 -79.6324299 -75.6629441
## 205 206 207 208 209 210
## -80.1759450 -83.5421214 -99.6304185 -105.7856106 -107.0071231 -121.1015667
## 211 212 213 214 215 216
## -121.4813366 -131.5847656 -116.8195330 -138.6834553 -122.7439747 -114.7840556
## 217 218 219 220 221 222
## -70.0790799 -42.8053689 -67.7609877 -67.4961215 -52.2403170 -55.1679440
## 223 224 225 226 227 228
## -45.1405851 -56.2154170 -58.9166570 -45.8991658 -20.3400656 -23.2069563
## 229 230 231 232 233 234
## -45.8886078 4.3003009 -1.5094107 -25.4398991 -24.1738000 1.4292449
## 235 236 237 238 239 240
## -3.6007735 -0.7099392 5.5369084 12.6778544 13.9833885 51.0955964
## 241 242 243 244 245 246
## 20.2064211 30.6665311 68.9243769 43.6042486 28.3939280 46.2850729
## 247 248 249 250 251 252
## 85.7456518 128.4149746 91.0495468 87.2960178 62.8137289 50.5170650
## 253 254 255 256 257 258
## 105.0391315 63.5661319 50.7750096 58.2314517 48.6709913 -0.6306388
## 259 260 261 262 263 264
## -44.2775594 -19.8568842 -39.0217869 -32.0749845 -9.7917565 -77.3920671
## 265 266 267 268 269 270
## -69.5620657 -57.0008295 -66.5532576 -66.1498651 -69.8245393 -35.8279263
## 271 272 273 274 275 276
## -40.3044048 -17.0651123 -26.9917946 -23.5635412 -10.0636022 20.3763529
## 277 278 279 280 281 282
## 14.2154963 15.2757634 32.4496350 68.6424176 48.3351503 66.2149997
## 283 284 285 286 287 288
## 67.8144337 94.9153034 67.1190773 67.5148006 28.5344339 54.4460703
## 289 290 291 292 293 294
## 80.5839916 88.2135506 76.5507755 71.7357421 88.6309813 78.9172611
## 295 296 297 298 299 300
## 92.1469138 114.1185760 99.3557959 136.2806074 134.4300241 115.2513218
## 301 302 303 304 305 306
## 131.5223330 96.0873099 78.4914226 52.9151048 60.4531758 7.2239323
## 307 308 309 310 311 312
## 13.5840492 52.7307318 13.7788790 -12.0655241 -30.8161856 -82.1002561
## 313 314 315 316 317 318
## -78.7152171 -49.0088845 -52.2007204 -36.0386683 -33.7582506 -59.5621879
## 319 320 321 322 323 324
## -90.1717980 -97.3018467 -86.4103588 -42.2306336 -26.9382216 -5.6298502
## 325 326 327 328 329 330
## -30.7242754 -9.6167732 -36.1013108 -26.7199932 -19.1090085 -19.9446432
## 331 332 333 334 335 336
## 23.5595964 37.1326038 0.8339304 -11.1404103 -34.3262918 -34.4480047
## 337 338 339 340 341 342
## -27.2572520 -12.2833017 -32.6140897 -12.0625789 19.1405222 -5.2160942
## 343 344 345 346 347 348
## 10.8466480 28.8524848 38.9666023 66.0000652 88.7732445 79.0310451
## 349 350 351 352 353 354
## 76.5496205 78.3436854 81.9689738 101.6010858 102.4395595 101.5798373
## 355 356 357 358 359 360
## 125.3032084 141.5681456 94.4269073 73.9757388 68.6319384 56.1809460
## 361 362 363 364 365 366
## 59.9898736 50.5057063 48.0495668 48.8891913 69.9738419 47.3123086
## 367 368 369 370 371 372
## -8.1819699 -7.4455645 6.0202023 1.9460809 44.6129884 48.4286213
## 373 374 375 376 377 378
## 48.2185083 70.1692618 74.2789935 57.6931886 73.1912686 98.8887121
## 379 380 381 382 383 384
## 130.6446412 138.7502892 159.2877151 132.4732470 137.3337449 137.0115923
## 385 386 387 388 389 390
## 148.6866239 122.6424508 110.2641468 110.5219435 108.3533186 109.2557456
## 391 392 393 394 395 396
## 102.0973642 97.8299322 79.4450669 97.0745257 112.6289640 85.2823962
## 397 398 399 400 401 402
## 145.1137810 181.1614317 179.1809027 140.0051597 115.0153045 95.1519879
## 403 404 405 406 407 408
## 72.1972391 83.1602345 91.5874710 87.5500718 71.7498232 36.0406079
## 409 410 411 412 413 414
## 10.2800935 -26.0484580 -13.2756629 -0.2504821 11.7659814 18.7049880
## 415 416 417 418 419 420
## 30.2066442 11.5991093 13.8227506 31.5288295 34.3692422 54.4355516
## 421 422 423 424 425 426
## 75.9294906 79.6803845 75.2903501 83.9984485 87.9943993 97.5453124
## 427 428 429 430 431 432
## 98.4563732 88.4590243 118.4382533 132.7767398 116.1473552 115.1013761
## 433 434 435 436 437 438
## 95.2310773 105.2945010 101.1802145 116.3874672 96.7917050 90.2404692
## 439 440 441 442 443 444
## 92.5984178 97.2948463 80.2578167 94.9717118 84.9770546 91.0345053
## 445 446 447 448 449 450
## 97.8274876 91.4357111 81.0725611 72.4204580 94.2029100 95.2806407
## 451 452 453 454 455 456
## 84.7579309 115.7535580 100.4549598 100.1156190 88.0725024 71.3632186
## 457 458 459 460 461 462
## 65.3479171 47.0915150 106.2141420 89.7210549 76.8629899 72.9213085
## 463 464 465 466 467 468
## 61.5234540 37.2156448 28.6813417 31.4089815 25.8302962 32.0941413
## 469 470 471 472 473 474
## 53.2624263 13.0973958 3.3791553 7.3617172 12.1354495 -5.8010303
## 475 476 477 478 479 480
## -13.1170560 -5.0578490 8.2328925 11.2345539 22.7323273 21.8574786
## 481 482 483 484 485 486
## 24.9499766 41.0606609 56.9257360 63.9562460 78.7920862 86.3545800
## 487 488 489 490 491 492
## 97.8657151 94.4608478 110.3441804 129.7288846 130.5962367 121.9210835
## 493 494 495 496 497 498
## 129.2408448 120.4898468 103.1299711 103.0238337 106.2878889 93.5768932
Thus multicolinearity is low in model1
finiteDLMauto(x= as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)
## q - k MASE AIC BIC GMRAE MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1 79.94845 5990.046 6006.888 54.60191 1.00430 0.53139 0
## 9 9 - 1 81.50397 6024.690 6041.541 52.59946 1.01309 0.51162 0
## 8 8 - 1 83.22529 6057.642 6074.501 52.63208 0.99098 0.49282 0
## 7 7 - 1 85.09156 6089.884 6106.750 55.00458 0.97363 0.47412 0
## 6 6 - 1 87.18937 6122.132 6139.006 55.69802 0.97473 0.45481 0
## 5 5 - 1 89.46396 6154.815 6171.697 60.47708 1.00926 0.43439 0
## 4 4 - 1 91.57364 6189.154 6206.045 58.50551 0.96670 0.41136 0
## 3 3 - 1 93.80353 6224.082 6240.980 63.56677 0.99422 0.38678 0
## 2 2 - 1 97.79697 6270.098 6287.005 67.40276 0.99672 0.34713 0
## 1 1 - 1 102.11181 6320.577 6337.491 71.45618 0.99780 0.29896 0
#Since partcle space and chem1 have highest correlation
finiteDLMauto(x= as.vector(dataf$X1), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)
## q - k MASE AIC BIC GMRAE MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1 124.2686 6367.430 6384.272 94.96389 0.99841 0.00020 0
## 9 9 - 1 124.5120 6382.053 6398.904 95.40471 0.99967 0.00050 0
## 8 8 - 1 124.7485 6396.690 6413.548 95.19124 0.99577 0.00079 0
## 7 7 - 1 124.9915 6411.377 6428.243 95.11694 1.06490 0.00098 0
## 6 6 - 1 125.2432 6426.110 6442.984 95.88065 1.00492 0.00109 0
## 5 5 - 1 125.4861 6440.861 6457.743 95.92584 1.00078 0.00117 0
## 4 4 - 1 125.7361 6455.690 6472.580 95.88237 0.99756 0.00110 0
## 3 3 - 1 125.9870 6470.544 6487.442 96.13404 1.01179 0.00099 0
## 2 2 - 1 126.2369 6485.400 6502.306 96.13934 0.99961 0.00089 0
## 1 1 - 1 126.5063 6500.374 6517.288 95.91392 0.99990 0.00056 0
finiteDLMauto(x= as.vector(dataf$X3), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)
## q - k MASE AIC BIC GMRAE MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1 76.09484 5927.164 5944.007 53.25451 1.01353 0.58698 0
## 9 9 - 1 77.07088 5956.737 5973.587 52.51940 1.01040 0.57380 0
## 8 8 - 1 78.30526 5987.006 6003.864 50.91376 1.01841 0.55964 0
## 7 7 - 1 80.12804 6016.111 6032.978 57.26881 0.99311 0.54612 0
## 6 6 - 1 81.83584 6048.052 6064.927 58.29537 1.00044 0.52961 0
## 5 5 - 1 83.61434 6081.504 6098.386 59.21895 0.98981 0.51110 0
## 4 4 - 1 85.63026 6117.644 6134.534 60.55434 1.01811 0.48922 0
## 3 3 - 1 88.48603 6159.437 6176.336 63.54922 1.01534 0.46046 0
## 2 2 - 1 92.83973 6213.661 6230.567 66.36473 0.98818 0.41604 0
## 1 1 - 1 98.26132 6274.512 6291.427 72.52020 0.97310 0.35984 0
From the VIF values, it is obvious that the estimates of the finite DLM coefficients are suffering from the multicollinearity. To deal with this issue, we can use the restricted leastsquares method to find parameter estimates. In this approach, some restrictions are placed on the model parameters to reduce the variances of the estimators. In the context of DLMs,we translate the pattern of time effects into the restrictions on parameters. In the nextsection, we will use polynomial curves to restrict lag weights. According to the significance tests of model coefficients obtained from the summary, all lag weights of predictors are not statistically significant at 5% level. Following this inference, the adjusted R2 is reported to be about 8% which is very low. F-test of the overall significance of the model reports the model is not statistically significant at 5% level (p-value > 0.05). Therefore, we can conclude that the model is not a good fit to the data. VIF values are reported > 10 so the effect of multicollinearity is high. The residualcheck function was created to apply a diagnostic check in a dynamic way. It displays residual analysis plots as well as performs the Breusch-Godfrey test of serial correlation and the Shapiro-Wilk normality test of the residuals. From looking at the diagnostic check plots in Figure 14, we can observe that the residuals are not randomly distributed and clearly have a trend. ACF plot shows that there is serial correlation in the residuals, the Beusch-Godfrey test supports that at 5% level of significance. The histogram and Shapiro-Wilk (p-value < 0.05) test report that the normality of the residuals does not hold. Overall, we can conclude that the finite DLM of lag 10 is not appropriate for further analysis.
##Polynomial Distributed Lags model To reduce the harmful effect of multicollinearity, we will impose a polynomial shape on thelag distribution. Suppose, lag weights follow a smooth polynomial pattern. Because this idea first introduced by Shirley Almon, the resulting model is called Almon Distributed LagModel or Polynomial Distributed Lag model.
To deal with multicollinearity problem in the finite DLM,we will attempt to use polynomial curves to restrict lag weights. We specify the optimal lag length using a function that fits finite DLMs for a range of lag lengths from 1 to 10 and orders the fitted models according to their AIC values.
Model2.AllIndexes <- polyDlm(x= as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -1.21 0.1630 -7.43 4.84e-13
## beta.1 -1.21 0.1340 -9.03 3.98e-18
## beta.2 -1.20 0.1060 -11.40 7.20e-27
## beta.3 -1.20 0.0797 -15.10 2.04e-42
## beta.4 -1.20 0.0590 -20.30 6.02e-67
## beta.5 -1.20 0.0503 -23.80 1.37e-83
## beta.6 -1.19 0.0591 -20.20 2.98e-66
## beta.7 -1.19 0.0799 -14.90 1.34e-41
## beta.8 -1.19 0.1060 -11.20 3.94e-26
## beta.9 -1.18 0.1340 -8.84 1.72e-17
## beta.10 -1.18 0.1630 -7.24 1.71e-12
#Model2.AllIndexes
summary(Model2.AllIndexes)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -246.207 -63.823 -6.922 75.755 194.858
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.545e+03 9.620e+01 26.454 < 2e-16 ***
## z.t0 -1.210e+00 1.629e-01 -7.432 4.75e-13 ***
## z.t1 2.875e-03 3.101e-02 0.093 0.926
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 98.51 on 495 degrees of freedom
## Multiple R-squared: 0.5333, Adjusted R-squared: 0.5314
## F-statistic: 282.8 on 2 and 495 DF, p-value: < 2.2e-16
vif(Model2.AllIndexes$model)
## z.t0 z.t1
## 10.48814 10.48814
residualcheck(Model2.AllIndexes$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98345, p-value = 1.913e-05
checkresiduals(Model2.AllIndexes$model)
##
## Breusch-Godfrey test for serial correlation of order up to 10
##
## data: Residuals
## LM test = 479.75, df = 10, p-value < 2.2e-16
vif(Model2.AllIndexes$model)>10
## z.t0 z.t1
## TRUE TRUE
pmodel1 = polyDlm(x = as.vector(dataf$temp) , y = as.vector(dataf$mortality),q=2,k = 2 , show.beta = TRUE)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -2.42 0.995 -2.43 0.01560
## beta.1 -2.65 0.988 -2.69 0.00748
## beta.2 -2.81 0.993 -2.83 0.00483
model2.c1 =polyDlm(x=as.vector(dataf$X1),y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.1600 0.3110 0.516 0.606
## beta.1 0.1530 0.2540 0.603 0.547
## beta.2 0.1460 0.1990 0.735 0.463
## beta.3 0.1390 0.1470 0.944 0.346
## beta.4 0.1320 0.1050 1.260 0.209
## beta.5 0.1240 0.0863 1.440 0.150
## beta.6 0.1170 0.1050 1.120 0.265
## beta.7 0.1100 0.1480 0.745 0.456
## beta.8 0.1030 0.1990 0.516 0.606
## beta.9 0.0958 0.2550 0.376 0.707
## beta.10 0.0886 0.3110 0.285 0.776
model2.p =polyDlm(x=as.vector(dataf$X3),y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -16.6 2.190 -7.57 1.84e-13
## beta.1 -16.2 1.780 -9.09 2.49e-18
## beta.2 -15.8 1.380 -11.40 5.91e-27
## beta.3 -15.4 1.010 -15.30 1.82e-43
## beta.4 -15.0 0.691 -21.70 8.60e-74
## beta.5 -14.6 0.550 -26.60 6.32e-97
## beta.6 -14.2 0.700 -20.30 5.96e-67
## beta.7 -13.8 1.020 -13.60 8.79e-36
## beta.8 -13.4 1.400 -9.62 3.61e-20
## beta.9 -13.0 1.800 -7.26 1.51e-12
## beta.10 -12.7 2.210 -5.74 1.69e-08
summary(model2.c1, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -246.30 -136.53 10.72 122.59 241.65
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 157.776535 70.798238 2.229 0.0263 *
## z.t0 0.160321 0.310712 0.516 0.6061
## z.t1 -0.007169 0.059776 -0.120 0.9046
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 143.9 on 495 degrees of freedom
## Multiple R-squared: 0.004219, Adjusted R-squared: 0.0001951
## F-statistic: 1.049 on 2 and 495 DF, p-value: 0.3512
#p value is large adjusted r square is lower
summary(model2.p, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -264.69 -63.04 11.17 78.56 177.80
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 718.3915 17.7578 40.455 < 2e-16 ***
## z.t0 -16.5867 2.1901 -7.573 1.8e-13 ***
## z.t1 0.3932 0.4256 0.924 0.356
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 92.48 on 495 degrees of freedom
## Multiple R-squared: 0.5886, Adjusted R-squared: 0.587
## F-statistic: 354.2 on 2 and 495 DF, p-value: < 2.2e-16
# pvalue is lowest and adjusted r square is acceptable low
vif(model2.c1$model)>10
## z.t0 z.t1
## TRUE TRUE
vif(model2.p$model)>10
## z.t0 z.t1
## TRUE TRUE
Like in the finite DLM fitting, the lowest AIC and BIC measures in the given range are for q = 10. We set the order of polynomial to 1 as it minimises the information criteria. According to the model summary, all lag weights are significant at 5% level except lag 6 (p-value > 0.05). The adjusted R2 = 17.5% is slightly better than the finite DLM but still very low. The overall significance test reports the model is statistically significant at 5% level. VIF values are > 10 and suggest there is still multicollinearity effect on this model. Diagnostic checking in Figure 15 shows that the residuals are not randomly spread. There are a lot of highly significant lags in the ACF plot, so there is autocorrelation present in the residuals. That is also supported by Beusch-Godfrey test at 5% level of significance. The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05). We conclude that the polynomial DLM of lag 10 is not appropriate for further analysis.
One way to deal with this infinite DLM is to use Koyck transformation
Model3.AllIndexes <- koyckDlm(x=as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality))
Model3.AllIndexes
## $model
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Coefficients:
## (Intercept) Y.1 X.t
## 1.00e+00 1.00e+00 4.33e-16
##
##
## $geometric.coefficients
## alpha beta phi
## Geometric coefficients: -2.2518e+15 4.329597e-16 1
##
## $call
## koyckDlm.default(x = as.vector(dataf$temp) + as.vector(dataf$X1) +
## as.vector(dataf$X2) + as.vector(dataf$X3), y = as.vector(dataf$mortality))
##
## attr(,"class")
## [1] "koyckDlm" "dLagM"
summary(Model3.AllIndexes)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.684e-14 0.000e+00 5.684e-14 8.527e-14 1.403e-13
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.356e-13 2.979e+12 <2e-16 ***
## Y.1 1.000e+00 7.533e-17 1.328e+16 <2e-16 ***
## X.t 4.330e-16 1.826e-15 2.370e-01 0.813
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.958e-14 on 504 degrees of freedom
## Multiple R-Squared: 1, Adjusted R-squared: 1
## Wald test: 1.122e+33 on 2 and 504 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: -2.2518e+15 4.329597e-16 1
vif(Model3.AllIndexes$model)
## Y.1 X.t
## 12.72753 12.72753
residualcheck(Model3.AllIndexes$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.93121, p-value = 1.64e-14
checkresiduals(Model3.AllIndexes$model)
vif(Model3.AllIndexes$model)>10
## Y.1 X.t
## TRUE TRUE
#residuals presents strange vague plot hence we can say multicolinearity exist.
model3.c1 =koyckDlm(x=as.vector(dataf$X1),y= as.vector(dataf$mortality))
model3.c1
## $model
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Coefficients:
## (Intercept) Y.1 X.t
## 1.000e+00 1.000e+00 -3.021e-16
##
##
## $geometric.coefficients
## alpha beta phi
## Geometric coefficients: 4.5036e+15 -3.021027e-16 1
##
## $call
## koyckDlm.default(x = as.vector(dataf$X1), y = as.vector(dataf$mortality))
##
## attr(,"class")
## [1] "koyckDlm" "dLagM"
model3.p =koyckDlm(x=as.vector(dataf$X3),y= as.vector(dataf$mortality))
model3.p
## $model
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Coefficients:
## (Intercept) Y.1 X.t
## 1.000e+00 1.000e+00 -1.874e-14
##
##
## $geometric.coefficients
## alpha beta phi
## Geometric coefficients: 1.5012e+15 -1.874194e-14 1
##
## $call
## koyckDlm.default(x = as.vector(dataf$X3), y = as.vector(dataf$mortality))
##
## attr(,"class")
## [1] "koyckDlm" "dLagM"
summary(model3.c1, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.837e-14 -5.684e-14 -5.684e-14 -2.842e-14 5.684e-14
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.176e-14 3.148e+13 <2e-16 ***
## Y.1 1.000e+00 1.590e-17 6.291e+16 <2e-16 ***
## X.t -3.021e-16 4.285e-16 -7.050e-01 0.481
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.213e-14 on 504 degrees of freedom
## Multiple R-Squared: 1, Adjusted R-squared: 1
## Wald test: 1.998e+33 on 2 and 504 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 504 284.38450569 6.381405e-51
## Wu-Hausman 1 503 0.03198651 8.581293e-01
##
## alpha beta phi
## Geometric coefficients: 4.5036e+15 -3.021027e-16 1
#p value is smaller adjusted r square is higher
summary(model3.p, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.545e-13 -5.684e-14 5.684e-14 1.137e-13 2.274e-13
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 1.234e-13 8.106e+12 <2e-16 ***
## Y.1 1.000e+00 1.223e-16 8.179e+15 <2e-16 ***
## X.t -1.874e-14 3.275e-14 -5.720e-01 0.567
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.116e-13 on 504 degrees of freedom
## Multiple R-Squared: 1, Adjusted R-squared: 1
## Wald test: 4.363e+32 on 2 and 504 DF, p-value: < 2.2e-16
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 504 14.335197 0.0001714073
## Wu-Hausman 1 503 1.246699 0.2647170467
##
## alpha beta phi
## Geometric coefficients: 1.5012e+15 -1.874194e-14 1
# pvalue is lowest and adjusted r square is high
vif(model3.c1$model)>10
## Y.1 X.t
## FALSE FALSE
#chemical 1 has no multicolinearity since values lie below lag value 10
vif(model3.p$model)>10
## Y.1 X.t
## TRUE TRUE
#particle size contains multicolinearity
#changed attribute of model to obtain aic value
attr(model3.c1$model, "class") ="lm"
AIC(model3.c1$model)
## [1] -29569.36
The AIC measure is reported to be -29569.36 which is lower than the finite and polynomial DLMs. From the residual analysis in Figure 16, it is observed that the errors are spread randomly as desired. There are no significant lags in the ACF plot which suggests there is no serial correlation in the residuals. However, the error terms are not perfectly normal. The histogram of the residuals seems left-skewed, and the Shapiro-Wilk normality test suggests not normal residuals (p-value < 0.05).
Autoregressive DLMs are useful when we cannot find suitable solutions with neither polynomial nor Kyock DLMs. Actually, the autoregressive DLM is a flexible andparsimonious infinite DLM.
We attempt to fit autoregressive DLMs in order to find a more suitable model than the Koyck model.
For specifying the parameters of ARDL(p,q), we use a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and fit the models that minimise the infrormation criteria. Based on the information criteria, we select the following models: ARDL(1,5) ARDL(2,5) ARDL(3,5) ARDL(4,5) ARDL(5,5)
#for(i in 1:5){
# for (j in 1:5) {
# model4 = ardlDlm(formula = mortality ~ temp +x3, data = data.frame(dataf), p= i, q=j)
# cat("p= ", i, "q= ", j,"AIC =", AIC(model4$model), "BIC =", BIC(model4$model),"Mase=", MASE(model4)$MASE, "\n")
# }
#}
for(i in 1:5){
for (j in 1:5) {
model4.allIndexes = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p= i, q=j)
cat("p= ", i, "q= ", j,"AIC =", AIC(model4.allIndexes$model), "BIC =", BIC(model4.allIndexes$model),"Mase=", MASE(model4.allIndexes)$MASE,"\n")
}
}
## p= 1 q= 1 AIC = -28835.99 BIC = -28806.39 Mase= 1.2438e-14
## p= 1 q= 2 AIC = -29003.22 BIC = -28973.63 Mase= 1.794261e-14
## p= 1 q= 3 AIC = -28412.27 BIC = -28382.69 Mase= 1.975098e-14
## p= 1 q= 4 AIC = -29522.48 BIC = -29492.92 Mase= 5.073543e-15
## p= 1 q= 5 AIC = -28709.65 BIC = -28680.1 Mase= 1.57453e-14
## p= 2 q= 1 AIC = -29487.23 BIC = -29449.19 Mase= 1.606444e-14
## p= 2 q= 2 AIC = -29487.23 BIC = -29449.19 Mase= 1.606444e-14
## p= 2 q= 3 AIC = -30338.17 BIC = -30300.15 Mase= 2.474588e-15
## p= 2 q= 4 AIC = -31396.88 BIC = -31358.88 Mase= 1.13137e-15
## p= 2 q= 5 AIC = -27996.53 BIC = -27958.54 Mase= 3.00056e-14
## p= 3 q= 1 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14
## p= 3 q= 2 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14
## p= 3 q= 3 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14
## p= 3 q= 4 AIC = -28111.5 BIC = -28065.05 Mase= 2.685154e-14
## p= 3 q= 5 AIC = -28562.62 BIC = -28516.2 Mase= 1.895016e-14
## p= 4 q= 1 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16
## p= 4 q= 2 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16
## p= 4 q= 3 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16
## p= 4 q= 4 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16
## p= 4 q= 5 AIC = -29504.13 BIC = -29449.26 Mase= 1.453575e-14
## p= 5 q= 1 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14
## p= 5 q= 2 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14
## p= 5 q= 3 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14
## p= 5 q= 4 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14
## p= 5 q= 5 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14
for (i in c(3,4,5)){
model4_ardl <- ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p
= i, q = 5)
summary(model4_ardl)
residualcheck(model4_ardl$model)
}
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.282e-12 -8.440e-15 3.810e-15 1.511e-14 7.850e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 7.409e-14 1.350e+13 < 2e-16 ***
## temp.t 1.984e-15 9.123e-16 2.175e+00 0.0301 *
## temp.1 4.201e-16 9.976e-16 4.210e-01 0.6739
## temp.2 1.465e-15 9.931e-16 1.475e+00 0.1408
## temp.3 1.951e-15 9.082e-16 2.148e+00 0.0322 *
## X3.t 4.290e-14 5.816e-15 7.377e+00 6.91e-13 ***
## X3.1 3.601e-14 5.959e-15 6.044e+00 2.97e-09 ***
## X3.2 1.053e-14 5.946e-15 1.770e+00 0.0773 .
## X3.3 2.892e-14 5.874e-15 4.924e+00 1.16e-06 ***
## mortality.1 1.000e+00 5.240e-17 1.909e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.117e-13 on 493 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 9.446e+31 on 9 and 493 DF, p-value: < 2.2e-16
##
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.370e-13 -9.910e-15 -3.200e-16 1.074e-14 6.579e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.017e-14 3.315e+13 < 2e-16 ***
## temp.t 5.421e-16 3.600e-16 1.506e+00 0.132699
## temp.1 3.387e-16 3.916e-16 8.650e-01 0.387474
## temp.2 -1.144e-16 4.160e-16 -2.750e-01 0.783494
## temp.3 3.294e-16 3.888e-16 8.470e-01 0.397294
## temp.4 -7.356e-16 3.553e-16 -2.070e+00 0.038963 *
## X3.t -1.290e-15 2.292e-15 -5.630e-01 0.573758
## X3.1 8.987e-15 2.338e-15 3.844e+00 0.000137 ***
## X3.2 -2.385e-16 2.371e-15 -1.010e-01 0.919918
## X3.3 1.548e-14 2.330e-15 6.643e+00 8.17e-11 ***
## X3.4 8.938e-15 2.336e-15 3.827e+00 0.000147 ***
## mortality.1 1.000e+00 2.117e-17 4.724e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.372e-14 on 491 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 5.043e+32 on 11 and 491 DF, p-value: < 2.2e-16
##
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.287e-13 -1.405e-14 1.240e-15 1.249e-14 6.350e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.386e-14 2.953e+13 < 2e-16 ***
## temp.t 4.055e-16 3.893e-16 1.042e+00 0.2981
## temp.1 -1.769e-16 4.261e-16 -4.150e-01 0.6782
## temp.2 2.586e-16 4.509e-16 5.730e-01 0.5666
## temp.3 -1.007e-17 4.501e-16 -2.200e-02 0.9822
## temp.4 -8.529e-16 4.200e-16 -2.031e+00 0.0428 *
## temp.5 -3.321e-16 3.853e-16 -8.620e-01 0.3892
## X3.t 2.158e-14 2.494e-15 8.653e+00 < 2e-16 ***
## X3.1 -2.542e-15 2.542e-15 -1.000e+00 0.3178
## X3.2 -2.135e-14 2.567e-15 -8.317e+00 9.00e-16 ***
## X3.3 2.387e-15 2.567e-15 9.300e-01 0.3529
## X3.4 -2.117e-14 2.553e-15 -8.291e+00 1.10e-15 ***
## X3.5 -1.505e-14 2.538e-15 -5.929e+00 5.77e-09 ***
## mortality.1 1.000e+00 2.353e-17 4.250e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.728e-14 on 489 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 3.649e+32 on 13 and 489 DF, p-value: < 2.2e-16
checkresiduals(model4_ardl$model)
##
## Breusch-Godfrey test for serial correlation of order up to 21
##
## data: Residuals
## LM test = 91.724, df = 21, p-value = 8.125e-11
#Based on the observation about model estimates made earlier, we can try to decrease the
#number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.
#for p=1, q=5
model4_1 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf),p=1 ,q =5)$model
summary(model4_1)
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.244e-13 -1.460e-14 -5.560e-15 4.650e-15 2.046e-12
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 5.747e-14 1.740e+13 < 2e-16 ***
## temp.t -3.572e-15 7.207e-16 -4.956e+00 9.87e-07 ***
## temp.1 -2.276e-15 7.166e-16 -3.176e+00 0.00159 **
## X3.t -4.133e-14 4.899e-15 -8.437e+00 3.58e-16 ***
## X3.1 -4.882e-14 4.971e-15 -9.820e+00 < 2e-16 ***
## mortality.1 1.000e+00 4.122e-17 2.426e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.688e-14 on 497 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 2.26e+32 on 5 and 497 DF, p-value: < 2.2e-16
residualcheck(model4_1)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.13369, p-value < 2.2e-16
checkresiduals(model4_1)
##
## Breusch-Godfrey test for serial correlation of order up to 13
##
## data: Residuals
## LM test = 25.695, df = 13, p-value = 0.01868
#attr(Model3.AllIndexes$model,"class")=lm
#for p=3, q=5
model4_3 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =3, q=5)$model
summary(model4_3)
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.282e-12 -8.440e-15 3.810e-15 1.511e-14 7.850e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 7.409e-14 1.350e+13 < 2e-16 ***
## temp.t 1.984e-15 9.123e-16 2.175e+00 0.0301 *
## temp.1 4.201e-16 9.976e-16 4.210e-01 0.6739
## temp.2 1.465e-15 9.931e-16 1.475e+00 0.1408
## temp.3 1.951e-15 9.082e-16 2.148e+00 0.0322 *
## X3.t 4.290e-14 5.816e-15 7.377e+00 6.91e-13 ***
## X3.1 3.601e-14 5.959e-15 6.044e+00 2.97e-09 ***
## X3.2 1.053e-14 5.946e-15 1.770e+00 0.0773 .
## X3.3 2.892e-14 5.874e-15 4.924e+00 1.16e-06 ***
## mortality.1 1.000e+00 5.240e-17 1.909e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.117e-13 on 493 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 9.446e+31 on 9 and 493 DF, p-value: < 2.2e-16
residualcheck(model4_3)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.14877, p-value < 2.2e-16
checkresiduals(model4_3)
##
## Breusch-Godfrey test for serial correlation of order up to 17
##
## data: Residuals
## LM test = 61.371, df = 17, p-value = 6.232e-07
#for p=4, q=5
model4_4 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =4, q=5)$model
summary(model4_4)
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.370e-13 -9.910e-15 -3.200e-16 1.074e-14 6.579e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.017e-14 3.315e+13 < 2e-16 ***
## temp.t 5.421e-16 3.600e-16 1.506e+00 0.132699
## temp.1 3.387e-16 3.916e-16 8.650e-01 0.387474
## temp.2 -1.144e-16 4.160e-16 -2.750e-01 0.783494
## temp.3 3.294e-16 3.888e-16 8.470e-01 0.397294
## temp.4 -7.356e-16 3.553e-16 -2.070e+00 0.038963 *
## X3.t -1.290e-15 2.292e-15 -5.630e-01 0.573758
## X3.1 8.987e-15 2.338e-15 3.844e+00 0.000137 ***
## X3.2 -2.385e-16 2.371e-15 -1.010e-01 0.919918
## X3.3 1.548e-14 2.330e-15 6.643e+00 8.17e-11 ***
## X3.4 8.938e-15 2.336e-15 3.827e+00 0.000147 ***
## mortality.1 1.000e+00 2.117e-17 4.724e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.372e-14 on 491 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 5.043e+32 on 11 and 491 DF, p-value: < 2.2e-16
residualcheck(model4_4)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.46774, p-value < 2.2e-16
checkresiduals(model4_4)
##
## Breusch-Godfrey test for serial correlation of order up to 19
##
## data: Residuals
## LM test = 110.01, df = 19, p-value = 7.932e-15
#vif(model4.4)
#for p=5, q=5
model4_5 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =5, q=5)$model
summary(model4_5)
##
## Time series regression with "ts" data:
## Start = 6, End = 508
##
## Call:
## dynlm(formula = as.formula(model.text), data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.287e-13 -1.405e-14 1.240e-15 1.249e-14 6.350e-13
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.000e+00 3.386e-14 2.953e+13 < 2e-16 ***
## temp.t 4.055e-16 3.893e-16 1.042e+00 0.2981
## temp.1 -1.769e-16 4.261e-16 -4.150e-01 0.6782
## temp.2 2.586e-16 4.509e-16 5.730e-01 0.5666
## temp.3 -1.007e-17 4.501e-16 -2.200e-02 0.9822
## temp.4 -8.529e-16 4.200e-16 -2.031e+00 0.0428 *
## temp.5 -3.321e-16 3.853e-16 -8.620e-01 0.3892
## X3.t 2.158e-14 2.494e-15 8.653e+00 < 2e-16 ***
## X3.1 -2.542e-15 2.542e-15 -1.000e+00 0.3178
## X3.2 -2.135e-14 2.567e-15 -8.317e+00 9.00e-16 ***
## X3.3 2.387e-15 2.567e-15 9.300e-01 0.3529
## X3.4 -2.117e-14 2.553e-15 -8.291e+00 1.10e-15 ***
## X3.5 -1.505e-14 2.538e-15 -5.929e+00 5.77e-09 ***
## mortality.1 1.000e+00 2.353e-17 4.250e+16 < 2e-16 ***
## mortality.2 NA NA NA NA
## mortality.3 NA NA NA NA
## mortality.4 NA NA NA NA
## mortality.5 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.728e-14 on 489 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 3.649e+32 on 13 and 489 DF, p-value: < 2.2e-16
residualcheck(model4_5)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.54252, p-value < 2.2e-16
checkresiduals(model4_5)
##
## Breusch-Godfrey test for serial correlation of order up to 21
##
## data: Residuals
## LM test = 91.724, df = 21, p-value = 8.125e-11
#vif(model4.5)
All the fitted ARDL models were reported to be statistically significant at 5% level. The best model in terms of the significance of coefficients, AIC and adjusted R2 was marginally ARDL(1,5). According to model summary, it is suggested that the all ords price index might be related to its previous year levels, gold prices of the previous year and copper prices of the previous year as well. The AIC (2086.8) and adjusted R2 (95%) values are the best compared to those of all previously fitted models.
Residual analysis supports that this model is appropriate: the errors are randomly spread and have no discernible trend, there is no serial autocorrelation in the residuals based on ACF and Beusch-Godfrey test. The normality assumption is not violated at 5% level according to the histogram and the results of Shapiro-Wilk test. However, all ARDL models fitted suffer from multicollinearity with VIFs > 10. Overall,we failed to find an appropriate ARDL model in terms of multicollinearity.
Mort1<-ts(mort_data$mortality,start = 2010,end =2018,frequency = 12)
hw1 <- hw(Mort1)
summary(hw1,)
##
## Forecast method: Holt-Winters' additive method
##
## Model Information:
## Holt-Winters' additive method
##
## Call:
## hw(y = Mort1)
##
## Smoothing parameters:
## alpha = 0.3026
## beta = 0.0634
## gamma = 1e-04
##
## Initial states:
## l = 98.503
## b = -0.6389
## s = -1.525 2.7413 0.8021 -0.7569 1.7547 -4.0959
## 1.0148 1.5662 -2.6838 -1.0263 1.5713 0.6374
##
## sigma: 6.4866
##
## AIC AICc BIC
## 822.9894 830.7362 866.7595
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3845843 5.927486 4.747907 0.1818436 4.998855 0.4851326
## ACF1
## Training set -0.1279376
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 2018 109.0174 100.70458 117.3303 96.30402 121.7308
## Mar 2018 108.1470 99.29468 116.9994 94.60853 121.6855
## Apr 2018 108.2180 98.67271 117.7633 93.61974 122.8162
## May 2018 114.1949 103.80714 124.5826 98.30821 130.0815
## Jun 2018 115.3714 104.00047 126.7423 97.98107 132.7617
## Jul 2018 111.9884 99.50449 124.4722 92.89593 131.0808
## Aug 2018 119.5674 105.85214 133.2827 98.59170 140.5432
## Sep 2018 118.7832 103.72866 133.8378 95.75924 141.8073
## Oct 2018 122.0702 105.57787 138.5626 96.84735 147.2931
## Nov 2018 125.7372 107.71678 143.7576 98.17735 153.2971
## Dec 2018 123.1990 103.56711 142.8309 93.17460 153.2234
## Jan 2019 127.0892 105.76832 148.4102 94.48170 159.6968
## Feb 2019 129.7506 106.66765 152.8335 94.44830 165.0528
## Mar 2019 128.8802 103.96736 153.7930 90.77931 166.9810
## Apr 2019 128.9511 102.14370 155.7585 87.95271 169.9495
## May 2019 134.9280 106.16437 163.6917 90.93782 178.9182
## Jun 2019 136.1045 105.32577 166.8833 89.03249 183.1766
## Jul 2019 132.7215 99.87113 165.5719 82.48119 182.9618
## Aug 2019 140.3006 105.32414 175.2770 86.80873 193.7924
## Sep 2019 139.5164 102.36132 176.6715 82.69262 196.3402
## Oct 2019 142.8034 103.41875 182.1880 82.56980 203.0369
## Nov 2019 146.4703 104.80679 188.1339 82.75143 210.1893
## Dec 2019 143.9322 99.94159 187.9227 76.65439 211.2099
## Jan 2020 147.8224 101.45800 194.1868 76.91418 218.7306
checkresiduals(hw1)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' additive method
## Q* = 34.909, df = 3, p-value = 1.273e-07
##
## Model df: 16. Total lags used: 19
hw2 <- hw(Mort1,seasonal="multiplicative")
summary(hw2)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = Mort1, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.2679
## beta = 0.0791
## gamma = 1e-04
##
## Initial states:
## l = 97.498
## b = -0.3153
## s = 0.9886 1.0287 1.0079 0.9898 1.0204 0.9511
## 1.0112 1.0136 0.9762 0.9865 1.0161 1.0098
##
## sigma: 0.0696
##
## AIC AICc BIC
## 823.5972 831.3440 867.3673
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.2988484 5.931923 4.731838 0.09075421 4.98262 0.4834907
## ACF1
## Training set -0.1040063
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 2018 108.9713 99.24873 118.6939 94.10191 123.8407
## Mar 2018 107.7472 97.58845 117.9060 92.21073 123.2837
## Apr 2018 108.5439 97.55894 119.5288 91.74385 125.3439
## May 2018 114.7023 102.09534 127.3093 95.42160 133.9831
## Jun 2018 116.4298 102.42923 130.4304 95.01776 137.8419
## Jul 2018 111.3913 96.68473 126.0979 88.89954 133.8831
## Aug 2018 121.5252 103.89809 139.1523 94.56686 148.4835
## Sep 2018 119.8386 100.76717 138.9099 90.67140 149.0057
## Oct 2018 124.0145 102.41667 145.6123 90.98347 157.0455
## Nov 2018 128.6109 104.17956 153.0423 91.24638 165.9755
## Dec 2018 125.5508 99.62937 151.4721 85.90742 165.1941
## Jan 2019 130.2332 101.11772 159.3488 85.70489 174.7616
## Feb 2019 133.0516 100.95882 165.1445 83.96991 182.1334
## Mar 2019 131.1265 97.12273 165.1303 79.12220 183.1309
## Apr 2019 131.6778 95.08920 168.2664 75.72037 187.6352
## May 2019 138.7222 97.54971 179.8946 75.75434 201.6900
## Jun 2019 140.3932 96.01671 184.7697 72.52521 208.2612
## Jul 2019 133.9311 88.96981 178.8924 65.16874 202.6935
## Aug 2019 145.7078 93.89023 197.5253 66.45968 224.9559
## Sep 2019 143.2965 89.44097 197.1521 60.93157 225.6615
## Oct 2019 147.9003 89.28635 206.5142 58.25802 237.5425
## Nov 2019 152.9907 89.18828 216.7931 55.41332 250.5681
## Dec 2019 148.9803 83.72640 214.2342 49.18306 248.7776
## Jan 2020 154.1645 83.37155 224.9574 45.89605 262.4329
checkresiduals(hw2)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 37.742, df = 3, p-value = 3.206e-08
##
## Model df: 16. Total lags used: 19
hw3 <- hw(Mort1,seasonal="additive",damped = TRUE, h=5*frequency(Mort1))
summary(hw3)
##
## Forecast method: Damped Holt-Winters' additive method
##
## Model Information:
## Damped Holt-Winters' additive method
##
## Call:
## hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "additive",
##
## Call:
## damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.224
## beta = 0.0802
## gamma = 4e-04
## phi = 0.8531
##
## Initial states:
## l = 98.1632
## b = -0.438
## s = -1.567 2.4215 1.5828 -0.7804 1.9009 -4.6792
## 0.9772 1.6293 -2.8195 -0.7759 1.352 0.7583
##
## sigma: 6.4048
##
## AIC AICc BIC
## 821.3245 830.0938 867.6693
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.2770002 5.816561 4.675037 0.01609047 4.93011 0.4776868
## ACF1
## Training set -0.09199321
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 2018 106.7660 98.55785 114.9741 94.21274 119.3192
## Mar 2018 105.9188 97.36706 114.4706 92.84004 118.9976
## Apr 2018 104.9735 95.95014 113.9968 91.17348 118.7735
## May 2018 110.3536 100.75004 119.9573 95.66619 125.0411
## Jun 2018 110.4993 100.23031 120.7683 94.79424 126.2043
## Jul 2018 105.5240 94.52737 116.5207 88.70609 122.3419
## Aug 2018 112.6836 100.91663 124.4506 94.68758 130.6796
## Sep 2018 110.4972 97.93307 123.0612 91.28205 129.7123
## Oct 2018 113.2808 99.90504 126.6566 92.82433 133.7373
## Nov 2018 114.4836 100.29076 128.6765 92.77751 136.1897
## Dec 2018 110.8025 95.79399 125.8111 87.84893 133.7562
## Jan 2019 113.3896 97.57164 129.2076 89.19811 137.5812
## Feb 2019 114.2065 97.58796 130.8251 88.79062 139.6224
## Mar 2019 112.2663 94.86017 129.6724 85.64594 138.8866
## Apr 2019 110.3884 92.20850 128.5683 82.58464 138.1922
## May 2019 114.9731 96.03401 133.9121 86.00829 143.9378
## Jun 2019 114.4401 94.75711 134.1230 84.33759 144.5425
## Jul 2019 108.8858 88.47445 129.2972 77.66931 140.1024
## Aug 2019 115.5515 94.42712 136.6759 83.24454 147.8585
## Sep 2019 112.9437 91.12159 134.7659 79.56963 146.3179
## Oct 2019 115.3680 92.86299 137.8730 80.94958 149.7864
## Nov 2019 116.2642 93.09090 139.4374 80.82371 151.7046
## Dec 2019 112.3215 88.49406 136.1489 75.88058 148.7624
## Jan 2020 114.6854 90.21744 139.1534 77.26487 152.1060
## Feb 2020 115.3120 90.21589 140.4080 76.93083 153.6931
## Mar 2020 113.2093 87.49842 138.9202 73.88791 152.5307
## Apr 2020 111.1929 84.87935 137.5065 70.94980 151.4360
## May 2020 115.6594 88.75474 142.5640 74.51229 156.8064
## Jun 2020 115.0255 87.54098 142.5101 72.99154 157.0595
## Jul 2020 109.3853 81.33149 137.4391 66.48070 152.2899
## Aug 2020 115.9776 87.36476 144.5904 72.21803 159.7372
## Sep 2020 113.3072 84.14515 142.4693 68.70767 157.9068
## Oct 2020 115.6781 85.97611 145.3800 70.25285 161.1033
## Nov 2020 116.5287 86.29586 146.7615 70.29158 162.7658
## Dec 2020 112.5472 81.79209 143.3022 65.51134 159.5830
## Jan 2021 114.8779 83.60888 146.1470 67.05604 162.6998
## Feb 2021 115.4762 83.70055 147.2518 66.87954 164.0728
## Mar 2021 113.3494 81.07532 145.6235 63.99045 162.7083
## Apr 2021 111.3124 78.54722 144.0776 61.20237 161.4225
## May 2021 115.7613 82.51201 149.0106 64.91089 166.6118
## Jun 2021 115.1125 81.38586 148.8392 63.53204 166.6930
## Jul 2021 109.4595 75.26200 143.6570 57.15894 161.7601
## Aug 2021 116.0409 81.37882 150.7030 63.02982 169.0520
## Sep 2021 113.3612 78.24059 148.4819 59.64885 167.0736
## Oct 2021 115.7241 80.15075 151.2975 61.31934 170.1289
## Nov 2021 116.5680 80.54747 152.5885 61.47936 171.6566
## Dec 2021 112.5807 76.11846 149.0429 56.81652 168.3449
## Jan 2022 114.9065 78.00782 151.8053 58.47482 171.3383
## Feb 2022 115.5006 78.16999 152.8312 58.40837 172.5928
## Mar 2022 113.3702 75.61308 151.1273 55.62566 171.1148
## Apr 2022 111.3302 73.15124 149.5091 52.94053 169.7198
## May 2022 115.7765 77.18030 154.3726 56.74872 174.8042
## Jun 2022 115.1254 76.11647 154.1344 55.46638 174.7845
## Jul 2022 109.4705 70.05308 148.8880 49.18674 169.7543
## Aug 2022 116.0503 76.22854 155.8721 55.14817 176.9524
## Sep 2022 113.3693 73.14722 153.5913 51.85497 174.8836
## Oct 2022 115.7310 75.11261 156.3493 53.61054 177.8514
## Nov 2022 116.5738 75.56294 157.5847 53.85309 179.2946
## Dec 2022 112.5857 71.18598 153.9854 49.27031 175.9010
## Jan 2023 114.9108 73.12591 156.6957 51.00633 178.8153
checkresiduals(hw3)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt-Winters' additive method
## Q* = 32.827, df = 3, p-value = 3.503e-07
##
## Model df: 17. Total lags used: 20
hw4 <- hw(Mort1,seasonal="multiplicative",damped = TRUE, h=5*frequency(Mort1))
summary(hw4)
##
## Forecast method: Damped Holt-Winters' multiplicative method
##
## Model Information:
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "multiplicative",
##
## Call:
## damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.2373
## beta = 0.071
## gamma = 1e-04
## phi = 0.946
##
## Initial states:
## l = 98.2254
## b = -0.2986
## s = 0.9893 1.0314 1.001 0.9921 1.0153 0.9495
## 1.0092 1.0147 0.9811 0.9926 1.017 1.0066
##
## sigma: 0.0691
##
## AIC AICc BIC
## 823.0094 831.7786 869.3542
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.2973536 5.848648 4.651999 0.06786496 4.897167 0.4753329
## ACF1
## Training set -0.09439883
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 2018 108.1279 98.55184 117.7039 93.4826050 122.7731
## Mar 2018 107.1958 97.27297 117.1187 92.0201182 122.3716
## Apr 2018 107.5107 96.96826 118.0532 91.3874201 123.6340
## May 2018 112.7263 100.89672 124.5558 94.6345414 130.8180
## Jun 2018 113.5512 100.71458 126.3878 93.9192940 133.1831
## Jul 2018 108.1114 94.90092 121.3219 87.9077117 128.3151
## Aug 2018 116.8894 101.43664 132.3422 93.2564450 140.5224
## Sep 2018 115.4130 98.91974 131.9063 90.1887368 140.6373
## Oct 2018 117.5915 99.46155 135.7214 89.8641370 145.3188
## Nov 2018 122.2738 101.98852 142.5591 91.2501512 153.2974
## Dec 2018 118.2880 97.23458 139.3415 86.0895518 150.4865
## Jan 2019 121.3315 98.23526 144.4277 86.0088522 156.6541
## Feb 2019 123.5114 98.44360 148.5792 85.1735101 161.8493
## Mar 2019 121.3994 95.20880 147.5900 81.3443165 161.4545
## Apr 2019 120.7919 93.17156 148.4122 78.5502289 163.0336
## May 2019 125.7220 95.33606 156.1080 79.2507127 172.1933
## Jun 2019 125.7791 93.72976 157.8284 76.7638708 174.7943
## Jul 2019 118.9952 87.10633 150.8840 70.2254041 167.7649
## Aug 2019 127.8988 91.93270 163.8649 72.8934080 182.9042
## Sep 2019 125.5903 88.60879 162.5718 69.0319615 182.1486
## Oct 2019 127.3062 88.12966 166.4827 67.3908562 187.2215
## Nov 2019 131.7433 89.45149 174.0352 67.0635311 196.4231
## Dec 2019 126.8807 84.46441 169.2969 62.0106137 191.7507
## Jan 2020 129.6029 84.55581 174.6499 60.7093399 198.4964
## Feb 2020 131.4172 83.99589 178.8384 58.8925781 203.9418
## Mar 2020 128.6988 80.55330 176.8443 55.0666097 202.3310
## Apr 2020 127.6173 78.18817 177.0464 52.0219844 203.2125
## May 2020 132.4007 79.37056 185.4308 51.2980967 213.5033
## Jun 2020 132.0632 77.42788 186.6985 48.5057026 215.6207
## Jul 2020 124.5885 71.40725 177.7697 43.2548263 205.9221
## Aug 2020 133.5566 74.79526 192.3180 43.6888717 223.4244
## Sep 2020 130.8205 71.55103 190.0900 40.1756473 221.4654
## Oct 2020 132.2987 70.63264 193.9647 37.9886003 226.6088
## Nov 2020 136.6099 71.15590 202.0638 36.5066588 236.7131
## Dec 2020 131.2965 66.68370 195.9093 32.4797447 230.1133
## Jan 2021 133.8536 66.24937 201.4579 30.4618398 237.2454
## Feb 2021 135.4801 65.30490 205.6552 28.1564085 242.8037
## Mar 2021 132.4501 62.13908 202.7611 24.9186813 239.9815
## Apr 2021 131.1249 59.83392 202.4159 22.0947492 240.1551
## May 2021 135.8330 60.24331 211.4227 20.2285555 251.4374
## Jun 2021 135.2927 58.27654 212.3088 17.5066621 253.0787
## Jul 2021 127.4629 53.28134 201.6446 14.0119712 240.9139
## Aug 2021 136.4643 55.31174 217.6169 12.3521699 260.5764
## Sep 2021 133.5084 52.42362 214.5933 9.4999123 257.5170
## Oct 2021 134.8644 51.25358 218.4753 6.9926592 262.7362
## Nov 2021 139.1109 51.11608 227.1057 4.5344558 273.6873
## Dec 2021 133.5659 47.40165 219.7302 1.7890404 265.3428
## Jan 2022 136.0382 46.57560 225.5008 -0.7830294 272.8594
## Feb 2022 137.5681 45.38111 229.7550 -3.4197201 278.5559
## Mar 2022 134.3779 42.65484 226.1010 -5.9004293 274.6563
## Apr 2022 132.9276 40.54283 225.3123 -8.3627074 274.2179
## May 2022 137.5969 40.26157 234.9323 -11.2646518 286.4585
## Jun 2022 136.9524 38.37969 235.5251 -13.8015346 287.7063
## Jul 2022 128.9402 34.54394 223.3365 -15.4264338 273.3069
## Aug 2022 137.9586 35.26234 240.6549 -19.1017965 295.0191
## Sep 2022 134.8898 32.82179 236.9579 -21.2097607 290.9894
## Oct 2022 136.1830 31.46813 240.8980 -23.9645926 296.3307
## Nov 2022 140.3962 30.72544 250.0670 -27.3307510 308.1232
## Dec 2022 134.7322 27.84235 241.6221 -28.7417368 298.2062
## Jan 2023 137.1609 26.67447 247.6473 -31.8135168 306.1353
checkresiduals(hw4)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt-Winters' multiplicative method
## Q* = 36.493, df = 3, p-value = 5.891e-08
##
## Model df: 17. Total lags used: 20
hw5 <- hw(Mort1,seasonal="multiplicative",exponential = TRUE, h=5*frequency(Mort1))
summary(hw5)
##
## Forecast method: Holt-Winters' multiplicative method with exponential trend
##
## Model Information:
## Holt-Winters' multiplicative method with exponential trend
##
## Call:
## hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "multiplicative",
##
## Call:
## exponential = TRUE)
##
## Smoothing parameters:
## alpha = 0.1702
## beta = 0.0826
## gamma = 6e-04
##
## Initial states:
## l = 98.1619
## b = 0.9888
## s = 0.9861 1.0325 1.0065 0.9966 1.015 0.95
## 1.0078 1.0165 0.975 0.9877 1.0245 1.0018
##
## sigma: 0.0698
##
## AIC AICc BIC
## 824.2251 831.9720 867.9952
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.2476226 6.041892 4.736294 0.05095547 4.972325 0.483946
## ACF1
## Training set 0.02318078
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Feb 2018 109.4988 99.83673 119.5230 94.46214 124.5544
## Mar 2018 108.0580 98.37236 118.0374 92.94384 123.8599
## Apr 2018 109.1812 98.55099 120.1720 93.47974 126.0201
## May 2018 116.5184 104.29342 128.8025 98.60449 135.5483
## Jun 2018 118.2492 105.55911 132.2012 98.60745 140.0028
## Jul 2018 114.1012 100.38163 129.3366 93.27904 137.6412
## Aug 2018 124.7755 108.13226 143.5460 100.18277 153.4979
## Sep 2018 125.3974 107.15307 146.5052 98.27312 158.1772
## Oct 2018 129.6403 108.90381 153.7712 98.41594 167.7733
## Nov 2018 136.1257 112.39892 164.6797 100.65249 182.1086
## Dec 2018 133.0759 107.52995 165.2210 96.38835 183.0898
## Jan 2019 138.4000 109.56289 176.1737 96.13983 197.2343
## Feb 2019 144.8668 112.11239 186.8451 97.24755 215.1107
## Mar 2019 142.9607 108.37087 189.7698 93.02552 217.4693
## Apr 2019 144.4467 106.33689 195.8011 89.30283 227.9146
## May 2019 154.1539 111.31489 214.4449 91.81972 254.4071
## Jun 2019 156.4436 110.06103 224.4994 89.89264 267.8307
## Jul 2019 150.9558 103.39080 221.6239 82.88167 269.2519
## Aug 2019 165.0779 109.23115 249.9013 86.92044 309.2957
## Sep 2019 165.9007 106.70006 257.2551 83.37932 323.4817
## Oct 2019 171.5141 106.61312 276.9840 83.22425 347.2831
## Nov 2019 180.0942 108.59901 298.9394 82.27630 387.3839
## Dec 2019 176.0594 102.83667 298.6024 77.32540 396.6695
## Jan 2020 183.1032 102.63906 324.2812 75.57144 429.6073
## Feb 2020 191.6587 104.12938 346.0328 76.59149 474.4412
## Mar 2020 189.1369 99.90952 355.6159 72.24514 490.6191
## Apr 2020 191.1029 97.70887 374.8353 68.50357 521.0252
## May 2020 203.9455 100.80766 413.0005 70.47600 586.8126
## Jun 2020 206.9749 97.53154 432.9401 67.66125 633.6323
## Jul 2020 199.7145 90.96275 433.8695 61.63919 633.5279
## Aug 2020 218.3980 96.16393 492.4960 63.11115 735.6280
## Sep 2020 219.4866 92.27964 512.6324 60.61528 785.6791
## Oct 2020 226.9130 92.88519 550.6302 58.40900 855.9951
## Nov 2020 238.2646 93.13383 592.4569 59.02745 947.2777
## Dec 2020 232.9265 88.79074 600.7463 54.53260 1013.7950
## Jan 2021 242.2454 87.18876 647.9502 54.19609 1080.2958
## Feb 2021 253.5644 88.25287 704.4843 53.19231 1218.2414
## Mar 2021 250.2280 83.72571 727.2705 48.65207 1273.4423
## Apr 2021 252.8290 81.10593 763.1807 47.02147 1359.4674
## May 2021 269.8198 83.53507 850.0836 46.78386 1555.7785
## Jun 2021 273.8277 80.90388 895.4948 44.65058 1682.5248
## Jul 2021 264.2222 74.81528 901.9133 40.72937 1712.5776
## Aug 2021 288.9404 78.95063 1017.5552 41.82568 2001.6934
## Sep 2021 290.3807 75.66547 1069.6307 39.28722 2178.6194
## Oct 2021 300.2058 74.51540 1147.5222 37.99232 2375.2249
## Nov 2021 315.2240 75.34337 1262.6635 37.21553 2684.3866
## Dec 2021 308.1617 69.65201 1286.3090 34.03295 2828.7544
## Jan 2022 320.4906 69.40450 1391.9039 32.97809 3146.3126
## Feb 2022 335.4656 69.48091 1532.4272 32.44381 3523.4407
## Mar 2022 331.0516 64.87783 1563.7035 30.24014 3773.6356
## Apr 2022 334.4927 63.40979 1639.9787 28.31597 3940.3574
## May 2022 356.9715 64.01448 1847.7022 28.07284 4547.3062
## Jun 2022 362.2739 62.19522 1955.6994 26.43109 4863.9921
## Jul 2022 349.5658 57.56158 1975.4967 23.60280 5221.6762
## Aug 2022 382.2681 59.86427 2264.9282 23.42085 5916.0922
## Sep 2022 384.1735 56.66129 2346.4940 21.88930 6654.0246
## Oct 2022 397.1722 56.81254 2505.8187 21.00711 7235.6332
## Nov 2022 417.0412 56.64708 2808.5879 21.10919 8169.1868
## Dec 2022 407.6978 52.50600 2848.0398 18.41054 8714.6095
## Jan 2023 424.0090 51.44345 3125.0802 17.59570 9914.5010
checkresiduals(hw5)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method with exponential trend
## Q* = 33.217, df = 3, p-value = 2.899e-07
##
## Model df: 16. Total lags used: 19
fit.expo = ets(mortal_ts, model="ZZZ", ic ="bic")
fit.expo$method
## [1] "ETS(M,N,N)"
#"ETS(M,N,N)"
fit.tem = ets(temp_ts, model="ZZZ", ic ="bic")
fit.tem$method
## [1] "ETS(A,N,N)"
#ETS(A,N,N)
fit.ch1 = ets(chem1_ts, model="ZZZ", ic ="bic")
fit.ch1$method
## [1] "ETS(M,Ad,N)"
#ETS(M,Ad,N)"
fit.ch2 = ets(chem2_ts, model="ZZZ", ic ="bic")
fit.ch2$method
## [1] "ETS(M,N,N)"
#"ETS(M,N,N)"
fit.par = ets(part_ts, model="ZZZ", ic ="bic")
fit.par$method
## [1] "ETS(M,Ad,N)"
#ETS(M,Ad,N)"
We append the accuracy measures for exponential smoothing models to the accuracy data frame. The format of model names is: trend (multiplicative or additive), seasonality (multiplicative or additive) and if the trend is damped or not damped
ssmodel1=ets(mortal_ts,model = "ANN")
summary(ssmodel1)
## ETS(A,N,N)
##
## Call:
## ets(y = mortal_ts, model = "ANN")
##
## Smoothing parameters:
## alpha = 0.511
##
## Initial states:
## l = 98.9364
##
## sigma: 5.8932
##
## AIC AICc BIC
## 4971.256 4971.303 4983.947
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.05497816 5.88156 4.588578 -0.3769459 5.156433 0.6871475
## ACF1
## Training set -0.07517189
ssmodel2=ets(mortal_ts,model = "MNN")
summary(ssmodel2)
## ETS(M,N,N)
##
## Call:
## ets(y = mortal_ts, model = "MNN")
##
## Smoothing parameters:
## alpha = 0.4843
##
## Initial states:
## l = 98.5582
##
## sigma: 0.0656
##
## AIC AICc BIC
## 4954.111 4954.159 4966.803
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.05730399 5.88508 4.593891 -0.3849281 5.159608 0.6879431
## ACF1
## Training set -0.04372931
ssmodel3=ets(mortal_ts,model = "AAN")
summary(ssmodel3)
## ETS(A,A,N)
##
## Call:
## ets(y = mortal_ts, model = "AAN")
##
## Smoothing parameters:
## alpha = 0.5122
## beta = 1e-04
##
## Initial states:
## l = 100.9765
## b = -0.029
##
## sigma: 5.906
##
## AIC AICc BIC
## 4975.460 4975.579 4996.612
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.004457548 5.88274 4.590352 -0.3178407 5.155916 0.6874132
## ACF1
## Training set -0.07651869
ssmodel4=ets(mortal_ts,model = "MAN",damped = TRUE)
summary(ssmodel4)
## ETS(M,Ad,N)
##
## Call:
## ets(y = mortal_ts, model = "MAN", damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.4311
## beta = 0.0441
## phi = 0.8
##
## Initial states:
## l = 101.9204
## b = -0.7466
##
## sigma: 0.0657
##
## AIC AICc BIC
## 4957.818 4957.986 4983.201
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.041505 5.883233 4.604242 -0.3409889 5.167291 0.6894931
## ACF1
## Training set -0.02166677
ssmodel5=ets(mortal_ts,model = "MAN")
summary(ssmodel5)
## ETS(M,Ad,N)
##
## Call:
## ets(y = mortal_ts, model = "MAN")
##
## Smoothing parameters:
## alpha = 0.4311
## beta = 0.0441
## phi = 0.8
##
## Initial states:
## l = 101.9204
## b = -0.7466
##
## sigma: 0.0657
##
## AIC AICc BIC
## 4957.818 4957.986 4983.201
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.041505 5.883233 4.604242 -0.3409889 5.167291 0.6894931
## ACF1
## Training set -0.02166677
vlist <- c("AAA", "MAA", "MAM", "MMM")
damp <- c(T,F)
ets_models <- expand.grid(vlist, damp)
ets_aic <- array(NA, 8)
ets_mase <- array(NA,8)
ets_bic <- array(NA,8)
auto_ets <- ets(head(mortal_ts,50))
summary(auto_ets)
## ETS(M,A,N)
##
## Call:
## ets(y = head(mortal_ts, 50))
##
## Smoothing parameters:
## alpha = 0.0546
## beta = 0.0546
##
## Initial states:
## l = 97.2676
## b = -0.9257
##
## sigma: 0.0581
##
## AIC AICc BIC
## 369.3093 370.6729 378.8694
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 1.211525 5.256707 3.928646 1.037072 4.099653 NaN -0.01514436
checkresiduals(auto_ets)
##
## Ljung-Box test
##
## data: Residuals from ETS(M,A,N)
## Q* = 17.502, df = 6, p-value = 0.007605
##
## Model df: 4. Total lags used: 10
#We append the accuracy measures for state-space models to the accuracy data frame calculate <- data.frame(mod, ets_mase, ets_aic, ets_bic) calculate\(X2 <- factor(calculate\)X2, levels = c(T,F), labels = c(“Damped”,“N”)) calculate <- unite(calculate, “Model”, c(“X1”,“X2”)) colnames(calculate) <- c(“Model”, “MASE”, “AIC”, “BIC”) accuracy <- rbind(accuracy,calculate)
#The data frame accuracy is sorted by ascending MASE value. accuracy <- arrange(accuracy, MASE) kable(accuracy, caption = “Models and their accuracy parameters (sorted by MASE)”)
#dlmfore=forecast(model4, x= as.vector(dataf), h = 4)$forecast
#polfore = forecast(pmodel1, x= as.vector(dataf), h = 4)$forecast
#koyfore = forecast(model3.p, x= as.vector(dataf), h = 4)$forecast
#arfore = forecast(armodel1, x= as.vector(dataf), h = 4)$forecast
#hwforecast=hw4$mean
#etsforecast = as.vector(forecast(ssm4,h=4)$mean)
#ets = c(Mort,etsforecast)
#hw=c(Mort,hwforecast)
#dlm = c(Mort,dlmfore)
#poly=c(Mort,polfore)
#koyn=c(Mort,koyfore)
#ardlm=c(Mort,arfore)
#Dataforecast = ts.intersect(
# ts(dlm,start=2010,frequency = 52),
#ts(poly,start=2010,frequency = 52),
#ts(koyn,start=2010,frequency = 52),
# ts(ardlm,start=2010,frequency = 52),
# ts(hw,start=2010,frequency = 52),
#ts(ets,start=2010,frequency = 52))
#
#ts.plot(Dataforecast,xlim=c(2019.75,2019.85),
# plot.type = c("single"),gpars=list(col=c("red","blue","gray","green","black","brown")),main = "Forecasting of next 4 Weeks of Mortality")
#legend("topleft", col=c("red","blue","gray","green","black","brown"), lty=1, cex=.65,c("DLM","Poly","koyn","ardlm","HW","ETS"))
The Aim of Task 1 was to give predictions of solar radiation amount for the next 2 years,and so we used three methods and compared them based on residual analysis and MASE value. The three methods were: • Time series regression models • Exponential smoothing • State-space models From the model fitting plots,the results and findings obtained says that though the time series regression, some exponential smoothing models and the automatically suggested .
In a study of 81 species of Australian plants Hudson & Keatley (2021) investigated whether the day of occurrence of a species first flowering (first flowering day, FFD, a number between 1 -365) is impacted by climate factors such as rainfall (rain), temperature (temp), radiation level (rad), and relative humidity (RH). The study by Hudson & Keatley essentially explores the influence of long-term climate on the FFD of 81 species of plants from 1984 to 2014. For this task, we will apply time series regression method to fit distributed lag models using yearly FFD series as an independent explanatory series and Provide the point forecasts and confidence intervals and corresponding plot for the most optimal model for each method used . We will also apply exponential smoothing methods with corresponding state-space models to forecast solar radiation series. We will then demonstrate an appropriate comparison between these methods in terms of residual assumptions and goodness of fit measures. The final goal of this analysis is to give 4 years ahead forecasts from the best suitable model in terms of its mean absolute scaled error (MASE) measure.
Your data focuses on one species (of the 81) and contains 5 time series, the FFD time series of the given plant species and the contemporaneous yearly averaged climate variables measured from 1984 – 2014 (31 years). All series are available here in “FFD .csv”
# Load the data
ffdata <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/FFD.csv")
ffdata
## ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 1 1984 9.371585 2.489344 14.87158 93.92650 217
## 2 1985 9.656164 2.475890 14.68493 94.93589 186
## 3 1986 9.273973 2.421370 14.51507 94.09507 233
## 4 1987 9.219178 2.319726 14.67397 94.49699 222
## 5 1988 10.202186 2.465301 14.74863 94.08142 214
## 6 1989 9.441096 2.735890 14.78356 96.08685 237
## 7 1990 9.943836 2.398630 14.67671 93.77918 213
## 8 1991 9.690411 2.635616 14.41096 93.15562 206
## 9 1992 9.691257 2.795902 13.39617 94.09863 188
## 10 1993 9.947945 2.878630 14.26575 94.91973 234
## 11 1994 9.316438 1.974795 14.52329 93.26932 264
## 12 1995 9.164384 2.843288 13.90411 94.45863 196
## 13 1996 8.967213 2.814754 14.33060 94.60000 229
## 14 1997 9.038356 1.403014 14.77534 93.74685 212
## 15 1998 8.934247 2.289041 14.60000 94.60822 244
## 16 1999 9.547945 2.126301 14.61370 96.22603 178
## 17 2000 9.680328 2.471858 14.65574 95.65738 154
## 18 2001 9.561644 2.227945 14.14521 94.70712 207
## 19 2002 9.389041 1.740000 14.63836 93.53233 182
## 20 2003 9.210959 2.270411 15.11233 94.47096 218
## 21 2004 9.300546 2.620492 14.64481 95.01421 192
## 22 2005 9.623288 2.284110 15.09315 94.30356 199
## 23 2006 8.715068 1.781370 15.41096 94.84493 200
## 24 2007 9.801370 2.191233 15.19452 94.11068 225
## 25 2008 9.034153 1.743169 14.80328 94.39508 216
## 26 2009 9.457534 2.038630 15.12877 94.63096 197
## 27 2010 9.765753 2.777808 14.29315 96.05205 230
## 28 2011 9.826027 2.886301 14.01096 95.70603 204
## 29 2012 9.767760 2.599454 14.40710 94.90519 233
## 30 2013 10.097260 2.540274 14.43014 93.83479 174
## 31 2014 10.247253 2.239286 14.60165 94.21016 189
#Converting into timeseries
ffdata_ts <- ts(ffdata, start=c(1984,1), frequency= 1)
head(ffdata_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 1984 1984 9.371585 2.489344 14.87158 93.92650 217
## 1985 1985 9.656164 2.475890 14.68493 94.93589 186
## 1986 1986 9.273973 2.421370 14.51507 94.09507 233
## 1987 1987 9.219178 2.319726 14.67397 94.49699 222
## 1988 1988 10.202186 2.465301 14.74863 94.08142 214
## 1989 1989 9.441096 2.735890 14.78356 96.08685 237
tail(ffdata_ts)
## Time Series:
## Start = 2009
## End = 2014
## Frequency = 1
## ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 2009 2009 9.457534 2.038630 15.12877 94.63096 197
## 2010 2010 9.765753 2.777808 14.29315 96.05205 230
## 2011 2011 9.826027 2.886301 14.01096 95.70603 204
## 2012 2012 9.767760 2.599454 14.40710 94.90519 233
## 2013 2013 10.097260 2.540274 14.43014 93.83479 174
## 2014 2014 10.247253 2.239286 14.60165 94.21016 189
Ts_tempo<- ts(ffdata$Temperature, start =c(1984,1), frequency = 1)
head(Ts_tempo)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 9.371585 9.656164 9.273973 9.219178 10.202186 9.441096
Ts_Rain <- ts(ffdata$Rainfall, start =c(1984,1), frequency = 1)
head(Ts_Rain)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 2.489344 2.475890 2.421370 2.319726 2.465301 2.735890
Ts_Rad<- ts(ffdata$Radiation, start =c(1984,1), frequency = 1)
head(Ts_Rad)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356
Ts_Hum <- ts(ffdata$RelHumidity, start =c(1984,1), frequency = 1)
head(Ts_Hum)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685
Ts_FFD <- ts(ffdata$FFD, start =c(1984,1), frequency = 1)
head(Ts_FFD)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 217 186 233 222 214 237
plot(Ts_FFD, main = "Fig.1 Time series plot of First flowering day series", ylab = "occurence of FFD series", xlab = "Time")
acf(Ts_FFD, lag.max = 48, main="Fig.2 ACF plot of first flowering day series")
adf.test(Ts_FFD, k=ar(Ts_FFD)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Ts_FFD
## Dickey-Fuller = -5.4552, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
From the plot in Figure 1, we can observe the following characteristics of the series:
There is no apparent trend.
There is obvious seasonality, with lower values in December and January and higher values in June and July. The seasonal pattern is not consistent across the observed time.
Changing variance and behaviour of the series are not obvious due to the presence of seasonality.
There are two potential intervention points .
We will further display sample ACF and conduct an Augmented Dickey-Fuller test to study stationarity and seasonality in the series. The length of our data allows to display more lags in the ACF plot to better observe any evidence of trend.
The ACF plot in Figure 2 shows no seasonal patterns and suggests no trend. ADF test with lag order = 0 reports stationarity in the series at 5% level of significance (p-value < 0.05). Overall, we conclude that FFD series has a no seasonality pattern.
We will display a time series plot of factors affecting ffd which we will use as a predictor series for distributed lag models.
#
par(mfrow=c(2,2))
plot(Ts_tempo, main ="Fig.3.1 Time series plot of temperature effects on ffd", ylab="Temperature change", xlab = "Time")
plot(Ts_Rain, main ="Fig3.2.Time series plot of Rain effects on ffd series", ylab="Rainfall", xlab = "Time")
plot(Ts_Rad, main ="Fig 3.3 Time series plot of Radiations on ffd series", ylab="Radiations", xlab = "Time")
plot(Ts_Hum, main ="Fig 3.4 Time series plot of Humidity effects on ffd series", ylab="Humidity", xlab = "Time")
par(mfrow=c(1,1))
Based on the plot in Figure 3, we can make the following comments on the characteristics of the series:
There might be a slight downward trend, especially in the beginning of the series.
There is a clear seasonality, while the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January.
The existence of changing variance and behaviour of the series is not apparent due to seasonality.
There are no obvious intervention points.
To further explore the trend and seasonality components in precipitation series, we will create a sample ACF plot and conduct an ADF test over the series.
par(mfrow=c(2,2))
acf(Ts_tempo,lag.max = 48, main = "Fig.4.1 ACF plot of Temperature on FFD series")
adf.test(Ts_tempo,k=ar(Ts_tempo)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Ts_tempo
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary
acf(Ts_Rain,lag.max = 48, main = "Fig 4.2 ACF plot of Rain on FFD series")
adf.test(Ts_Rain,k=ar(Ts_Rain)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Ts_Rain
## Dickey-Fuller = -4.5622, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
acf(Ts_Rad,lag.max = 48, main = "Fig.4.3 ACF plot of Radiation on FFD series")
adf.test(Ts_Rad,k=ar(Ts_Rad)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Ts_Rad
## Dickey-Fuller = -2.7317, Lag order = 4, p-value = 0.2911
## alternative hypothesis: stationary
acf(Ts_Hum,lag.max = 48, main = "Fig 4.4 ACF plot of Humidity on FFD series")
adf.test(Ts_Hum,k=ar(Ts_Hum)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Ts_Hum
## Dickey-Fuller = -4.5749, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
par(mfrow=c(1,1))
From Figure 4, we can observe that there is a slight seasonal pattern in temperature, a decaying pattern in rain and decaying seasonal lags in radiation and humidity also suggests the possible existence of trend. The ADF test reports 1)lag value 2 and p-value = 0.9002 > 0.05 which suggests the series is nonstationary at 5% level of significance. 2)lag value 0 and p-value = 0.01 < 0.05 which suggests the series is stationary at 5% level of significance. 3)lag value 4 and p-value = 0.2911 > 0.05 which suggests the series is nonstationary at 5% level of significance.4)lag value 0 and p-value = 0.01 < 0.05 which suggests the series is stationary at 5% level of significance.
To clearly display the dependent radiation series versus the explanatory precipitation series within the same plot, we will standardise the data. The following code creates a time series plot to explore the relationship of the series.
# scaling of data
shift<- scale(ffdata_ts)
plot(shift, plot.type="s",col=c("Red", "Blue", "Brown","Black","Green"),main= "Fig.5 FFD rate versus factor affecting ffd wrt time(Scaled)")
legend("bottomright", lty=1, text.width = 7, col = c("Red", "Blue", "Brown","Black","Green"), c("Temperature", "Rain", "Radiation", "Humidity","FFD"))
The plot in Figure 5 shows that the dependent and the independent series are likely to be negatively correlated. High values of radiation correspond to low values of precipitation and vice versa.
#We also calculate the correlation coefficient to check the relationship.
cor(ffdata_ts)
## ï..Year Temperature Rainfall Radiation RelHumidity
## ï..Year 1.0000000 0.148410676 -0.1752091 0.11881829 0.206355767
## Temperature 0.1484107 1.000000000 0.3933255 -0.24096625 0.009646021
## Rainfall -0.1752091 0.393325545 1.0000000 -0.58131610 0.338461007
## Radiation 0.1188183 -0.240966245 -0.5813161 1.00000000 -0.055209652
## RelHumidity 0.2063558 0.009646021 0.3384610 -0.05520965 1.000000000
## FFD -0.2329975 -0.247933708 0.0506911 0.04677758 -0.128502440
## FFD
## ï..Year -0.23299747
## Temperature -0.24793371
## Rainfall 0.05069110
## Radiation 0.04677758
## RelHumidity -0.12850244
## FFD 1.00000000
cor(Ts_tempo,Ts_FFD)
## [1] -0.2479337
cor(Ts_Rain,Ts_FFD)
## [1] 0.0506911
cor(Ts_Rad,Ts_FFD)
## [1] 0.04677758
cor(Ts_Hum,Ts_FFD)
## [1] -0.1285024
The correlation coefficient is reported as FFD have r=−0.2479 wrt temperature which suggests a moderate negative correlation between the series and confirms the conclusion made from the plot in Figure 5. After we have explored the characteristics of the individual series and found the evidence of relationship between them, we proceed to modelling stage.
To specify the finite lag length for the model, we create a loop that computes accuracy measures like AIC/BIC and MASE for the models with different lag lengths and select a model with the lowest values.
for (i in 1:10){
model1 <- dlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = ffdata$FFD, q = i)
cat("q =", i, "AIC =", AIC(model1$model), "BIC =", BIC(model1$model), "MASE =", MASE(model1)$MASE, "\n")
}
## q = 1 AIC = 281.5591 BIC = 287.1639 MASE = 0.6780763
## q = 2 AIC = 273.1301 BIC = 279.9665 MASE = 0.6577795
## q = 3 AIC = 266.1371 BIC = 274.1303 MASE = 0.6544162
## q = 4 AIC = 259.4614 BIC = 268.5323 MASE = 0.6307995
## q = 5 AIC = 248.5012 BIC = 258.566 MASE = 0.6235341
## q = 6 AIC = 238.0214 BIC = 248.9912 MASE = 0.5711917
## q = 7 AIC = 231.115 BIC = 242.8955 MASE = 0.5544822
## q = 8 AIC = 219.0449 BIC = 231.5353 MASE = 0.4915635
## q = 9 AIC = 209.6412 BIC = 222.7337 MASE = 0.4791517
## q = 10 AIC = 202.0655 BIC = 215.6443 MASE = 0.4567299
It is observed that the values of information criteria as well as MASE decrease as the lag q increases, so we will fit a finite DLM with a number of lags = 10.
1)Temperature
ftem_dlm <- dlm(x = ffdata$FFD, y = ffdata$Temperature, q=10)
summary(ftem_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.40860 -0.19427 -0.03581 0.18360 0.47061
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.5227528 2.4798858 5.453 0.000404 ***
## x.t -0.0025541 0.0042765 -0.597 0.565061
## x.1 -0.0032419 0.0042724 -0.759 0.467381
## x.2 -0.0045529 0.0044176 -1.031 0.329614
## x.3 0.0018560 0.0043032 0.431 0.676395
## x.4 0.0009147 0.0042016 0.218 0.832518
## x.5 -0.0008784 0.0037899 -0.232 0.821907
## x.6 0.0056978 0.0045033 1.265 0.237563
## x.7 -0.0006914 0.0043958 -0.157 0.878495
## x.8 -0.0073904 0.0043644 -1.693 0.124640
## x.9 -0.0013518 0.0041645 -0.325 0.752907
## x.10 -0.0072004 0.0041466 -1.736 0.116496
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3893 on 9 degrees of freedom
## Multiple R-squared: 0.5724, Adjusted R-squared: 0.0497
## F-statistic: 1.095 on 11 and 9 DF, p-value: 0.4532
##
## AIC and BIC values for the model:
## AIC BIC
## 1 28.17827 41.75706
vif(ftem_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178
## x.8 x.9 x.10
## 1.595176 1.507594 1.485141
From the temperature series, we obtained Adjusted R-squared: 0.0497,p-value: 0.4532 >0.05 and AIC:28.17827
2)Rain
frain_dlm <- dlm(x = ffdata$FFD, y = ffdata$Rainfall, q=10)
summary(frain_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.62480 -0.13413 -0.05632 0.12456 0.86560
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.3362531 2.9267216 0.457 0.659
## x.t 0.0017720 0.0050470 0.351 0.734
## x.1 0.0012178 0.0050423 0.242 0.815
## x.2 0.0057334 0.0052136 1.100 0.300
## x.3 0.0002185 0.0050786 0.043 0.967
## x.4 -0.0059988 0.0049586 -1.210 0.257
## x.5 -0.0007208 0.0044728 -0.161 0.876
## x.6 0.0045630 0.0053148 0.859 0.413
## x.7 0.0020808 0.0051878 0.401 0.698
## x.8 -0.0007027 0.0051509 -0.136 0.894
## x.9 0.0015957 0.0049149 0.325 0.753
## x.10 -0.0052402 0.0048938 -1.071 0.312
##
## Residual standard error: 0.4594 on 9 degrees of freedom
## Multiple R-squared: 0.4289, Adjusted R-squared: -0.2692
## F-statistic: 0.6143 on 11 and 9 DF, p-value: 0.7798
##
## AIC and BIC values for the model:
## AIC BIC
## 1 35.13643 48.71522
vif(frain_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178
## x.8 x.9 x.10
## 1.595176 1.507594 1.485141
From the temperature series, we obtained Adjusted R-squared: -0.2692,p-value: 0.7798 >0.05 and AIC:35.13643
3)Radiation
frad_dlm <- dlm(x = ffdata$FFD, y = ffdata$Radiation, q=10)
summary(frad_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.63400 -0.18651 -0.02998 0.18646 0.42512
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.108071 2.373738 8.050 2.11e-05 ***
## x.t -0.001632 0.004093 -0.399 0.6994
## x.1 -0.006388 0.004090 -1.562 0.1527
## x.2 0.001004 0.004228 0.237 0.8176
## x.3 -0.004638 0.004119 -1.126 0.2893
## x.4 -0.001052 0.004022 -0.262 0.7996
## x.5 -0.003036 0.003628 -0.837 0.4244
## x.6 -0.004475 0.004311 -1.038 0.3262
## x.7 -0.008065 0.004208 -1.917 0.0875 .
## x.8 0.001065 0.004178 0.255 0.8045
## x.9 0.000136 0.003986 0.034 0.9735
## x.10 0.005649 0.003969 1.423 0.1884
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3726 on 9 degrees of freedom
## Multiple R-squared: 0.6004, Adjusted R-squared: 0.1121
## F-statistic: 1.229 on 11 and 9 DF, p-value: 0.3843
##
## AIC and BIC values for the model:
## AIC BIC
## 1 26.34091 39.9197
vif(frad_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178
## x.8 x.9 x.10
## 1.595176 1.507594 1.485141
From the temperature series, we obtained Adjusted R-squared: 0.1121,p-value: 0.3843 >0.05 and AIC:26.34091
4)Humidity
fhum_dlm <- dlm(x = ffdata$FFD, y = ffdata$RelHumidity, q=10)
summary(fhum_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3664 -0.3400 -0.1628 0.5044 1.3298
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 98.209806 5.866260 16.741 4.33e-08 ***
## x.t -0.010228 0.010116 -1.011 0.338
## x.1 -0.003821 0.010107 -0.378 0.714
## x.2 0.007267 0.010450 0.695 0.504
## x.3 0.008712 0.010179 0.856 0.414
## x.4 -0.009341 0.009939 -0.940 0.372
## x.5 0.008470 0.008965 0.945 0.369
## x.6 -0.002341 0.010653 -0.220 0.831
## x.7 -0.004620 0.010398 -0.444 0.667
## x.8 -0.006071 0.010324 -0.588 0.571
## x.9 0.001441 0.009851 0.146 0.887
## x.10 -0.006567 0.009809 -0.670 0.520
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9209 on 9 degrees of freedom
## Multiple R-squared: 0.3739, Adjusted R-squared: -0.3913
## F-statistic: 0.4887 on 11 and 9 DF, p-value: 0.869
##
## AIC and BIC values for the model:
## AIC BIC
## 1 64.34047 77.91927
vif(fhum_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178
## x.8 x.9 x.10
## 1.595176 1.507594 1.485141
From the temperature series, we obtained Adjusted R-squared: -0.3913,p-value: 0.869 >0.05 and AIC:64.34047
From the above all dlm models, radiation sries have given considerably better results than others. According to the significance tests of model coefficients obtained from the summary, nearly all lag weights of a predictor series are not statistically significant at 5% level. The adjusted R2 for finite_dlm is 0.1121, which means that the model explains only 11% of the variability in radiation. F-test of the overall significance of the model reports the model is statistically significant at 5% level (p-value < 0.05). However, we conclude that the model is not a good fit to the data due to insignificant terms and low explainability.
There is no issue with multicollinearity in the model, VIF values are reported < 10.
##residual check loop
residualcheck <- function(x){
checkresiduals(x)
#bgtest(x)
shapiro.test(x$residuals)
}
Univariate poly modelling for all features 1)Temperature
temp_polyd <- polyDlm(x=as.vector(ffdata$Temperature), y=as.vector(ffdata$FFD), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -6.2400 9.09 -0.68600 0.507
## beta.1 -0.6310 6.86 -0.09200 0.928
## beta.2 3.7500 5.93 0.63100 0.541
## beta.3 6.8900 5.90 1.17000 0.267
## beta.4 8.8100 6.10 1.44000 0.177
## beta.5 9.5000 6.15 1.55000 0.151
## beta.6 8.9700 5.96 1.51000 0.160
## beta.7 7.2000 5.73 1.26000 0.235
## beta.8 4.2000 6.03 0.69700 0.500
## beta.9 -0.0216 7.49 -0.00289 0.998
## beta.10 -5.4700 10.30 -0.53200 0.605
summary(temp_polyd)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.641 -14.230 -5.104 16.274 43.700
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -141.0618 533.0958 -0.265 0.794
## z.t0 -6.2363 9.0912 -0.686 0.502
## z.t1 6.2197 3.9083 1.591 0.130
## z.t2 -0.6143 0.3942 -1.559 0.138
##
## Residual standard error: 25.48 on 17 degrees of freedom
## Multiple R-squared: 0.1583, Adjusted R-squared: 0.009816
## F-statistic: 1.066 on 3 and 17 DF, p-value: 0.3896
##
residualcheck(temp_polyd$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.96973, p-value = 0.7269
checkresiduals(temp_polyd$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 6.2838, df = 7, p-value = 0.507
From the temperature series, we obtained Adjusted R-squared: 0.009816,p-value: 0.3896 >0.05
2)Rainfall
rain_polyd <- polyDlm(x=as.vector(ffdata$Rainfall), y=as.vector(ffdata$FFD), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -3.23 10.90 -0.297 0.772
## beta.1 2.47 7.22 0.341 0.739
## beta.2 6.48 5.64 1.150 0.275
## beta.3 8.82 5.69 1.550 0.149
## beta.4 9.47 6.06 1.560 0.146
## beta.5 8.44 5.99 1.410 0.187
## beta.6 5.73 5.35 1.070 0.308
## beta.7 1.33 4.64 0.287 0.779
## beta.8 -4.74 5.38 -0.882 0.397
## beta.9 -12.50 8.55 -1.460 0.171
## beta.10 -21.90 13.60 -1.620 0.134
summary(rain_polyd)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -46.494 -10.638 -2.134 15.543 46.891
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 209.0810 100.8835 2.073 0.0537 .
## z.t0 -3.2319 10.8654 -0.297 0.7697
## z.t1 6.5390 5.4708 1.195 0.2484
## z.t2 -0.8410 0.5707 -1.474 0.1589
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25.4 on 17 degrees of freedom
## Multiple R-squared: 0.1636, Adjusted R-squared: 0.01605
## F-statistic: 1.109 on 3 and 17 DF, p-value: 0.3729
residualcheck(rain_polyd$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98714, p-value = 0.9902
checkresiduals(rain_polyd$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 14.791, df = 7, p-value = 0.03878
From the temperature series, we obtained Adjusted R-squared: 0.01605,p-value: 0.3729 >0.05
3)Radiation
rad_polyd <- polyDlm(x=as.vector(ffdata$Radiation), y=as.vector(ffdata$FFD), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -0.4200 8.60 -0.04880 0.962
## beta.1 -0.6790 5.50 -0.12300 0.904
## beta.2 -0.6930 3.82 -0.18200 0.859
## beta.3 -0.4620 3.58 -0.12900 0.900
## beta.4 0.0143 3.90 0.00367 0.997
## beta.5 0.7360 4.03 0.18300 0.858
## beta.6 1.7000 3.80 0.44800 0.663
## beta.7 2.9100 3.57 0.81500 0.432
## beta.8 4.3700 4.28 1.02000 0.329
## beta.9 6.0700 6.50 0.93300 0.371
## beta.10 8.0200 10.00 0.80000 0.440
summary(rad_polyd)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -46.417 -11.818 -1.942 17.330 51.949
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -106.9869 505.1443 -0.212 0.835
## z.t0 -0.4198 8.6001 -0.049 0.962
## z.t1 -0.3816 4.1407 -0.092 0.928
## z.t2 0.1225 0.4193 0.292 0.774
##
## Residual standard error: 26.9 on 17 degrees of freedom
## Multiple R-squared: 0.06187, Adjusted R-squared: -0.1037
## F-statistic: 0.3737 on 3 and 17 DF, p-value: 0.773
residualcheck(rad_polyd$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.97843, p-value = 0.9012
checkresiduals(rad_polyd$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 6.0156, df = 7, p-value = 0.5379
From the temperature series, we obtained Adjusted R-squared: -0.1037,p-value: 0.773 >0.05 4)Humidity
hum_polyd <- polyDlm(x=as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -8.8300 5.30 -1.6600 0.1240
## beta.1 -7.1000 3.82 -1.8600 0.0898
## beta.2 -5.5600 3.13 -1.7800 0.1030
## beta.3 -4.2000 3.07 -1.3700 0.1990
## beta.4 -3.0300 3.24 -0.9330 0.3710
## beta.5 -2.0400 3.35 -0.6080 0.5560
## beta.6 -1.2300 3.32 -0.3720 0.7170
## beta.7 -0.6160 3.25 -0.1900 0.8530
## beta.8 -0.1830 3.43 -0.0534 0.9580
## beta.9 0.0648 4.21 0.0154 0.9880
## beta.10 0.1270 5.74 0.0222 0.9830
summary(hum_polyd)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.56 -16.91 -6.12 13.94 43.19
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3290.68298 2900.02326 1.135 0.272
## z.t0 -8.83169 5.30493 -1.665 0.114
## z.t1 1.82201 2.32542 0.784 0.444
## z.t2 -0.09261 0.22708 -0.408 0.688
##
## Residual standard error: 25.03 on 17 degrees of freedom
## Multiple R-squared: 0.1878, Adjusted R-squared: 0.04447
## F-statistic: 1.31 on 3 and 17 DF, p-value: 0.3036
##
residualcheck(hum_polyd$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.95686, p-value = 0.4553
checkresiduals(hum_polyd$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 5.8819, df = 7, p-value = 0.5536
From the temperature series, we obtained Adjusted R-squared: 0.04447,p-value: 0.3036 >0.05 The analysis of residuals from polynomial model in Figure 7 shows the following:
The errors are not randomly spread. There are a lot of highly significant lags in the ACF plot as well as a wavy pattern at seasonal lags, so there is autocorrelation and seasonality still present in the residuals. Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.
The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05).
Overall, we can conclude that the second order polynomial of lag 10 is not successful at capturing the autocorrelation and seasonality in the series and has low explainability.
We will implement Koyck transformation model with precipitation predictor series as follows.
First we design multivariate model and then univariate models for each parameter
K_trans = koyckDlm(x=as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD))
summary(K_trans$model, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -47.998 -14.267 -3.171 17.086 44.040
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 963.41380 3083.76190 0.312 0.757
## Y.1 -0.05027 0.26344 -0.191 0.850
## X.t -6.14434 25.16739 -0.244 0.809
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 0.826 0.371
## Wu-Hausman 1 26 0.012 0.914
## Sargan 0 NA NA NA
##
## Residual standard error: 24.54 on 27 degrees of freedom
## Multiple R-Squared: 0.007908, Adjusted R-squared: -0.06558
## Wald test: 0.0304 on 2 and 27 DF, p-value: 0.9701
vif(K_trans$model)
## Y.1 X.t
## 1.845706 1.845706
1)Temperature
temp_Koyck <- koyckDlm(x=as.vector(ffdata$Temperature), y=as.vector(ffdata$FFD))
summary(temp_Koyck)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -57.755 -12.972 -4.079 17.329 58.541
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -19.87216 608.22407 -0.033 0.974
## Y.1 0.03846 0.25384 0.152 0.881
## X.t 23.22042 61.08917 0.380 0.707
##
## Residual standard error: 28.38 on 27 degrees of freedom
## Multiple R-Squared: -0.3272, Adjusted R-squared: -0.4255
## Wald test: 0.07269 on 2 and 27 DF, p-value: 0.9301
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: -20.66698 23.22042 0.0384583
vif(temp_Koyck$model, diagnostics =T)
## Y.1 X.t
## 1.280988 1.280988
From the temperature series, we obtained Adjusted R-squared: -0.4255,p-value: 0.9301 >0.05
2)Rain
rain_Koyck <- koyckDlm(x=as.vector(ffdata$Rainfall), y=as.vector(ffdata$FFD))
summary(rain_Koyck)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -58.691 -21.222 2.697 14.856 68.192
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.266e+02 2.185e+02 0.579 0.567
## Y.1 4.591e-03 2.196e-01 0.021 0.983
## X.t 3.448e+01 8.772e+01 0.393 0.697
##
## Residual standard error: 27.55 on 27 degrees of freedom
## Multiple R-Squared: -0.2505, Adjusted R-squared: -0.3431
## Wald test: 0.07773 on 2 and 27 DF, p-value: 0.9254
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 127.2254 34.48095 0.004590669
vif(rain_Koyck$model,diagnostics =T)
## Y.1 X.t
## 1.017508 1.017508
From the temperature series, we obtained Adjusted R-squared: -0.3431,p-value: 0.9254 >0.05 3)Radiation
rad_Koyck <- koyckDlm(x=as.vector(ffdata$Radiation), y=as.vector(ffdata$FFD))
summary(rad_Koyck)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -55.229 -19.662 3.956 16.232 54.756
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 418.31843 384.12678 1.089 0.286
## Y.1 -0.03254 0.20729 -0.157 0.876
## X.t -13.87153 25.49529 -0.544 0.591
##
## Residual standard error: 25.54 on 27 degrees of freedom
## Multiple R-Squared: -0.07436, Adjusted R-squared: -0.1539
## Wald test: 0.1486 on 2 and 27 DF, p-value: 0.8626
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 405.1353 -13.87153 -0.03254005
vif(rad_Koyck$model)
## Y.1 X.t
## 1.055257 1.055257
From the temperature series, we obtained Adjusted R-squared: -0.1539,p-value: 0.8626 >0.05
humidity
hum_Koyck <- koyckDlm(x=as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD))
summary(hum_Koyck)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -46.787 -14.896 -3.024 15.673 55.019
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1898.18932 4211.29973 0.451 0.656
## Y.1 -0.05904 0.25016 -0.236 0.815
## X.t -17.72952 44.24103 -0.401 0.692
##
## Residual standard error: 27 on 27 degrees of freedom
## Multiple R-Squared: -0.2016, Adjusted R-squared: -0.2906
## Wald test: 0.0808 on 2 and 27 DF, p-value: 0.9226
##
## Diagnostic tests:
## NULL
##
## alpha beta phi
## Geometric coefficients: 1792.364 -17.72952 -0.05904222
##From the temperature series, we obtained Adjusted R-squared: -0.2906,p-value: 0.9226 >0.05
vif(hum_Koyck$model)
## Y.1 X.t
## 1.374145 1.374145
From the model summary, we can conclude that all terms of Koyck model are not significant at 5% level. The model is reported to be overall statistically nonsignificant at 5% level (p-value > 0.05) and its adjusted R2 is negative which means the model explains about negative variability in ffd
According to the Weak instruments test (p-value > 0.05), the model at the first stage of least-squares estimation is notsignificant at 5% level.
From the Wu-Hausman test (p-value > 0.05), we can conclude that there is no significant correlation between the explanatory variable and the error term at 5% level. There is no effect of multicollinearity as all VIFs are less than 10.
par(mfrow=c(1,2))
#residualcheck(temp_Koyck$model)
checkresiduals(temp_Koyck$model)
par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(rain_Koyck$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98582, p-value = 0.9504
checkresiduals(rain_Koyck$model)
par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(rad_Koyck$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98748, p-value = 0.9716
checkresiduals(rad_Koyck$model)
par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(hum_Koyck$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98176, p-value = 0.8702
checkresiduals(hum_Koyck$model)
par(mfrow=c(1,1))
From the residual analysis in Figure 8, we can conclude the following:
The errors are not spread randomly.
All the lags in ACF plot are significant and have a wave-like pattern, which suggests serial correlation and seasonality remaining in the residuals.
The errors are not normal. The histogram and the Shapiro-Wilk normality test with p-value < 0.05 suggest not normal residuals.
Overall, we can conclude that the Koyck model is also not successful at capturing the autocorrelation and seasonality in the series.
Autoregressive distributed lag models The final model type from time series regression method is Autoregressive distributed lag models. For specifying the parameters of ARDL(p,q), we create a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and obtains their accuracy measures, like AIC/BIC and MASE.
Three models with lowest values of MASE were chosen for fitting and analysis. The models were:
ARDL(3,5)
ARDL(4,5)
ARDL(5,5)
We create a loop to fit these candidate models and do residual analysis in a dynamical way.
for (i in 1:5){
for(j in 1:5){
model2 = ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p = i , q = j)
cat("p =", i, "q =", j, "AIC =", AIC(model2$model), "BIC =", BIC(model2$model), "MASE =", MASE(model2)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = 283.5269 BIC = 290.5329 MASE = 0.6795852
## p = 1 q = 2 AIC = 276.4449 BIC = 284.6487 MASE = 0.6858828
## p = 1 q = 3 AIC = 269.4214 BIC = 278.7469 MASE = 0.6627738
## p = 1 q = 4 AIC = 262.8803 BIC = 273.247 MASE = 0.6451817
## p = 1 q = 5 AIC = 255.9729 BIC = 267.2957 MASE = 0.6432902
## p = 2 q = 1 AIC = 275.0869 BIC = 283.2907 MASE = 0.6547865
## p = 2 q = 2 AIC = 276.9991 BIC = 286.5702 MASE = 0.6507562
## p = 2 q = 3 AIC = 270.1351 BIC = 280.7927 MASE = 0.6370252
## p = 2 q = 4 AIC = 262.9649 BIC = 274.6274 MASE = 0.6008803
## p = 2 q = 5 AIC = 255.8111 BIC = 268.3921 MASE = 0.5647418
## p = 3 q = 1 AIC = 268.1205 BIC = 277.446 MASE = 0.6534486
## p = 3 q = 2 AIC = 270.0886 BIC = 280.7462 MASE = 0.6510693
## p = 3 q = 3 AIC = 271.8702 BIC = 283.86 MASE = 0.6428087
## p = 3 q = 4 AIC = 264.8975 BIC = 277.8559 MASE = 0.6035375
## p = 3 q = 5 AIC = 257.5522 BIC = 271.3913 MASE = 0.5623997
## p = 4 q = 1 AIC = 261.3992 BIC = 271.7659 MASE = 0.6330189
## p = 4 q = 2 AIC = 263.3272 BIC = 274.9897 MASE = 0.6237364
## p = 4 q = 3 AIC = 264.9645 BIC = 277.9228 MASE = 0.6073044
## p = 4 q = 4 AIC = 266.7086 BIC = 280.9628 MASE = 0.598053
## p = 4 q = 5 AIC = 259.5419 BIC = 274.639 MASE = 0.5621408
## p = 5 q = 1 AIC = 250.4857 BIC = 261.8085 MASE = 0.6196805
## p = 5 q = 2 AIC = 252.4857 BIC = 265.0666 MASE = 0.6196831
## p = 5 q = 3 AIC = 254.4675 BIC = 268.3065 MASE = 0.620152
## p = 5 q = 4 AIC = 255.8517 BIC = 270.9489 MASE = 0.6083909
## p = 5 q = 5 AIC = 256.0737 BIC = 272.4289 MASE = 0.5493301
for (i in c(3,4,5)){
ardl <- ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p = i, q = 5)
summary(ardl)
#bgtest(ardl$model)
}
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -63.367 -10.203 -0.015 18.350 36.453
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1477.26374 1365.76665 1.082 0.295
## X.t -4.64092 6.28682 -0.738 0.471
## X.1 2.58737 6.76133 0.383 0.707
## X.2 -6.30493 6.42318 -0.982 0.341
## X.3 -2.46544 6.16105 -0.400 0.694
## Y.1 -0.06198 0.25392 -0.244 0.810
## Y.2 0.08573 0.27328 0.314 0.758
## Y.3 -0.15581 0.27753 -0.561 0.582
## Y.4 0.09237 0.27418 0.337 0.741
## Y.5 0.23504 0.26418 0.890 0.387
##
## Residual standard error: 28.61 on 16 degrees of freedom
## Multiple R-squared: 0.1318, Adjusted R-squared: -0.3565
## F-statistic: 0.27 on 9 and 16 DF, p-value: 0.9741
##
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -63.26 -10.12 0.49 17.61 36.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1534.76078 1594.70318 0.962 0.351
## X.t -4.55557 6.58510 -0.692 0.500
## X.1 2.53173 7.01876 0.361 0.723
## X.2 -6.46004 6.92989 -0.932 0.366
## X.3 -2.31162 6.66630 -0.347 0.734
## X.4 -0.48565 6.28801 -0.077 0.939
## Y.1 -0.06130 0.26234 -0.234 0.818
## Y.2 0.08177 0.28681 0.285 0.779
## Y.3 -0.15693 0.28695 -0.547 0.593
## Y.4 0.09315 0.28330 0.329 0.747
## Y.5 0.22827 0.28653 0.797 0.438
##
## Residual standard error: 29.54 on 15 degrees of freedom
## Multiple R-squared: 0.1322, Adjusted R-squared: -0.4464
## F-statistic: 0.2285 on 10 and 15 DF, p-value: 0.9884
##
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -46.322 -10.557 1.106 13.976 41.797
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.650e+02 1.819e+03 -0.201 0.8438
## X.t -1.865e+00 6.313e+00 -0.295 0.7720
## X.1 -2.276e-01 6.715e+00 -0.034 0.9734
## X.2 -4.653e+00 6.534e+00 -0.712 0.4880
## X.3 1.302e+00 6.524e+00 0.200 0.8447
## X.4 -1.937e+00 5.914e+00 -0.328 0.7481
## X.5 1.148e+01 6.344e+00 1.810 0.0918 .
## Y.1 -3.483e-02 2.449e-01 -0.142 0.8889
## Y.2 -1.330e-03 2.712e-01 -0.005 0.9962
## Y.3 4.191e-03 2.818e-01 0.015 0.9883
## Y.4 1.405e-01 2.653e-01 0.530 0.6048
## Y.5 2.666e-01 2.678e-01 0.995 0.3364
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 27.52 on 14 degrees of freedom
## Multiple R-squared: 0.2968, Adjusted R-squared: -0.2558
## F-statistic: 0.5371 on 11 and 14 DF, p-value: 0.8474
From model summaries, we can conclude that all the fitted ARDL models were reported to be statistically significant at 5% level with p-value < 0.05. All models have an adjusted R2=0.933, which means they explain about 93.3% of the variability in radiation.
Regarding model coefficient estimates, we can observe for ARDL(3,5) only X.2 lag of predictor series is significant at 5% level (p-value = 0.0393 < 0.05), for ARDL(4,5) only X.4 lag of predictor series is significant at 5% level (p-value = 03014 < 0.05), and all lags of predictor series are not statistically significant at 5% level for ARDL(5,5). All lags of independent series are statistically significant in all models except Y.2 (p-value = 0.7829 > 0.05).
The plots from diagnostic checking in Figure 9 show that there is a very similar overall picture in residuals from all three fitted models:
The residuals are not as randomly spread as desired, they show evidence of changing variance.
There are a some highly significant lags in the ACF plot. The seasonal lags are also highly significant. Therefore, there is autocorrelation and seasonality still present in the residuals.
Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.
Long tails on the histogram of residuals suggest the normality of the residuals is violated.
Based on the observation about model estimates made earlier, we can try to decrease the number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.
ardl_15 <- ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -53.157 -14.444 -0.465 19.927 47.433
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 362.62412 872.79704 0.415 0.683
## X.t -2.19478 5.89426 -0.372 0.714
## X.1 0.50563 5.89138 0.086 0.933
## Y.1 -0.01082 0.24148 -0.045 0.965
## Y.2 0.07149 0.26885 0.266 0.793
## Y.3 -0.05385 0.25949 -0.208 0.838
## Y.4 0.05257 0.25078 0.210 0.836
## Y.5 0.17975 0.25215 0.713 0.485
##
## Residual standard error: 28.26 on 18 degrees of freedom
## Multiple R-squared: 0.04712, Adjusted R-squared: -0.3234
## F-statistic: 0.1272 on 7 and 18 DF, p-value: 0.9951
residualcheck(ardl_15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98628, p-value = 0.9724
#temperature
ardl_temp15 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_temp15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -55.363 -17.901 -1.096 15.657 45.469
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 246.044968 196.309815 1.253 0.226
## X.t -20.508096 16.166491 -1.269 0.221
## X.1 10.580366 16.043160 0.659 0.518
## Y.1 -0.005384 0.231828 -0.023 0.982
## Y.2 0.060271 0.234783 0.257 0.800
## Y.3 -0.041509 0.235489 -0.176 0.862
## Y.4 0.168807 0.239335 0.705 0.490
## Y.5 0.087847 0.249085 0.353 0.728
##
## Residual standard error: 27.15 on 18 degrees of freedom
## Multiple R-squared: 0.1206, Adjusted R-squared: -0.2214
## F-statistic: 0.3525 on 7 and 18 DF, p-value: 0.9179
residualcheck(ardl_temp15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98258, p-value = 0.9233
##Based on the observation about model ,p-value: 0.9179 >0.05, Residual standard error: 27.15 and Adjusted R-squared: -0.2214
#rainfall
ardl_rain15 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_rain15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.777 -12.609 -1.348 16.712 45.413
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 143.617855 109.253360 1.315 0.205
## X.t 1.677519 16.237143 0.103 0.919
## X.1 9.943103 16.455578 0.604 0.553
## Y.1 0.003725 0.233996 0.016 0.987
## Y.2 0.031718 0.279802 0.113 0.911
## Y.3 -0.136841 0.280969 -0.487 0.632
## Y.4 0.080008 0.239877 0.334 0.743
## Y.5 0.199777 0.242849 0.823 0.421
##
## Residual standard error: 28.04 on 18 degrees of freedom
## Multiple R-squared: 0.06175, Adjusted R-squared: -0.3031
## F-statistic: 0.1692 on 7 and 18 DF, p-value: 0.9884
residualcheck(ardl_rain15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.9879, p-value = 0.9852
##Based on the observation about model ,p-value: 0.9884 >0.05, Residual standard error: 28.04 and Adjusted R-squared: -0.3031
#radiation
ardl_rad15 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_rad15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.212 -11.254 0.832 19.508 48.780
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.178e+01 3.381e+02 0.212 0.834
## X.t 9.774e+00 1.597e+01 0.612 0.548
## X.1 -5.631e+00 1.535e+01 -0.367 0.718
## Y.1 3.118e-02 2.371e-01 0.131 0.897
## Y.2 4.365e-02 2.447e-01 0.178 0.860
## Y.3 7.349e-04 2.596e-01 0.003 0.998
## Y.4 8.326e-02 2.555e-01 0.326 0.748
## Y.5 2.043e-01 2.535e-01 0.806 0.431
##
## Residual standard error: 28.06 on 18 degrees of freedom
## Multiple R-squared: 0.06065, Adjusted R-squared: -0.3047
## F-statistic: 0.166 on 7 and 18 DF, p-value: 0.9891
residualcheck(ardl_rad15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.9804, p-value = 0.8823
##Based on the observation about model ,p-value: 0.9891 >0.05, Residual standard error: 28.06 and Adjusted R-squared: -0.3047
#humidity
ardl_hum15 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_hum15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.635 -15.409 0.681 20.856 49.830
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 523.482390 884.751156 0.592 0.561
## X.t -1.938592 7.785492 -0.249 0.806
## X.1 -1.937931 7.797149 -0.249 0.807
## Y.1 -0.007652 0.238367 -0.032 0.975
## Y.2 0.037000 0.256453 0.144 0.887
## Y.3 -0.023502 0.251293 -0.094 0.927
## Y.4 0.071939 0.263173 0.273 0.788
## Y.5 0.167932 0.258863 0.649 0.525
##
## Residual standard error: 28.23 on 18 degrees of freedom
## Multiple R-squared: 0.04908, Adjusted R-squared: -0.3207
## F-statistic: 0.1327 on 7 and 18 DF, p-value: 0.9944
residualcheck(ardl_hum15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98316, p-value = 0.9329
##Based on the observation about model ,p-value: 0.9944 >0.05, Residual standard error: 28.23 and Adjusted R-squared: -0.3207
The p-value of the overall significance test is > 0.05, therefore ARDL(1,5) model is statistically not significant at 5% level. All the estimated terms are significant at 5% level except Y.2 - second lag of independent series (p-value = 0.9724 > 0.05). Adjusted R2=0.932, which means they explain 93.2% of the variability in radiation.
The plots from diagnostic checking in Figure 10 show the same picture as the diagnostic checkings in Figure 9, so the comments are the same as for previously fitted models.
Overall, none of the models from time series regression method were successful at capturing the autocorrelation and seasonal pattern in radiation series.
We create a data frame accuracy to store the accuracy measures, like AIC/BIC and MASE from the models fitted so far. The accuracy measures for further models will be appended to this data frame.
attr(K_trans$model,"class") = "lm"
#temperature
ardl_temp35 <- ardlDlm(x = (ffdata$Temperature), y = (ffdata$FFD), p=3, q=5)
ardl_temp45 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_temp55 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=5, q=5)
models <- c("FFtemp_DLM", "temp_PolyD", "temp_Koyck", "ARDL_temp15", "ARDL_temp35", "ARDL_temp45", "ARDL_temp55")
aic_1 <- AIC(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)
## [1] 28.17827
bic_1 <- BIC(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)
## [1] 41.75706
MASE_1 <- MASE(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)
accuracy_1 <- data.frame(models, MASE_1, aic_1, bic_1 )
colnames(accuracy_1) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_1)
## Model MASE AIC BIC NA
## ftem_dlm FFtemp_DLM 21 0.7001861 28.17827 41.75706
## temp_polyd temp_PolyD 21 0.6461061 28.17827 41.75706
## temp_Koyck temp_Koyck 30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15 26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35 26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45 26 0.6145971 28.17827 41.75706
2)rainfall
ardl_rain35 <- ardlDlm(x = (ffdata$Rainfall), y = (ffdata$FFD), p=3, q=5)
ardl_rain45 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_rain55 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=5, q=5)
#better compared to others
models <- c("FFrain_DLM", "rain_PolyD", "rain_Koyck", "ARDL_rain15", "ARDL_rain35", "ARDL_rain45", "ARDL_rain55")
aic_2 <- AIC(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)
## [1] 35.13643
bic_2 <- BIC(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)
## [1] 48.71522
MASE_2 <- MASE(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)
accuracy_2 <- data.frame(models, MASE_2, aic_2, bic_2 )
colnames(accuracy_2) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_2)
## Model MASE AIC BIC NA
## frain_dlm FFrain_DLM 21 0.5137970 35.13643 48.71522
## rain_polyd rain_PolyD 21 0.6248857 35.13643 48.71522
## rain_Koyck rain_Koyck 30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15 26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35 26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45 26 0.6175106 35.13643 48.71522
#radiation
ardl_rad35 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=3, q=5)
ardl_rad45 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_rad55 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=5, q=5)
models <- c("FFrad_DLM", "rad_PolyD", "rad_Koyck", "ARDL_rad15", "ARDL_rad35", "ARDL_rad45", "ARDL_rad55")
aic_3 <- AIC(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)
## [1] 26.34091
bic_3 <- BIC(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)
## [1] 39.9197
MASE_3 <- MASE(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)
accuracy_3 <- data.frame(models, MASE_3, aic_3, bic_3 )
colnames(accuracy_3) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_3)
## Model MASE AIC BIC NA
## frad_dlm FFrad_DLM 21 0.5356254 26.34091 39.9197
## rad_polyd rad_PolyD 21 0.6463697 26.34091 39.9197
## rad_Koyck rad_Koyck 30 0.7136301 26.34091 39.9197
## ardl_rad15 ARDL_rad15 26 0.6196376 26.34091 39.9197
## ardl_rad35 ARDL_rad35 26 0.6164068 26.34091 39.9197
## ardl_rad45 ARDL_rad45 26 0.6045879 26.34091 39.9197
#humidity
ardl_hum35 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=3, q=5)
ardl_hum45 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_hum55 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=5, q=5)
#worst accuracy
models <- c("Fhum_DLM", "hum_PolyD", "hum_Koyck", "ARDL_hum15", "ARDL_hum35", "ARDL_hum45", "ARDL_hum55")
aic_4 <- AIC(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)
## [1] 64.34047
bic_4 <- BIC(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)
## [1] 77.91927
MASE_4 <- MASE(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)
accuracy_4 <- data.frame(models, MASE_4, aic_4, bic_4 )
colnames(accuracy_4) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_4)
## Model MASE AIC BIC NA
## fhum_dlm Fhum_DLM 21 0.6141544 64.34047 77.91927
## hum_polyd hum_PolyD 21 0.6520980 64.34047 77.91927
## hum_Koyck hum_Koyck 30 0.7335956 64.34047 77.91927
## ardl_hum15 ARDL_hum15 26 0.6497544 64.34047 77.91927
## ardl_hum35 ARDL_hum35 26 0.5175561 64.34047 77.91927
## ardl_hum45 ARDL_hum45 26 0.5186368 64.34047 77.91927
Another forecasting method we can try is exponential smoothing. Because we have found a weak seasonal component in ffd series also the frequncy is only 1 year based we cannot use expo smoothening.
ffd_ts <- ts(Ts_FFD, start=c(2015,1), frequency= 12)
ffd_ts
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2015 217 186 233 222 214 237 213 206 188 234 264 196
## 2016 229 212 244 178 154 207 182 218 192 199 200 225
## 2017 216 197 230 204 233 174 189
ES = c(T,F)
seasonality <- c("additive","multiplicative")
damped <- c(T,F)
expa <- expand.grid(ES, seasonality, damped)
expa <- expa[-c(1,5),]
f_aic <- array(NA, 6)
f_bic <- array(NA, 6)
f_mase <- array(NA, 6)
levels <- array(NA, dim=c(6,3))
for (i in 1:6){
holt_w <- hw(ffd_ts, ES = expa[i,1], seasonal = toString(expa[i,2], damped = expa[i,3]))
f_aic[i] <- holt_w$model$aic
f_bic[i] <- holt_w$model$bic
f_mase[i] <- accuracy(holt_w)[6]
levels[i,1] <- expa[i,1]
levels[i,2] <- toString(expa[i,2])
levels[i,3] <- expa[i,3]
checkresiduals(holt_w)
}
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' additive method
## Q* = 10.781, df = 3, p-value = 0.01297
##
## Model df: 16. Total lags used: 19
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
##
## Model df: 16. Total lags used: 19
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
##
## Model df: 16. Total lags used: 19
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' additive method
## Q* = 10.781, df = 3, p-value = 0.01297
##
## Model df: 16. Total lags used: 19
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
##
## Model df: 16. Total lags used: 19
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
##
## Model df: 16. Total lags used: 19
#Overall, we can observe a slight improvement in residuals from exponential smoothing models in terms of serial correlation and especially seasonality. Holt-Winters’ multiplicative method is the most successful at capturing autocorrelation and seasonality in radiation series.
#We append the accuracy measures for exponential smoothing models to the accuracy data frame. The format of model names is: trend (multiplicative or additive), seasonality (multiplicative or additive) and if the trend is damped (damped) or no (N).
newvalues <- data.frame(levels, f_mase, f_aic, f_bic, NA)
colnames(newvalues) <- c("Trend", "Seasonality", "damped", "MASE", "AIC", "BIC","NA")
newvalues$Trend <- factor(newvalues$Trend, levels = c(T,F), labels = c("multiplicative","additive"))
newvalues$damped <- factor(newvalues$damped, levels = c(T,F), labels = c("damped","N"))
newvalues <- unite(newvalues, col = "Model", c("Trend","Seasonality","damped"))
accuracy_T <- rbind(accuracy_1, accuracy_2,accuracy_3,accuracy_4)
accuracy_T
## Model MASE AIC BIC NA
## ftem_dlm FFtemp_DLM 21 0.7001861 28.17827 41.75706
## temp_polyd temp_PolyD 21 0.6461061 28.17827 41.75706
## temp_Koyck temp_Koyck 30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15 26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35 26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45 26 0.6145971 28.17827 41.75706
## ardl_temp55 ARDL_temp55 26 0.5872934 28.17827 41.75706
## frain_dlm FFrain_DLM 21 0.5137970 35.13643 48.71522
## rain_polyd rain_PolyD 21 0.6248857 35.13643 48.71522
## rain_Koyck rain_Koyck 30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15 26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35 26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45 26 0.6175106 35.13643 48.71522
## ardl_rain55 ARDL_rain55 26 0.5306522 35.13643 48.71522
## frad_dlm FFrad_DLM 21 0.5356254 26.34091 39.91970
## rad_polyd rad_PolyD 21 0.6463697 26.34091 39.91970
## rad_Koyck rad_Koyck 30 0.7136301 26.34091 39.91970
## ardl_rad15 ARDL_rad15 26 0.6196376 26.34091 39.91970
## ardl_rad35 ARDL_rad35 26 0.6164068 26.34091 39.91970
## ardl_rad45 ARDL_rad45 26 0.6045879 26.34091 39.91970
## ardl_rad55 ARDL_rad55 26 0.5314267 26.34091 39.91970
## fhum_dlm Fhum_DLM 21 0.6141544 64.34047 77.91927
## hum_polyd hum_PolyD 21 0.6520980 64.34047 77.91927
## hum_Koyck hum_Koyck 30 0.7335956 64.34047 77.91927
## ardl_hum15 ARDL_hum15 26 0.6497544 64.34047 77.91927
## ardl_hum35 ARDL_hum35 26 0.5175561 64.34047 77.91927
## ardl_hum45 ARDL_hum45 26 0.5186368 64.34047 77.91927
## ardl_hum55 ARDL_hum55 26 0.5164104 64.34047 77.91927
For each exponential smoothing method, there are two corresponding state-space models (with additive or multiplicative errors). There are 8 state-space variations which include seasonality that we cam implement in R (some combinations are forbidden due to their stability issues). We create a loop to fit these models and capture their accuracy measures for further comparison.
vlist <- c("AAA", "MAA", "MAM", "MMM")
damp <- c(T,F)
ets_models <- expand.grid(vlist, damp)
ets_aic <- array(NA, 8)
ets_mase <- array(NA,8)
ets_bic <- array(NA,8)
mod <- array(NA, dim=c(8,2))
#Auto ETS fitted to see what the software automatically suggested model is
auto_ets <- ets(ffd_ts)
summary(auto_ets)
## ETS(M,N,N)
##
## Call:
## ets(y = ffd_ts)
##
## Smoothing parameters:
## alpha = 1e-04
##
## Initial states:
## l = 209.448
##
## sigma: 0.1137
##
## AIC AICc BIC
## 306.9448 307.8337 311.2468
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.001214657 23.03386 18.696 -1.263579 9.173601 0.6517872
## ACF1
## Training set -0.006864096
#The model suggested automatically is ETS(M,N,N) which is a model with multiplicative errors, No damped trend and no seasonality.
checkresiduals(auto_ets)
##
## Ljung-Box test
##
## data: Residuals from ETS(M,N,N)
## Q* = 7.489, df = 4, p-value = 0.1122
##
## Model df: 2. Total lags used: 6
Overall, the residual analysis suggest this model is not successful at capturing autocorrelation and seasonality in ffd series.
We append the accuracy measures for state-space models to the accuracy data frame.
library(tidyr)
calculate <- data.frame(mod, ets_mase, ets_aic, ets_bic,"NA")
calculate$X2 <- factor(calculate$X2, levels = c(T,F), labels = c("Damped","N"))
calculate <- unite(calculate, "Model", c("X1","X2"))
colnames(calculate) <- c("Model", "MASE", "AIC", "BIC","NA")
accuracy_T <- rbind(accuracy_1, accuracy_2,accuracy_3,accuracy_4)
accuracy_T
## Model MASE AIC BIC NA
## ftem_dlm FFtemp_DLM 21 0.7001861 28.17827 41.75706
## temp_polyd temp_PolyD 21 0.6461061 28.17827 41.75706
## temp_Koyck temp_Koyck 30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15 26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35 26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45 26 0.6145971 28.17827 41.75706
## ardl_temp55 ARDL_temp55 26 0.5872934 28.17827 41.75706
## frain_dlm FFrain_DLM 21 0.5137970 35.13643 48.71522
## rain_polyd rain_PolyD 21 0.6248857 35.13643 48.71522
## rain_Koyck rain_Koyck 30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15 26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35 26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45 26 0.6175106 35.13643 48.71522
## ardl_rain55 ARDL_rain55 26 0.5306522 35.13643 48.71522
## frad_dlm FFrad_DLM 21 0.5356254 26.34091 39.91970
## rad_polyd rad_PolyD 21 0.6463697 26.34091 39.91970
## rad_Koyck rad_Koyck 30 0.7136301 26.34091 39.91970
## ardl_rad15 ARDL_rad15 26 0.6196376 26.34091 39.91970
## ardl_rad35 ARDL_rad35 26 0.6164068 26.34091 39.91970
## ardl_rad45 ARDL_rad45 26 0.6045879 26.34091 39.91970
## ardl_rad55 ARDL_rad55 26 0.5314267 26.34091 39.91970
## fhum_dlm Fhum_DLM 21 0.6141544 64.34047 77.91927
## hum_polyd hum_PolyD 21 0.6520980 64.34047 77.91927
## hum_Koyck hum_Koyck 30 0.7335956 64.34047 77.91927
## ardl_hum15 ARDL_hum15 26 0.6497544 64.34047 77.91927
## ardl_hum35 ARDL_hum35 26 0.5175561 64.34047 77.91927
## ardl_hum45 ARDL_hum45 26 0.5186368 64.34047 77.91927
## ardl_hum55 ARDL_hum55 26 0.5164104 64.34047 77.91927
#The data frame accuracy is sorted by ascending MASE value.
The accuracy table will be used to compare all methods we have tried at the modelling stage in terms of their MASE. The model that minimises MASE is Holt-Winters’ multiplicative method with additive trend (there is no difference in models in terms of damping). The best state-space model in terms of lowest MASE is ETS(A,Ad,A) which is the model with additive errors, additive damped trend and additive seasonality. ETS(A,Ad,A) was also the model suggested automatically. We can see from the table that time series regression methods perform the worst in terms of MASE but this approach has lower AIC/BIC measures than the exponential smoothing approach.
For deciding on the final model to give four years ahead forecasts of FFD value, we compare forecasts from three models:
Holt-Winters’ multiplicative method which has the lowest MASE and is the most successful at capturing the autocorrelation and seasonality in the series
Holt-Winters’ multiplicative method with multiplicative trend which has the second lowest MASE and is also good at capturing the autocorrelation and seasonality in the series
ETS(A,Ad,A) model was suggested by an automatic algorithm and has the lowest MASE of all state-space models but does not capture autocorrelation in the series
The fitted values and 4 year forecasts are displayed in Figure 13.
fitm1 <- hw(ffd_ts, seasonal = "multiplicative", h = 2*frequency(ffd_ts))
fitm2 <- hw(ffd_ts, seasonal = "multiplicative", exponential = T, h = 2*frequency(ffd_ts))
fitm3 <- ets(ffd_ts,model="AAA", damped=T)
#class(fit3)
#methods(forecast())
for_fit3 <- forecast.ets(fitm3)
plot(for_fit3, fcol = "black", main = "FFD occurences series with four years ahead forecasts", ylab = "ffd", ylim = c(-10,55))
lines(fitted(fitm1), col = "darkgreen")
lines(fitm1$mean, col = "darkgreen", lwd = 2)
lines(fitted(fitm2), col = "brown2")
lines(fitm2$mean, col = "brown2", lwd = 2)
lines(fitted(fitm3), col = "dodgerblue3")
lines(for_fit3$mean, col = "dodgerblue3", lwd = 2)
legend("bottomleft", lty = 1, col = c("black", "darkgreen", "brown2", "dodgerblue3"), c("Data", "Holt-Winters' Multiplicative", "Holt-Winters' Multiplicative Exponential", "ETS(M,N,N)"))
Overall, based on the residual analysis and the lowest MASE value, we take Holt-Winters’ multiplicative model to give 4 years ahead forecasts of the amount of solar radiation.
The final forecasts are displayed in Figure 14.
plot(fitm1, fcol = "white", main = "FFD series with four years ahead forecasts", ylab = "ffd occurences")
lines(fitted(fitm1), col = "darkgreen")
lines(fitm1$mean, col = "darkgreen", lwd = 2)
legend("topleft", lty = 1, col = c("black", "darkgreen"), c("Data", "Forecasts"))
#The solar radiation 2 years ahead point forecast values with corresponding 95% confidence intervals are as follows:
forc <- fitm1$mean
ub <- fitm1$upper[,2]
lb <- fitm1$lower[,2]
forecasts <- ts.intersect(ts(lb, start = c(2015,1),end =c(2018,1) , frequency = 1), ts(forc,start = c(2015,1),end =c(2018,1), frequency = 1), ts(ub,start = c(2015,1),end =c(2018,1), frequency = 1))
colnames(forecasts) <- c("Lower bound", "Point forecast", "Upper bound")
forecasts
## Time Series:
## Start = 2015
## End = 2018
## Frequency = 1
## Lower bound Point forecast Upper bound
## 2015 152.4610 202.5019 252.5427
## 2016 139.4139 185.1746 230.9353
## 2017 156.2027 207.4778 258.7530
## 2018 172.0050 228.4733 284.9416
plot(forecasts)
#However, we can observe that the 95% confidence intervals for the forecasts from selected approach are very wide and do not provide reliable forecasts
This task consists of two parts. Task 3[A] The objective of the first task is Carry out your analysis based on univariate climate regressors (model one climate indicator at a time, i.e., univariate regressor). For this task, we will apply Modelling methods like (DLM, ARDL, polyck, koyck, dynlm)and will make Choice of optimal models within EACH a specific method can be assessed from values of R squared, AIC, BIC, MASE etc (as is appropriate to the method). The final goal of this analysis is to forecast RBO three years ahead using each regressor one at a time and (use percentiles for the regressors) in forecasting for each of the best models within the methods utilised.
task 3[B] The aim of the second task is to analyse the correlation structure between quarterly Residential Property Price Index (PPI) in Melbourne and population change over the previous quarter in Victoria from September 2003 to December 2016. We will explore and demonstrate if the correlation between the two series is found spurious or not.
Dataset explores the relative flowering order similarity of 81 species of plants from 1983 to 2014. The species were ranked annually by the time taken to flower (FFD), and changes in flowering order were measured by computing the similarity between annual flowering order and the flowering order of 1983 using the Rank-based Order similarity metric (RBO). The earliest flowering species is ranked 1 and latest ranked 81 for the given year under study. RBO values are therefore numbers between 0 and 1. Higher RBO values indicate higher similarity of the order of the first flowering occurrence (based on FFD) of the 81 species from 1983 compared to each of the subsequent years, 1984 to 2014, so the time series are of length 31.
Flowering orders became more dissimilar over the most recent decades, particularly during the Millennium Drought (1997 – 2009), suggesting that flora in Australia is responding to changes in their environment. According to the BoM the drought period for Australia occurred from 1996 to 2009.
Primary Dataset “RBO.csv” consist of four dependent factors :Temperature,Rainfall, radiation ,Humidity with Target variable RBO which gives order value wrt FFD. Secondary dataset given “Covariate x-values for Task 3.csv” consist of future values 2015 to 2018 of dependent variables.
#Load data
RBOdata <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/RBO.csv")
RBOdata
## ï..Year RBO Temperature Rainfall Radiation RelHumidity
## 1 1984 0.7550088 9.371585 2.489344 14.87158 93.92650
## 2 1985 0.7407520 9.656164 2.475890 14.68493 94.93589
## 3 1986 0.8423860 9.273973 2.421370 14.51507 94.09507
## 4 1987 0.7484425 9.219178 2.319726 14.67397 94.49699
## 5 1988 0.7984084 10.202186 2.465301 14.74863 94.08142
## 6 1989 0.7938803 9.441096 2.735890 14.78356 96.08685
## 7 1990 0.7925678 9.943836 2.398630 14.67671 93.77918
## 8 1991 0.8138698 9.690411 2.635616 14.41096 93.15562
## 9 1992 0.8152843 9.691257 2.795902 13.39617 94.09863
## 10 1993 0.7758007 9.947945 2.878630 14.26575 94.91973
## 11 1994 0.7471853 9.316438 1.974795 14.52329 93.26932
## 12 1995 0.7508197 9.164384 2.843288 13.90411 94.45863
## 13 1996 0.6644419 8.967213 2.814754 14.33060 94.60000
## 14 1997 0.6941213 9.038356 1.403014 14.77534 93.74685
## 15 1998 0.7045545 8.934247 2.289041 14.60000 94.60822
## 16 1999 0.6992259 9.547945 2.126301 14.61370 96.22603
## 17 2000 0.7137116 9.680328 2.471858 14.65574 95.65738
## 18 2001 0.7267423 9.561644 2.227945 14.14521 94.70712
## 19 2002 0.6629484 9.389041 1.740000 14.63836 93.53233
## 20 2003 0.7118227 9.210959 2.270411 15.11233 94.47096
## 21 2004 0.7039938 9.300546 2.620492 14.64481 95.01421
## 22 2005 0.7321166 9.623288 2.284110 15.09315 94.30356
## 23 2006 0.7258027 8.715068 1.781370 15.41096 94.84493
## 24 2007 0.7007718 9.801370 2.191233 15.19452 94.11068
## 25 2008 0.7445151 9.034153 1.743169 14.80328 94.39508
## 26 2009 0.6853045 9.457534 2.038630 15.12877 94.63096
## 27 2010 0.7022626 9.765753 2.777808 14.29315 96.05205
## 28 2011 0.7582674 9.826027 2.886301 14.01096 95.70603
## 29 2012 0.7346374 9.767760 2.599454 14.40710 94.90519
## 30 2013 0.7255165 10.097260 2.540274 14.43014 93.83479
## 31 2014 0.7090916 10.247253 2.239286 14.60165 94.21016
#Converting into timeseries
RBO_ts<- ts(RBOdata$RBO, start =c(1984,1), frequency = 1)
head(RBO_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 0.7550088 0.7407520 0.8423860 0.7484425 0.7984084 0.7938803
Temperature_ts <- ts(RBOdata$Temperature, start =c(1984,1), frequency = 1)
head(Temperature_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 9.371585 9.656164 9.273973 9.219178 10.202186 9.441096
RainFall_ts <-ts(RBOdata$Temperature, start =c(1984,1), frequency = 1)
head(RainFall_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 9.371585 9.656164 9.273973 9.219178 10.202186 9.441096
Radiation_ts <-ts(RBOdata$Radiation, start =c(1984,1), frequency = 1)
head(Radiation_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356
Humidity_ts <-ts(RBOdata$RelHumidity, start =c(1984,1), frequency = 1)
head(Humidity_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685
RBOdata_ts <-ts(RBOdata[,2:6], start =c(1984,1), frequency = 1)
head(RBOdata_ts)
## Time Series:
## Start = 1984
## End = 1989
## Frequency = 1
## RBO Temperature Rainfall Radiation RelHumidity
## 1984 0.7550088 9.371585 2.489344 14.87158 93.92650
## 1985 0.7407520 9.656164 2.475890 14.68493 94.93589
## 1986 0.8423860 9.273973 2.421370 14.51507 94.09507
## 1987 0.7484425 9.219178 2.319726 14.67397 94.49699
## 1988 0.7984084 10.202186 2.465301 14.74863 94.08142
## 1989 0.7938803 9.441096 2.735890 14.78356 96.08685
#Load Covariate x-values for Task 3
xvalues <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/Covariate x-values for Task 3 .csv")
head(xvalues)
## ï..Year Temperature Rainfall Radiation RelHumidity
## 1 2015 10.23 2.27 14.60 94.45
## 2 2016 10.10 2.38 14.56 94.03
## 3 2017 9.53 2.26 14.79 95.04
## 4 2018 9.54 2.27 14.79 95.06
# **Data exploration and visualisation
plot(RBO_ts, main = "Fig.1 Time series plot of the order of the FFD of the 81 species from 1983 ", ylab = "Similarity values for the flowering orders ", xlab = "Time")
#points(RBO_ts, x=time(RBO_ts), pch = as.vector(season(RBO_ts)))
From the plot in Figure 1, we can observe the following characteristics of the series:
There is no apparent trend.
There is obvious seasonality, with lower values in December and January and higher values in June and July. The seasonal pattern is not consistent across the observed time.
Changing variance and behaviour of the series are not obvious due to the presence of seasonality.
There are two potential intervention points around 1965 and 1987.
We will further display sample ACF and conduct an Augmented Dickey-Fuller test to study stationarity and seasonality in the series. The length of our data allows to display more lags in the ACF plot to better observe any evidence of trend.
acf(RBO_ts, lag.max = 48, main="Fig.2 ACF plot of RBO values series")
adf.test(RBO_ts, k=ar(RBO_ts)$order)
##
## Augmented Dickey-Fuller Test
##
## data: RBO_ts
## Dickey-Fuller = -1.4542, Lag order = 2, p-value = 0.7829
## alternative hypothesis: stationary
The ACF plot in Figure 2 shows strong seasonal patterns and suggests no trend. ADF test with lag order = 25 reports stationarity in the series at 5% level of significance (p-value < 0.05). Overall, we conclude that solar radiation series has a strong seasonality pattern.
We will display a time series plot of precipitation series which we will use as a predictor series for distributed lag models.
#Dependent variables plotting
par(mfrow=c(2,2))
plot(Temperature_ts, main ="Fig.3.1 Time series plot of temperature effects on rbo value", ylab="Temperature change", xlab = "Time")
plot(RainFall_ts, main ="Fig3.2.Time series plot of Rain effects on rbo value series", ylab="Rainfall", xlab = "Time")
plot(Radiation_ts, main ="Fig 3.3 Time series plot of Radiations on rbo value series", ylab="Radiations", xlab = "Time")
plot(Humidity_ts, main ="Fig 3.4 Time series plot of Humidity effects on ffd series", ylab="Humidity", xlab = "Time")
#points (precip, x= time(precip), pch = as.vector(season(precip)))
par(mfrow=c(1,1))
Plots 3.1 to 3.4 concludes: Based on the plot in Figure 3, we can make the following comments on the characteristics of the series:
There might be a slight downward trend, especially in the beginning of the series.
There is a clear seasonality, while the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January.
The existence of changing variance and behaviour of the series is not apparent due to seasonality.
There are no obvious intervention points
To study more on trend and seasonality we further display ADF test and acf plot for each feature
par(mfrow=c(2,2))
#Temperature
acf(Temperature_ts,lag.max = 48, main = "Fig. 4.1 ACF plot of Temperature values")
adf.test(Temperature_ts,k=ar(Temperature_ts)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Temperature_ts
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary
#Rainfall
acf(RainFall_ts,lag.max = 48, main = "Fig.4.2 ACF plot of rainfall values")
adf.test(RainFall_ts,k=ar(RainFall_ts)$order)
##
## Augmented Dickey-Fuller Test
##
## data: RainFall_ts
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary
#radiation
acf(Radiation_ts,lag.max = 48, main = "Fig 4.3 ACF plot of radiation values")
adf.test(Radiation_ts,k=ar(Radiation_ts)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Radiation_ts
## Dickey-Fuller = -2.7317, Lag order = 4, p-value = 0.2911
## alternative hypothesis: stationary
#humidity
acf(Humidity_ts,lag.max = 48, main = "Fig.4.4 ACF plot of humidity values")
adf.test(Humidity_ts,k=ar(Humidity_ts)$order)
##
## Augmented Dickey-Fuller Test
##
## data: Humidity_ts
## Dickey-Fuller = -4.5749, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
par(mfrow=c(1,1))
From Figure 4, we can observe that there is a strong seasonal pattern, a decaying pattern of seasonal lags also suggests the possible existence of trend. The ADF test reports p-value = 0.078 > 0.05 which suggests the series is nonstationary at 5% level of significance.
To clearly display the dependent radiation series versus the explanatory precipitation series within the same plot, we will standardise the data. The following code creates a time series plot to explore the relationship of the series.
#Scaling of data
shift<- scale(RBOdata_ts)
plot(shift, plot.type="s",col=c("Red", "Blue", "Brown","Black","Green"),main= "Fig.5 RBO similarity values for the flowering orders versus factor affecting RBO wrt time(Scaled)")
legend("bottomright", lty=1, text.width = 7, col = c("Red", "Blue", "Brown","Black","Green"), c("Temperature", "Rain", "Radiation", "Humidity","FFD"))
The plot in Figure 5 shows that the dependent and the independent series are likely to be negatively correlated. High values of radiation correspond to low values of precipitation and vice versa.
We also calculate the correlation coefficient to check the relationship.
#correlation between each variable with RBO
cor(RBO_ts,Temperature_ts)
## [1] 0.2610007
cor(RBO_ts,RainFall_ts)
## [1] 0.2610007
cor(RBO_ts,Radiation_ts)
## [1] -0.3173602
cor(RBO_ts,Humidity_ts)
## [1] -0.1776349
The correlation between temperature and rainfall is same i.e 0.2610007 whereas for radiation and humidity coefficient is reported r=-0.3173602 and -0.1776349 respectively which suggests a moderate negative correlation between the series and confirms the conclusion made from the plot in Figure 5. After we have explored the characteristics of the individual series and found the evidence of relationship between them, we proceed to modelling stage.
##Finite distributed lag model To find a suitable model for forecasting solar radiation values, we will try fitting distributed lag models which include an independent explanatory series and its lags to help explain the overall variation and correlation structure in our dependent series.
To specify the finite lag length for the model, we create a loop that computes accuracy measures like AIC/BIC and MASE for the models with different lag lengths and select a model with the lowest values.
for (i in 1:10){
model1 <- dlm(x = RBOdata$Temperature, y = RBOdata$RBO, q = i)
cat("q =", i, "AIC =", AIC(model1$model), "BIC =", BIC(model1$model), "MASE =", MASE(model1)$MASE, "\n")
}
## q = 1 AIC = -101.8617 BIC = -96.2569 MASE = 0.9239038
## q = 2 AIC = -95.49894 BIC = -88.66246 MASE = 1.032564
## q = 3 AIC = -96.76727 BIC = -88.77404 MASE = 1.033663
## q = 4 AIC = -92.75653 BIC = -83.68567 MASE = 1.005373
## q = 5 AIC = -91.46337 BIC = -81.3986 MASE = 0.8594175
## q = 6 AIC = -85.74127 BIC = -74.77139 MASE = 0.8103361
## q = 7 AIC = -82.04015 BIC = -70.25962 MASE = 0.7518958
## q = 8 AIC = -83.28717 BIC = -70.79674 MASE = 0.6497633
## q = 9 AIC = -88.10651 BIC = -75.014 MASE = 0.5120906
## q = 10 AIC = -87.92965 BIC = -74.35085 MASE = 0.458588
#Temperature has lowest Aic at q=1 than rainfall data
for (i in 1:10){
model1_r <- dlm(x = RBOdata$Rainfall, y = RBOdata$RBO, q = i)
cat("q =", i, "AIC =", AIC(model1_r$model), "BIC =", BIC(model1_r$model), "MASE =", MASE(model1_r)$MASE, "\n")
}
## q = 1 AIC = -100.898 BIC = -95.29319 MASE = 0.9417954
## q = 2 AIC = -96.70956 BIC = -89.87308 MASE = 0.9993747
## q = 3 AIC = -97.19966 BIC = -89.20643 MASE = 0.9796852
## q = 4 AIC = -90.46187 BIC = -81.39101 MASE = 1.038827
## q = 5 AIC = -87.24242 BIC = -77.17765 MASE = 0.925677
## q = 6 AIC = -82.31788 BIC = -71.348 MASE = 0.8543964
## q = 7 AIC = -77.98405 BIC = -66.20351 MASE = 0.829337
## q = 8 AIC = -76.81922 BIC = -64.32879 MASE = 0.7233794
## q = 9 AIC = -80.79432 BIC = -67.70181 MASE = 0.6205897
## q = 10 AIC = -76.93255 BIC = -63.35376 MASE = 0.585617
Finite dlm of each variate
1)Temperature
temp_dlm <- dlm(x = RBOdata$Temperature, y = RBOdata$RBO, q=10)
summary(temp_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.033097 -0.011942 0.005304 0.008460 0.030820
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.521379 0.537055 0.971 0.3570
## x.t 0.003982 0.016965 0.235 0.8197
## x.1 0.047512 0.018674 2.544 0.0315 *
## x.2 -0.010070 0.019814 -0.508 0.6235
## x.3 -0.024214 0.020051 -1.208 0.2580
## x.4 0.011690 0.021762 0.537 0.6042
## x.5 -0.001764 0.023152 -0.076 0.9409
## x.6 0.017653 0.019245 0.917 0.3829
## x.7 0.015177 0.018673 0.813 0.4373
## x.8 -0.011418 0.020339 -0.561 0.5882
## x.9 -0.036343 0.019400 -1.873 0.0938 .
## x.10 0.008222 0.018870 0.436 0.6733
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02453 on 9 degrees of freedom
## Multiple R-squared: 0.6008, Adjusted R-squared: 0.1129
## F-statistic: 1.232 on 11 and 9 DF, p-value: 0.3834
##
## AIC and BIC values for the model:
## AIC BIC
## 1 -87.92965 -74.35085
vif(temp_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.525783 1.621374 1.577203 1.582646 1.951362 2.096894 1.824277 1.649730
## x.8 x.9 x.10
## 1.882237 1.408012 1.321229
From the temperature series, we obtained Adjusted R-squared: 0.1129,p-value: 0.3834 >0.05 and AIC:-87.92965
2)Rain
rain_dlm <- dlm(x = RBOdata$Rainfall, y = RBOdata$RBO, q=10)
summary(rain_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.050274 -0.013229 -0.001445 0.015071 0.039030
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.709414 0.129781 5.466 0.000397 ***
## x.t 0.016449 0.019478 0.845 0.420260
## x.1 0.009100 0.017831 0.510 0.622079
## x.2 0.014628 0.018404 0.795 0.447163
## x.3 -0.006321 0.018174 -0.348 0.735980
## x.4 -0.006181 0.020176 -0.306 0.766285
## x.5 0.004570 0.020010 0.228 0.824453
## x.6 -0.007054 0.019391 -0.364 0.724424
## x.7 -0.011836 0.021444 -0.552 0.594424
## x.8 0.004407 0.020473 0.215 0.834366
## x.9 -0.021710 0.021728 -0.999 0.343817
## x.10 0.006802 0.022752 0.299 0.771745
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03187 on 9 degrees of freedom
## Multiple R-squared: 0.3261, Adjusted R-squared: -0.4975
## F-statistic: 0.3959 on 11 and 9 DF, p-value: 0.9251
##
## AIC and BIC values for the model:
## AIC BIC
## 1 -76.93255 -63.35376
vif(rain_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.242349 1.147054 1.282021 1.257098 1.420090 1.381663 1.279639 1.407959
## x.8 x.9 x.10
## 1.274726 1.277607 1.399164
From the temperature series, we obtained Adjusted R-squared: -0.4975,p-value: 0.9251 >0.05 and AIC:-76.93255
3)Radiation
rad_dlm <- dlm(x = RBOdata$Radiation, y = RBOdata$RBO, q=10)
summary(rad_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.036879 -0.014268 -0.000611 0.013122 0.040675
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.428718 0.623265 0.688 0.509
## x.t -0.036410 0.037502 -0.971 0.357
## x.1 0.048104 0.044770 1.074 0.311
## x.2 -0.021862 0.026884 -0.813 0.437
## x.3 0.002035 0.025673 0.079 0.939
## x.4 0.002219 0.026890 0.083 0.936
## x.5 0.002622 0.025560 0.103 0.921
## x.6 0.027607 0.026512 1.041 0.325
## x.7 -0.014768 0.025996 -0.568 0.584
## x.8 0.008387 0.026420 0.317 0.758
## x.9 0.010569 0.032643 0.324 0.754
## x.10 -0.008962 0.032801 -0.273 0.791
##
## Residual standard error: 0.03102 on 9 degrees of freedom
## Multiple R-squared: 0.3616, Adjusted R-squared: -0.4187
## F-statistic: 0.4634 on 11 and 9 DF, p-value: 0.8854
##
## AIC and BIC values for the model:
## AIC BIC
## 1 -78.06879 -64.49
vif(rad_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 4.571066 6.783792 3.502952 3.193621 3.262380 2.898270 2.938461 2.800495
## x.8 x.9 x.10
## 2.625446 3.207576 3.013446
From the temperature series, we obtained Adjusted R-squared: -0.4187,p-value: 0.8854 >0.05 and AIC:-78.06879
4)humidity
hum_dlm <- dlm(x = RBOdata$RelHumidity, y = RBOdata$RBO, q=10)
summary(hum_dlm)
##
## Call:
## lm(formula = model.formula, data = design)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0305118 -0.0079610 -0.0002443 0.0159831 0.0256032
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.3474469 2.8680763 -1.167 0.2731
## x.t 0.0090536 0.0094493 0.958 0.3630
## x.1 -0.0067710 0.0098800 -0.685 0.5104
## x.2 0.0214895 0.0097786 2.198 0.0556 .
## x.3 -0.0116432 0.0094885 -1.227 0.2509
## x.4 -0.0052385 0.0092682 -0.565 0.5857
## x.5 0.0177998 0.0087535 2.033 0.0725 .
## x.6 0.0005203 0.0081145 0.064 0.9503
## x.7 0.0091601 0.0080459 1.138 0.2843
## x.8 0.0074338 0.0079817 0.931 0.3760
## x.9 -0.0031835 0.0081450 -0.391 0.7050
## x.10 0.0043326 0.0084918 0.510 0.6222
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02378 on 9 degrees of freedom
## Multiple R-squared: 0.6249, Adjusted R-squared: 0.1663
## F-statistic: 1.363 on 11 and 9 DF, p-value: 0.3263
##
## AIC and BIC values for the model:
## AIC BIC
## 1 -89.23348 -75.65468
vif(hum_dlm$model)
## x.t x.1 x.2 x.3 x.4 x.5 x.6 x.7
## 1.925010 2.083771 1.978215 2.200686 1.986807 1.787024 1.553067 1.526096
## x.8 x.9 x.10
## 1.503189 1.573971 1.745843
From the temperature series, we obtained Adjusted R-squared: 0.1663 ,p-value: 0.3263 >0.05 and AIC:-89.23348
It is observed that the values of information criteria as well as MASE decrease as the lag q increases, so we will fit a finite DLM with a number of lags = 10.
According to the significance tests of model coefficients obtained from the summary, nearly all lag weights of a predictor series are not statistically significant at 5% level. The adjusted R2 for finite_dlm is 0.296, which means that the model explains only 29.6% of the variability in radiation. F-test of the overall significance of the model reports the model is statistically significant at 5% level (p-value < 0.05). However, we conclude that the model is not a good fit to the data due to insignificant terms and low explainability.
There is no issue with multicollinearity in the model, VIF values are reported < 10.
residualcheck <- function(x){
checkresiduals(x)
# bgtest(x)
shapiro.test(x$residuals)
}
Polynomial modelling on univariate
1)Temperature
Temp_polyd3 <- polyDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.009240 0.00932 0.991 0.343
## beta.1 0.008290 0.00704 1.180 0.264
## beta.2 0.007160 0.00608 1.180 0.264
## beta.3 0.005860 0.00605 0.969 0.353
## beta.4 0.004380 0.00626 0.699 0.499
## beta.5 0.002720 0.00631 0.431 0.675
## beta.6 0.000882 0.00611 0.144 0.888
## beta.7 -0.001130 0.00588 -0.192 0.851
## beta.8 -0.003320 0.00618 -0.537 0.602
## beta.9 -0.005690 0.00768 -0.741 0.474
## beta.10 -0.008230 0.01050 -0.780 0.452
summary(Temp_polyd3, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.048188 -0.013552 0.000488 0.011598 0.039393
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.246e-01 5.467e-01 0.960 0.351
## z.t0 9.240e-03 9.322e-03 0.991 0.335
## z.t1 -8.622e-04 4.008e-03 -0.215 0.832
## z.t2 -8.849e-05 4.042e-04 -0.219 0.829
##
## Residual standard error: 0.02613 on 17 degrees of freedom
## Multiple R-squared: 0.1445, Adjusted R-squared: -0.00647
## F-statistic: 0.9571 on 3 and 17 DF, p-value: 0.4354
residualcheck(Temp_polyd3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.96513, p-value = 0.6248
checkresiduals(Temp_polyd3$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 11.321, df = 7, p-value = 0.1252
From the temperature series, we obtained Adjusted R-squared: -0.00647 ,p-value: 0.4354 >0.05
2)Rain
#better than others
Rain_polyd3 <- polyDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.015300 0.01110 1.380 0.196
## beta.1 0.009620 0.00737 1.310 0.218
## beta.2 0.004830 0.00576 0.839 0.419
## beta.3 0.000907 0.00580 0.156 0.879
## beta.4 -0.002160 0.00618 -0.349 0.734
## beta.5 -0.004360 0.00611 -0.712 0.491
## beta.6 -0.005690 0.00546 -1.040 0.320
## beta.7 -0.006170 0.00474 -1.300 0.220
## beta.8 -0.005780 0.00549 -1.050 0.315
## beta.9 -0.004520 0.00872 -0.519 0.614
## beta.10 -0.002410 0.01380 -0.174 0.865
summary(Rain_polyd3)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.056127 -0.014819 0.002352 0.012277 0.038053
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.7166742 0.1029308 6.963 2.29e-06 ***
## z.t0 0.0152725 0.0110859 1.378 0.186
## z.t1 -0.0060829 0.0055819 -1.090 0.291
## z.t2 0.0004315 0.0005823 0.741 0.469
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02592 on 17 degrees of freedom
## Multiple R-squared: 0.1584, Adjusted R-squared: 0.009865
## F-statistic: 1.066 on 3 and 17 DF, p-value: 0.3894
residualcheck(Rain_polyd3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.97008, p-value = 0.7348
checkresiduals(Rain_polyd3$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 9.9973, df = 7, p-value = 0.1887
From the temperature series, we obtained Adjusted R-squared: 0.009865,p-value: 0.3894 >0.05
3)Radiation
Rad_polyd3 <- polyDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 -0.005740 0.00831 -0.691 0.504
## beta.1 -0.002240 0.00532 -0.422 0.681
## beta.2 0.000648 0.00369 0.176 0.864
## beta.3 0.002930 0.00346 0.847 0.415
## beta.4 0.004600 0.00377 1.220 0.248
## beta.5 0.005650 0.00389 1.450 0.174
## beta.6 0.006100 0.00367 1.660 0.125
## beta.7 0.005940 0.00345 1.720 0.113
## beta.8 0.005160 0.00413 1.250 0.237
## beta.9 0.003780 0.00628 0.601 0.560
## beta.10 0.001780 0.00968 0.184 0.857
summary(Rad_polyd3)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.048403 -0.012315 0.001545 0.021358 0.033699
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.2975083 0.4879633 0.610 0.550
## z.t0 -0.0057426 0.0083076 -0.691 0.499
## z.t1 0.0038059 0.0039999 0.952 0.355
## z.t2 -0.0003054 0.0004050 -0.754 0.461
##
## Residual standard error: 0.02599 on 17 degrees of freedom
## Multiple R-squared: 0.1538, Adjusted R-squared: 0.004456
## F-statistic: 1.03 on 3 and 17 DF, p-value: 0.4043
residualcheck(Rad_polyd3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.951, p-value = 0.3557
checkresiduals(Rad_polyd3$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 7.121, df = 7, p-value = 0.4164
From the temperature series, we obtained Adjusted R-squared: 0.004456,p-value: 0.4043 >0.05
4)Humidity
Humi_polyd3 <- polyDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO), q=10,k=2)
## Estimates and t-tests for beta coefficients:
## Estimate Std. Error t value P(>|t|)
## beta.0 0.002210 0.00585 0.3770 0.713
## beta.1 0.002620 0.00421 0.6210 0.547
## beta.2 0.002890 0.00345 0.8390 0.419
## beta.3 0.003030 0.00339 0.8960 0.390
## beta.4 0.003040 0.00358 0.8490 0.414
## beta.5 0.002910 0.00370 0.7870 0.448
## beta.6 0.002640 0.00366 0.7220 0.486
## beta.7 0.002240 0.00358 0.6260 0.544
## beta.8 0.001710 0.00378 0.4520 0.660
## beta.9 0.001040 0.00464 0.2230 0.828
## beta.10 0.000229 0.00633 0.0362 0.972
summary(Humi_polyd3)
##
## Call:
## "Y ~ (Intercept) + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.053351 -0.013376 -0.000361 0.013300 0.044624
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.6065641 3.1981382 -0.502 0.622
## z.t0 0.0022052 0.0058503 0.377 0.711
## z.t1 0.0004784 0.0025645 0.187 0.854
## z.t2 -0.0000676 0.0002504 -0.270 0.790
##
## Residual standard error: 0.0276 on 17 degrees of freedom
## Multiple R-squared: 0.04517, Adjusted R-squared: -0.1233
## F-statistic: 0.2681 on 3 and 17 DF, p-value: 0.8475
residualcheck(Humi_polyd3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.96765, p-value = 0.6806
checkresiduals(Humi_polyd3$model)
##
## Breusch-Godfrey test for serial correlation of order up to 7
##
## data: Residuals
## LM test = 5.2541, df = 7, p-value = 0.629
From the temperature series, we obtained Adjusted R-squared: -0.1233,p-value: 0.8475 >0.05
The analysis of residuals from polynomial model in Figure 7 shows the following:
The errors are not randomly spread.
There are a lot of highly significant lags in the ACF plot as well as a wavy pattern at seasonal lags, so there is autocorrelation and seasonality still present in the residuals.
Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.
The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05).
Overall, we can conclude that the second order polynomial of lag 10 is not successful at capturing the autocorrelation and seasonality in the series and has low explainability.
We will implement Koyck transformation model with precipitation predictor series as follows
K_total = koyckDlm(x=as.vector(RBOdata$Temperature)+as.vector(RBOdata$Rainfall)+as.vector(RBOdata$Radiation)+as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
summary(K_total$model, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.0877331 -0.0443882 0.0004844 0.0327202 0.1376518
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.37256 7.12386 -0.754 0.4573
## Y.1 0.63395 0.27436 2.311 0.0287 *
## X.t 0.04661 0.05834 0.799 0.4313
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 1.017 0.322
## Wu-Hausman 1 26 1.787 0.193
## Sargan 0 NA NA NA
##
## Residual standard error: 0.06357 on 27 degrees of freedom
## Multiple R-Squared: -0.8376, Adjusted R-squared: -0.9738
## Wald test: 2.677 on 2 and 27 DF, p-value: 0.08698
vif(K_total$model)
## Y.1 X.t
## 1.095691 1.095691
1)Temperature
Temp_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO))
summary(Temp_Koyck3, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.15981 -0.04678 -0.01440 0.04750 0.14952
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.2032 1.6184 -0.743 0.464
## Y.1 0.2469 0.4609 0.536 0.597
## X.t 0.1847 0.1947 0.949 0.351
##
## Residual standard error: 0.07557 on 27 degrees of freedom
## Multiple R-Squared: -1.597, Adjusted R-squared: -1.789
## Wald test: 2.119 on 2 and 27 DF, p-value: 0.1397
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 1.010792 0.3236389
## Wu-Hausman 1 26 3.246834 0.0831690
##
## alpha beta phi
## Geometric coefficients: -1.597614 0.1847273 0.246894
vif(Temp_Koyck3$model, diagnostics =T)
## Y.1 X.t
## 2.188316 2.188316
From the temperature series, we obtained Adjusted R-squared:-1.789,p-value: 0.1397 >0.05
2)Rainfall
Rain_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO))
summary(Rain_Koyck3,diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3665 -0.4155 -0.1142 0.3241 1.6012
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.3207 2.4302 0.132 0.896
## Y.1 -6.5147 243.8216 -0.027 0.979
## X.t 2.2101 76.0635 0.029 0.977
##
## Residual standard error: 0.7951 on 27 degrees of freedom
## Multiple R-Squared: -286.5, Adjusted R-squared: -307.8
## Wald test: 0.01549 on 2 and 27 DF, p-value: 0.9846
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 0.0008275768 0.9772615
## Wu-Hausman 1 26 0.3602689549 0.5535531
##
## alpha beta phi
## Geometric coefficients: 0.04267914 2.21011 -6.514689
vif(Rain_Koyck3$model,diagnostics =T)
## Y.1 X.t
## 5531.807 5531.807
From the temperature series, we obtained Adjusted R-squared:-307.8,p-value: 0.9846 >0.05 3)Radiation
Rad_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO))
summary(Rad_Koyck3, diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.082255 -0.017008 -0.001036 0.021424 0.106984
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.48011 0.94819 -0.506 0.6167
## Y.1 0.69801 0.24502 2.849 0.0083 **
## X.t 0.04812 0.05661 0.850 0.4028
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0467 on 27 degrees of freedom
## Multiple R-Squared: 0.008467, Adjusted R-squared: -0.06498
## Wald test: 4.731 on 2 and 27 DF, p-value: 0.01732
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 4.941539 0.03478221
## Wu-Hausman 1 26 2.764873 0.10836470
##
## alpha beta phi
## Geometric coefficients: -1.589802 0.04811971 0.6980071
vif(Rad_Koyck3$model)
## Y.1 X.t
## 1.619594 1.619594
From the temperature series, we obtained Adjusted R-squared: -0.06498,p-value: 0.01732 <0.05 Results are much good.
4)humidity
Humi_Koyck3 <- koyckDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
summary(Humi_Koyck3,diagnostics=T)
##
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.080897 -0.021103 -0.004676 0.022673 0.111041
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.16679 8.04941 -0.145 0.8858
## Y.1 0.62503 0.34753 1.798 0.0833 .
## X.t 0.01525 0.08274 0.184 0.8551
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04127 on 27 degrees of freedom
## Multiple R-Squared: 0.2256, Adjusted R-squared: 0.1682
## Wald test: 5.612 on 2 and 27 DF, p-value: 0.009161
##
## Diagnostic tests:
## df1 df2 statistic p-value
## Weak instruments 1 27 0.39261559 0.5361901
## Wu-Hausman 1 26 0.05497691 0.8164556
##
## alpha beta phi
## Geometric coefficients: -3.111733 0.01525207 0.6250339
vif(Humi_Koyck3$model)
## Y.1 X.t
## 4.171591 4.171591
From the temperature series, we obtained Adjusted R-squared: 0.1682,p-value: 0.009161 <0.05.Results obtained are better as compared to other koyck models.
From the model summary, we can conclude that all terms of Koyck model are not significant at 5% level except rainfall and humidity. The model is reported to be overall statistically significant at 5% level (p-value < 0.05) and its adjusted R2 is negative which means the model explains about negative variability in rbo
According to the Weak instruments test (p-value > 0.05), the model at the first stage of least-squares estimation is notsignificant at 5% level.
From the Wu-Hausman test (p-value > 0.05), we can conclude that there is no significant correlation between the explanatory variable and the error term at 5% level.
There is no effect of multicollinearity as all VIFs are less than 10.
#Residual analysis univariately
residualcheck(Temp_Koyck3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.983, p-value = 0.8984
checkresiduals(Temp_Koyck3$model)
residualcheck(Rain_Koyck3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.96568, p-value = 0.4287
checkresiduals(Rain_Koyck3$model)
residualcheck(Rad_Koyck3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.96083, p-value = 0.3253
checkresiduals(Rad_Koyck3$model)
residualcheck(Humi_Koyck3$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.97232, p-value = 0.6044
checkresiduals(Humi_Koyck3$model)
From the residual analysis in Figure 8, we can conclude the following:
The errors are not spread randomly.
All the lags in ACF plot are significant and have a wave-like pattern, which suggests serial correlation and seasonality remaining in the residuals.
The errors are not normal. The histogram and the Shapiro-Wilk normality test with p-value < 0.05 suggest not normal residuals.
Overall, we can conclude that the Koyck model is also not successful at capturing the autocorrelation and seasonality in the series.
The final model type from time series regression method is Autoregressive distributed lag models. For specifying the parameters of ARDL(p,q), we create a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and obtains their accuracy measures, like AIC/BIC and MASE.
Three models with lowest values of MASE were chosen for fitting and analysis. The models were:
ARDL(3,5)
ARDL(4,5)
ARDL(5,5)
We create a loop to fit these candidate models and do residual analysis in a dynamical way.
1)Temperature
for (i in 1:5){
for(j in 1:5){
modtemp = ardlDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO))
cat("p =", i, "q =", j, "AIC =", AIC(modtemp$model), "BIC =", BIC(modtemp$model), "MASE =", MASE(modtemp)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 1 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 1 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 1 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 1 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 2 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 2 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 2 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 2 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 2 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 3 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 3 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 3 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 3 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 3 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 4 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 4 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 4 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 4 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 4 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 5 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 5 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 5 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 5 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
## p = 5 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316
2)Rainfall
for (i in 1:5){
for(j in 1:5){
modrain = ardlDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO))
cat("p =", i, "q =", j, "AIC =", AIC(modrain$model), "BIC =", BIC(modrain$model), "MASE =", MASE(modrain)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 1 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 1 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 1 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 1 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 2 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 2 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 2 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 2 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 2 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 3 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 3 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 3 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 3 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 3 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 4 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 4 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 4 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 4 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 4 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 5 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 5 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 5 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 5 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
## p = 5 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275
3)Radiation
#
for (i in 1:5){
for(j in 1:5){
modrad = ardlDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO))
cat("p =", i, "q =", j, "AIC =", AIC(modrad$model), "BIC =", BIC(modrad$model), "MASE =", MASE(modrad)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 1 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 1 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 1 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 1 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 2 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 2 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 2 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 2 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 2 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 3 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 3 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 3 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 3 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 3 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 4 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 4 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 4 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 4 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 4 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 5 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 5 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 5 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 5 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
## p = 5 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648
4)Humidity
for (i in 1:5){
for(j in 1:5){
modhum = ardlDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
cat("p =", i, "q =", j, "AIC =", AIC(modhum$model), "BIC =", BIC(modhum$model), "MASE =", MASE(modhum)$MASE, "\n")
}
}
## p = 1 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 1 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 1 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 1 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 1 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 2 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 2 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 2 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 2 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 2 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 3 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 3 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 3 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 3 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 3 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 4 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 4 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 4 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 4 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 4 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 5 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 5 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 5 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 5 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
## p = 5 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564
Lowest Mase with lowest aic in temperature so taking up temp data further
for (i in c(3,4,5)){
ardl3_temp <- ardlDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO), p = i, q = 5)
summary(ardl3_temp)
#bgtest(ardl3_temp$model)
#residualcheck(ardl3_temp$model)
}
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.051613 -0.010047 0.000777 0.020277 0.040746
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.09995 0.32029 0.312 0.759
## X.t 0.01654 0.02173 0.761 0.458
## X.1 0.03858 0.02603 1.482 0.158
## X.2 -0.01886 0.02632 -0.717 0.484
## X.3 -0.03100 0.02251 -1.377 0.187
## Y.1 0.30160 0.25501 1.183 0.254
## Y.2 0.26910 0.28224 0.953 0.355
## Y.3 0.11888 0.23217 0.512 0.616
## Y.4 0.05585 0.24529 0.228 0.823
## Y.5 0.04069 0.22768 0.179 0.860
##
## Residual standard error: 0.03121 on 16 degrees of freedom
## Multiple R-squared: 0.6393, Adjusted R-squared: 0.4364
## F-statistic: 3.151 on 9 and 16 DF, p-value: 0.02187
##
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.052918 -0.009863 0.003109 0.020643 0.043277
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.039481 0.360467 0.110 0.914
## X.t 0.017090 0.022358 0.764 0.457
## X.1 0.038753 0.026735 1.450 0.168
## X.2 -0.022368 0.028347 -0.789 0.442
## X.3 -0.029645 0.023352 -1.269 0.224
## X.4 0.009898 0.024127 0.410 0.687
## Y.1 0.332351 0.272427 1.220 0.241
## Y.2 0.273800 0.290099 0.944 0.360
## Y.3 0.064754 0.272518 0.238 0.815
## Y.4 0.052400 0.252066 0.208 0.838
## Y.5 0.036593 0.234047 0.156 0.878
##
## Residual standard error: 0.03206 on 15 degrees of freedom
## Multiple R-squared: 0.6433, Adjusted R-squared: 0.4055
## F-statistic: 2.705 on 10 and 15 DF, p-value: 0.04003
##
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.054872 -0.009819 0.003510 0.019692 0.041115
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.088818 0.433502 0.205 0.841
## X.t 0.016838 0.023130 0.728 0.479
## X.1 0.037499 0.028194 1.330 0.205
## X.2 -0.023297 0.029586 -0.787 0.444
## X.3 -0.027820 0.025485 -1.092 0.293
## X.4 0.009152 0.025155 0.364 0.721
## X.5 -0.005800 0.026075 -0.222 0.827
## Y.1 0.346273 0.288366 1.201 0.250
## Y.2 0.259300 0.306758 0.845 0.412
## Y.3 0.063980 0.281607 0.227 0.824
## Y.4 0.077358 0.283595 0.273 0.789
## Y.5 0.037752 0.241891 0.156 0.878
##
## Residual standard error: 0.03312 on 14 degrees of freedom
## Multiple R-squared: 0.6446, Adjusted R-squared: 0.3653
## F-statistic: 2.308 on 11 and 14 DF, p-value: 0.07142
#checkresiduals(ardl3_temp$model)
From model summaries, we can conclude that all the fitted ARDL models were reported to be statistically significant at 5% level with p-value < 0.05. All models have an adjusted R2=0.933, which means they explain about 93.3% of the variability in radiation.
Regarding model coefficient estimates, we can observe for ARDL(3,5) only X.2 lag of predictor series is significant at 5% level (p-value = 0.02187 < 0.05), for ARDL(4,5) only X.4 lag of predictor series is significant at 5% level (p-value = 0.04003 < 0.05), and all lags of predictor series are not statistically significant at 5% level for ARDL(5,5). All lags of independent series are statistically significant in all models except Y.2 (p-value = 0.7829 > 0.05).
The plots from diagnostic checking in Figure 9 show that there is a very similar overall picture in residuals from all three fitted models:
The residuals are not as randomly spread as desired, they show evidence of changing variance.
There are a some highly significant lags in the ACF plot. The seasonal lags are also highly significant. Therefore, there is autocorrelation and seasonality still present in the residuals.
Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.
Long tails on the histogram of residuals suggest the normality of the residuals is violated.
Based on the observation about model estimates made earlier, we can try to decrease the number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.
1)Temperature
ardl3_Temp15 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Temp15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.065387 -0.009906 0.006212 0.016715 0.038292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.14864 0.26458 -0.562 0.581
## X.t 0.01261 0.02137 0.590 0.563
## X.1 0.03374 0.02549 1.323 0.202
## Y.1 0.32065 0.24308 1.319 0.204
## Y.2 0.07062 0.25490 0.277 0.785
## Y.3 0.10359 0.23352 0.444 0.663
## Y.4 0.03316 0.24215 0.137 0.893
## Y.5 0.06797 0.19677 0.345 0.734
##
## Residual standard error: 0.03159 on 18 degrees of freedom
## Multiple R-squared: 0.5843, Adjusted R-squared: 0.4226
## F-statistic: 3.614 on 7 and 18 DF, p-value: 0.01312
residualcheck(ardl3_Temp15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.92583, p-value = 0.06171
Based on the observation about model ,p-value: 0.01312 <0.05, Residual standard error: 0.03159 and Adjusted R-squared: 0.4226
2)Rainfall
ardl3_Rain15 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Rain15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.078127 -0.017340 0.005343 0.014866 0.039246
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.228993 0.134816 1.699 0.107
## X.t 0.019647 0.018162 1.082 0.294
## X.1 0.007268 0.018640 0.390 0.701
## Y.1 0.372964 0.248591 1.500 0.151
## Y.2 0.260567 0.244988 1.064 0.302
## Y.3 0.174384 0.214252 0.814 0.426
## Y.4 -0.203452 0.196005 -1.038 0.313
## Y.5 -0.008021 0.196922 -0.041 0.968
##
## Residual standard error: 0.0326 on 18 degrees of freedom
## Multiple R-squared: 0.5575, Adjusted R-squared: 0.3854
## F-statistic: 3.239 on 7 and 18 DF, p-value: 0.02091
residualcheck(ardl3_Rain15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.94165, p-value = 0.147
Based on the observation about model ,p-value: 0.02091 <0.05, Residual standard error: 0.0326 and Adjusted R-squared: 0.3854
3)Radiation
#best results
ardl3_Rad15 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Rad15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.058184 -0.015578 0.002216 0.017187 0.043461
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.12704 0.37571 0.338 0.7392
## X.t -0.02852 0.01654 -1.724 0.1017
## X.1 0.03195 0.01691 1.889 0.0751 .
## Y.1 0.54526 0.21425 2.545 0.0203 *
## Y.2 0.24093 0.21721 1.109 0.2819
## Y.3 0.06064 0.19795 0.306 0.7629
## Y.4 -0.21453 0.18353 -1.169 0.2577
## Y.5 0.12092 0.18766 0.644 0.5275
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02995 on 18 degrees of freedom
## Multiple R-squared: 0.6264, Adjusted R-squared: 0.4811
## F-statistic: 4.311 on 7 and 18 DF, p-value: 0.005806
residualcheck(ardl3_Rad15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.98447, p-value = 0.9519
Based on the observation about model ,p-value: 0.005806 <0.05, Residual standard error: 0.02995 and Adjusted R-squared: 0.4811
4)Humidity
ardl3_Hum15 <- ardlDlm(x = as.vector(RBOdata$RelHumidity), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Hum15)
##
## Time series regression with "ts" data:
## Start = 6, End = 31
##
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.07050 -0.01625 0.00118 0.01782 0.04034
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.831753 1.210326 -0.687 0.5007
## X.t 0.006641 0.009130 0.727 0.4764
## X.1 0.003776 0.009215 0.410 0.6868
## Y.1 0.446855 0.235475 1.898 0.0739 .
## Y.2 0.312764 0.264204 1.184 0.2519
## Y.3 0.211856 0.234960 0.902 0.3791
## Y.4 -0.188857 0.198286 -0.952 0.3535
## Y.5 0.002712 0.198356 0.014 0.9892
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03306 on 18 degrees of freedom
## Multiple R-squared: 0.5449, Adjusted R-squared: 0.3679
## F-statistic: 3.079 on 7 and 18 DF, p-value: 0.02568
residualcheck(ardl3_Hum15$model)
##
## Shapiro-Wilk normality test
##
## data: x$residuals
## W = 0.95958, p-value = 0.3833
Based on the observation about model ,p-value: 0.02568 >0.05, Residual standard error: 0.03306 and Adjusted R-squared: 0.3679
The p-value of the overall significance test is > 0.05, therefore ARDL(1,5) model is statistically not significant at 5% level. All the estimated terms are significant at 5% level except Y.2 - second lag of independent series (p-value = 0.9724 > 0.05). Adjusted R2=0.932, which means they explain 93.2% of the variability in radiation.
The plots from diagnostic checking in Figure 10 show the same picture as the diagnostic checkings in Figure 9, so the comments are the same as for previously fitted models.
Overall, none of the models from time series regression method were successful at capturing the autocorrelation and seasonal pattern in radiation series.
We create a data frame accuracy to store the accuracy measures, like AIC/BIC and MASE from the models fitted so far. The accuracy measures for further models will be appended to this data frame.
attr(K_total$model,"class") = "lm"
Univariate Ardl modelling 1) Temperature
ardl3_Temp35 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Temp45 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Temp55 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=5, q=5)
models <- c("Temp_DLM3", "Temp_PolyD3", "Temp_Koyck3", "ARDL3_temp15", "ARDL3_temp35", "ARDL3_temp45", "ARDL3_temp55")
aic_a <- AIC(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)
## [1] -87.92965
bic_a <- BIC(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)
## [1] -74.35085
MASE_a <- MASE(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)
accuracy_a <- data.frame(models, MASE_a, aic_a, bic_a )
colnames(accuracy_a) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_a)
## Model MASE AIC BIC NA
## temp_dlm Temp_DLM3 21 0.4585880 -87.92965 -74.35085
## Temp_polyd3 Temp_PolyD3 21 0.6742057 -87.92965 -74.35085
## Temp_Koyck3 Temp_Koyck3 30 1.9150155 -87.92965 -74.35085
## ardl3_Temp15 ARDL3_temp15 26 0.7735245 -87.92965 -74.35085
## ardl3_Temp35 ARDL3_temp35 26 0.7661286 -87.92965 -74.35085
## ardl3_Temp45 ARDL3_temp45 26 0.7643666 -87.92965 -74.35085
2)Rainfall
ardl3_Rain35 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Rain45 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Rain55 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=5, q=5)
models <- c("Rain_DLM", "Rain_PolyD3", "Rain_Koyck3", "ARDL3_Rain15", "ARDL3_Rain35", "ARDL3_Rain45", "ARDL3_Rain55")
aic_b <- AIC(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)
## [1] -76.93255
bic_b <- BIC(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)
## [1] -63.35376
MASE_b <- MASE(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)
accuracy_b <- data.frame(models, MASE_b, aic_b, bic_b )
colnames(accuracy_b) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_b)
## Model MASE AIC BIC NA
## rain_dlm Rain_DLM 21 0.5856170 -76.93255 -63.35376
## Rain_polyd3 Rain_PolyD3 21 0.6451804 -76.93255 -63.35376
## Rain_Koyck3 Rain_Koyck3 30 19.1057647 -76.93255 -63.35376
## ardl3_Rain15 ARDL3_Rain15 26 0.8152025 -76.93255 -63.35376
## ardl3_Rain35 ARDL3_Rain35 26 0.8095313 -76.93255 -63.35376
## ardl3_Rain45 ARDL3_Rain45 26 0.7390848 -76.93255 -63.35376
Rainfall model results better compared to others.
3)Radiation
ardl3_Rad35 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Rad45 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Rad55 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=5, q=5)
models <- c("Rad_DLM", "Rain_PolyD3", "Rain_Koyck3", "ARDL3_Rain15", "ARDL3_Rain35", "ARDL3_Rain45", "ARDL3_Rain55")
aic_c <- AIC(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)
## [1] -78.06879
bic_c <- BIC(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)
## [1] -64.49
MASE_c <- MASE(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)
accuracy_c <- data.frame(models, MASE_c, aic_c, bic_c )
colnames(accuracy_c) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_c)
## Model MASE AIC BIC NA
## rad_dlm Rad_DLM 21 0.6037290 -78.06879 -64.49
## Rad_polyd3 Rain_PolyD3 21 0.6711964 -78.06879 -64.49
## Rad_Koyck3 Rain_Koyck3 30 1.0314227 -78.06879 -64.49
## ardl3_Rad15 ARDL3_Rain15 26 0.7627672 -78.06879 -64.49
## ardl3_Rad35 ARDL3_Rain35 26 0.7009101 -78.06879 -64.49
## ardl3_Rad45 ARDL3_Rain45 26 0.7052516 -78.06879 -64.49
4)Humidity
ardl3_Humi35 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Humi45 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Humi55 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=5, q=5)
models <- c("Hum_DLM", "Humi_PolyD3", "Humi_Koyck3", "ARDL3_Hum15", "ARDL3_Humi35", "ARDL3_Humi45", "ARDL3_Humi55")
aic_d <- AIC(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)
## [1] -89.23348
bic_d <- BIC(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)
## [1] -75.65468
MASE_d <- MASE(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)
accuracy_d <- data.frame(models, MASE_d, aic_d, bic_d )
colnames(accuracy_d) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_d)
## Model MASE AIC BIC NA
## hum_dlm Hum_DLM 21 0.4518490 -89.23348 -75.65468
## Humi_polyd3 Humi_PolyD3 21 0.6821919 -89.23348 -75.65468
## Humi_Koyck3 Humi_Koyck3 30 0.9559618 -89.23348 -75.65468
## ardl3_Hum15 ARDL3_Hum15 26 0.8156142 -89.23348 -75.65468
## ardl3_Humi35 ARDL3_Humi35 26 0.8095313 -89.23348 -75.65468
## ardl3_Humi45 ARDL3_Humi45 26 0.7390848 -89.23348 -75.65468
Humidity results have best accuracy
For deciding on the final model to give three years ahead forecasts of solar radiation, we compare forecasts from three models:
fit.auto =ets(RBO_ts,model="ZZZ",ic="bic")
fit.auto$method
## [1] "ETS(M,N,N)"
f1.etsM = ets(RBO_ts, model="MNN")
summary(f1.etsM)
## ETS(M,N,N)
##
## Call:
## ets(y = RBO_ts, model = "MNN")
##
## Smoothing parameters:
## alpha = 0.4421
##
## Initial states:
## l = 0.7685
##
## sigma: 0.0479
##
## AIC AICc BIC
## -96.69180 -95.80291 -92.38984
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.003529451 0.03466016 0.02607253 -0.6459794 3.549133 0.8460683
## ACF1
## Training set -0.0753464
checkresiduals(f1.etsM)
##
## Ljung-Box test
##
## data: Residuals from ETS(M,N,N)
## Q* = 2.2852, df = 4, p-value = 0.6835
##
## Model df: 2. Total lags used: 6
Holt-Winters’ multiplicative method which has the lowest MASE and is the most successful at capturing the autocorrelation and seasonality in the series
Holt-Winters’ multiplicative method with multiplicative trend which has the second lowest MASE and is also good at capturing the autocorrelation and seasonality in the series
ETS(M,N,N) model was suggested by an automatic algorithm and has the lowest MASE of all state-space models but does not capture autocorrelation in the series
1)simple exponential forecast
f1 <- ses(RBO_ts, alpha=0.1, initial="simple", h=3) # Set alpha to a small value
summary(f1)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = RBO_ts, h = 3, initial = "simple", alpha = 0.1)
##
## Smoothing parameters:
## alpha = 0.1
##
## Initial states:
## l = 0.755
##
## sigma: 0.0406
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.009995034 0.0406462 0.03161539 -1.640631 4.337625 1.025937
## ACF1
## Training set 0.4122565
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2015 0.7240242 0.6719340 0.7761144 0.6443591 0.8036893
## 2016 0.7240242 0.6716742 0.7763742 0.6439618 0.8040866
## 2017 0.7240242 0.6714157 0.7766327 0.6435664 0.8044820
checkresiduals(f1)
##
## Ljung-Box test
##
## data: Residuals from Simple exponential smoothing
## Q* = 17.699, df = 4, p-value = 0.001413
##
## Model df: 2. Total lags used: 6
2)Holts simple forecast
f2 <- holt(RBO_ts,initial = "simple",h=3)
summary(f2)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = RBO_ts, h = 3, initial = "simple")
##
## Smoothing parameters:
## alpha = 0.5678
## beta = 0.0888
##
## Initial states:
## l = 0.755
## b = -0.0143
##
## sigma: 0.0376
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.008262872 0.03758621 0.02896041 0.9541908 3.905009 0.9397818
## ACF1
## Training set -0.1449597
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2015 0.7162102 0.6680415 0.7643789 0.6425426 0.7898778
## 2016 0.7148677 0.6572445 0.7724909 0.6267407 0.8029948
## 2017 0.7135252 0.6456320 0.7814184 0.6096915 0.8173589
checkresiduals(f2)
##
## Ljung-Box test
##
## data: Residuals from Holt's method
## Q* = 2.8477, df = 3, p-value = 0.4157
##
## Model df: 4. Total lags used: 7
3)Holts with exponential trend
f3 <- holt(RBO_ts, initial="simple", exponential=TRUE, h=3)
# Fit with exponential trend
summary(f3)
##
## Forecast method: Holt's method with exponential trend
##
## Model Information:
## Holt's method with exponential trend
##
## Call:
## holt(y = RBO_ts, h = 3, initial = "simple", exponential = TRUE)
##
## Smoothing parameters:
## alpha = 0.5667
## beta = 0.0845
##
## Initial states:
## l = 0.755
## b = 0.9811
##
## sigma: 0.0514
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.008053192 0.03737566 0.02865753 0.9221333 3.861994 0.9299532
## ACF1
## Training set -0.1461854
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2015 0.7164020 0.6685812 0.7621690 0.6454951 0.7897139
## 2016 0.7151840 0.6604416 0.7717697 0.6324023 0.8036000
## 2017 0.7139681 0.6478010 0.7801527 0.6163925 0.8181239
checkresiduals(f3)
##
## Ljung-Box test
##
## data: Residuals from Holt's method with exponential trend
## Q* = 2.8343, df = 3, p-value = 0.4179
##
## Model df: 4. Total lags used: 7
4)Additive damped holts method
f4 <- holt(RBO_ts, damped=TRUE, initial="simple", h=3)
# Fit with additive damped trend
summary(f4)
##
## Forecast method: Damped Holt's method
##
## Model Information:
## Damped Holt's method
##
## Call:
## holt(y = RBO_ts, h = 3, damped = TRUE, initial = "simple")
##
## Smoothing parameters:
## alpha = 0.4773
## beta = 1e-04
## phi = 0.8
##
## Initial states:
## l = 0.7542
## b = 0.0082
##
## sigma: 0.0377
##
## AIC AICc BIC
## -90.15879 -86.65879 -81.55487
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.004553134 0.03457183 0.02527824 -0.7722916 3.450154 0.8202932
## ACF1
## Training set -0.1237046
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2015 0.7195752 0.6711967 0.7679537 0.6455866 0.7935638
## 2016 0.7195801 0.6659717 0.7731885 0.6375931 0.8015670
## 2017 0.7195840 0.6612112 0.7779567 0.6303106 0.8088574
checkresiduals(f4)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt's method
## Q* = 3.3959, df = 3, p-value = 0.3345
##
## Model df: 5. Total lags used: 8
plot(f1, type="l", ylab="Similarity of RBO order wrt FFD", xlab="Year",main="Fig.13 forecasting of fitted models",
fcol="white", plot.conf=FALSE)
lines(fitted(f1), col="blue")
lines(fitted(f2), col="red")
lines(fitted(f3), col="green")
lines(fitted(f4), col="cyan")
lines(f1$mean, col="blue", type="l")
lines(f2$mean, col="red", type="l")
lines(f3$mean, col="green", type="l")
lines(f4$mean, col="brown", type="l")
legend("topright", lty=1, col=c("black","blue","red","green","cyan"),c("Data","SES", "Holt's linear trend", "Exponential trend","Additive damped trend"))
The fitted values and 3 year forecasts are displayed in Figure 13
However, we can observe that the 95% confidence intervals for the forecasts from selected approach are very precise and provide reliable forecasts.
knitr::opts_chunk$set(echo = TRUE)
library(TSA)
library(car)
library(dynlm)
library(Hmisc)
library(forecast)
library(stats)
Loading dataset RBO.csv for task 3(b)
#recalling RBO data
class(RBOdata)
## [1] "data.frame"
head(RBOdata)
## ï..Year RBO Temperature Rainfall Radiation RelHumidity
## 1 1984 0.7550088 9.371585 2.489344 14.87158 93.92650
## 2 1985 0.7407520 9.656164 2.475890 14.68493 94.93589
## 3 1986 0.8423860 9.273973 2.421370 14.51507 94.09507
## 4 1987 0.7484425 9.219178 2.319726 14.67397 94.49699
## 5 1988 0.7984084 10.202186 2.465301 14.74863 94.08142
## 6 1989 0.7938803 9.441096 2.735890 14.78356 96.08685
Convert data into a time series object
RBO.ts = matrix(RBOdata$RBO, nrow = 25, ncol = 12)
RBO.ts = as.vector(t(RBO.ts))
RBO.ts = ts(RBO.ts,start=c(1984,1), end=c(2014,1), frequency=2)
class(RBO.ts)
## [1] "ts"
plot(RBO.ts,ylab='RBO similarity of the order of FFD',xlab='Year',type='o',
main = "Time series plot of RBOs.")
acf(RBO.ts,max.lag = 48, main="Sample ACF for RBOs")
# Intervention results in an immediate and permanent shift in the mean function
RBO.tr = log(RBO.ts)
plot(RBO.tr,ylab='Log of landings in metric tons',xlab='Year',
main = "Fig.14 Time series plot of the logarithm of yearly
similarity of order of RBOs.")
points(y=RBO.tr,x=time(RBO.tr), pch=as.vector(season(RBO.tr)))
Observations made from Figure 14 plot, we can make the following comments on the characteristics of the series: • There is a possibility of a slight downward trend, especially in the beginning of the series. • Seasonality is present strongly, though the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January. • Since the series is Seasonal,the changing variance and behaviour of the series is not apparent due to seasonality. • Intervention on points is absent.
Y.t = RBO.tr
T = 96
S.t = 1*(seq(Y.t) >= T)
S.t.1 = Lag(S.t,+1)
model31 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + trend(Y.t) + season(Y.t))
summary(model31)
##
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
##
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + trend(Y.t) + season(Y.t))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.122950 -0.043817 -0.009939 0.030096 0.119378
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.343e-01 4.350e-02 -5.387 1.47e-06 ***
## L(Y.t, k = 1) 1.730e-01 1.331e-01 1.300 0.199
## S.t NA NA NA NA
## trend(Y.t) -9.609e-05 8.560e-04 -0.112 0.911
## season(Y.t)2 -2.037e-02 1.494e-02 -1.363 0.178
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05729 on 56 degrees of freedom
## Multiple R-squared: 0.05307, Adjusted R-squared: 0.002347
## F-statistic: 1.046 on 3 and 56 DF, p-value: 0.3793
model31.2 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + season(Y.t))
summary(model31.2)
##
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
##
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + season(Y.t))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.124162 -0.044216 -0.009184 0.029622 0.119925
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.23559 0.04166 -5.655 5.24e-07 ***
## L(Y.t, k = 1) 0.17391 0.13166 1.321 0.192
## S.t NA NA NA NA
## season(Y.t)2 -0.02034 0.01481 -1.373 0.175
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05679 on 57 degrees of freedom
## Multiple R-squared: 0.05286, Adjusted R-squared: 0.01963
## F-statistic: 1.591 on 2 and 57 DF, p-value: 0.2127
model31.3 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + trend(Y.t) )
summary(model31.3)
##
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
##
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + trend(Y.t))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.11298 -0.04540 -0.01214 0.02424 0.12738
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.525e-01 4.173e-02 -6.050 1.19e-07 ***
## L(Y.t, k = 1) 1.476e-01 1.327e-01 1.112 0.271
## S.t NA NA NA NA
## trend(Y.t) -7.265e-05 8.622e-04 -0.084 0.933
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05772 on 57 degrees of freedom
## Multiple R-squared: 0.02165, Adjusted R-squared: -0.01267
## F-statistic: 0.6308 on 2 and 57 DF, p-value: 0.5358
aic = AIC(model31, model31.2, model31.3)
bic = BIC(model31, model31.2, model31.3)
aic
## df AIC
## model31 5 -167.0321
## model31.2 4 -169.0186
## model31.3 4 -167.0735
bic
## df BIC
## model31 5 -156.5604
## model31.2 4 -160.6413
## model31.3 4 -158.6961
model32 = dynlm(Y.t ~ L(Y.t , k = 2 ) + S.t + trend(Y.t) + season(Y.t))
summary(model32)
##
## Time series regression with "ts" data:
## Start = 1985(1), End = 2014(1)
##
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 2) + S.t + trend(Y.t) + season(Y.t))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.127912 -0.040066 -0.007146 0.034405 0.118657
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.3483867 0.0407243 -8.555 1.1e-11 ***
## L(Y.t, k = 2) -0.2344598 0.1307355 -1.793 0.0784 .
## S.t NA NA NA NA
## trend(Y.t) -0.0005389 0.0008606 -0.626 0.5338
## season(Y.t)2 -0.0190425 0.0147801 -1.288 0.2030
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05613 on 55 degrees of freedom
## Multiple R-squared: 0.07605, Adjusted R-squared: 0.02565
## F-statistic: 1.509 on 3 and 55 DF, p-value: 0.2224
model33 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + S.t.1 + trend(Y.t) + season(Y.t))
summary(model33)
##
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
##
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + S.t.1 + trend(Y.t) +
## season(Y.t))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.122950 -0.043817 -0.009939 0.030096 0.119378
##
## Coefficients: (2 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.343e-01 4.350e-02 -5.387 1.47e-06 ***
## L(Y.t, k = 1) 1.730e-01 1.331e-01 1.300 0.199
## S.t NA NA NA NA
## S.t.1 NA NA NA NA
## trend(Y.t) -9.609e-05 8.560e-04 -0.112 0.911
## season(Y.t)2 -2.037e-02 1.494e-02 -1.363 0.178
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05729 on 56 degrees of freedom
## Multiple R-squared: 0.05307, Adjusted R-squared: 0.002347
## F-statistic: 1.046 on 3 and 56 DF, p-value: 0.3793
. 1) Simple exponential forecasting
f31 <- ses(RBO.tr, alpha=0.1, initial="simple", h=3) # Set alpha to a small value
summary(f31)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = RBO.tr, h = 3, initial = "simple", alpha = 0.1)
##
## Smoothing parameters:
## alpha = 0.1
##
## Initial states:
## l = -0.281
##
## sigma: 0.0591
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.003149581 0.05912707 0.04786464 -3.448812 17.99458 0.6576632
## ACF1
## Training set 0.135882
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2014.50 -0.3002383 -0.3760127 -0.224464 -0.4161253 -0.1843514
## 2015.00 -0.3002383 -0.3763907 -0.224086 -0.4167033 -0.1837734
## 2015.50 -0.3002383 -0.3767667 -0.223710 -0.4172784 -0.1831983
checkresiduals(f31)
##
## Ljung-Box test
##
## data: Residuals from Simple exponential smoothing
## Q* = 20.508, df = 3, p-value = 0.0001332
##
## Model df: 2. Total lags used: 5
f32 <- holt(RBO.tr,initial = "simple",h=3)
summary(f32)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = RBO.tr, h = 3, initial = "simple")
##
## Smoothing parameters:
## alpha = 0.7249
## beta = 0.1837
##
## Initial states:
## l = -0.281
## b = -0.0969
##
## sigma: 0.0796
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.01355766 0.07956267 0.06302001 -8.828896 24.08236 0.865899
## ACF1
## Training set 0.01998612
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2014.50 -0.2288297 -0.3307933 -0.12686599 -0.3847696 -0.072889690
## 2015.00 -0.2155438 -0.3533143 -0.07777332 -0.4262456 -0.004842037
## 2015.50 -0.2022579 -0.3794222 -0.02509370 -0.4732073 0.068691395
checkresiduals(f32)
##
## Ljung-Box test
##
## data: Residuals from Holt's method
## Q* = 15.226, df = 3, p-value = 0.001634
##
## Model df: 4. Total lags used: 7
f33 <- holt(RBO.tr, initial="simple", exponential=TRUE, h=3)
# Fit with exponential trend
summary(f33)
##
## Forecast method: Holt's method with exponential trend
##
## Model Information:
## Holt's method with exponential trend
##
## Call:
## holt(y = RBO.tr, h = 3, initial = "simple", exponential = TRUE)
##
## Smoothing parameters:
## alpha = 0.7417
## beta = 0.2581
##
## Initial states:
## l = -0.281
## b = 1.3447
##
## sigma: 0.2658
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.01935805 0.08295608 0.06495641 -10.75976 24.88312 0.8925052
## ACF1
## Training set -0.002607442
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2014.50 -0.2301822 -0.3103707 -0.15370514 -0.3492260 -0.11004616
## 2015.00 -0.2186152 -0.3347460 -0.12546123 -0.4108465 -0.08717152
## 2015.50 -0.2076295 -0.3691222 -0.09681279 -0.4845434 -0.06043851
checkresiduals(f33)
##
## Ljung-Box test
##
## data: Residuals from Holt's method with exponential trend
## Q* = 15.743, df = 3, p-value = 0.00128
##
## Model df: 4. Total lags used: 7
4)Additive Damped holts method
#
f34 <- holt(RBO.tr, damped=TRUE, initial="simple", h=3)
# Fit with additive damped trend
summary(f34)
##
## Forecast method: Damped Holt's method
##
## Model Information:
## Damped Holt's method
##
## Call:
## holt(y = RBO.tr, h = 3, damped = TRUE, initial = "simple")
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
## phi = 0.98
##
## Initial states:
## l = -0.2809
## b = -6e-04
##
## sigma: 0.0593
##
## AIC AICc BIC
## -87.03774 -85.48219 -74.37250
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.004210585 0.05685667 0.04609657 -2.873118 17.1346 0.6333698
## ACF1
## Training set 0.1502685
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2014.50 -0.3013437 -0.3773918 -0.2252956 -0.4176492 -0.1850381
## 2015.00 -0.3015122 -0.3775603 -0.2254641 -0.4178177 -0.1852066
## 2015.50 -0.3016773 -0.3777254 -0.2256292 -0.4179829 -0.1853718
checkresiduals(f34)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt's method
## Q* = 30.622, df = 3, p-value = 1.021e-06
##
## Model df: 5. Total lags used: 8
The fitted values and 3 year forecasts are displayed in Figure 15
plot(f31, type="l", ylab="Similarity of RBO order wrt FFD", xlab="Year",main="Fig.15 Forecasting of RBO wrt FFd values",
fcol="white", plot.conf=FALSE)
lines(fitted(f31), col="blue")
lines(fitted(f32), col="red")
lines(fitted(f33), col="green")
lines(fitted(f34), col="cyan")
lines(f31$mean, col="blue", type="l")
lines(f32$mean, col="red", type="l")
lines(f33$mean, col="green", type="l")
lines(f34$mean, col="brown", type="l")
legend("topright", lty=1, col=c("black","blue","red","green","cyan"),c("Data","SES", "Holt's linear trend", "Exponential trend","Additive damped trend"))
Conclusion: Using Various timeseries analysis and modelling techniques, we have obtained the forescasting for next three years 2015,2016,2017.