About the Final Project:

The project report consist of three tasks: [a] Task1: Your task is to give best 4 weeks ahead forecasts in terms of R squared, AIC, BIC, MASE etc (as is appropriate) for the mortality series. Provide the point forecasts and confidence intervals and corresponding plot for the most optimal model for each method used.

[b] Task2: Your task is to model FFD and forecast FFD. Single climate predictors (univariate models) are to be tested. Your task is to give best FFD 4 years ahead forecasts for the FFD series. Point forecasts and confidence intervals are required for the forecasts with appropriate graphs.

[c] Task 3(a)Carry out your analysis based on univariate climate regressors (model one climate indicator at a time, i.e., univariate regressor). • Modelling methods to try (DLM, ARDL, polyck, koyck, dynlm). • Choice of optimal models within EACH a specific method can be assessed from values of R squared, AIC, BIC, MASE etc (as is appropriate to the method).

[c] Task 3(b)Perform the appropriate analysis and obtain the 3 year ahead forecasts (suggest using the dynlm package) only for part (b))

Task 1

Introduction

The aim of the investigation is to perform the analysis on disease specific mortality between the years 2010-2020 effected by both climate and pollution and observe the results to reach upon conclusions

using various techniques, we will analyse the nature of series.
investigate about Stationarity/ non stationarity of data set
perform transformation and decomposition
explore the various model fit to find accurate model for mortality rate prediction.

About Dataset

The dataset used ASX.csv includes averaged weekly mortality in Paris, France and the city’s local climate (temperature degrees Fahrenheit), size of pollutants and levels of noxious chemical emissions from cars and industry in the air - all measured at the same points between 2010-2020. All 5 series i.e. mortality, temperature, pollutants particle size and two chemical emissions (chem1, chem2) between 2010-2020 (508 time points) are given here in mort.csv. You will use this data for the calculation of 4 weeks ahead forecasts for mortality

mort_data <- read_csv("D:/Drive data/Rmit/Sem4/Forecasting/mort.csv")

## Warning: Missing column names filled in: 'X1' [1]

## 
## -- Column specification --------------------------------------------------------
## cols(
##   X1 = col_double(),
##   mortality = col_double(),
##   temp = col_double(),
##   chem1 = col_double(),
##   chem2 = col_double(),
##   `particle size` = col_double()
## )

colnames(mort_data)

## [1] "X1"            "mortality"     "temp"          "chem1"        
## [5] "chem2"         "particle size"

head(mort_data,5)

## # A tibble: 5 x 6
##      X1 mortality  temp chem1 chem2 `particle size`
##   <dbl>     <dbl> <dbl> <dbl> <dbl>           <dbl>
## 1     1      97.8  72.4 11.5   3.37            72.7
## 2     2     105.   67.2  8.92  2.59            49.6
## 3     3      94.4  62.9  9.48  3.29            55.7
## 4     4      98.0  72.5 10.3   3.04            55.2
## 5     5      95.8  74.2 10.6   3.39            66.0

summary(mort_data)

##        X1          mortality           temp           chem1       
##  Min.   :  1.0   Min.   : 68.11   Min.   :50.91   Min.   : 2.520  
##  1st Qu.:127.8   1st Qu.: 81.90   1st Qu.:67.23   1st Qu.: 4.970  
##  Median :254.5   Median : 87.33   Median :74.06   Median : 6.865  
##  Mean   :254.5   Mean   : 88.70   Mean   :74.26   Mean   : 7.909  
##  3rd Qu.:381.2   3rd Qu.: 94.36   3rd Qu.:81.49   3rd Qu.:10.080  
##  Max.   :508.0   Max.   :132.04   Max.   :99.88   Max.   :22.390  
##      chem2       particle size  
##  Min.   :0.860   Min.   :20.25  
##  1st Qu.:2.050   1st Qu.:35.85  
##  Median :2.740   Median :44.25  
##  Mean   :2.844   Mean   :47.41  
##  3rd Qu.:3.465   3rd Qu.:57.54  
##  Max.   :6.570   Max.   :97.94

class(mort_data)

## [1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame"

#tail(mort_data)
#508 points from 2010 to 2020 weekly so we have 52 weeks
T1_ts <- ts(mort_data[,2:6], start = c(2010,7), frequency = 52)
class(T1_ts)

## [1] "mts"    "ts"     "matrix"

#T1_ts
#tail(T1_ts)

Timeseries on each column

##Mortality
mortal_ts <- ts(mort_data$mortality, start = c(2010,7),frequency = 52)

##temperature
temp_ts <- ts(mort_data$temp, start = c(2010,7),frequency = 52)

##Chemical 1 &chemical2
chem1_ts <- ts(mort_data$chem1, start = c(2010,7),frequency = 52)

chem2_ts <- ts(mort_data$chem2, start = c(2010,7),frequency = 52)

#Particle size
part_ts <- ts(mort_data$`particle size`, start = c(2010,7),frequency = 52)

Plot the time series dataset

Plotting each column in data

1)Mortality

##PLOtting

plot(mortal_ts,type = "o", ylab="mortality index", xlab="Year", main = "Time series plot of mortality rates between 2010 to 2020")

In Mortality series we do not find any trend. We do not see seasonality in the series. Intervention point is observed. Moving average is found. Changing varince is present

2)Temperature

plot(temp_ts,type = "o", ylab="temperature index", xlab="Year", main = "Time series plot of temperature change rates between 2010 to 2020")

In Temperature series we do not see any trend. We found seasonality in the series. No intervention point is observed. Moving average was visible. Changing varince is present.

plot(chem1_ts,type = "o", ylab="chemical1 index", xlab="Year", main = "Time series plot of chemical 1 change rates between 2010 to 2020")

In Chemical 1 series we see downward any trend. We see no seasonality in the series. No intervention point is observed. Moving average is visible. Changing variance was present.

plot(chem2_ts,type = "o", ylab="chemical 2 index", xlab="Year", main = "Time series plot of chemical 2 change rates between 2010 to 2020")

In Chemical 2 series we find no trend. We found slight seasonality in the series. Intervention point is observed. Moving average is visible. Changing variance was present.

plot(part_ts,type = "o", ylab="particle size", xlab="Year", main = "Time series plot of particle size change rates between 2010 to 2020")

In Particle Size series we did not find any trend. We observed seasonality in the series. No intervention point is observed. Moving average is visible. Changing variance was found.

Successive points as well as fluctuations around mean level suggest autoregressive and moving average behaviour.

To further explore the relationship between the all ords index and our independent series, we display them within the same plot. Standartisation is performed over all variables by centering and scaling to clearly plot them on the same scale.

Scaling down to avoid mismatch

T1scale_date = scale(T1_ts)
plot(T1scale_date, plot.type = "s" ,col = c("Red","blue", "Green", "black","brown"),main="Time series plot of Scaled mortality data")
legend("bottomright",lty=1, text.width = 3, col=c("Red","blue", "Green", "black","brown"), c("Mortality", "Temperature", "Chemical1", "Chemical2","Particle Size"))

#Find the correlation between them

cor(T1_ts)

##                mortality        temp       chem1     chem2 particle size
## mortality      1.0000000 -0.43863962  0.55744759 0.2569989    0.44387133
## temp          -0.4386396  1.00000000 -0.09785582 0.4043740   -0.01723095
## chem1          0.5574476 -0.09785582  1.00000000 0.5130047    0.86611747
## chem2          0.2569989  0.40437401  0.51300467 1.0000000    0.46793404
## particle size  0.4438713 -0.01723095  0.86611747 0.4679340    1.00000000

It is observed that there is a moderate positive correlation between the dependent series all parameters From value obtained we see that Size is has good correlation than other predictors

Analysing the non stationarity of data

To find the suitable lag and find stationarity unit root test is performed

##Augmented dicky fuller test

mortality data

par(mfrow=c(1,2))
acf(mortal_ts, main = "ACF for the mortality rate",cex.main=0.65)
pacf(mortal_ts, main = "PACF of Mortality",cex.main=0.05)

par(mfrow=c(1,1))
ar(mortal_ts)

## 
## Call:
## ar(x = mortal_ts)
## 
## Coefficients:
##      1       2  
## 0.4339  0.4376  
## 
## Order selected 2  sigma^2 estimated as  32.84

#order selected=2
adf.test(mortal_ts,k = 2)

## Warning in adf.test(mortal_ts, k = 2): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  mortal_ts
## Dickey-Fuller = -5.161, Lag order = 2, p-value = 0.01
## alternative hypothesis: stationary

Since the p value is smaller than 0.05, we reject the null hypothesis that implies stationarity.

PP.test(mortal_ts)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  mortal_ts
## Dickey-Fuller = -9.454, Truncation lag parameter = 6, p-value = 0.01

According to PP test,p value is lower than 5% thus mortal_ts series is stationarity.

Temperature data

par(mfrow=c(1,2))
acf(temp_ts, main = "ACF for the temperature series",cex.main=0.65)
pacf(temp_ts, main = "PACF of temperature",cex.main=0.05)

par(mfrow=c(1,1))
ar(temp_ts)

## 
## Call:
## ar(x = temp_ts)
## 
## Coefficients:
##       1        2        3        4        5        6        7        8  
##  0.1479   0.2072   0.0702   0.1794   0.0486   0.0769   0.0191   0.0618  
##       9       10       11       12       13       14       15       16  
##  0.0934  -0.0328  -0.0889  -0.0992  -0.0092  -0.0335  -0.0240   0.0094  
##      17       18       19       20       21       22  
##  0.0180   0.0004  -0.0465  -0.0204  -0.0373  -0.1382  
## 
## Order selected 22  sigma^2 estimated as  36.83

#order selected=22
adf.test(temp_ts,k = 22)

## Warning in adf.test(temp_ts, k = 22): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  temp_ts
## Dickey-Fuller = -8.2554, Lag order = 22, p-value = 0.01
## alternative hypothesis: stationary

Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.

PP.test(temp_ts)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  temp_ts
## Dickey-Fuller = -12.095, Truncation lag parameter = 6, p-value = 0.01

According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.

Chemical1 data

par(mfrow=c(1,2))
acf(chem1_ts, main = "ACF for the chemical1 values",cex.main=0.65)
pacf(chem1_ts, main = "PACF of chemical1 values",cex.main=0.05)

par(mfrow=c(1,1))
ar(chem1_ts)

## 
## Call:
## ar(x = chem1_ts)
## 
## Coefficients:
##       1        2        3        4        5        6        7        8  
##  0.0883   0.3275   0.1834   0.1018   0.1016   0.1447   0.0522   0.0184  
##       9       10       11       12       13       14       15       16  
## -0.0052   0.0542  -0.1058  -0.1009   0.0225  -0.0643  -0.0490  -0.0802  
## 
## Order selected 16  sigma^2 estimated as  5.609

#order selected=16
adf.test(chem1_ts,k = 16)

## Warning in adf.test(chem1_ts, k = 16): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  chem1_ts
## Dickey-Fuller = -8.1588, Lag order = 16, p-value = 0.01
## alternative hypothesis: stationary

Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.

PP.test(chem1_ts)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  chem1_ts
## Dickey-Fuller = -12.819, Truncation lag parameter = 6, p-value = 0.01

According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.

chemical2 data

par(mfrow=c(1,2))
acf(chem2_ts, main = "ACF for the chemical2 values",cex.main=0.65)
pacf(chem2_ts, main = "PACF of chemical2",cex.main=0.05)

par(mfrow=c(1,1))
ar(chem2_ts)

## 
## Call:
## ar(x = chem2_ts)
## 
## Coefficients:
##      1       2       3       4       5       6       7       8  
## 0.1319  0.2025  0.0119  0.1425  0.1070  0.0523  0.0631  0.0958  
## 
## Order selected 8  sigma^2 estimated as  0.7765

#order selected=8
adf.test(chem2_ts,k = 8)

## Warning in adf.test(chem2_ts, k = 8): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  chem2_ts
## Dickey-Fuller = -5.3362, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary

Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.

PP.test(chem2_ts)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  chem2_ts
## Dickey-Fuller = -20.014, Truncation lag parameter = 6, p-value = 0.01

According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.

Particle size data

par(mfrow=c(1,2))
acf(part_ts, main = "ACF for the particle size series",cex.main=0.65)
pacf(part_ts, main = "PACF of particle size",cex.main=0.05)

par(mfrow=c(1,1))
ar(part_ts)

## 
## Call:
## ar(x = part_ts)
## 
## Coefficients:
##       1        2        3        4        5        6        7        8  
##  0.1272   0.2584   0.1620   0.1593   0.0681   0.1083   0.0666   0.0256  
##       9       10       11       12       13       14  
## -0.0359   0.0504  -0.0827  -0.0989  -0.0665  -0.1112  
## 
## Order selected 14  sigma^2 estimated as  114.6

#order selected=14
adf.test(part_ts,k = 14)

## Warning in adf.test(part_ts, k = 14): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  part_ts
## Dickey-Fuller = -7.2956, Lag order = 14, p-value = 0.01
## alternative hypothesis: stationary

Since the p value is lower than 0.05, we reject the null hypothesis that implies stationarity.

PP.test(part_ts)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  part_ts
## Dickey-Fuller = -13.343, Truncation lag parameter = 6, p-value = 0.01

According to PP test,p value is lower than 5% thus temp series is stationarity and alternate hypothesis is stationary.

Based on the slowly decaying pattern of significant lags in the sample ACF plots in Figures 6-9, we can conclude that all explored series have a trend. The ADF test reports p-values > 0.05 for all the series, so we fail to reject the H0 that the series are nonstationary at 5% level. Overall, from a descriptive analysis of time series plots, sample ACF plots and the ADF test results, we can observe that there is nonstationarity existent in the asx data. Augmented Dickey-Fuller Test states tha t all the series is stationary as we get p-value less than 5% hence we reject null hypothesis.

#Decomposition ##Decomposition of time series Decomposition of time series into different components is useful to observe the individualeffects of the existing components and historical effects occurred in the past. Thecomponents that a time series can be decomposed into are * seasonal * trend and * remainder , which includes other effects that are not captured by the seasonal and trend components. Basically, there are three main decomposition methods for time series. The most basic oneis the * classical decomposition, which provides the basis for other decomposition methods. * The X-12-ARIMA decomposition is another decomposition which is more complex than the classical decomposition. It is mostly used for quarterly and monthly data.

One of thevery robust and commonly used decomposition methods is the Seasonal and Trend decomposition using Loss (STL) decomposition). When a time series is displayed, it includes trend and seasonal effect in a confounded way;hence, it would be very difficult to infer about the main characteristics of the series underthe effect of seasonality.

Therefore, we use time series decomposition to extract eachcomponent from the series and adjust the series for various effects like seasonality.

#Decomposition function giving output STL plot.
decompose <- function(x){
  stldeco=mstl(x, t.window=15, s.window="periodic", robust=TRUE)
  plot(stldeco)
}
#decomposition of Mortality
decompose(mortal_ts)

#Decomposition of Temperature
decompose(temp_ts)

#Decomposition of Chemical 1
decompose(chem1_ts)

#Decomposition of chemical 2
decompose(chem2_ts)

#Decomposition of Particle Size
decompose(part_ts)

STL decomposition and x12 decomposition

STL handles any type of seasonality, whiles others are somewhat limited to onlymonthly and/or quarterly series. The seasonal component can change by the time and the rate of change can becontrolled by the user. The smoothness of the trend-cycle can also be controlled by the user. We can make it robust to outliers by sending the effect of occasional unusualobservations to the remainder component.

1)Mortality

stl_mort = stl(mortal_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_mort, main = "STL decomposition of all price index series")

#Seasonal period too large.thus cannot fit x12 model

##Naive forcast
mortadj =seasadj(stl_mort)
plot(naive(mortadj), xlab="mortality rates", main= "Naive forecasts of seasonally adjusted mortality rates")

From the remainder series of the all ords series in Figure 10, it is observed that the spikes in the raw data are caused by other external factors, they happen around the intervention point.

Temperature

stl_temp = stl(temp_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_temp, main = "STL decomposition of gold price index series")

##Naive forcast
tempadj =seasadj(stl_temp)
plot(naive(tempadj), ylab="temperature rates",xlab="Time period",  main= "Naive forecasts of seasonally adjusted temperature rates")

There was no seasonal effect found in the gold price series at the data visualisation stage, so the seasonally adjusted data in X12 decomposition is very close to the original series,and the seasonal pattern in STL decomposition is meaningless (Figure 11). The remainder component of this series is not smooth at all, meaning there are other unknown factors that have an impact on the series.

3)Chemical1

stl_c1 = stl(chem1_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_c1, main = "STL decomposition of chemical1 value index series")

##Naive forcast
c1adj =seasadj(stl_c1)
plot(naive(c1adj), ylab="chemical 1 values",xlab="Time period",  main= "Naive forecasts of seasonally adjusted chemical1 rates")

From the STL decomposition in Figure 12,it is observed that the remainder has a major peak when the intervention happened, but otherwise is rather smooth.

4)Chemical2

stl_c2 = stl(chem2_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_c2, main = "STL decomposition of chemical2 value index series")

##Naive forcast
c2adj =seasadj(stl_c2)
plot(naive(c2adj), ylab="chemical2 values",xlab="Time period",  main= "Naive forecasts of seasonally adjusted chemical 2 value")

5)particle space

stl_p = stl(part_ts,t.window = 15, s.window = "periodic", robust = T)
plot(stl_p, main = "STL decomposition of particle space value index series")

##Naive forecast
partadj =seasadj(stl_c2)
plot(naive(partadj), ylab="Particle size",xlab="Time period", main= "Naive forecasts of seasonally adjusted particle space value")

The conclusions that we can make from the decomposition of the mortality data are similar to all the other previously analysed series. There are ups and downs around the intervention in the remainder of the series (Figure 13). There was no seasonality found prior to decomposition, so there is no seasonal effect here.

Overall, it is observed that there is no seasonality effect on the mortal data, and all the fluctuations are due to other external factors. Since we did not find any evidence of seasonality from the time series and ACF plots, the seasonal pattern from STL decomposition is not meaningful. From the X12 decomposition we can observe that the seasonally adjusted data is very close to the original series.

Modelling process:

In the modelling process we attempt to find the best appropriate model for the all ordinaries price index.

These predictors were chosen based on their correlation with each other and with the dependent variable.

#1 Finite DLM

dataf = mort_data
colnames(dataf) <- c("mortality", "temp", "X1", "X2","X3")
for ( i in 1:10){
  model1.1 = dlm(formula = mortality ~ temp + +X1+X2+X3, data = data.frame(dataf), q = i )
  cat("q = ", i, "AIC = ", AIC(model1.1$model), "BIC = ", BIC(model1.1$model),"Mase =",MASE(model1.1)$MASE, "\n")
}

## q =  1 AIC =  6162.854 BIC =  6205.139 Mase = 84.97144 
## q =  2 AIC =  6102.699 BIC =  6161.871 Mase = 79.95024 
## q =  3 AIC =  6049.467 BIC =  6125.509 Mase = 76.15284 
## q =  4 AIC =  6010.61 BIC =  6103.507 Mase = 73.65758 
## q =  5 AIC =  5975.72 BIC =  6085.455 Mase = 71.25103 
## q =  6 AIC =  5939.706 BIC =  6066.264 Mase = 68.52489 
## q =  7 AIC =  5901.759 BIC =  6045.123 Mase = 65.89878 
## q =  8 AIC =  5861.617 BIC =  6021.772 Mase = 62.95227 
## q =  9 AIC =  5820.038 BIC =  5996.968 Mase = 60.43504 
## q =  10 AIC =  5776.348 BIC =  5970.035 Mase = 58.22366

Finite dlm

Multiple predictors For all indexes

Model1.AllIndexes = dlm(formula = mortality ~ temp + X3, data = data.frame(dataf), q=10)
summary(Model1.AllIndexes)

## 
## Call:
## lm(formula = as.formula(model.formula), data = design)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -243.910  -55.471    8.942   54.356  220.948 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1228.49183   40.41868  30.394  < 2e-16 ***
## temp.t        -1.15116    0.66343  -1.735 0.083358 .  
## temp.1        -0.36794    0.72348  -0.509 0.611288    
## temp.2        -0.32930    0.76406  -0.431 0.666675    
## temp.3        -0.09361    0.76381  -0.123 0.902506    
## temp.4         0.17080    0.76156   0.224 0.822640    
## temp.5         0.11340    0.76171   0.149 0.881713    
## temp.6        -0.07927    0.76292  -0.104 0.917287    
## temp.7        -0.33629    0.76079  -0.442 0.658672    
## temp.8        -0.46105    0.76066  -0.606 0.544719    
## temp.9        -1.48477    0.71179  -2.086 0.037514 *  
## temp.10       -2.46332    0.65084  -3.785 0.000173 ***
## X3.t         -20.06364    4.14000  -4.846 1.71e-06 ***
## X3.1         -19.55329    4.26072  -4.589 5.70e-06 ***
## X3.2         -15.02953    4.32428  -3.476 0.000556 ***
## X3.3         -16.71820    4.31411  -3.875 0.000121 ***
## X3.4         -13.09503    4.37328  -2.994 0.002894 ** 
## X3.5         -10.46192    4.38709  -2.385 0.017484 *  
## X3.6          -9.49201    4.36805  -2.173 0.030270 *  
## X3.7          -9.38949    4.37222  -2.148 0.032256 *  
## X3.8          -7.44280    4.39035  -1.695 0.090681 .  
## X3.9          -8.14088    4.33394  -1.878 0.060938 .  
## X3.10         -8.62138    4.30207  -2.004 0.045637 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 79.12 on 475 degrees of freedom
## Multiple R-squared:  0.7111, Adjusted R-squared:  0.6977 
## F-statistic: 53.14 on 22 and 475 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 5791.183 5892.238

residualcheck=function(x){
  shapiro.test(x$residuals)
}
residualcheck(Model1.AllIndexes$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.99173, p-value = 0.00714

checkresiduals(Model1.AllIndexes$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 26
## 
## data:  Residuals
## LM test = 474.68, df = 26, p-value < 2.2e-16

VIF_m1 = vif(Model1.AllIndexes$model)
VIF_m1

##   temp.t   temp.1   temp.2   temp.3   temp.4   temp.5   temp.6   temp.7 
## 3.522721 4.188573 4.672719 4.660662 4.619670 4.617936 4.615754 4.590513 
##   temp.8   temp.9  temp.10     X3.t     X3.1     X3.2     X3.3     X3.4 
## 4.588034 4.035825 3.375787 1.523050 1.608759 1.658110 1.648431 1.689063 
##     X3.5     X3.6     X3.7     X3.8     X3.9    X3.10 
## 1.696054 1.681233 1.679486 1.691992 1.649012 1.625328

VIF_m1 > 10

##  temp.t  temp.1  temp.2  temp.3  temp.4  temp.5  temp.6  temp.7  temp.8  temp.9 
##   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE 
## temp.10    X3.t    X3.1    X3.2    X3.3    X3.4    X3.5    X3.6    X3.7    X3.8 
##   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE 
##    X3.9   X3.10 
##   FALSE   FALSE

If the value of VIF is greater than 10, we can conclude that the effect of multicollinearity is high.

#Temp
model1.temp <- dlm(x=as.vector(dataf$temp), y=as.vector(dataf$mortality), q=10)
summary(model1.temp)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -262.11  -89.47   -2.66   93.64  275.67 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.174e+03  6.016e+01  19.518  < 2e-16 ***
## x.t         -2.899e+00  9.459e-01  -3.065  0.00230 ** 
## x.1         -1.431e+00  1.024e+00  -1.397  0.16304    
## x.2         -6.515e-01  1.095e+00  -0.595  0.55218    
## x.3         -3.835e-01  1.095e+00  -0.350  0.72639    
## x.4         -1.668e-03  1.096e+00  -0.002  0.99879    
## x.5          6.399e-02  1.096e+00   0.058  0.95346    
## x.6          4.227e-02  1.097e+00   0.039  0.96929    
## x.7         -1.910e-01  1.096e+00  -0.174  0.86177    
## x.8         -3.774e-01  1.097e+00  -0.344  0.73087    
## x.9         -1.514e+00  1.025e+00  -1.477  0.14031    
## x.10        -2.970e+00  9.479e-01  -3.133  0.00183 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 119.6 on 486 degrees of freedom
## Multiple R-squared:  0.3243, Adjusted R-squared:  0.309 
## F-statistic:  21.2 on 11 and 486 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 6192.336 6247.073

checkresiduals(model1.temp)

##            1            2            3            4            5            6 
## -173.9110411 -165.2324112 -174.8008560 -189.3055931 -180.6809853 -221.4638279 
##            7            8            9           10           11           12 
## -247.3735674 -227.1723497 -245.3420008 -205.4787361 -221.9912166 -202.9837259 
##           13           14           15           16           17           18 
## -208.2602349 -228.4757166 -220.9296073 -256.6840365 -246.8174666 -251.4165222 
##           19           20           21           22           23           24 
## -254.1753122 -224.2394737 -262.1047517 -207.0276056 -219.0632537 -223.2998477 
##           25           26           27           28           29           30 
## -217.9343181 -230.7859978 -230.3158899 -210.1819458 -203.6393006 -197.2741253 
##           31           32           33           34           35           36 
## -194.0331829 -171.5325787 -174.4802410 -172.2179381 -131.6226821 -105.3451398 
##           37           38           39           40           41           42 
##  -48.6899586  -53.1170919  -61.8432574  -88.6530015 -110.8604109  -89.6887571 
##           43           44           45           46           47           48 
##  -72.2442783  -54.1794219  -20.7490259    5.8095577    5.6322319  -38.7757658 
##           49           50           51           52           53           54 
##  -39.7721468  -85.4694422  -98.4607513  -88.8176746  -87.4717181 -110.9609435 
##           55           56           57           58           59           60 
## -100.2946092 -117.4186719 -118.8705575 -135.9659968 -147.5358642 -179.7906844 
##           61           62           63           64           65           66 
## -178.0871199 -201.2200879 -173.5349275 -182.3780260 -175.4744648 -208.2497122 
##           67           68           69           70           71           72 
## -134.8915080 -151.1722332 -192.2902827 -194.6058011 -202.1978437 -179.9241408 
##           73           74           75           76           77           78 
## -157.3069215 -168.8761442 -196.0763584 -134.2858767 -106.4080165 -134.7091395 
##           79           80           81           82           83           84 
## -149.9134025 -110.8794141 -155.7726334  -97.6465106 -119.9820358 -128.0926220 
##           85           86           87           88           89           90 
##  -92.8433508  -36.4216330  -47.6779134    0.6172922   18.2685926   15.5064126 
##           91           92           93           94           95           96 
##  -17.9884370   10.6357179   -3.5140161   13.7314847   25.5753221   23.8991052 
##           97           98           99          100          101          102 
##   18.4211074    6.1616208    3.7415844  -38.0688099  -32.2155699  -53.8971423 
##          103          104          105          106          107          108 
##  -39.3222936  -39.2794719  -73.9907584 -103.0364305 -106.7168873 -138.5681816 
##          109          110          111          112          113          114 
## -111.7302957 -159.4940562 -140.6790148 -146.0943322 -152.2750188 -121.4275755 
##          115          116          117          118          119          120 
## -152.3556562 -140.8921654 -123.7577516 -126.6159927  -93.2879471 -150.4030344 
##          121          122          123          124          125          126 
## -140.6874649 -174.6404861 -157.0349134 -146.0061261 -159.5970659 -122.6880306 
##          127          128          129          130          131          132 
## -130.2045384 -108.1279414 -129.8213431 -128.2992776 -139.3149877 -126.7641326 
##          133          134          135          136          137          138 
##  -97.0783600 -106.5736113  -78.0101038  -71.0953055  -81.2066496  -55.9815085 
##          139          140          141          142          143          144 
##  -28.9066180   26.6882114   88.8513420  135.9963432  110.3244287   98.3749746 
##          145          146          147          148          149          150 
##  104.1927651   57.2438324   34.3560483   51.4163753   62.0871469  102.1742134 
##          151          152          153          154          155          156 
##  115.5886114   91.8571490   74.8493395   58.2456571    1.4443511  -10.3959732 
##          157          158          159          160          161          162 
##  -34.0853401  -34.8477166  -41.1870225  -71.1625015  -97.2893636  -86.6236138 
##          163          164          165          166          167          168 
##  -97.9256322 -116.6323228  -75.6753600  -43.7854109  -63.3130468  -78.2767021 
##          169          170          171          172          173          174 
## -106.8780085 -131.3438858 -113.1327713 -129.5157462 -122.3572612 -106.8655809 
##          175          176          177          178          179          180 
##  -94.1544865 -125.7118683 -107.8625569 -131.5631704 -102.6837570 -103.7311777 
##          181          182          183          184          185          186 
## -109.0075678 -129.9077939 -100.6835113 -129.6657779 -109.0005020  -88.5667733 
##          187          188          189          190          191          192 
##  -50.9169580  -56.6397917  -21.9340464  -44.3238750  -35.7959609   -3.4875565 
##          193          194          195          196          197          198 
##   15.8105871   -9.9731550   -6.9839119  -14.0156600   -9.2261644   -2.5210838 
##          199          200          201          202          203          204 
##   31.3415099   50.5351638   71.3315129  108.0761944   79.5420133   33.5288103 
##          205          206          207          208          209          210 
##   -4.0611845    9.7126412   19.4947916    0.1845762   25.6009758   25.3531766 
##          211          212          213          214          215          216 
##   47.0822098   32.5389403   13.2655778   -2.4361076  -26.1738947  -19.5852577 
##          217          218          219          220          221          222 
##  -35.5725741  -25.9144264  -36.2276628  -46.1004604  -28.2305516  -71.0238228 
##          223          224          225          226          227          228 
##  -33.2528133  -78.4121254  -83.9972637  -88.0313576  -61.1217263  -33.9119232 
##          229          230          231          232          233          234 
##  -80.1138492  -42.0712671  -60.3759753  -60.2872359  -42.7516725  -63.6687115 
##          235          236          237          238          239          240 
##  -32.0303724  -31.5691248  -33.1001406   -6.9103383  -35.6704511   13.9655184 
##          241          242          243          244          245          246 
##   -7.0651455   12.8894679   41.7735716   83.7673601   97.0447985   90.9364386 
##          247          248          249          250          251          252 
##  117.0436708  121.2470288  114.0451237  151.6632081   93.7949207  100.3520305 
##          253          254          255          256          257          258 
##   98.8787510  106.9725834  108.8316053  102.1952407  113.3214993   95.8947674 
##          259          260          261          262          263          264 
##  111.3481214  111.3393414   31.4938755   11.2431719  -33.3549514  -42.4940337 
##          265          266          267          268          269          270 
##  -11.9078781  -26.4498547  -26.6982475  -49.1462300  -35.4142297  -17.1022903 
##          271          272          273          274          275          276 
##  -62.0353873  -75.9187542  -92.9345614  -67.3451004  -57.4499227  -71.0669274 
##          277          278          279          280          281          282 
##  -95.2828227 -106.3500125  -62.8642128  -47.0413772  -58.6385604  -79.1655618 
##          283          284          285          286          287          288 
##  -77.4430311  -90.7417161 -101.1776738  -81.1180183  -67.1323958  -26.2725043 
##          289          290          291          292          293          294 
##  -16.9002758  -16.8197327   -7.4938436    5.6728464   -6.5250404  -24.5922283 
##          295          296          297          298          299          300 
##  -17.5443565   33.8692648   79.6887738   76.7411639   62.1276778   66.3711272 
##          301          302          303          304          305          306 
##   80.3892910  103.2368524  113.5093210  124.3178664  120.5872121  148.7913364 
##          307          308          309          310          311          312 
##  170.6713758  128.7406993  110.6687384   86.5476827   85.7993029  102.7617846 
##          313          314          315          316          317          318 
##  110.0108287  104.3184657   75.2865080   77.4013079   68.1129219   25.9307189 
##          319          320          321          322          323          324 
##   24.8773584   22.5704773   21.5079489   48.2350046   21.3235909   -2.7998949 
##          325          326          327          328          329          330 
##  -28.6147523  -46.9380522  -50.7033023  -17.8418280  -22.4094737   -8.6282053 
##          331          332          333          334          335          336 
##  -12.9582795    5.4533145  -48.0516536  -49.3854554  -51.4804530  -44.3979682 
##          337          338          339          340          341          342 
##   -4.4050588  -11.1578846  -40.3930242   -8.2757927   -4.1708865    3.0312284 
##          343          344          345          346          347          348 
##   -9.1948511    5.3571259   25.3198118   29.5265549   58.6581666   70.2916994 
##          349          350          351          352          353          354 
##   81.6243173  105.0612008   98.8015038   93.1666441   61.2774768   79.5090566 
##          355          356          357          358          359          360 
##   77.7378484   56.2428133   95.5294256  109.7464431  107.5546921  103.9317380 
##          361          362          363          364          365          366 
##  122.0513358  113.7015140   85.5641022   94.8729890   84.9217364   66.0685108 
##          367          368          369          370          371          372 
##   68.6454496   46.2888535   83.4792172   89.3688022   96.5279139   73.9755551 
##          373          374          375          376          377          378 
##   64.7687383   59.8849744   53.2985000   35.3215966   17.3624197   24.0752905 
##          379          380          381          382          383          384 
##   77.2737364   49.7921768   13.8205515    3.2374796    4.7531111  -19.1420345 
##          385          386          387          388          389          390 
##   -7.5467576  -16.9386685  -14.7532279   30.8192199   26.6119694   23.3322726 
##          391          392          393          394          395          396 
##    8.7386887   16.7490077   14.9491508   33.1096089   25.3681342   36.0473420 
##          397          398          399          400          401          402 
##   54.2410084   76.7943775   61.0960212  101.7692150  109.7963259  117.4093029 
##          403          404          405          406          407          408 
##  156.6606119  177.6374809  183.2556861  199.5284671  174.2559457  195.8519721 
##          409          410          411          412          413          414 
##  174.7081311  180.8513014  171.7538240  192.9447516  208.3689367  226.6724946 
##          415          416          417          418          419          420 
##  214.6019846  183.3621545  169.7319925  154.2075359  113.4402415  113.9344483 
##          421          422          423          424          425          426 
##  128.3864246  110.7443819  109.8350659  111.9560800   89.7233961   92.8190013 
##          427          428          429          430          431          432 
##   65.5205748   55.8287225   89.0948209   79.5177601   63.5130770   43.9827485 
##          433          434          435          436          437          438 
##   76.4482458   72.3550218   67.0843620   80.4266190   63.1989710   78.0503284 
##          439          440          441          442          443          444 
##  122.3753821  118.5729562   98.7910630   90.9281151   81.4551768   89.9618333 
##          445          446          447          448          449          450 
##   97.1036751  117.1531973  112.9403018  114.7317245  189.1447389  184.4545147 
##          451          452          453          454          455          456 
##  166.0084300  192.0168165  221.8667952  222.5538436  190.9321043  188.1357716 
##          457          458          459          460          461          462 
##  172.0766387  210.8810949  260.9855969  252.3104862  263.2185253  275.6710715 
##          463          464          465          466          467          468 
##  273.8045108  215.5562524  203.4027388  201.7934596  229.4717186  258.9219239 
##          469          470          471          472          473          474 
##  219.8017599  201.8511522  179.3778020  199.8654510  202.1286520  196.7796983 
##          475          476          477          478          479          480 
##  180.3718445  180.1995478  207.0451458  164.1815072  154.3418304  165.9926964 
##          481          482          483          484          485          486 
##  156.6109438  159.3341912  147.1279771  126.7553044  120.0005996  138.6958912 
##          487          488          489          490          491          492 
##  143.4101241  108.0228197  143.1792720  159.2367441  147.0819503  129.9027004 
##          493          494          495          496          497          498 
##  105.2081827  109.5904347  117.5496196  122.6991169  163.9425184  171.0521727

#chemical1
model1.c1 <- dlm(x=as.vector(dataf$X1), y=as.vector(dataf$mortality), q=10)
summary(model1.c1)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -247.62 -136.81   10.29  123.43  240.01 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 160.28008   73.78138   2.172   0.0303 *
## x.t           0.06170    1.03202   0.060   0.9524  
## x.1           0.12744    1.05214   0.121   0.9036  
## x.2           0.22030    1.09015   0.202   0.8399  
## x.3           0.12043    1.09395   0.110   0.9124  
## x.4           0.12271    1.11476   0.110   0.9124  
## x.5           0.22306    1.11401   0.200   0.8414  
## x.6           0.12467    1.11382   0.112   0.9109  
## x.7           0.12565    1.09305   0.115   0.9085  
## x.8           0.15956    1.08662   0.147   0.8833  
## x.9           0.07092    1.05232   0.067   0.9463  
## x.10         -0.02110    1.03575  -0.020   0.9838  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 145.2 on 486 degrees of freedom
## Multiple R-squared:  0.004279,   Adjusted R-squared:  -0.01826 
## F-statistic: 0.1899 on 11 and 486 DF,  p-value: 0.9981
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 6385.399 6440.137

checkresiduals(model1.c1)

##           1           2           3           4           5           6 
## -241.707607 -243.109962 -244.648658 -241.809855 -243.135078 -242.393354 
##           7           8           9          10          11          12 
## -240.376199 -242.036307 -241.205730 -243.107096 -244.435117 -243.893069 
##          13          14          15          16          17          18 
## -243.373263 -245.410050 -245.532661 -246.076783 -247.620538 -246.603628 
##          19          20          21          22          23          24 
## -245.020807 -243.972870 -242.269183 -240.819006 -240.637417 -239.633138 
##          25          26          27          28          29          30 
## -235.957558 -232.249299 -231.192134 -227.826895 -226.132129 -223.517645 
##          31          32          33          34          35          36 
## -220.296983 -216.010295 -211.460938 -210.332169 -207.266866 -202.063001 
##          37          38          39          40          41          42 
## -199.976681 -197.721036 -193.080959 -195.697644 -194.874592 -193.683107 
##          43          44          45          46          47          48 
## -194.062050 -195.065990 -193.239460 -192.511112 -195.190833 -191.755051 
##          49          50          51          52          53          54 
## -192.143821 -192.104145 -192.092660 -189.983417 -188.507222 -189.007978 
##          55          56          57          58          59          60 
## -188.171122 -187.037512 -187.263141 -185.550802 -181.932425 -183.892040 
##          61          62          63          64          65          66 
## -185.571251 -186.439206 -187.569105 -188.888228 -190.646542 -190.885953 
##          67          68          69          70          71          72 
## -192.933098 -196.361827 -196.457459 -194.602667 -195.066419 -196.485921 
##          73          74          75          76          77          78 
## -194.226974 -194.032488 -191.491656 -188.708697 -188.017697 -186.251151 
##          79          80          81          82          83          84 
## -179.890548 -177.963479 -175.610289 -169.394043 -166.347924 -163.514246 
##          85          86          87          88          89          90 
## -159.443357 -153.535980 -150.878738 -149.166845 -146.077885 -144.283246 
##          91          92          93          94          95          96 
## -143.864930 -141.719173 -140.177819 -140.847512 -141.236090 -141.056024 
##          97          98          99         100         101         102 
## -142.155981 -141.839772 -142.388548 -140.994028 -141.130968 -142.337726 
##         103         104         105         106         107         108 
## -141.518684 -140.441259 -140.472151 -139.829070 -140.000373 -138.319667 
##         109         110         111         112         113         114 
## -139.253025 -139.305191 -139.206683 -140.612000 -138.696213 -139.892801 
##         115         116         117         118         119         120 
## -140.090157 -140.068954 -140.995821 -141.822565 -141.587145 -143.068214 
##         121         122         123         124         125         126 
## -141.397029 -140.077380 -141.802096 -139.530372 -137.208771 -135.593034 
##         127         128         129         130         131         132 
## -133.610598 -131.143729 -128.240519 -126.216804 -123.981962 -120.457121 
##         133         134         135         136         137         138 
## -117.914104 -115.293525 -111.844544 -109.928619 -107.802674 -103.759569 
##         139         140         141         142         143         144 
## -100.220867 -101.816462 -100.268385  -95.242475  -95.134891  -95.780899 
##         145         146         147         148         149         150 
##  -90.907081  -89.850443  -91.349866  -87.938730  -87.560793  -86.648335 
##         151         152         153         154         155         156 
##  -84.682066  -83.732016  -83.418075  -81.786998  -82.224679  -84.038323 
##         157         158         159         160         161         162 
##  -81.347749  -81.569960  -82.927819  -82.507069  -83.283348  -83.980174 
##         163         164         165         166         167         168 
##  -85.024793  -83.972521  -85.928361  -87.005380  -89.161744  -88.823517 
##         169         170         171         172         173         174 
##  -88.165888  -90.055108  -88.864919  -88.936978  -88.207600  -86.967150 
##         175         176         177         178         179         180 
##  -85.639004  -82.996036  -80.673793  -80.027040  -78.220425  -75.745381 
##         181         182         183         184         185         186 
##  -75.364330  -72.087981  -71.467105  -71.451618  -68.618419  -67.187829 
##         187         188         189         190         191         192 
##  -64.979494  -60.729505  -57.745514  -56.878933  -54.104328  -49.943196 
##         193         194         195         196         197         198 
##  -46.441153  -43.609364  -40.250175  -41.266461  -40.195318  -37.770646 
##         199         200         201         202         203         204 
##  -37.420161  -36.952939  -35.141299  -36.393237  -37.050493  -34.560707 
##         205         206         207         208         209         210 
##  -35.445392  -33.613637  -32.611631  -34.576646  -33.022922  -31.720386 
##         211         212         213         214         215         216 
##  -33.160233  -32.037619  -31.300872  -32.492439  -33.303660  -31.287912 
##         217         218         219         220         221         222 
##  -31.583067  -34.532919  -34.018610  -35.005979  -36.323365  -37.590688 
##         223         224         225         226         227         228 
##  -38.622943  -37.777874  -37.330498  -36.729041  -34.723768  -34.272087 
##         229         230         231         232         233         234 
##  -34.024457  -32.568275  -29.523362  -28.034390  -25.675868  -23.586562 
##         235         236         237         238         239         240 
##  -22.902647  -19.548371  -14.623281  -12.900319  -13.222337   -8.252571 
##         241         242         243         244         245         246 
##   -6.339807   -6.058185   -2.144757    0.460595    1.938088    5.659977 
##         247         248         249         250         251         252 
##    6.411807    6.848453    9.148531   11.429171   14.117572   13.945599 
##         253         254         255         256         257         258 
##   15.400067   16.216068   16.179546   19.967615   22.854585   22.434129 
##         259         260         261         262         263         264 
##   24.063138   25.748472   25.807806   26.937739   27.392202   26.349142 
##         265         266         267         268         269         270 
##   25.601281   26.044741   25.165574   23.411984   23.708631   23.014884 
##         271         272         273         274         275         276 
##   22.189555   21.042982   20.568352   20.033186   19.387419   18.455926 
##         277         278         279         280         281         282 
##   18.287178   18.495051   18.761782   19.447334   20.783027   22.372669 
##         283         284         285         286         287         288 
##   22.484112   21.686344   24.327206   27.168906   27.797959   29.990351 
##         289         290         291         292         293         294 
##   33.498904   35.087891   37.007005   41.237433   44.101312   47.026663 
##         295         296         297         298         299         300 
##   48.021072   51.147866   53.097315   53.938108   56.411862   56.659320 
##         301         302         303         304         305         306 
##   54.724842   58.073152   58.825816   59.968876   61.455170   61.640462 
##         307         308         309         310         311         312 
##   64.711876   66.451206   66.593973   68.084044   71.038678   70.921426 
##         313         314         315         316         317         318 
##   71.327805   72.110971   72.045989   70.975224   72.044652   72.177394 
##         319         320         321         322         323         324 
##   73.551364   73.643968   70.973893   70.258415   69.046000   66.419296 
##         325         326         327         328         329         330 
##   68.199519   68.209204   67.116105   68.082348   68.084043   69.117098 
##         331         332         333         334         335         336 
##   70.930573   72.628327   72.440492   74.397808   76.687281   77.908762 
##         337         338         339         340         341         342 
##   78.255835   79.549659   79.746450   82.608036   84.665366   84.613235 
##         343         344         345         346         347         348 
##   85.668583   89.019029   88.548625   90.911083   95.234619   95.529067 
##         349         350         351         352         353         354 
##   98.617079  103.126496  107.440452  108.974325  109.654973  111.642717 
##         355         356         357         358         359         360 
##  113.634614  112.380973  113.188938  114.880779  114.978642  114.078020 
##         361         362         363         364         365         366 
##  116.799854  119.926679  120.169167  122.011203  123.601249  124.140841 
##         367         368         369         370         371         372 
##  123.672930  125.443106  127.166257  126.242410  125.312605  126.188149 
##         373         374         375         376         377         378 
##  125.604067  126.321991  126.884058  124.913077  123.890644  122.346856 
##         379         380         381         382         383         384 
##  120.710299  119.545983  119.413816  119.550184  119.801193  120.345021 
##         385         386         387         388         389         390 
##  121.895636  122.921317  124.712660  128.812822  130.177271  131.082292 
##         391         392         393         394         395         396 
##  133.631237  135.026994  137.043372  138.838345  140.890756  141.694849 
##         397         398         399         400         401         402 
##  142.926190  143.684770  146.931826  150.966958  152.329341  155.246363 
##         403         404         405         406         407         408 
##  159.327050  161.484596  163.789648  167.261742  168.986773  170.590094 
##         409         410         411         412         413         414 
##  172.213226  171.576549  172.607060  174.047214  172.395068  171.698751 
##         415         416         417         418         419         420 
##  174.466895  175.282536  174.345207  176.052925  176.221119  175.487023 
##         421         422         423         424         425         426 
##  174.720434  174.218172  175.104246  173.922175  171.504938  170.313520 
##         427         428         429         430         431         432 
##  169.761755  169.045632  169.712829  169.864909  168.672060  170.146874 
##         433         434         435         436         437         438 
##  169.905134  169.924429  172.898488  173.588225  173.529363  175.748166 
##         439         440         441         442         443         444 
##  177.974098  178.170490  176.602281  179.222444  181.920123  180.591079 
##         445         446         447         448         449         450 
##  184.164461  188.301527  188.908809  194.372287  200.096362  203.789895 
##         451         452         453         454         455         456 
##  207.796207  211.633559  213.980929  216.995779  219.503874  220.230946 
##         457         458         459         460         461         462 
##  222.607166  224.645075  225.767827  226.895371  227.221856  228.800971 
##         463         464         465         466         467         468 
##  229.014771  227.877525  228.836346  230.663465  228.624926  227.427643 
##         469         470         471         472         473         474 
##  229.278886  228.079408  227.941082  228.359989  228.902517  226.196806 
##         475         476         477         478         479         480 
##  226.944540  227.491844  225.711728  224.573995  224.591752  224.943760 
##         481         482         483         484         485         486 
##  222.619397  223.584937  223.780780  222.174632  222.627556  224.436403 
##         487         488         489         490         491         492 
##  225.266965  226.696767  226.484550  226.807417  226.636800  226.522123 
##         493         494         495         496         497         498 
##  228.899481  230.714377  231.959976  235.017308  238.179584  240.006664

#chem2
model1.c2 <- dlm(x=as.vector(dataf$X2), y=as.vector(dataf$mortality), q=10)
summary(model1.c2)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -253.260 -104.880    1.077  113.113  255.319 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 428.37924   17.13593  24.999   <2e-16 ***
## x.t          -5.42914    2.42679  -2.237   0.0257 *  
## x.1          -4.31465    2.44531  -1.764   0.0783 .  
## x.2          -1.46195    2.59464  -0.563   0.5734    
## x.3          -0.21991    2.64517  -0.083   0.9338    
## x.4           0.38362    2.66043   0.144   0.8854    
## x.5           0.63485    2.65979   0.239   0.8115    
## x.6           0.53842    2.66269   0.202   0.8398    
## x.7           0.01805    2.64856   0.007   0.9946    
## x.8          -1.46525    2.59724  -0.564   0.5729    
## x.9          -4.42015    2.44851  -1.805   0.0717 .  
## x.10         -5.63263    2.42882  -2.319   0.0208 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 131.4 on 486 degrees of freedom
## Multiple R-squared:  0.1848, Adjusted R-squared:  0.1663 
## F-statistic: 10.01 on 11 and 486 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC     BIC
## 1 6285.802 6340.54

checkresiduals(model1.c2)

##            1            2            3            4            5            6 
## -209.9881945 -232.7902552 -203.8387637 -218.8249413 -236.7919802 -245.3285501 
##            7            8            9           10           11           12 
## -247.6496339 -252.5770017 -248.5150022 -218.8721104 -216.5962028 -224.7032445 
##           13           14           15           16           17           18 
## -226.1531620 -253.2596834 -237.2812101 -241.5297272 -243.5194066 -219.1116587 
##           19           20           21           22           23           24 
## -211.1211117 -205.7081414 -196.2162861 -159.2298951 -149.9909165 -151.3228120 
##           25           26           27           28           29           30 
## -176.8992754 -196.2974058 -135.0947429 -131.3681316 -109.3156601  -55.3007704 
##           31           32           33           34           35           36 
##  -68.0628251  -51.1554509  -61.0561422 -109.4304743 -103.8296268  -65.4548562 
##           37           38           39           40           41           42 
##  -41.0639857  -36.0168270   -0.6665881    6.7024160  -42.0977938    3.9059892 
##           43           44           45           46           47           48 
##  -66.2371484 -132.3317164 -109.7171234 -128.9994327 -104.8394083  -90.2848172 
##           49           50           51           52           53           54 
##  -84.7852728  -75.2222641 -137.3600951 -150.7694332 -217.5418074 -207.3569552 
##           55           56           57           58           59           60 
## -196.6760405 -209.1025999 -212.9332877 -222.7682562 -209.0606193 -219.1308582 
##           61           62           63           64           65           66 
## -232.3006851 -237.1031334 -219.7000874 -213.6408281 -225.4063371 -232.2632779 
##           67           68           69           70           71           72 
## -225.7905463 -215.8335217 -221.5400319 -203.4569356 -222.9155048 -184.3527165 
##           73           74           75           76           77           78 
## -182.2072466 -189.3292379 -177.2312837 -120.0085624 -140.6924543 -153.0541883 
##           79           80           81           82           83           84 
## -160.3849560 -116.0851321 -123.7622281 -100.8758138 -111.8966819 -100.1557526 
##           85           86           87           88           89           90 
##  -88.4351106  -54.3591692 -109.9538481  -83.0803062  -48.3631864   13.1249322 
##           91           92           93           94           95           96 
##  -38.5314976  -41.2122367  -70.0534773  -77.1121293  -61.6930613  -72.6103126 
##           97           98           99          100          101          102 
## -106.2037308  -66.9957326  -56.4107316  -77.7293611 -112.3752471  -92.9044066 
##          103          104          105          106          107          108 
## -100.4830944  -97.9109463 -115.4059789 -153.7021296 -161.0053130 -174.2685116 
##          109          110          111          112          113          114 
## -173.2543350 -163.7630420 -162.4840592 -166.5019874 -176.7853327 -173.7475650 
##          115          116          117          118          119          120 
## -175.5961444 -181.7754373 -177.1465272 -163.7896798 -155.2895500 -180.4477761 
##          121          122          123          124          125          126 
## -178.6238444 -177.9953049 -173.8802283 -175.6294589 -162.8955033 -147.7765284 
##          127          128          129          130          131          132 
## -132.4170181 -121.4513808 -139.6066979 -126.6810150 -109.8157141  -88.2986065 
##          133          134          135          136          137          138 
## -101.7974110 -107.7658767  -70.6099922  -40.7625987  -70.8860154  -65.9128220 
##          139          140          141          142          143          144 
##  -19.1195187  -23.4363362  -29.5166189  -15.2017475  -52.8157406  -29.6497917 
##          145          146          147          148          149          150 
##   -8.9960912  -43.2577036  -65.4522767  -30.6181485  -36.2893383  -81.2827991 
##          151          152          153          154          155          156 
##  -68.8930741  -77.2961829 -103.3275600  -81.6342945 -102.8229794 -116.9891218 
##          157          158          159          160          161          162 
## -114.8104247 -124.8028071 -129.4521267 -144.1610648 -138.1495439 -149.9398337 
##          163          164          165          166          167          168 
## -130.5510074 -128.1570079 -125.1606299 -128.5330484 -136.6135380 -145.6452774 
##          169          170          171          172          173          174 
## -150.9655140 -142.9030694 -137.6842848 -141.9087558 -115.3253699 -113.7155985 
##          175          176          177          178          179          180 
## -118.5858481 -144.7367485 -145.0951684 -137.0914468  -90.2725671  -94.9000611 
##          181          182          183          184          185          186 
##  -74.0265468  -55.0542976  -35.8967605  -59.4743040  -60.7709897  -74.5336413 
##          187          188          189          190          191          192 
##  -75.1385480  -36.3295143   14.6311883   25.2146907   47.5251054   14.6290454 
##          193          194          195          196          197          198 
##  -17.5567563  -15.1119217    3.9236992    2.3084051   39.5489595   67.2417883 
##          199          200          201          202          203          204 
##   57.0345634   12.9846997  -26.3960795  -57.6258969  -30.2206820   -9.5802171 
##          205          206          207          208          209          210 
##  -23.0973250  -18.0912716  -23.9185305  -40.4369594  -58.7547638  -72.1909833 
##          211          212          213          214          215          216 
##  -78.8764700  -81.1610143  -77.7001943 -112.7969178 -106.8970532  -93.9671011 
##          217          218          219          220          221          222 
##  -88.1870830  -83.4120139 -104.8930309 -111.4937045 -100.6458644  -96.5513912 
##          223          224          225          226          227          228 
##  -97.0615453 -115.9869883 -104.9830019  -88.6846645  -71.9650968  -60.6287389 
##          229          230          231          232          233          234 
##  -74.7576004  -45.7871844  -50.2674391  -68.4174295  -61.4951619  -30.7637186 
##          235          236          237          238          239          240 
##  -29.9457992  -24.1556358   10.7920935   33.3635289   26.0473413   48.1944823 
##          241          242          243          244          245          246 
##   39.7719225   55.7519028  109.4013933   59.9599667   33.5319012   61.9285473 
##          247          248          249          250          251          252 
##  126.8435325  142.0800349  103.1133509   98.6120189   83.7253408   86.0213248 
##          253          254          255          256          257          258 
##   94.5471665   31.7083752   61.2925222   76.1403418   86.5115126   38.1849652 
##          259          260          261          262          263          264 
##   -0.7787071    2.8460324    1.9348245   11.7305187   16.6908787  -24.5324712 
##          265          266          267          268          269          270 
##  -20.3692699  -33.7760850  -39.0249007  -37.5406085  -50.2965691  -33.4294139 
##          271          272          273          274          275          276 
##  -30.8469309  -29.4269252  -43.6114078  -53.7674902  -49.3170568  -40.2452893 
##          277          278          279          280          281          282 
##  -46.2073789  -41.9586980  -20.9386120   -9.6749396  -24.2584989  -11.7406964 
##          283          284          285          286          287          288 
##  -10.4570347   -2.4878244  -11.3131865    8.3350048   -4.7174457   41.2231208 
##          289          290          291          292          293          294 
##   60.4031013   70.4017123   87.3142150   85.3704508   79.6488986   65.3621861 
##          295          296          297          298          299          300 
##   92.9960962  118.2774125  102.8271271  146.0609556  156.0451076  143.5081053 
##          301          302          303          304          305          306 
##  144.2842677  101.5183109   98.3015043   95.1065631  115.1688639   81.1744214 
##          307          308          309          310          311          312 
##   89.3622452  121.1108539  104.5983007   75.7719766   57.9429176   15.0771369 
##          313          314          315          316          317          318 
##   33.0006810   28.2202623   18.5818998   23.4253967   30.2798225   17.8529691 
##          319          320          321          322          323          324 
##    5.7146168    5.4140443    6.5924938   25.7482481   18.7971723   14.2832713 
##          325          326          327          328          329          330 
##    0.2191319    4.0867184   -8.5816990    2.4535726   16.4435134   19.8384762 
##          331          332          333          334          335          336 
##   43.7384733   48.4447210   32.2441071   24.5944529   13.3566701   21.0224254 
##          337          338          339          340          341          342 
##   47.3127094   59.2938845   50.5669672   75.1268315  126.3299729  108.4279664 
##          343          344          345          346          347          348 
##  118.0432086  104.5002027  120.2943425  147.4597031  162.7031515  143.5422830 
##          349          350          351          352          353          354 
##  141.9487650  148.9174182  169.5321362  161.2489978  174.5101467  177.2883739 
##          355          356          357          358          359          360 
##  206.3586791  200.2778320  159.0934339  128.2154117  102.6452944  107.4465810 
##          361          362          363          364          365          366 
##  118.2481728  105.3217904  110.4897165  113.5588200  114.1830490   84.0868657 
##          367          368          369          370          371          372 
##   56.8388653   62.9070746   57.1507391   51.7943208   51.5015250   51.2297637 
##          373          374          375          376          377          378 
##   47.5201374   49.2373302   52.9696740   38.2211561   41.3586587   48.2215940 
##          379          380          381          382          383          384 
##   65.0199311   61.5127160   61.1150081   61.1170146   58.0945208   55.9213795 
##          385          386          387          388          389          390 
##   66.0542639   66.2044573   72.5230389   93.8007157  114.2714706  110.5336753 
##          391          392          393          394          395          396 
##  117.4262777  124.7738515  121.0272461  145.8713249  154.6894194  149.2499998 
##          397          398          399          400          401          402 
##  184.4926822  208.2010612  197.3065684  158.8696850  155.6021020  174.6119107 
##          403          404          405          406          407          408 
##  185.4110696  200.8488758  213.8629441  228.0696740  219.6385379  187.5155099 
##          409          410          411          412          413          414 
##  156.7994331  133.3906608  146.3867163  156.8293565  154.5456048  151.3156790 
##          415          416          417          418          419          420 
##  150.6667354  121.5560465  116.4767238  122.3440498  114.0188320  113.7085002 
##          421          422          423          424          425          426 
##  121.0377816  117.1614444  102.7018432   96.8569025  101.8718132  103.8807619 
##          427          428          429          430          431          432 
##  101.5864983   96.2556889  111.1992996  116.9678250  111.7768712  106.6165714 
##          433          434          435          436          437          438 
##  101.0621174  110.1123166  117.7941915  113.6625563  110.7917528  117.0654527 
##          439          440          441          442          443          444 
##  146.9226317  161.4561567  146.6701930  158.2138222  158.0312240  157.7489617 
##          445          446          447          448          449          450 
##  176.3441544  170.4006661  186.5676528  186.4470579  225.4475520  215.1551294 
##          451          452          453          454          455          456 
##  227.2281865  254.0061166  235.1189857  235.8351428  230.3778223  216.7961277 
##          457          458          459          460          461          462 
##  222.2352891  200.4408875  255.3186526  248.4187754  248.2821427  231.7141565 
##          463          464          465          466          467          468 
##  217.4050744  192.7229436  182.1615573  179.7366678  173.6628765  187.5911162 
##          469          470          471          472          473          474 
##  207.5326987  170.2405038  161.8518850  169.2445786  157.2040551  142.0110209 
##          475          476          477          478          479          480 
##  140.0525141  145.7508405  154.4634711  154.5290637  155.8091198  148.9884873 
##          481          482          483          484          485          486 
##  146.9323980  152.7195437  144.5051349  147.4576015  155.3846221  154.8711854 
##          487          488          489          490          491          492 
##  161.5463399  160.8844333  170.6959262  188.3453217  196.0092555  181.7248785 
##          493          494          495          496          497          498 
##  187.8677640  183.2258036  177.4403401  192.6258538  217.9486337  220.7019342

#particle space
model1.part <- dlm(x=as.vector(dataf$X3), y=as.vector(dataf$mortality), q=10)
summary(model1.part)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -263.86  -63.66   11.95   76.78  181.16 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  719.374     17.893  40.203  < 2e-16 ***
## x.t          -20.379      4.777  -4.266 2.39e-05 ***
## x.1          -18.325      4.810  -3.810 0.000157 ***
## x.2          -13.553      4.885  -2.774 0.005747 ** 
## x.3          -15.384      4.864  -3.163 0.001661 ** 
## x.4          -13.032      4.903  -2.658 0.008126 ** 
## x.5          -11.305      4.923  -2.297 0.022065 *  
## x.6          -12.551      4.902  -2.561 0.010751 *  
## x.7          -13.555      4.868  -2.784 0.005571 ** 
## x.8          -12.030      4.885  -2.462 0.014148 *  
## x.9          -14.863      4.803  -3.094 0.002086 ** 
## x.10         -16.192      4.774  -3.391 0.000752 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 93.12 on 486 degrees of freedom
## Multiple R-squared:  0.5905, Adjusted R-squared:  0.5812 
## F-statistic: 63.71 on 11 and 486 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 5942.914 5997.652

checkresiduals(model1.part)

##            1            2            3            4            5            6 
## -251.1089667 -263.8642991 -211.3376274 -215.0273046 -214.8033521 -227.6847095 
##            7            8            9           10           11           12 
## -253.7116178 -244.9425670 -232.6368604 -149.7716057 -136.4534390 -115.1587036 
##           13           14           15           16           17           18 
##  -91.0679420  -98.9333576  -52.2685786  -57.3521835  -41.4061256    4.1375118 
##           19           20           21           22           23           24 
##   19.8090726   17.2487898   10.7132418   20.7386341   11.5104162   -0.3629868 
##           25           26           27           28           29           30 
##  -17.4537538  -80.5947698  -54.0560633  -63.9811758  -82.3041537  -77.8017437 
##           31           32           33           34           35           36 
## -104.8487502 -133.1787262 -152.1382413 -186.1357856 -199.1821690 -199.2515678 
##           37           38           39           40           41           42 
## -172.5672209 -192.8760742 -185.7342358 -170.8506565 -172.0418286 -130.0022939 
##           43           44           45           46           47           48 
## -160.9817327 -169.1660982 -134.6361060 -132.8234062 -113.9580230 -108.3776224 
##           49           50           51           52           53           54 
##  -80.5774312  -78.7206649 -117.7860029 -146.2419839 -180.0809135 -169.0826853 
##           55           56           57           58           59           60 
## -158.5799195 -162.1756560 -174.8357723 -200.1275696 -202.7068437 -187.6806515 
##           61           62           63           64           65           66 
## -189.0272987 -187.3150468 -133.4839948 -126.5817556 -113.0672132  -94.5010224 
##           67           68           69           70           71           72 
##  -29.4487327  -14.8737105   -7.8350884   41.0424805    4.2661380   58.3637597 
##           73           74           75           76           77           78 
##   64.3539541   33.0953219   13.6798878   76.0542543   81.5352408   20.4929508 
##           79           80           81           82           83           84 
##   -8.2449011   45.5236255   37.0291627   20.8761838   -9.6107655  -19.6834660 
##           85           86           87           88           89           90 
##  -34.6326556   10.4894013  -62.6845095  -82.7064040  -18.5618475   45.7878178 
##           91           92           93           94           95           96 
##   19.9954096   -9.4886964   -0.3938329  -18.6952430   25.2323463   65.7580293 
##           97           98           99          100          101          102 
##   69.5022540  120.6194892  139.3414700  105.0023557   70.6084157   56.6231570 
##          103          104          105          106          107          108 
##   52.3825363   30.8449790   79.3735003   31.1895149   30.1536093   -9.1383300 
##          109          110          111          112          113          114 
##  -54.2827263  -59.4739182  -79.3666718  -83.9956879 -112.0359826  -66.1187264 
##          115          116          117          118          119          120 
##  -14.0625364  -44.7614407  -45.8379287  -32.0370769  -11.5535344  -21.4827147 
##          121          122          123          124          125          126 
##  -26.8586067  -20.9645887  -25.5956123  -15.1180799  -16.3262990  -62.4427306 
##          127          128          129          130          131          132 
##  -54.7425499  -50.2796597 -107.2471128  -87.3470544  -84.3895856  -66.5035170 
##          133          134          135          136          137          138 
##  -68.5953079  -79.8943940  -66.8617192  -66.4588217  -82.1458531  -95.0828805 
##          139          140          141          142          143          144 
##  -72.8562869  -51.9277953  -85.3986234  -69.0532668 -105.1744423 -106.8055732 
##          145          146          147          148          149          150 
## -109.0196258 -129.0687041 -137.6923437 -132.0309133 -144.0765882 -167.3905097 
##          151          152          153          154          155          156 
## -176.8168714 -178.7437586 -196.4651024 -188.9266306 -179.8145321 -176.5510401 
##          157          158          159          160          161          162 
## -162.6570608 -165.4638249 -147.4936832 -143.8837648 -111.1786168 -111.1088852 
##          163          164          165          166          167          168 
##  -77.3913934  -88.6510168  -46.1548513   -5.7155419   21.4259253   18.4720485 
##          169          170          171          172          173          174 
##   46.9544197   70.2801699   85.2337766   86.8083079  147.9217138  133.9549952 
##          175          176          177          178          179          180 
##  138.7918447  111.9577168   85.5521967   79.7178000  101.1523246   94.4823123 
##          181          182          183          184          185          186 
##   92.9327808   94.4086063   98.5689709   47.4451812   81.8703356   55.5182195 
##          187          188          189          190          191          192 
##   51.1634638   72.1619682   56.0498828   54.2471370   56.0509817   20.3452644 
##          193          194          195          196          197          198 
##  -34.5120172  -47.0351881  -40.9557516  -77.3627167  -45.8972246  -45.7558593 
##          199          200          201          202          203          204 
##  -47.5812747  -68.2810584  -84.2119174 -105.5131036  -79.6324299  -75.6629441 
##          205          206          207          208          209          210 
##  -80.1759450  -83.5421214  -99.6304185 -105.7856106 -107.0071231 -121.1015667 
##          211          212          213          214          215          216 
## -121.4813366 -131.5847656 -116.8195330 -138.6834553 -122.7439747 -114.7840556 
##          217          218          219          220          221          222 
##  -70.0790799  -42.8053689  -67.7609877  -67.4961215  -52.2403170  -55.1679440 
##          223          224          225          226          227          228 
##  -45.1405851  -56.2154170  -58.9166570  -45.8991658  -20.3400656  -23.2069563 
##          229          230          231          232          233          234 
##  -45.8886078    4.3003009   -1.5094107  -25.4398991  -24.1738000    1.4292449 
##          235          236          237          238          239          240 
##   -3.6007735   -0.7099392    5.5369084   12.6778544   13.9833885   51.0955964 
##          241          242          243          244          245          246 
##   20.2064211   30.6665311   68.9243769   43.6042486   28.3939280   46.2850729 
##          247          248          249          250          251          252 
##   85.7456518  128.4149746   91.0495468   87.2960178   62.8137289   50.5170650 
##          253          254          255          256          257          258 
##  105.0391315   63.5661319   50.7750096   58.2314517   48.6709913   -0.6306388 
##          259          260          261          262          263          264 
##  -44.2775594  -19.8568842  -39.0217869  -32.0749845   -9.7917565  -77.3920671 
##          265          266          267          268          269          270 
##  -69.5620657  -57.0008295  -66.5532576  -66.1498651  -69.8245393  -35.8279263 
##          271          272          273          274          275          276 
##  -40.3044048  -17.0651123  -26.9917946  -23.5635412  -10.0636022   20.3763529 
##          277          278          279          280          281          282 
##   14.2154963   15.2757634   32.4496350   68.6424176   48.3351503   66.2149997 
##          283          284          285          286          287          288 
##   67.8144337   94.9153034   67.1190773   67.5148006   28.5344339   54.4460703 
##          289          290          291          292          293          294 
##   80.5839916   88.2135506   76.5507755   71.7357421   88.6309813   78.9172611 
##          295          296          297          298          299          300 
##   92.1469138  114.1185760   99.3557959  136.2806074  134.4300241  115.2513218 
##          301          302          303          304          305          306 
##  131.5223330   96.0873099   78.4914226   52.9151048   60.4531758    7.2239323 
##          307          308          309          310          311          312 
##   13.5840492   52.7307318   13.7788790  -12.0655241  -30.8161856  -82.1002561 
##          313          314          315          316          317          318 
##  -78.7152171  -49.0088845  -52.2007204  -36.0386683  -33.7582506  -59.5621879 
##          319          320          321          322          323          324 
##  -90.1717980  -97.3018467  -86.4103588  -42.2306336  -26.9382216   -5.6298502 
##          325          326          327          328          329          330 
##  -30.7242754   -9.6167732  -36.1013108  -26.7199932  -19.1090085  -19.9446432 
##          331          332          333          334          335          336 
##   23.5595964   37.1326038    0.8339304  -11.1404103  -34.3262918  -34.4480047 
##          337          338          339          340          341          342 
##  -27.2572520  -12.2833017  -32.6140897  -12.0625789   19.1405222   -5.2160942 
##          343          344          345          346          347          348 
##   10.8466480   28.8524848   38.9666023   66.0000652   88.7732445   79.0310451 
##          349          350          351          352          353          354 
##   76.5496205   78.3436854   81.9689738  101.6010858  102.4395595  101.5798373 
##          355          356          357          358          359          360 
##  125.3032084  141.5681456   94.4269073   73.9757388   68.6319384   56.1809460 
##          361          362          363          364          365          366 
##   59.9898736   50.5057063   48.0495668   48.8891913   69.9738419   47.3123086 
##          367          368          369          370          371          372 
##   -8.1819699   -7.4455645    6.0202023    1.9460809   44.6129884   48.4286213 
##          373          374          375          376          377          378 
##   48.2185083   70.1692618   74.2789935   57.6931886   73.1912686   98.8887121 
##          379          380          381          382          383          384 
##  130.6446412  138.7502892  159.2877151  132.4732470  137.3337449  137.0115923 
##          385          386          387          388          389          390 
##  148.6866239  122.6424508  110.2641468  110.5219435  108.3533186  109.2557456 
##          391          392          393          394          395          396 
##  102.0973642   97.8299322   79.4450669   97.0745257  112.6289640   85.2823962 
##          397          398          399          400          401          402 
##  145.1137810  181.1614317  179.1809027  140.0051597  115.0153045   95.1519879 
##          403          404          405          406          407          408 
##   72.1972391   83.1602345   91.5874710   87.5500718   71.7498232   36.0406079 
##          409          410          411          412          413          414 
##   10.2800935  -26.0484580  -13.2756629   -0.2504821   11.7659814   18.7049880 
##          415          416          417          418          419          420 
##   30.2066442   11.5991093   13.8227506   31.5288295   34.3692422   54.4355516 
##          421          422          423          424          425          426 
##   75.9294906   79.6803845   75.2903501   83.9984485   87.9943993   97.5453124 
##          427          428          429          430          431          432 
##   98.4563732   88.4590243  118.4382533  132.7767398  116.1473552  115.1013761 
##          433          434          435          436          437          438 
##   95.2310773  105.2945010  101.1802145  116.3874672   96.7917050   90.2404692 
##          439          440          441          442          443          444 
##   92.5984178   97.2948463   80.2578167   94.9717118   84.9770546   91.0345053 
##          445          446          447          448          449          450 
##   97.8274876   91.4357111   81.0725611   72.4204580   94.2029100   95.2806407 
##          451          452          453          454          455          456 
##   84.7579309  115.7535580  100.4549598  100.1156190   88.0725024   71.3632186 
##          457          458          459          460          461          462 
##   65.3479171   47.0915150  106.2141420   89.7210549   76.8629899   72.9213085 
##          463          464          465          466          467          468 
##   61.5234540   37.2156448   28.6813417   31.4089815   25.8302962   32.0941413 
##          469          470          471          472          473          474 
##   53.2624263   13.0973958    3.3791553    7.3617172   12.1354495   -5.8010303 
##          475          476          477          478          479          480 
##  -13.1170560   -5.0578490    8.2328925   11.2345539   22.7323273   21.8574786 
##          481          482          483          484          485          486 
##   24.9499766   41.0606609   56.9257360   63.9562460   78.7920862   86.3545800 
##          487          488          489          490          491          492 
##   97.8657151   94.4608478  110.3441804  129.7288846  130.5962367  121.9210835 
##          493          494          495          496          497          498 
##  129.2408448  120.4898468  103.1299711  103.0238337  106.2878889   93.5768932

Thus multicolinearity is low in model1

finiteDLMauto(x= as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)

##     q - k      MASE      AIC      BIC    GMRAE   MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1  79.94845 5990.046 6006.888 54.60191 1.00430  0.53139         0
## 9   9 - 1  81.50397 6024.690 6041.541 52.59946 1.01309  0.51162         0
## 8   8 - 1  83.22529 6057.642 6074.501 52.63208 0.99098  0.49282         0
## 7   7 - 1  85.09156 6089.884 6106.750 55.00458 0.97363  0.47412         0
## 6   6 - 1  87.18937 6122.132 6139.006 55.69802 0.97473  0.45481         0
## 5   5 - 1  89.46396 6154.815 6171.697 60.47708 1.00926  0.43439         0
## 4   4 - 1  91.57364 6189.154 6206.045 58.50551 0.96670  0.41136         0
## 3   3 - 1  93.80353 6224.082 6240.980 63.56677 0.99422  0.38678         0
## 2   2 - 1  97.79697 6270.098 6287.005 67.40276 0.99672  0.34713         0
## 1   1 - 1 102.11181 6320.577 6337.491 71.45618 0.99780  0.29896         0

#Since partcle space and chem1 have highest correlation
finiteDLMauto(x= as.vector(dataf$X1), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)

##     q - k     MASE      AIC      BIC    GMRAE   MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1 124.2686 6367.430 6384.272 94.96389 0.99841  0.00020         0
## 9   9 - 1 124.5120 6382.053 6398.904 95.40471 0.99967  0.00050         0
## 8   8 - 1 124.7485 6396.690 6413.548 95.19124 0.99577  0.00079         0
## 7   7 - 1 124.9915 6411.377 6428.243 95.11694 1.06490  0.00098         0
## 6   6 - 1 125.2432 6426.110 6442.984 95.88065 1.00492  0.00109         0
## 5   5 - 1 125.4861 6440.861 6457.743 95.92584 1.00078  0.00117         0
## 4   4 - 1 125.7361 6455.690 6472.580 95.88237 0.99756  0.00110         0
## 3   3 - 1 125.9870 6470.544 6487.442 96.13404 1.01179  0.00099         0
## 2   2 - 1 126.2369 6485.400 6502.306 96.13934 0.99961  0.00089         0
## 1   1 - 1 126.5063 6500.374 6517.288 95.91392 0.99990  0.00056         0

finiteDLMauto(x= as.vector(dataf$X3), y= as.vector(dataf$mortality),q.min = 1,q.max =10, k.order =1, model.type ="poly", error.type="AIC", trace= TRUE)

##     q - k     MASE      AIC      BIC    GMRAE   MBRAE R.Adj.Sq Ljung-Box
## 10 10 - 1 76.09484 5927.164 5944.007 53.25451 1.01353  0.58698         0
## 9   9 - 1 77.07088 5956.737 5973.587 52.51940 1.01040  0.57380         0
## 8   8 - 1 78.30526 5987.006 6003.864 50.91376 1.01841  0.55964         0
## 7   7 - 1 80.12804 6016.111 6032.978 57.26881 0.99311  0.54612         0
## 6   6 - 1 81.83584 6048.052 6064.927 58.29537 1.00044  0.52961         0
## 5   5 - 1 83.61434 6081.504 6098.386 59.21895 0.98981  0.51110         0
## 4   4 - 1 85.63026 6117.644 6134.534 60.55434 1.01811  0.48922         0
## 3   3 - 1 88.48603 6159.437 6176.336 63.54922 1.01534  0.46046         0
## 2   2 - 1 92.83973 6213.661 6230.567 66.36473 0.98818  0.41604         0
## 1   1 - 1 98.26132 6274.512 6291.427 72.52020 0.97310  0.35984         0

From the VIF values, it is obvious that the estimates of the finite DLM coefficients are suffering from the multicollinearity. To deal with this issue, we can use the restricted leastsquares method to find parameter estimates. In this approach, some restrictions are placed on the model parameters to reduce the variances of the estimators. In the context of DLMs,we translate the pattern of time effects into the restrictions on parameters. In the nextsection, we will use polynomial curves to restrict lag weights. According to the significance tests of model coefficients obtained from the summary, all lag weights of predictors are not statistically significant at 5% level. Following this inference, the adjusted R2 is reported to be about 8% which is very low. F-test of the overall significance of the model reports the model is not statistically significant at 5% level (p-value > 0.05). Therefore, we can conclude that the model is not a good fit to the data. VIF values are reported > 10 so the effect of multicollinearity is high. The residualcheck function was created to apply a diagnostic check in a dynamic way. It displays residual analysis plots as well as performs the Breusch-Godfrey test of serial correlation and the Shapiro-Wilk normality test of the residuals. From looking at the diagnostic check plots in Figure 14, we can observe that the residuals are not randomly distributed and clearly have a trend. ACF plot shows that there is serial correlation in the residuals, the Beusch-Godfrey test supports that at 5% level of significance. The histogram and Shapiro-Wilk (p-value < 0.05) test report that the normality of the residuals does not hold. Overall, we can conclude that the finite DLM of lag 10 is not appropriate for further analysis.

##Polynomial Distributed Lags model To reduce the harmful effect of multicollinearity, we will impose a polynomial shape on thelag distribution. Suppose, lag weights follow a smooth polynomial pattern. Because this idea first introduced by Shirley Almon, the resulting model is called Almon Distributed LagModel or Polynomial Distributed Lag model.

To deal with multicollinearity problem in the finite DLM,we will attempt to use polynomial curves to restrict lag weights. We specify the optimal lag length using a function that fits finite DLMs for a range of lag lengths from 1 to 10 and orders the fitted models according to their AIC values.

Model2.AllIndexes <- polyDlm(x= as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value  P(>|t|)
## beta.0     -1.21     0.1630   -7.43 4.84e-13
## beta.1     -1.21     0.1340   -9.03 3.98e-18
## beta.2     -1.20     0.1060  -11.40 7.20e-27
## beta.3     -1.20     0.0797  -15.10 2.04e-42
## beta.4     -1.20     0.0590  -20.30 6.02e-67
## beta.5     -1.20     0.0503  -23.80 1.37e-83
## beta.6     -1.19     0.0591  -20.20 2.98e-66
## beta.7     -1.19     0.0799  -14.90 1.34e-41
## beta.8     -1.19     0.1060  -11.20 3.94e-26
## beta.9     -1.18     0.1340   -8.84 1.72e-17
## beta.10    -1.18     0.1630   -7.24 1.71e-12

#Model2.AllIndexes
summary(Model2.AllIndexes)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -246.207  -63.823   -6.922   75.755  194.858 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.545e+03  9.620e+01  26.454  < 2e-16 ***
## z.t0        -1.210e+00  1.629e-01  -7.432 4.75e-13 ***
## z.t1         2.875e-03  3.101e-02   0.093    0.926    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 98.51 on 495 degrees of freedom
## Multiple R-squared:  0.5333, Adjusted R-squared:  0.5314 
## F-statistic: 282.8 on 2 and 495 DF,  p-value: < 2.2e-16

vif(Model2.AllIndexes$model)

##     z.t0     z.t1 
## 10.48814 10.48814

residualcheck(Model2.AllIndexes$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98345, p-value = 1.913e-05

checkresiduals(Model2.AllIndexes$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 479.75, df = 10, p-value < 2.2e-16

vif(Model2.AllIndexes$model)>10

## z.t0 z.t1 
## TRUE TRUE

pmodel1 = polyDlm(x = as.vector(dataf$temp) , y = as.vector(dataf$mortality),q=2,k = 2 , show.beta = TRUE)

## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0    -2.42      0.995   -2.43 0.01560
## beta.1    -2.65      0.988   -2.69 0.00748
## beta.2    -2.81      0.993   -2.83 0.00483

model2.c1 =polyDlm(x=as.vector(dataf$X1),y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value P(>|t|)
## beta.0    0.1600     0.3110   0.516   0.606
## beta.1    0.1530     0.2540   0.603   0.547
## beta.2    0.1460     0.1990   0.735   0.463
## beta.3    0.1390     0.1470   0.944   0.346
## beta.4    0.1320     0.1050   1.260   0.209
## beta.5    0.1240     0.0863   1.440   0.150
## beta.6    0.1170     0.1050   1.120   0.265
## beta.7    0.1100     0.1480   0.745   0.456
## beta.8    0.1030     0.1990   0.516   0.606
## beta.9    0.0958     0.2550   0.376   0.707
## beta.10   0.0886     0.3110   0.285   0.776

model2.p =polyDlm(x=as.vector(dataf$X3),y= as.vector(dataf$mortality),q=10,k=1, show.beta = T)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value  P(>|t|)
## beta.0     -16.6      2.190   -7.57 1.84e-13
## beta.1     -16.2      1.780   -9.09 2.49e-18
## beta.2     -15.8      1.380  -11.40 5.91e-27
## beta.3     -15.4      1.010  -15.30 1.82e-43
## beta.4     -15.0      0.691  -21.70 8.60e-74
## beta.5     -14.6      0.550  -26.60 6.32e-97
## beta.6     -14.2      0.700  -20.30 5.96e-67
## beta.7     -13.8      1.020  -13.60 8.79e-36
## beta.8     -13.4      1.400   -9.62 3.61e-20
## beta.9     -13.0      1.800   -7.26 1.51e-12
## beta.10    -12.7      2.210   -5.74 1.69e-08

summary(model2.c1, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -246.30 -136.53   10.72  122.59  241.65 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 157.776535  70.798238   2.229   0.0263 *
## z.t0          0.160321   0.310712   0.516   0.6061  
## z.t1         -0.007169   0.059776  -0.120   0.9046  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 143.9 on 495 degrees of freedom
## Multiple R-squared:  0.004219,   Adjusted R-squared:  0.0001951 
## F-statistic: 1.049 on 2 and 495 DF,  p-value: 0.3512

#p value is large adjusted r square is lower
summary(model2.p, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -264.69  -63.04   11.17   78.56  177.80 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 718.3915    17.7578  40.455  < 2e-16 ***
## z.t0        -16.5867     2.1901  -7.573  1.8e-13 ***
## z.t1          0.3932     0.4256   0.924    0.356    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 92.48 on 495 degrees of freedom
## Multiple R-squared:  0.5886, Adjusted R-squared:  0.587 
## F-statistic: 354.2 on 2 and 495 DF,  p-value: < 2.2e-16

# pvalue is lowest and adjusted r square is acceptable low
vif(model2.c1$model)>10

## z.t0 z.t1 
## TRUE TRUE

vif(model2.p$model)>10

## z.t0 z.t1 
## TRUE TRUE

Like in the finite DLM fitting, the lowest AIC and BIC measures in the given range are for q = 10. We set the order of polynomial to 1 as it minimises the information criteria. According to the model summary, all lag weights are significant at 5% level except lag 6 (p-value > 0.05). The adjusted R2 = 17.5% is slightly better than the finite DLM but still very low. The overall significance test reports the model is statistically significant at 5% level. VIF values are > 10 and suggest there is still multicollinearity effect on this model. Diagnostic checking in Figure 15 shows that the residuals are not randomly spread. There are a lot of highly significant lags in the ACF plot, so there is autocorrelation present in the residuals. That is also supported by Beusch-Godfrey test at 5% level of significance. The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05). We conclude that the polynomial DLM of lag 10 is not appropriate for further analysis.

3 KOYCK Transformation DL modeling

One way to deal with this infinite DLM is to use Koyck transformation

Model3.AllIndexes <- koyckDlm(x=as.vector(dataf$temp)+as.vector(dataf$X1)+as.vector(dataf$X2)+as.vector(dataf$X3), y= as.vector(dataf$mortality))
Model3.AllIndexes

## $model
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Coefficients:
## (Intercept)          Y.1          X.t  
##    1.00e+00     1.00e+00     4.33e-16  
## 
## 
## $geometric.coefficients
##                                alpha         beta phi
## Geometric coefficients:  -2.2518e+15 4.329597e-16   1
## 
## $call
## koyckDlm.default(x = as.vector(dataf$temp) + as.vector(dataf$X1) + 
##     as.vector(dataf$X2) + as.vector(dataf$X3), y = as.vector(dataf$mortality))
## 
## attr(,"class")
## [1] "koyckDlm" "dLagM"

summary(Model3.AllIndexes)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -5.684e-14  0.000e+00  5.684e-14  8.527e-14  1.403e-13 
## 
## Coefficients:
##              Estimate Std. Error   t value Pr(>|t|)    
## (Intercept) 1.000e+00  3.356e-13 2.979e+12   <2e-16 ***
## Y.1         1.000e+00  7.533e-17 1.328e+16   <2e-16 ***
## X.t         4.330e-16  1.826e-15 2.370e-01    0.813    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.958e-14 on 504 degrees of freedom
## Multiple R-Squared:     1,   Adjusted R-squared:     1 
## Wald test: 1.122e+33 on 2 and 504 DF,  p-value: < 2.2e-16 
## 
## Diagnostic tests:
## NULL
## 
##                                alpha         beta phi
## Geometric coefficients:  -2.2518e+15 4.329597e-16   1

vif(Model3.AllIndexes$model)

##      Y.1      X.t 
## 12.72753 12.72753

residualcheck(Model3.AllIndexes$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.93121, p-value = 1.64e-14

checkresiduals(Model3.AllIndexes$model)

vif(Model3.AllIndexes$model)>10

##  Y.1  X.t 
## TRUE TRUE

#residuals presents strange vague plot hence we can say multicolinearity exist.


model3.c1 =koyckDlm(x=as.vector(dataf$X1),y= as.vector(dataf$mortality))
model3.c1

## $model
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Coefficients:
## (Intercept)          Y.1          X.t  
##   1.000e+00    1.000e+00   -3.021e-16  
## 
## 
## $geometric.coefficients
##                               alpha          beta phi
## Geometric coefficients:  4.5036e+15 -3.021027e-16   1
## 
## $call
## koyckDlm.default(x = as.vector(dataf$X1), y = as.vector(dataf$mortality))
## 
## attr(,"class")
## [1] "koyckDlm" "dLagM"

model3.p =koyckDlm(x=as.vector(dataf$X3),y= as.vector(dataf$mortality))
model3.p

## $model
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Coefficients:
## (Intercept)          Y.1          X.t  
##   1.000e+00    1.000e+00   -1.874e-14  
## 
## 
## $geometric.coefficients
##                               alpha          beta phi
## Geometric coefficients:  1.5012e+15 -1.874194e-14   1
## 
## $call
## koyckDlm.default(x = as.vector(dataf$X3), y = as.vector(dataf$mortality))
## 
## attr(,"class")
## [1] "koyckDlm" "dLagM"

summary(model3.c1, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -8.837e-14 -5.684e-14 -5.684e-14 -2.842e-14  5.684e-14 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  3.176e-14  3.148e+13   <2e-16 ***
## Y.1          1.000e+00  1.590e-17  6.291e+16   <2e-16 ***
## X.t         -3.021e-16  4.285e-16 -7.050e-01    0.481    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.213e-14 on 504 degrees of freedom
## Multiple R-Squared:     1,   Adjusted R-squared:     1 
## Wald test: 1.998e+33 on 2 and 504 DF,  p-value: < 2.2e-16 
## 
## Diagnostic tests:
##                  df1 df2    statistic      p-value
## Weak instruments   1 504 284.38450569 6.381405e-51
## Wu-Hausman         1 503   0.03198651 8.581293e-01
## 
##                               alpha          beta phi
## Geometric coefficients:  4.5036e+15 -3.021027e-16   1

#p value is smaller adjusted r square is higher
summary(model3.p, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.545e-13 -5.684e-14  5.684e-14  1.137e-13  2.274e-13 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  1.234e-13  8.106e+12   <2e-16 ***
## Y.1          1.000e+00  1.223e-16  8.179e+15   <2e-16 ***
## X.t         -1.874e-14  3.275e-14 -5.720e-01    0.567    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.116e-13 on 504 degrees of freedom
## Multiple R-Squared:     1,   Adjusted R-squared:     1 
## Wald test: 4.363e+32 on 2 and 504 DF,  p-value: < 2.2e-16 
## 
## Diagnostic tests:
##                  df1 df2 statistic      p-value
## Weak instruments   1 504 14.335197 0.0001714073
## Wu-Hausman         1 503  1.246699 0.2647170467
## 
##                               alpha          beta phi
## Geometric coefficients:  1.5012e+15 -1.874194e-14   1

# pvalue is lowest and adjusted r square is high
vif(model3.c1$model)>10

##   Y.1   X.t 
## FALSE FALSE

#chemical 1 has no multicolinearity since values lie below lag value 10
vif(model3.p$model)>10

##  Y.1  X.t 
## TRUE TRUE

#particle size contains multicolinearity

#changed attribute of model to obtain aic value
attr(model3.c1$model, "class") ="lm"
AIC(model3.c1$model)

## [1] -29569.36

The AIC measure is reported to be -29569.36 which is lower than the finite and polynomial DLMs. From the residual analysis in Figure 16, it is observed that the errors are spread randomly as desired. There are no significant lags in the ACF plot which suggests there is no serial correlation in the residuals. However, the error terms are not perfectly normal. The histogram of the residuals seems left-skewed, and the Shapiro-Wilk normality test suggests not normal residuals (p-value < 0.05).

4 Autoregressive DLM

Autoregressive DLMs are useful when we cannot find suitable solutions with neither polynomial nor Kyock DLMs. Actually, the autoregressive DLM is a flexible andparsimonious infinite DLM.

We attempt to fit autoregressive DLMs in order to find a more suitable model than the Koyck model.

For specifying the parameters of ARDL(p,q), we use a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and fit the models that minimise the infrormation criteria. Based on the information criteria, we select the following models: ARDL(1,5) ARDL(2,5) ARDL(3,5) ARDL(4,5) ARDL(5,5)

#for(i in 1:5){
#  for (j in 1:5) {
 #   model4 = ardlDlm(formula = mortality ~ temp +x3, data = data.frame(dataf), p= i, q=j)
  #  cat("p= ", i, "q= ", j,"AIC =", AIC(model4$model), "BIC =", BIC(model4$model),"Mase=", MASE(model4)$MASE, "\n")
    
 # }
#}


for(i in 1:5){
  for (j in 1:5) {
    model4.allIndexes = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p= i, q=j)
    cat("p= ", i, "q= ", j,"AIC =", AIC(model4.allIndexes$model), "BIC =", BIC(model4.allIndexes$model),"Mase=", MASE(model4.allIndexes)$MASE,"\n")
    
  }
}

## p=  1 q=  1 AIC = -28835.99 BIC = -28806.39 Mase= 1.2438e-14 
## p=  1 q=  2 AIC = -29003.22 BIC = -28973.63 Mase= 1.794261e-14 
## p=  1 q=  3 AIC = -28412.27 BIC = -28382.69 Mase= 1.975098e-14 
## p=  1 q=  4 AIC = -29522.48 BIC = -29492.92 Mase= 5.073543e-15 
## p=  1 q=  5 AIC = -28709.65 BIC = -28680.1 Mase= 1.57453e-14 
## p=  2 q=  1 AIC = -29487.23 BIC = -29449.19 Mase= 1.606444e-14 
## p=  2 q=  2 AIC = -29487.23 BIC = -29449.19 Mase= 1.606444e-14 
## p=  2 q=  3 AIC = -30338.17 BIC = -30300.15 Mase= 2.474588e-15 
## p=  2 q=  4 AIC = -31396.88 BIC = -31358.88 Mase= 1.13137e-15 
## p=  2 q=  5 AIC = -27996.53 BIC = -27958.54 Mase= 3.00056e-14 
## p=  3 q=  1 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14 
## p=  3 q=  2 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14 
## p=  3 q=  3 AIC = -28584.12 BIC = -28537.65 Mase= 1.589576e-14 
## p=  3 q=  4 AIC = -28111.5 BIC = -28065.05 Mase= 2.685154e-14 
## p=  3 q=  5 AIC = -28562.62 BIC = -28516.2 Mase= 1.895016e-14 
## p=  4 q=  1 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16 
## p=  4 q=  2 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16 
## p=  4 q=  3 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16 
## p=  4 q=  4 AIC = -32376.74 BIC = -32321.84 Mase= 1.867994e-16 
## p=  4 q=  5 AIC = -29504.13 BIC = -29449.26 Mase= 1.453575e-14 
## p=  5 q=  1 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14 
## p=  5 q=  2 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14 
## p=  5 q=  3 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14 
## p=  5 q=  4 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14 
## p=  5 q=  5 AIC = -29423.44 BIC = -29360.13 Mase= 1.83604e-14

for (i in c(3,4,5)){
  model4_ardl <- ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p 
                  = i, q = 5)
  summary(model4_ardl)
 
  residualcheck(model4_ardl$model)
  
}

## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.282e-12 -8.440e-15  3.810e-15  1.511e-14  7.850e-13 
## 
## Coefficients: (4 not defined because of singularities)
##              Estimate Std. Error   t value Pr(>|t|)    
## (Intercept) 1.000e+00  7.409e-14 1.350e+13  < 2e-16 ***
## temp.t      1.984e-15  9.123e-16 2.175e+00   0.0301 *  
## temp.1      4.201e-16  9.976e-16 4.210e-01   0.6739    
## temp.2      1.465e-15  9.931e-16 1.475e+00   0.1408    
## temp.3      1.951e-15  9.082e-16 2.148e+00   0.0322 *  
## X3.t        4.290e-14  5.816e-15 7.377e+00 6.91e-13 ***
## X3.1        3.601e-14  5.959e-15 6.044e+00 2.97e-09 ***
## X3.2        1.053e-14  5.946e-15 1.770e+00   0.0773 .  
## X3.3        2.892e-14  5.874e-15 4.924e+00 1.16e-06 ***
## mortality.1 1.000e+00  5.240e-17 1.909e+16  < 2e-16 ***
## mortality.2        NA         NA        NA       NA    
## mortality.3        NA         NA        NA       NA    
## mortality.4        NA         NA        NA       NA    
## mortality.5        NA         NA        NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.117e-13 on 493 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 9.446e+31 on 9 and 493 DF,  p-value: < 2.2e-16
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -4.370e-13 -9.910e-15 -3.200e-16  1.074e-14  6.579e-13 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  3.017e-14  3.315e+13  < 2e-16 ***
## temp.t       5.421e-16  3.600e-16  1.506e+00 0.132699    
## temp.1       3.387e-16  3.916e-16  8.650e-01 0.387474    
## temp.2      -1.144e-16  4.160e-16 -2.750e-01 0.783494    
## temp.3       3.294e-16  3.888e-16  8.470e-01 0.397294    
## temp.4      -7.356e-16  3.553e-16 -2.070e+00 0.038963 *  
## X3.t        -1.290e-15  2.292e-15 -5.630e-01 0.573758    
## X3.1         8.987e-15  2.338e-15  3.844e+00 0.000137 ***
## X3.2        -2.385e-16  2.371e-15 -1.010e-01 0.919918    
## X3.3         1.548e-14  2.330e-15  6.643e+00 8.17e-11 ***
## X3.4         8.938e-15  2.336e-15  3.827e+00 0.000147 ***
## mortality.1  1.000e+00  2.117e-17  4.724e+16  < 2e-16 ***
## mortality.2         NA         NA         NA       NA    
## mortality.3         NA         NA         NA       NA    
## mortality.4         NA         NA         NA       NA    
## mortality.5         NA         NA         NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.372e-14 on 491 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 5.043e+32 on 11 and 491 DF,  p-value: < 2.2e-16
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.287e-13 -1.405e-14  1.240e-15  1.249e-14  6.350e-13 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  3.386e-14  2.953e+13  < 2e-16 ***
## temp.t       4.055e-16  3.893e-16  1.042e+00   0.2981    
## temp.1      -1.769e-16  4.261e-16 -4.150e-01   0.6782    
## temp.2       2.586e-16  4.509e-16  5.730e-01   0.5666    
## temp.3      -1.007e-17  4.501e-16 -2.200e-02   0.9822    
## temp.4      -8.529e-16  4.200e-16 -2.031e+00   0.0428 *  
## temp.5      -3.321e-16  3.853e-16 -8.620e-01   0.3892    
## X3.t         2.158e-14  2.494e-15  8.653e+00  < 2e-16 ***
## X3.1        -2.542e-15  2.542e-15 -1.000e+00   0.3178    
## X3.2        -2.135e-14  2.567e-15 -8.317e+00 9.00e-16 ***
## X3.3         2.387e-15  2.567e-15  9.300e-01   0.3529    
## X3.4        -2.117e-14  2.553e-15 -8.291e+00 1.10e-15 ***
## X3.5        -1.505e-14  2.538e-15 -5.929e+00 5.77e-09 ***
## mortality.1  1.000e+00  2.353e-17  4.250e+16  < 2e-16 ***
## mortality.2         NA         NA         NA       NA    
## mortality.3         NA         NA         NA       NA    
## mortality.4         NA         NA         NA       NA    
## mortality.5         NA         NA         NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.728e-14 on 489 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.649e+32 on 13 and 489 DF,  p-value: < 2.2e-16

checkresiduals(model4_ardl$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 21
## 
## data:  Residuals
## LM test = 91.724, df = 21, p-value = 8.125e-11

#Based on the observation about model estimates made earlier, we can try to decrease the 
#number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.

#for p=1, q=5
model4_1 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf),p=1 ,q =5)$model
summary(model4_1)

## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.244e-13 -1.460e-14 -5.560e-15  4.650e-15  2.046e-12 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  5.747e-14  1.740e+13  < 2e-16 ***
## temp.t      -3.572e-15  7.207e-16 -4.956e+00 9.87e-07 ***
## temp.1      -2.276e-15  7.166e-16 -3.176e+00  0.00159 ** 
## X3.t        -4.133e-14  4.899e-15 -8.437e+00 3.58e-16 ***
## X3.1        -4.882e-14  4.971e-15 -9.820e+00  < 2e-16 ***
## mortality.1  1.000e+00  4.122e-17  2.426e+16  < 2e-16 ***
## mortality.2         NA         NA         NA       NA    
## mortality.3         NA         NA         NA       NA    
## mortality.4         NA         NA         NA       NA    
## mortality.5         NA         NA         NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.688e-14 on 497 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 2.26e+32 on 5 and 497 DF,  p-value: < 2.2e-16

residualcheck(model4_1)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.13369, p-value < 2.2e-16

checkresiduals(model4_1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 13
## 
## data:  Residuals
## LM test = 25.695, df = 13, p-value = 0.01868

#attr(Model3.AllIndexes$model,"class")=lm

#for p=3, q=5
model4_3 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =3, q=5)$model
summary(model4_3)

## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.282e-12 -8.440e-15  3.810e-15  1.511e-14  7.850e-13 
## 
## Coefficients: (4 not defined because of singularities)
##              Estimate Std. Error   t value Pr(>|t|)    
## (Intercept) 1.000e+00  7.409e-14 1.350e+13  < 2e-16 ***
## temp.t      1.984e-15  9.123e-16 2.175e+00   0.0301 *  
## temp.1      4.201e-16  9.976e-16 4.210e-01   0.6739    
## temp.2      1.465e-15  9.931e-16 1.475e+00   0.1408    
## temp.3      1.951e-15  9.082e-16 2.148e+00   0.0322 *  
## X3.t        4.290e-14  5.816e-15 7.377e+00 6.91e-13 ***
## X3.1        3.601e-14  5.959e-15 6.044e+00 2.97e-09 ***
## X3.2        1.053e-14  5.946e-15 1.770e+00   0.0773 .  
## X3.3        2.892e-14  5.874e-15 4.924e+00 1.16e-06 ***
## mortality.1 1.000e+00  5.240e-17 1.909e+16  < 2e-16 ***
## mortality.2        NA         NA        NA       NA    
## mortality.3        NA         NA        NA       NA    
## mortality.4        NA         NA        NA       NA    
## mortality.5        NA         NA        NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.117e-13 on 493 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 9.446e+31 on 9 and 493 DF,  p-value: < 2.2e-16

residualcheck(model4_3)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.14877, p-value < 2.2e-16

checkresiduals(model4_3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 17
## 
## data:  Residuals
## LM test = 61.371, df = 17, p-value = 6.232e-07

#for p=4, q=5
model4_4 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =4, q=5)$model
summary(model4_4)

## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -4.370e-13 -9.910e-15 -3.200e-16  1.074e-14  6.579e-13 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  3.017e-14  3.315e+13  < 2e-16 ***
## temp.t       5.421e-16  3.600e-16  1.506e+00 0.132699    
## temp.1       3.387e-16  3.916e-16  8.650e-01 0.387474    
## temp.2      -1.144e-16  4.160e-16 -2.750e-01 0.783494    
## temp.3       3.294e-16  3.888e-16  8.470e-01 0.397294    
## temp.4      -7.356e-16  3.553e-16 -2.070e+00 0.038963 *  
## X3.t        -1.290e-15  2.292e-15 -5.630e-01 0.573758    
## X3.1         8.987e-15  2.338e-15  3.844e+00 0.000137 ***
## X3.2        -2.385e-16  2.371e-15 -1.010e-01 0.919918    
## X3.3         1.548e-14  2.330e-15  6.643e+00 8.17e-11 ***
## X3.4         8.938e-15  2.336e-15  3.827e+00 0.000147 ***
## mortality.1  1.000e+00  2.117e-17  4.724e+16  < 2e-16 ***
## mortality.2         NA         NA         NA       NA    
## mortality.3         NA         NA         NA       NA    
## mortality.4         NA         NA         NA       NA    
## mortality.5         NA         NA         NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.372e-14 on 491 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 5.043e+32 on 11 and 491 DF,  p-value: < 2.2e-16

residualcheck(model4_4)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.46774, p-value < 2.2e-16

checkresiduals(model4_4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 19
## 
## data:  Residuals
## LM test = 110.01, df = 19, p-value = 7.932e-15

#vif(model4.4)

#for p=5, q=5
model4_5 = ardlDlm(formula = mortality ~ temp + X3, data = data.frame(dataf), p =5, q=5)$model
summary(model4_5)

## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.287e-13 -1.405e-14  1.240e-15  1.249e-14  6.350e-13 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  1.000e+00  3.386e-14  2.953e+13  < 2e-16 ***
## temp.t       4.055e-16  3.893e-16  1.042e+00   0.2981    
## temp.1      -1.769e-16  4.261e-16 -4.150e-01   0.6782    
## temp.2       2.586e-16  4.509e-16  5.730e-01   0.5666    
## temp.3      -1.007e-17  4.501e-16 -2.200e-02   0.9822    
## temp.4      -8.529e-16  4.200e-16 -2.031e+00   0.0428 *  
## temp.5      -3.321e-16  3.853e-16 -8.620e-01   0.3892    
## X3.t         2.158e-14  2.494e-15  8.653e+00  < 2e-16 ***
## X3.1        -2.542e-15  2.542e-15 -1.000e+00   0.3178    
## X3.2        -2.135e-14  2.567e-15 -8.317e+00 9.00e-16 ***
## X3.3         2.387e-15  2.567e-15  9.300e-01   0.3529    
## X3.4        -2.117e-14  2.553e-15 -8.291e+00 1.10e-15 ***
## X3.5        -1.505e-14  2.538e-15 -5.929e+00 5.77e-09 ***
## mortality.1  1.000e+00  2.353e-17  4.250e+16  < 2e-16 ***
## mortality.2         NA         NA         NA       NA    
## mortality.3         NA         NA         NA       NA    
## mortality.4         NA         NA         NA       NA    
## mortality.5         NA         NA         NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.728e-14 on 489 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.649e+32 on 13 and 489 DF,  p-value: < 2.2e-16

residualcheck(model4_5)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.54252, p-value < 2.2e-16

checkresiduals(model4_5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 21
## 
## data:  Residuals
## LM test = 91.724, df = 21, p-value = 8.125e-11

#vif(model4.5)

All the fitted ARDL models were reported to be statistically significant at 5% level. The best model in terms of the significance of coefficients, AIC and adjusted R2 was marginally ARDL(1,5). According to model summary, it is suggested that the all ords price index might be related to its previous year levels, gold prices of the previous year and copper prices of the previous year as well. The AIC (2086.8) and adjusted R2 (95%) values are the best compared to those of all previously fitted models.

Residual analysis supports that this model is appropriate: the errors are randomly spread and have no discernible trend, there is no serial autocorrelation in the residuals based on ACF and Beusch-Godfrey test. The normality assumption is not violated at 5% level according to the histogram and the results of Shapiro-Wilk test. However, all ARDL models fitted suffer from multicollinearity with VIFs > 10. Overall,we failed to find an appropriate ARDL model in terms of multicollinearity.

Exponential smoothening

Mort1<-ts(mort_data$mortality,start = 2010,end =2018,frequency = 12)
hw1 <- hw(Mort1)
summary(hw1,)

## 
## Forecast method: Holt-Winters' additive method
## 
## Model Information:
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = Mort1) 
## 
##   Smoothing parameters:
##     alpha = 0.3026 
##     beta  = 0.0634 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 98.503 
##     b = -0.6389 
##     s = -1.525 2.7413 0.8021 -0.7569 1.7547 -4.0959
##            1.0148 1.5662 -2.6838 -1.0263 1.5713 0.6374
## 
##   sigma:  6.4866
## 
##      AIC     AICc      BIC 
## 822.9894 830.7362 866.7595 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.3845843 5.927486 4.747907 0.1818436 4.998855 0.4851326
##                    ACF1
## Training set -0.1279376
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2018       109.0174 100.70458 117.3303 96.30402 121.7308
## Mar 2018       108.1470  99.29468 116.9994 94.60853 121.6855
## Apr 2018       108.2180  98.67271 117.7633 93.61974 122.8162
## May 2018       114.1949 103.80714 124.5826 98.30821 130.0815
## Jun 2018       115.3714 104.00047 126.7423 97.98107 132.7617
## Jul 2018       111.9884  99.50449 124.4722 92.89593 131.0808
## Aug 2018       119.5674 105.85214 133.2827 98.59170 140.5432
## Sep 2018       118.7832 103.72866 133.8378 95.75924 141.8073
## Oct 2018       122.0702 105.57787 138.5626 96.84735 147.2931
## Nov 2018       125.7372 107.71678 143.7576 98.17735 153.2971
## Dec 2018       123.1990 103.56711 142.8309 93.17460 153.2234
## Jan 2019       127.0892 105.76832 148.4102 94.48170 159.6968
## Feb 2019       129.7506 106.66765 152.8335 94.44830 165.0528
## Mar 2019       128.8802 103.96736 153.7930 90.77931 166.9810
## Apr 2019       128.9511 102.14370 155.7585 87.95271 169.9495
## May 2019       134.9280 106.16437 163.6917 90.93782 178.9182
## Jun 2019       136.1045 105.32577 166.8833 89.03249 183.1766
## Jul 2019       132.7215  99.87113 165.5719 82.48119 182.9618
## Aug 2019       140.3006 105.32414 175.2770 86.80873 193.7924
## Sep 2019       139.5164 102.36132 176.6715 82.69262 196.3402
## Oct 2019       142.8034 103.41875 182.1880 82.56980 203.0369
## Nov 2019       146.4703 104.80679 188.1339 82.75143 210.1893
## Dec 2019       143.9322  99.94159 187.9227 76.65439 211.2099
## Jan 2020       147.8224 101.45800 194.1868 76.91418 218.7306

checkresiduals(hw1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' additive method
## Q* = 34.909, df = 3, p-value = 1.273e-07
## 
## Model df: 16.   Total lags used: 19

hw2 <- hw(Mort1,seasonal="multiplicative")
summary(hw2)

## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = Mort1, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.2679 
##     beta  = 0.0791 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 97.498 
##     b = -0.3153 
##     s = 0.9886 1.0287 1.0079 0.9898 1.0204 0.9511
##            1.0112 1.0136 0.9762 0.9865 1.0161 1.0098
## 
##   sigma:  0.0696
## 
##      AIC     AICc      BIC 
## 823.5972 831.3440 867.3673 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE    MAPE      MASE
## Training set 0.2988484 5.931923 4.731838 0.09075421 4.98262 0.4834907
##                    ACF1
## Training set -0.1040063
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2018       108.9713  99.24873 118.6939 94.10191 123.8407
## Mar 2018       107.7472  97.58845 117.9060 92.21073 123.2837
## Apr 2018       108.5439  97.55894 119.5288 91.74385 125.3439
## May 2018       114.7023 102.09534 127.3093 95.42160 133.9831
## Jun 2018       116.4298 102.42923 130.4304 95.01776 137.8419
## Jul 2018       111.3913  96.68473 126.0979 88.89954 133.8831
## Aug 2018       121.5252 103.89809 139.1523 94.56686 148.4835
## Sep 2018       119.8386 100.76717 138.9099 90.67140 149.0057
## Oct 2018       124.0145 102.41667 145.6123 90.98347 157.0455
## Nov 2018       128.6109 104.17956 153.0423 91.24638 165.9755
## Dec 2018       125.5508  99.62937 151.4721 85.90742 165.1941
## Jan 2019       130.2332 101.11772 159.3488 85.70489 174.7616
## Feb 2019       133.0516 100.95882 165.1445 83.96991 182.1334
## Mar 2019       131.1265  97.12273 165.1303 79.12220 183.1309
## Apr 2019       131.6778  95.08920 168.2664 75.72037 187.6352
## May 2019       138.7222  97.54971 179.8946 75.75434 201.6900
## Jun 2019       140.3932  96.01671 184.7697 72.52521 208.2612
## Jul 2019       133.9311  88.96981 178.8924 65.16874 202.6935
## Aug 2019       145.7078  93.89023 197.5253 66.45968 224.9559
## Sep 2019       143.2965  89.44097 197.1521 60.93157 225.6615
## Oct 2019       147.9003  89.28635 206.5142 58.25802 237.5425
## Nov 2019       152.9907  89.18828 216.7931 55.41332 250.5681
## Dec 2019       148.9803  83.72640 214.2342 49.18306 248.7776
## Jan 2020       154.1645  83.37155 224.9574 45.89605 262.4329

checkresiduals(hw2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 37.742, df = 3, p-value = 3.206e-08
## 
## Model df: 16.   Total lags used: 19

hw3 <- hw(Mort1,seasonal="additive",damped = TRUE, h=5*frequency(Mort1))
summary(hw3)

## 
## Forecast method: Damped Holt-Winters' additive method
## 
## Model Information:
## Damped Holt-Winters' additive method 
## 
## Call:
##  hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "additive",  
## 
##  Call:
##      damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.224 
##     beta  = 0.0802 
##     gamma = 4e-04 
##     phi   = 0.8531 
## 
##   Initial states:
##     l = 98.1632 
##     b = -0.438 
##     s = -1.567 2.4215 1.5828 -0.7804 1.9009 -4.6792
##            0.9772 1.6293 -2.8195 -0.7759 1.352 0.7583
## 
##   sigma:  6.4048
## 
##      AIC     AICc      BIC 
## 821.3245 830.0938 867.6693 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE    MAPE      MASE
## Training set 0.2770002 5.816561 4.675037 0.01609047 4.93011 0.4776868
##                     ACF1
## Training set -0.09199321
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2018       106.7660  98.55785 114.9741 94.21274 119.3192
## Mar 2018       105.9188  97.36706 114.4706 92.84004 118.9976
## Apr 2018       104.9735  95.95014 113.9968 91.17348 118.7735
## May 2018       110.3536 100.75004 119.9573 95.66619 125.0411
## Jun 2018       110.4993 100.23031 120.7683 94.79424 126.2043
## Jul 2018       105.5240  94.52737 116.5207 88.70609 122.3419
## Aug 2018       112.6836 100.91663 124.4506 94.68758 130.6796
## Sep 2018       110.4972  97.93307 123.0612 91.28205 129.7123
## Oct 2018       113.2808  99.90504 126.6566 92.82433 133.7373
## Nov 2018       114.4836 100.29076 128.6765 92.77751 136.1897
## Dec 2018       110.8025  95.79399 125.8111 87.84893 133.7562
## Jan 2019       113.3896  97.57164 129.2076 89.19811 137.5812
## Feb 2019       114.2065  97.58796 130.8251 88.79062 139.6224
## Mar 2019       112.2663  94.86017 129.6724 85.64594 138.8866
## Apr 2019       110.3884  92.20850 128.5683 82.58464 138.1922
## May 2019       114.9731  96.03401 133.9121 86.00829 143.9378
## Jun 2019       114.4401  94.75711 134.1230 84.33759 144.5425
## Jul 2019       108.8858  88.47445 129.2972 77.66931 140.1024
## Aug 2019       115.5515  94.42712 136.6759 83.24454 147.8585
## Sep 2019       112.9437  91.12159 134.7659 79.56963 146.3179
## Oct 2019       115.3680  92.86299 137.8730 80.94958 149.7864
## Nov 2019       116.2642  93.09090 139.4374 80.82371 151.7046
## Dec 2019       112.3215  88.49406 136.1489 75.88058 148.7624
## Jan 2020       114.6854  90.21744 139.1534 77.26487 152.1060
## Feb 2020       115.3120  90.21589 140.4080 76.93083 153.6931
## Mar 2020       113.2093  87.49842 138.9202 73.88791 152.5307
## Apr 2020       111.1929  84.87935 137.5065 70.94980 151.4360
## May 2020       115.6594  88.75474 142.5640 74.51229 156.8064
## Jun 2020       115.0255  87.54098 142.5101 72.99154 157.0595
## Jul 2020       109.3853  81.33149 137.4391 66.48070 152.2899
## Aug 2020       115.9776  87.36476 144.5904 72.21803 159.7372
## Sep 2020       113.3072  84.14515 142.4693 68.70767 157.9068
## Oct 2020       115.6781  85.97611 145.3800 70.25285 161.1033
## Nov 2020       116.5287  86.29586 146.7615 70.29158 162.7658
## Dec 2020       112.5472  81.79209 143.3022 65.51134 159.5830
## Jan 2021       114.8779  83.60888 146.1470 67.05604 162.6998
## Feb 2021       115.4762  83.70055 147.2518 66.87954 164.0728
## Mar 2021       113.3494  81.07532 145.6235 63.99045 162.7083
## Apr 2021       111.3124  78.54722 144.0776 61.20237 161.4225
## May 2021       115.7613  82.51201 149.0106 64.91089 166.6118
## Jun 2021       115.1125  81.38586 148.8392 63.53204 166.6930
## Jul 2021       109.4595  75.26200 143.6570 57.15894 161.7601
## Aug 2021       116.0409  81.37882 150.7030 63.02982 169.0520
## Sep 2021       113.3612  78.24059 148.4819 59.64885 167.0736
## Oct 2021       115.7241  80.15075 151.2975 61.31934 170.1289
## Nov 2021       116.5680  80.54747 152.5885 61.47936 171.6566
## Dec 2021       112.5807  76.11846 149.0429 56.81652 168.3449
## Jan 2022       114.9065  78.00782 151.8053 58.47482 171.3383
## Feb 2022       115.5006  78.16999 152.8312 58.40837 172.5928
## Mar 2022       113.3702  75.61308 151.1273 55.62566 171.1148
## Apr 2022       111.3302  73.15124 149.5091 52.94053 169.7198
## May 2022       115.7765  77.18030 154.3726 56.74872 174.8042
## Jun 2022       115.1254  76.11647 154.1344 55.46638 174.7845
## Jul 2022       109.4705  70.05308 148.8880 49.18674 169.7543
## Aug 2022       116.0503  76.22854 155.8721 55.14817 176.9524
## Sep 2022       113.3693  73.14722 153.5913 51.85497 174.8836
## Oct 2022       115.7310  75.11261 156.3493 53.61054 177.8514
## Nov 2022       116.5738  75.56294 157.5847 53.85309 179.2946
## Dec 2022       112.5857  71.18598 153.9854 49.27031 175.9010
## Jan 2023       114.9108  73.12591 156.6957 51.00633 178.8153

checkresiduals(hw3)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' additive method
## Q* = 32.827, df = 3, p-value = 3.503e-07
## 
## Model df: 17.   Total lags used: 20

hw4 <- hw(Mort1,seasonal="multiplicative",damped = TRUE, h=5*frequency(Mort1))
summary(hw4)

## 
## Forecast method: Damped Holt-Winters' multiplicative method
## 
## Model Information:
## Damped Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "multiplicative",  
## 
##  Call:
##      damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.2373 
##     beta  = 0.071 
##     gamma = 1e-04 
##     phi   = 0.946 
## 
##   Initial states:
##     l = 98.2254 
##     b = -0.2986 
##     s = 0.9893 1.0314 1.001 0.9921 1.0153 0.9495
##            1.0092 1.0147 0.9811 0.9926 1.017 1.0066
## 
##   sigma:  0.0691
## 
##      AIC     AICc      BIC 
## 823.0094 831.7786 869.3542 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.2973536 5.848648 4.651999 0.06786496 4.897167 0.4753329
##                     ACF1
## Training set -0.09439883
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80       Lo 95    Hi 95
## Feb 2018       108.1279  98.55184 117.7039  93.4826050 122.7731
## Mar 2018       107.1958  97.27297 117.1187  92.0201182 122.3716
## Apr 2018       107.5107  96.96826 118.0532  91.3874201 123.6340
## May 2018       112.7263 100.89672 124.5558  94.6345414 130.8180
## Jun 2018       113.5512 100.71458 126.3878  93.9192940 133.1831
## Jul 2018       108.1114  94.90092 121.3219  87.9077117 128.3151
## Aug 2018       116.8894 101.43664 132.3422  93.2564450 140.5224
## Sep 2018       115.4130  98.91974 131.9063  90.1887368 140.6373
## Oct 2018       117.5915  99.46155 135.7214  89.8641370 145.3188
## Nov 2018       122.2738 101.98852 142.5591  91.2501512 153.2974
## Dec 2018       118.2880  97.23458 139.3415  86.0895518 150.4865
## Jan 2019       121.3315  98.23526 144.4277  86.0088522 156.6541
## Feb 2019       123.5114  98.44360 148.5792  85.1735101 161.8493
## Mar 2019       121.3994  95.20880 147.5900  81.3443165 161.4545
## Apr 2019       120.7919  93.17156 148.4122  78.5502289 163.0336
## May 2019       125.7220  95.33606 156.1080  79.2507127 172.1933
## Jun 2019       125.7791  93.72976 157.8284  76.7638708 174.7943
## Jul 2019       118.9952  87.10633 150.8840  70.2254041 167.7649
## Aug 2019       127.8988  91.93270 163.8649  72.8934080 182.9042
## Sep 2019       125.5903  88.60879 162.5718  69.0319615 182.1486
## Oct 2019       127.3062  88.12966 166.4827  67.3908562 187.2215
## Nov 2019       131.7433  89.45149 174.0352  67.0635311 196.4231
## Dec 2019       126.8807  84.46441 169.2969  62.0106137 191.7507
## Jan 2020       129.6029  84.55581 174.6499  60.7093399 198.4964
## Feb 2020       131.4172  83.99589 178.8384  58.8925781 203.9418
## Mar 2020       128.6988  80.55330 176.8443  55.0666097 202.3310
## Apr 2020       127.6173  78.18817 177.0464  52.0219844 203.2125
## May 2020       132.4007  79.37056 185.4308  51.2980967 213.5033
## Jun 2020       132.0632  77.42788 186.6985  48.5057026 215.6207
## Jul 2020       124.5885  71.40725 177.7697  43.2548263 205.9221
## Aug 2020       133.5566  74.79526 192.3180  43.6888717 223.4244
## Sep 2020       130.8205  71.55103 190.0900  40.1756473 221.4654
## Oct 2020       132.2987  70.63264 193.9647  37.9886003 226.6088
## Nov 2020       136.6099  71.15590 202.0638  36.5066588 236.7131
## Dec 2020       131.2965  66.68370 195.9093  32.4797447 230.1133
## Jan 2021       133.8536  66.24937 201.4579  30.4618398 237.2454
## Feb 2021       135.4801  65.30490 205.6552  28.1564085 242.8037
## Mar 2021       132.4501  62.13908 202.7611  24.9186813 239.9815
## Apr 2021       131.1249  59.83392 202.4159  22.0947492 240.1551
## May 2021       135.8330  60.24331 211.4227  20.2285555 251.4374
## Jun 2021       135.2927  58.27654 212.3088  17.5066621 253.0787
## Jul 2021       127.4629  53.28134 201.6446  14.0119712 240.9139
## Aug 2021       136.4643  55.31174 217.6169  12.3521699 260.5764
## Sep 2021       133.5084  52.42362 214.5933   9.4999123 257.5170
## Oct 2021       134.8644  51.25358 218.4753   6.9926592 262.7362
## Nov 2021       139.1109  51.11608 227.1057   4.5344558 273.6873
## Dec 2021       133.5659  47.40165 219.7302   1.7890404 265.3428
## Jan 2022       136.0382  46.57560 225.5008  -0.7830294 272.8594
## Feb 2022       137.5681  45.38111 229.7550  -3.4197201 278.5559
## Mar 2022       134.3779  42.65484 226.1010  -5.9004293 274.6563
## Apr 2022       132.9276  40.54283 225.3123  -8.3627074 274.2179
## May 2022       137.5969  40.26157 234.9323 -11.2646518 286.4585
## Jun 2022       136.9524  38.37969 235.5251 -13.8015346 287.7063
## Jul 2022       128.9402  34.54394 223.3365 -15.4264338 273.3069
## Aug 2022       137.9586  35.26234 240.6549 -19.1017965 295.0191
## Sep 2022       134.8898  32.82179 236.9579 -21.2097607 290.9894
## Oct 2022       136.1830  31.46813 240.8980 -23.9645926 296.3307
## Nov 2022       140.3962  30.72544 250.0670 -27.3307510 308.1232
## Dec 2022       134.7322  27.84235 241.6221 -28.7417368 298.2062
## Jan 2023       137.1609  26.67447 247.6473 -31.8135168 306.1353

checkresiduals(hw4)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 36.493, df = 3, p-value = 5.891e-08
## 
## Model df: 17.   Total lags used: 20

hw5 <- hw(Mort1,seasonal="multiplicative",exponential = TRUE, h=5*frequency(Mort1))
summary(hw5)

## 
## Forecast method: Holt-Winters' multiplicative method with exponential trend
## 
## Model Information:
## Holt-Winters' multiplicative method with exponential trend 
## 
## Call:
##  hw(y = Mort1, h = 5 * frequency(Mort1), seasonal = "multiplicative",  
## 
##  Call:
##      exponential = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.1702 
##     beta  = 0.0826 
##     gamma = 6e-04 
## 
##   Initial states:
##     l = 98.1619 
##     b = 0.9888 
##     s = 0.9861 1.0325 1.0065 0.9966 1.015 0.95
##            1.0078 1.0165 0.975 0.9877 1.0245 1.0018
## 
##   sigma:  0.0698
## 
##      AIC     AICc      BIC 
## 824.2251 831.9720 867.9952 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE     MAPE     MASE
## Training set 0.2476226 6.041892 4.736294 0.05095547 4.972325 0.483946
##                    ACF1
## Training set 0.02318078
## 
## Forecasts:
##          Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## Feb 2018       109.4988  99.83673  119.5230  94.46214  124.5544
## Mar 2018       108.0580  98.37236  118.0374  92.94384  123.8599
## Apr 2018       109.1812  98.55099  120.1720  93.47974  126.0201
## May 2018       116.5184 104.29342  128.8025  98.60449  135.5483
## Jun 2018       118.2492 105.55911  132.2012  98.60745  140.0028
## Jul 2018       114.1012 100.38163  129.3366  93.27904  137.6412
## Aug 2018       124.7755 108.13226  143.5460 100.18277  153.4979
## Sep 2018       125.3974 107.15307  146.5052  98.27312  158.1772
## Oct 2018       129.6403 108.90381  153.7712  98.41594  167.7733
## Nov 2018       136.1257 112.39892  164.6797 100.65249  182.1086
## Dec 2018       133.0759 107.52995  165.2210  96.38835  183.0898
## Jan 2019       138.4000 109.56289  176.1737  96.13983  197.2343
## Feb 2019       144.8668 112.11239  186.8451  97.24755  215.1107
## Mar 2019       142.9607 108.37087  189.7698  93.02552  217.4693
## Apr 2019       144.4467 106.33689  195.8011  89.30283  227.9146
## May 2019       154.1539 111.31489  214.4449  91.81972  254.4071
## Jun 2019       156.4436 110.06103  224.4994  89.89264  267.8307
## Jul 2019       150.9558 103.39080  221.6239  82.88167  269.2519
## Aug 2019       165.0779 109.23115  249.9013  86.92044  309.2957
## Sep 2019       165.9007 106.70006  257.2551  83.37932  323.4817
## Oct 2019       171.5141 106.61312  276.9840  83.22425  347.2831
## Nov 2019       180.0942 108.59901  298.9394  82.27630  387.3839
## Dec 2019       176.0594 102.83667  298.6024  77.32540  396.6695
## Jan 2020       183.1032 102.63906  324.2812  75.57144  429.6073
## Feb 2020       191.6587 104.12938  346.0328  76.59149  474.4412
## Mar 2020       189.1369  99.90952  355.6159  72.24514  490.6191
## Apr 2020       191.1029  97.70887  374.8353  68.50357  521.0252
## May 2020       203.9455 100.80766  413.0005  70.47600  586.8126
## Jun 2020       206.9749  97.53154  432.9401  67.66125  633.6323
## Jul 2020       199.7145  90.96275  433.8695  61.63919  633.5279
## Aug 2020       218.3980  96.16393  492.4960  63.11115  735.6280
## Sep 2020       219.4866  92.27964  512.6324  60.61528  785.6791
## Oct 2020       226.9130  92.88519  550.6302  58.40900  855.9951
## Nov 2020       238.2646  93.13383  592.4569  59.02745  947.2777
## Dec 2020       232.9265  88.79074  600.7463  54.53260 1013.7950
## Jan 2021       242.2454  87.18876  647.9502  54.19609 1080.2958
## Feb 2021       253.5644  88.25287  704.4843  53.19231 1218.2414
## Mar 2021       250.2280  83.72571  727.2705  48.65207 1273.4423
## Apr 2021       252.8290  81.10593  763.1807  47.02147 1359.4674
## May 2021       269.8198  83.53507  850.0836  46.78386 1555.7785
## Jun 2021       273.8277  80.90388  895.4948  44.65058 1682.5248
## Jul 2021       264.2222  74.81528  901.9133  40.72937 1712.5776
## Aug 2021       288.9404  78.95063 1017.5552  41.82568 2001.6934
## Sep 2021       290.3807  75.66547 1069.6307  39.28722 2178.6194
## Oct 2021       300.2058  74.51540 1147.5222  37.99232 2375.2249
## Nov 2021       315.2240  75.34337 1262.6635  37.21553 2684.3866
## Dec 2021       308.1617  69.65201 1286.3090  34.03295 2828.7544
## Jan 2022       320.4906  69.40450 1391.9039  32.97809 3146.3126
## Feb 2022       335.4656  69.48091 1532.4272  32.44381 3523.4407
## Mar 2022       331.0516  64.87783 1563.7035  30.24014 3773.6356
## Apr 2022       334.4927  63.40979 1639.9787  28.31597 3940.3574
## May 2022       356.9715  64.01448 1847.7022  28.07284 4547.3062
## Jun 2022       362.2739  62.19522 1955.6994  26.43109 4863.9921
## Jul 2022       349.5658  57.56158 1975.4967  23.60280 5221.6762
## Aug 2022       382.2681  59.86427 2264.9282  23.42085 5916.0922
## Sep 2022       384.1735  56.66129 2346.4940  21.88930 6654.0246
## Oct 2022       397.1722  56.81254 2505.8187  21.00711 7235.6332
## Nov 2022       417.0412  56.64708 2808.5879  21.10919 8169.1868
## Dec 2022       407.6978  52.50600 2848.0398  18.41054 8714.6095
## Jan 2023       424.0090  51.44345 3125.0802  17.59570 9914.5010

checkresiduals(hw5)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method with exponential trend
## Q* = 33.217, df = 3, p-value = 2.899e-07
## 
## Model df: 16.   Total lags used: 19

Finding best fit for each attribute

fit.expo = ets(mortal_ts, model="ZZZ", ic ="bic")
fit.expo$method

## [1] "ETS(M,N,N)"

#"ETS(M,N,N)"
fit.tem = ets(temp_ts, model="ZZZ", ic ="bic")
fit.tem$method

## [1] "ETS(A,N,N)"

#ETS(A,N,N)
fit.ch1 = ets(chem1_ts, model="ZZZ", ic ="bic")
fit.ch1$method

## [1] "ETS(M,Ad,N)"

#ETS(M,Ad,N)"

fit.ch2 = ets(chem2_ts, model="ZZZ", ic ="bic")
fit.ch2$method

## [1] "ETS(M,N,N)"

#"ETS(M,N,N)"

fit.par = ets(part_ts, model="ZZZ", ic ="bic")
fit.par$method

## [1] "ETS(M,Ad,N)"

#ETS(M,Ad,N)"

We append the accuracy measures for exponential smoothing models to the accuracy data frame. The format of model names is: trend (multiplicative or additive), seasonality (multiplicative or additive) and if the trend is damped or not damped

State-space models variation

ssmodel1=ets(mortal_ts,model = "ANN")
summary(ssmodel1)

## ETS(A,N,N) 
## 
## Call:
##  ets(y = mortal_ts, model = "ANN") 
## 
##   Smoothing parameters:
##     alpha = 0.511 
## 
##   Initial states:
##     l = 98.9364 
## 
##   sigma:  5.8932
## 
##      AIC     AICc      BIC 
## 4971.256 4971.303 4983.947 
## 
## Training set error measures:
##                       ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.05497816 5.88156 4.588578 -0.3769459 5.156433 0.6871475
##                     ACF1
## Training set -0.07517189

ssmodel2=ets(mortal_ts,model = "MNN")
summary(ssmodel2)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = mortal_ts, model = "MNN") 
## 
##   Smoothing parameters:
##     alpha = 0.4843 
## 
##   Initial states:
##     l = 98.5582 
## 
##   sigma:  0.0656
## 
##      AIC     AICc      BIC 
## 4954.111 4954.159 4966.803 
## 
## Training set error measures:
##                       ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.05730399 5.88508 4.593891 -0.3849281 5.159608 0.6879431
##                     ACF1
## Training set -0.04372931

ssmodel3=ets(mortal_ts,model = "AAN")
summary(ssmodel3)

## ETS(A,A,N) 
## 
## Call:
##  ets(y = mortal_ts, model = "AAN") 
## 
##   Smoothing parameters:
##     alpha = 0.5122 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 100.9765 
##     b = -0.029 
## 
##   sigma:  5.906
## 
##      AIC     AICc      BIC 
## 4975.460 4975.579 4996.612 
## 
## Training set error measures:
##                        ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.004457548 5.88274 4.590352 -0.3178407 5.155916 0.6874132
##                     ACF1
## Training set -0.07651869

ssmodel4=ets(mortal_ts,model = "MAN",damped = TRUE)
summary(ssmodel4)

## ETS(M,Ad,N) 
## 
## Call:
##  ets(y = mortal_ts, model = "MAN", damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.4311 
##     beta  = 0.0441 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 101.9204 
##     b = -0.7466 
## 
##   sigma:  0.0657
## 
##      AIC     AICc      BIC 
## 4957.818 4957.986 4983.201 
## 
## Training set error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.041505 5.883233 4.604242 -0.3409889 5.167291 0.6894931
##                     ACF1
## Training set -0.02166677

ssmodel5=ets(mortal_ts,model = "MAN")
summary(ssmodel5)

## ETS(M,Ad,N) 
## 
## Call:
##  ets(y = mortal_ts, model = "MAN") 
## 
##   Smoothing parameters:
##     alpha = 0.4311 
##     beta  = 0.0441 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 101.9204 
##     b = -0.7466 
## 
##   sigma:  0.0657
## 
##      AIC     AICc      BIC 
## 4957.818 4957.986 4983.201 
## 
## Training set error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.041505 5.883233 4.604242 -0.3409889 5.167291 0.6894931
##                     ACF1
## Training set -0.02166677

vlist <- c("AAA", "MAA", "MAM", "MMM")
damp <- c(T,F)
ets_models <- expand.grid(vlist, damp)
ets_aic <- array(NA, 8)
ets_mase <- array(NA,8)
ets_bic <- array(NA,8)


auto_ets <- ets(head(mortal_ts,50))
summary(auto_ets)

## ETS(M,A,N) 
## 
## Call:
##  ets(y = head(mortal_ts, 50)) 
## 
##   Smoothing parameters:
##     alpha = 0.0546 
##     beta  = 0.0546 
## 
##   Initial states:
##     l = 97.2676 
##     b = -0.9257 
## 
##   sigma:  0.0581
## 
##      AIC     AICc      BIC 
## 369.3093 370.6729 378.8694 
## 
## Training set error measures:
##                    ME     RMSE      MAE      MPE     MAPE MASE        ACF1
## Training set 1.211525 5.256707 3.928646 1.037072 4.099653  NaN -0.01514436

checkresiduals(auto_ets)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,A,N)
## Q* = 17.502, df = 6, p-value = 0.007605
## 
## Model df: 4.   Total lags used: 10

#We append the accuracy measures for state-space models to the accuracy data frame calculate <- data.frame(mod, ets_mase, ets_aic, ets_bic) calculate\(X2 <- factor(calculate\)X2, levels = c(T,F), labels = c(“Damped”,“N”)) calculate <- unite(calculate, “Model”, c(“X1”,“X2”)) colnames(calculate) <- c(“Model”, “MASE”, “AIC”, “BIC”) accuracy <- rbind(accuracy,calculate)

#The data frame accuracy is sorted by ascending MASE value. accuracy <- arrange(accuracy, MASE) kable(accuracy, caption = “Models and their accuracy parameters (sorted by MASE)”)

#dlmfore=forecast(model4, x= as.vector(dataf), h = 4)$forecast
#polfore = forecast(pmodel1, x= as.vector(dataf), h = 4)$forecast
#koyfore = forecast(model3.p, x= as.vector(dataf), h = 4)$forecast
#arfore = forecast(armodel1, x= as.vector(dataf), h = 4)$forecast
#hwforecast=hw4$mean
#etsforecast = as.vector(forecast(ssm4,h=4)$mean)
#ets = c(Mort,etsforecast)
#hw=c(Mort,hwforecast)


#dlm = c(Mort,dlmfore)
#poly=c(Mort,polfore)
#koyn=c(Mort,koyfore)
#ardlm=c(Mort,arfore)

#Dataforecast = ts.intersect(
 # ts(dlm,start=2010,frequency = 52),
  #ts(poly,start=2010,frequency = 52),
  #ts(koyn,start=2010,frequency = 52),
#  ts(ardlm,start=2010,frequency = 52),
 # ts(hw,start=2010,frequency = 52),
  #ts(ets,start=2010,frequency = 52))
#

#ts.plot(Dataforecast,xlim=c(2019.75,2019.85),
 #       plot.type = c("single"),gpars=list(col=c("red","blue","gray","green","black","brown")),main = "Forecasting of next 4 Weeks of Mortality")

#legend("topleft", col=c("red","blue","gray","green","black","brown"), lty=1, cex=.65,c("DLM","Poly","koyn","ardlm","HW","ETS"))

Conclusion:

The Aim of Task 1 was to give predictions of solar radiation amount for the next 2 years,and so we used three methods and compared them based on residual analysis and MASE value. The three methods were: • Time series regression models • Exponential smoothing • State-space models From the model fitting plots,the results and findings obtained says that though the time series regression, some exponential smoothing models and the automatically suggested .

Task 2

Introduction

In a study of 81 species of Australian plants Hudson & Keatley (2021) investigated whether the day of occurrence of a species first flowering (first flowering day, FFD, a number between 1 -365) is impacted by climate factors such as rainfall (rain), temperature (temp), radiation level (rad), and relative humidity (RH). The study by Hudson & Keatley essentially explores the influence of long-term climate on the FFD of 81 species of plants from 1984 to 2014. For this task, we will apply time series regression method to fit distributed lag models using yearly FFD series as an independent explanatory series and Provide the point forecasts and confidence intervals and corresponding plot for the most optimal model for each method used . We will also apply exponential smoothing methods with corresponding state-space models to forecast solar radiation series. We will then demonstrate an appropriate comparison between these methods in terms of residual assumptions and goodness of fit measures. The final goal of this analysis is to give 4 years ahead forecasts from the best suitable model in terms of its mean absolute scaled error (MASE) measure.

About Dataset

Your data focuses on one species (of the 81) and contains 5 time series, the FFD time series of the given plant species and the contemporaneous yearly averaged climate variables measured from 1984 – 2014 (31 years). All series are available here in “FFD .csv”

# Load the data

ffdata <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/FFD.csv")
ffdata

##    ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 1     1984    9.371585 2.489344  14.87158    93.92650 217
## 2     1985    9.656164 2.475890  14.68493    94.93589 186
## 3     1986    9.273973 2.421370  14.51507    94.09507 233
## 4     1987    9.219178 2.319726  14.67397    94.49699 222
## 5     1988   10.202186 2.465301  14.74863    94.08142 214
## 6     1989    9.441096 2.735890  14.78356    96.08685 237
## 7     1990    9.943836 2.398630  14.67671    93.77918 213
## 8     1991    9.690411 2.635616  14.41096    93.15562 206
## 9     1992    9.691257 2.795902  13.39617    94.09863 188
## 10    1993    9.947945 2.878630  14.26575    94.91973 234
## 11    1994    9.316438 1.974795  14.52329    93.26932 264
## 12    1995    9.164384 2.843288  13.90411    94.45863 196
## 13    1996    8.967213 2.814754  14.33060    94.60000 229
## 14    1997    9.038356 1.403014  14.77534    93.74685 212
## 15    1998    8.934247 2.289041  14.60000    94.60822 244
## 16    1999    9.547945 2.126301  14.61370    96.22603 178
## 17    2000    9.680328 2.471858  14.65574    95.65738 154
## 18    2001    9.561644 2.227945  14.14521    94.70712 207
## 19    2002    9.389041 1.740000  14.63836    93.53233 182
## 20    2003    9.210959 2.270411  15.11233    94.47096 218
## 21    2004    9.300546 2.620492  14.64481    95.01421 192
## 22    2005    9.623288 2.284110  15.09315    94.30356 199
## 23    2006    8.715068 1.781370  15.41096    94.84493 200
## 24    2007    9.801370 2.191233  15.19452    94.11068 225
## 25    2008    9.034153 1.743169  14.80328    94.39508 216
## 26    2009    9.457534 2.038630  15.12877    94.63096 197
## 27    2010    9.765753 2.777808  14.29315    96.05205 230
## 28    2011    9.826027 2.886301  14.01096    95.70603 204
## 29    2012    9.767760 2.599454  14.40710    94.90519 233
## 30    2013   10.097260 2.540274  14.43014    93.83479 174
## 31    2014   10.247253 2.239286  14.60165    94.21016 189

#Converting into timeseries
ffdata_ts <- ts(ffdata, start=c(1984,1), frequency= 1)
head(ffdata_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
##      ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 1984    1984    9.371585 2.489344  14.87158    93.92650 217
## 1985    1985    9.656164 2.475890  14.68493    94.93589 186
## 1986    1986    9.273973 2.421370  14.51507    94.09507 233
## 1987    1987    9.219178 2.319726  14.67397    94.49699 222
## 1988    1988   10.202186 2.465301  14.74863    94.08142 214
## 1989    1989    9.441096 2.735890  14.78356    96.08685 237

tail(ffdata_ts)

## Time Series:
## Start = 2009 
## End = 2014 
## Frequency = 1 
##      ï..Year Temperature Rainfall Radiation RelHumidity FFD
## 2009    2009    9.457534 2.038630  15.12877    94.63096 197
## 2010    2010    9.765753 2.777808  14.29315    96.05205 230
## 2011    2011    9.826027 2.886301  14.01096    95.70603 204
## 2012    2012    9.767760 2.599454  14.40710    94.90519 233
## 2013    2013   10.097260 2.540274  14.43014    93.83479 174
## 2014    2014   10.247253 2.239286  14.60165    94.21016 189

Ts_tempo<- ts(ffdata$Temperature, start =c(1984,1), frequency = 1)
head(Ts_tempo)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1]  9.371585  9.656164  9.273973  9.219178 10.202186  9.441096

Ts_Rain <- ts(ffdata$Rainfall, start =c(1984,1), frequency = 1)
head(Ts_Rain)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 2.489344 2.475890 2.421370 2.319726 2.465301 2.735890

Ts_Rad<- ts(ffdata$Radiation, start =c(1984,1), frequency = 1)
head(Ts_Rad)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356

Ts_Hum <- ts(ffdata$RelHumidity, start =c(1984,1), frequency = 1)
head(Ts_Hum)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685

Ts_FFD <- ts(ffdata$FFD, start =c(1984,1), frequency = 1)
head(Ts_FFD)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 217 186 233 222 214 237

**Data exploration and visualisation

plot(Ts_FFD,  main = "Fig.1 Time series plot of First flowering day series", ylab = "occurence of FFD series", xlab = "Time")

acf(Ts_FFD, lag.max = 48, main="Fig.2 ACF plot of first flowering day series")

adf.test(Ts_FFD, k=ar(Ts_FFD)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ts_FFD
## Dickey-Fuller = -5.4552, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary

From the plot in Figure 1, we can observe the following characteristics of the series:

There is no apparent trend.

There is obvious seasonality, with lower values in December and January and higher values in June and July. The seasonal pattern is not consistent across the observed time.

Changing variance and behaviour of the series are not obvious due to the presence of seasonality.

There are two potential intervention points .

We will further display sample ACF and conduct an Augmented Dickey-Fuller test to study stationarity and seasonality in the series. The length of our data allows to display more lags in the ACF plot to better observe any evidence of trend.

The ACF plot in Figure 2 shows no seasonal patterns and suggests no trend. ADF test with lag order = 0 reports stationarity in the series at 5% level of significance (p-value < 0.05). Overall, we conclude that FFD series has a no seasonality pattern.

We will display a time series plot of factors affecting ffd which we will use as a predictor series for distributed lag models.

#
par(mfrow=c(2,2))
plot(Ts_tempo, main ="Fig.3.1 Time series plot of temperature effects on ffd", ylab="Temperature change", xlab = "Time")

plot(Ts_Rain, main ="Fig3.2.Time series plot of Rain effects on ffd series", ylab="Rainfall", xlab = "Time")

plot(Ts_Rad, main ="Fig 3.3 Time series plot of Radiations on ffd series", ylab="Radiations", xlab = "Time")

plot(Ts_Hum, main ="Fig 3.4 Time series plot of Humidity effects on ffd series", ylab="Humidity", xlab = "Time")

par(mfrow=c(1,1))

Based on the plot in Figure 3, we can make the following comments on the characteristics of the series:

There might be a slight downward trend, especially in the beginning of the series.

There is a clear seasonality, while the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January.

The existence of changing variance and behaviour of the series is not apparent due to seasonality.

There are no obvious intervention points.

To further explore the trend and seasonality components in precipitation series, we will create a sample ACF plot and conduct an ADF test over the series.

par(mfrow=c(2,2))
acf(Ts_tempo,lag.max = 48, main = "Fig.4.1 ACF plot of Temperature on FFD series")
adf.test(Ts_tempo,k=ar(Ts_tempo)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ts_tempo
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary

acf(Ts_Rain,lag.max = 48, main = "Fig 4.2 ACF plot of Rain on FFD series")
adf.test(Ts_Rain,k=ar(Ts_Rain)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ts_Rain
## Dickey-Fuller = -4.5622, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary

acf(Ts_Rad,lag.max = 48, main = "Fig.4.3 ACF plot of Radiation on FFD series")
adf.test(Ts_Rad,k=ar(Ts_Rad)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ts_Rad
## Dickey-Fuller = -2.7317, Lag order = 4, p-value = 0.2911
## alternative hypothesis: stationary

acf(Ts_Hum,lag.max = 48, main = "Fig 4.4 ACF plot of Humidity on FFD series")

adf.test(Ts_Hum,k=ar(Ts_Hum)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Ts_Hum
## Dickey-Fuller = -4.5749, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary

par(mfrow=c(1,1))

From Figure 4, we can observe that there is a slight seasonal pattern in temperature, a decaying pattern in rain and decaying seasonal lags in radiation and humidity also suggests the possible existence of trend. The ADF test reports 1)lag value 2 and p-value = 0.9002 > 0.05 which suggests the series is nonstationary at 5% level of significance. 2)lag value 0 and p-value = 0.01 < 0.05 which suggests the series is stationary at 5% level of significance. 3)lag value 4 and p-value = 0.2911 > 0.05 which suggests the series is nonstationary at 5% level of significance.4)lag value 0 and p-value = 0.01 < 0.05 which suggests the series is stationary at 5% level of significance.

To clearly display the dependent radiation series versus the explanatory precipitation series within the same plot, we will standardise the data. The following code creates a time series plot to explore the relationship of the series.

# scaling of data

shift<- scale(ffdata_ts)
plot(shift, plot.type="s",col=c("Red", "Blue", "Brown","Black","Green"),main= "Fig.5 FFD rate versus factor affecting ffd wrt time(Scaled)")
legend("bottomright", lty=1, text.width = 7, col = c("Red", "Blue", "Brown","Black","Green"), c("Temperature", "Rain", "Radiation", "Humidity","FFD"))

The plot in Figure 5 shows that the dependent and the independent series are likely to be negatively correlated. High values of radiation correspond to low values of precipitation and vice versa.

#We also calculate the correlation coefficient to check the relationship.

cor(ffdata_ts)

##                ï..Year  Temperature   Rainfall   Radiation  RelHumidity
## ï..Year      1.0000000  0.148410676 -0.1752091  0.11881829  0.206355767
## Temperature  0.1484107  1.000000000  0.3933255 -0.24096625  0.009646021
## Rainfall    -0.1752091  0.393325545  1.0000000 -0.58131610  0.338461007
## Radiation    0.1188183 -0.240966245 -0.5813161  1.00000000 -0.055209652
## RelHumidity  0.2063558  0.009646021  0.3384610 -0.05520965  1.000000000
## FFD         -0.2329975 -0.247933708  0.0506911  0.04677758 -0.128502440
##                     FFD
## ï..Year     -0.23299747
## Temperature -0.24793371
## Rainfall     0.05069110
## Radiation    0.04677758
## RelHumidity -0.12850244
## FFD          1.00000000

cor(Ts_tempo,Ts_FFD)

## [1] -0.2479337

cor(Ts_Rain,Ts_FFD)

## [1] 0.0506911

cor(Ts_Rad,Ts_FFD)

## [1] 0.04677758

cor(Ts_Hum,Ts_FFD)

## [1] -0.1285024

The correlation coefficient is reported as FFD have r=−0.2479 wrt temperature which suggests a moderate negative correlation between the series and confirms the conclusion made from the plot in Figure 5. After we have explored the characteristics of the individual series and found the evidence of relationship between them, we proceed to modelling stage.

**Time series regression methods ##Finite distributed lag model To find a suitable model for forecasting solar radiation values, we will try fitting distributed lag models which include an independent explanatory series and its lags to help explain the overall variation and correlation structure in our dependent series.

To specify the finite lag length for the model, we create a loop that computes accuracy measures like AIC/BIC and MASE for the models with different lag lengths and select a model with the lowest values.

for (i in 1:10){
  model1 <- dlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = ffdata$FFD, q = i)
  cat("q =", i, "AIC =", AIC(model1$model), "BIC =", BIC(model1$model), "MASE =", MASE(model1)$MASE, "\n")
}

## q = 1 AIC = 281.5591 BIC = 287.1639 MASE = 0.6780763 
## q = 2 AIC = 273.1301 BIC = 279.9665 MASE = 0.6577795 
## q = 3 AIC = 266.1371 BIC = 274.1303 MASE = 0.6544162 
## q = 4 AIC = 259.4614 BIC = 268.5323 MASE = 0.6307995 
## q = 5 AIC = 248.5012 BIC = 258.566 MASE = 0.6235341 
## q = 6 AIC = 238.0214 BIC = 248.9912 MASE = 0.5711917 
## q = 7 AIC = 231.115 BIC = 242.8955 MASE = 0.5544822 
## q = 8 AIC = 219.0449 BIC = 231.5353 MASE = 0.4915635 
## q = 9 AIC = 209.6412 BIC = 222.7337 MASE = 0.4791517 
## q = 10 AIC = 202.0655 BIC = 215.6443 MASE = 0.4567299

It is observed that the values of information criteria as well as MASE decrease as the lag q increases, so we will fit a finite DLM with a number of lags = 10.

Finite dlm

1)Temperature

ftem_dlm <- dlm(x = ffdata$FFD, y = ffdata$Temperature, q=10)
summary(ftem_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.40860 -0.19427 -0.03581  0.18360  0.47061 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.5227528  2.4798858   5.453 0.000404 ***
## x.t         -0.0025541  0.0042765  -0.597 0.565061    
## x.1         -0.0032419  0.0042724  -0.759 0.467381    
## x.2         -0.0045529  0.0044176  -1.031 0.329614    
## x.3          0.0018560  0.0043032   0.431 0.676395    
## x.4          0.0009147  0.0042016   0.218 0.832518    
## x.5         -0.0008784  0.0037899  -0.232 0.821907    
## x.6          0.0056978  0.0045033   1.265 0.237563    
## x.7         -0.0006914  0.0043958  -0.157 0.878495    
## x.8         -0.0073904  0.0043644  -1.693 0.124640    
## x.9         -0.0013518  0.0041645  -0.325 0.752907    
## x.10        -0.0072004  0.0041466  -1.736 0.116496    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3893 on 9 degrees of freedom
## Multiple R-squared:  0.5724, Adjusted R-squared:  0.0497 
## F-statistic: 1.095 on 11 and 9 DF,  p-value: 0.4532
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 28.17827 41.75706

vif(ftem_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178 
##      x.8      x.9     x.10 
## 1.595176 1.507594 1.485141

From the temperature series, we obtained Adjusted R-squared: 0.0497,p-value: 0.4532 >0.05 and AIC:28.17827

2)Rain

frain_dlm <- dlm(x = ffdata$FFD, y = ffdata$Rainfall, q=10)
summary(frain_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.62480 -0.13413 -0.05632  0.12456  0.86560 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.3362531  2.9267216   0.457    0.659
## x.t          0.0017720  0.0050470   0.351    0.734
## x.1          0.0012178  0.0050423   0.242    0.815
## x.2          0.0057334  0.0052136   1.100    0.300
## x.3          0.0002185  0.0050786   0.043    0.967
## x.4         -0.0059988  0.0049586  -1.210    0.257
## x.5         -0.0007208  0.0044728  -0.161    0.876
## x.6          0.0045630  0.0053148   0.859    0.413
## x.7          0.0020808  0.0051878   0.401    0.698
## x.8         -0.0007027  0.0051509  -0.136    0.894
## x.9          0.0015957  0.0049149   0.325    0.753
## x.10        -0.0052402  0.0048938  -1.071    0.312
## 
## Residual standard error: 0.4594 on 9 degrees of freedom
## Multiple R-squared:  0.4289, Adjusted R-squared:  -0.2692 
## F-statistic: 0.6143 on 11 and 9 DF,  p-value: 0.7798
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 35.13643 48.71522

vif(frain_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178 
##      x.8      x.9     x.10 
## 1.595176 1.507594 1.485141

From the temperature series, we obtained Adjusted R-squared: -0.2692,p-value: 0.7798 >0.05 and AIC:35.13643

3)Radiation

frad_dlm <- dlm(x = ffdata$FFD, y = ffdata$Radiation, q=10)
summary(frad_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.63400 -0.18651 -0.02998  0.18646  0.42512 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 19.108071   2.373738   8.050 2.11e-05 ***
## x.t         -0.001632   0.004093  -0.399   0.6994    
## x.1         -0.006388   0.004090  -1.562   0.1527    
## x.2          0.001004   0.004228   0.237   0.8176    
## x.3         -0.004638   0.004119  -1.126   0.2893    
## x.4         -0.001052   0.004022  -0.262   0.7996    
## x.5         -0.003036   0.003628  -0.837   0.4244    
## x.6         -0.004475   0.004311  -1.038   0.3262    
## x.7         -0.008065   0.004208  -1.917   0.0875 .  
## x.8          0.001065   0.004178   0.255   0.8045    
## x.9          0.000136   0.003986   0.034   0.9735    
## x.10         0.005649   0.003969   1.423   0.1884    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3726 on 9 degrees of freedom
## Multiple R-squared:  0.6004, Adjusted R-squared:  0.1121 
## F-statistic: 1.229 on 11 and 9 DF,  p-value: 0.3843
## 
## AIC and BIC values for the model:
##        AIC     BIC
## 1 26.34091 39.9197

vif(frad_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178 
##      x.8      x.9     x.10 
## 1.595176 1.507594 1.485141

From the temperature series, we obtained Adjusted R-squared: 0.1121,p-value: 0.3843 >0.05 and AIC:26.34091

4)Humidity

fhum_dlm <- dlm(x = ffdata$FFD, y = ffdata$RelHumidity, q=10)
summary(fhum_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3664 -0.3400 -0.1628  0.5044  1.3298 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 98.209806   5.866260  16.741 4.33e-08 ***
## x.t         -0.010228   0.010116  -1.011    0.338    
## x.1         -0.003821   0.010107  -0.378    0.714    
## x.2          0.007267   0.010450   0.695    0.504    
## x.3          0.008712   0.010179   0.856    0.414    
## x.4         -0.009341   0.009939  -0.940    0.372    
## x.5          0.008470   0.008965   0.945    0.369    
## x.6         -0.002341   0.010653  -0.220    0.831    
## x.7         -0.004620   0.010398  -0.444    0.667    
## x.8         -0.006071   0.010324  -0.588    0.571    
## x.9          0.001441   0.009851   0.146    0.887    
## x.10        -0.006567   0.009809  -0.670    0.520    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9209 on 9 degrees of freedom
## Multiple R-squared:  0.3739, Adjusted R-squared:  -0.3913 
## F-statistic: 0.4887 on 11 and 9 DF,  p-value: 0.869
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 64.34047 77.91927

vif(fhum_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.582467 1.618708 1.628589 1.475913 1.406887 1.177330 1.644082 1.580178 
##      x.8      x.9     x.10 
## 1.595176 1.507594 1.485141

From the temperature series, we obtained Adjusted R-squared: -0.3913,p-value: 0.869 >0.05 and AIC:64.34047

From the above all dlm models, radiation sries have given considerably better results than others. According to the significance tests of model coefficients obtained from the summary, nearly all lag weights of a predictor series are not statistically significant at 5% level. The adjusted R2 for finite_dlm is 0.1121, which means that the model explains only 11% of the variability in radiation. F-test of the overall significance of the model reports the model is statistically significant at 5% level (p-value < 0.05). However, we conclude that the model is not a good fit to the data due to insignificant terms and low explainability.

There is no issue with multicollinearity in the model, VIF values are reported < 10.

##residual check loop
residualcheck <- function(x){
  checkresiduals(x)
  #bgtest(x)
  shapiro.test(x$residuals)
}

Polynomial distributed lag model

Univariate poly modelling for all features 1)Temperature

temp_polyd <- polyDlm(x=as.vector(ffdata$Temperature), y=as.vector(ffdata$FFD), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error  t value P(>|t|)
## beta.0   -6.2400       9.09 -0.68600   0.507
## beta.1   -0.6310       6.86 -0.09200   0.928
## beta.2    3.7500       5.93  0.63100   0.541
## beta.3    6.8900       5.90  1.17000   0.267
## beta.4    8.8100       6.10  1.44000   0.177
## beta.5    9.5000       6.15  1.55000   0.151
## beta.6    8.9700       5.96  1.51000   0.160
## beta.7    7.2000       5.73  1.26000   0.235
## beta.8    4.2000       6.03  0.69700   0.500
## beta.9   -0.0216       7.49 -0.00289   0.998
## beta.10  -5.4700      10.30 -0.53200   0.605

summary(temp_polyd)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -41.641 -14.230  -5.104  16.274  43.700 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -141.0618   533.0958  -0.265    0.794
## z.t0          -6.2363     9.0912  -0.686    0.502
## z.t1           6.2197     3.9083   1.591    0.130
## z.t2          -0.6143     0.3942  -1.559    0.138
## 
## Residual standard error: 25.48 on 17 degrees of freedom
## Multiple R-squared:  0.1583, Adjusted R-squared:  0.009816 
## F-statistic: 1.066 on 3 and 17 DF,  p-value: 0.3896

##
residualcheck(temp_polyd$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.96973, p-value = 0.7269

checkresiduals(temp_polyd$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 6.2838, df = 7, p-value = 0.507

From the temperature series, we obtained Adjusted R-squared: 0.009816,p-value: 0.3896 >0.05

2)Rainfall

rain_polyd <- polyDlm(x=as.vector(ffdata$Rainfall), y=as.vector(ffdata$FFD), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value P(>|t|)
## beta.0     -3.23      10.90  -0.297   0.772
## beta.1      2.47       7.22   0.341   0.739
## beta.2      6.48       5.64   1.150   0.275
## beta.3      8.82       5.69   1.550   0.149
## beta.4      9.47       6.06   1.560   0.146
## beta.5      8.44       5.99   1.410   0.187
## beta.6      5.73       5.35   1.070   0.308
## beta.7      1.33       4.64   0.287   0.779
## beta.8     -4.74       5.38  -0.882   0.397
## beta.9    -12.50       8.55  -1.460   0.171
## beta.10   -21.90      13.60  -1.620   0.134

summary(rain_polyd)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.494 -10.638  -2.134  15.543  46.891 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 209.0810   100.8835   2.073   0.0537 .
## z.t0         -3.2319    10.8654  -0.297   0.7697  
## z.t1          6.5390     5.4708   1.195   0.2484  
## z.t2         -0.8410     0.5707  -1.474   0.1589  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25.4 on 17 degrees of freedom
## Multiple R-squared:  0.1636, Adjusted R-squared:  0.01605 
## F-statistic: 1.109 on 3 and 17 DF,  p-value: 0.3729

residualcheck(rain_polyd$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98714, p-value = 0.9902

checkresiduals(rain_polyd$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 14.791, df = 7, p-value = 0.03878

From the temperature series, we obtained Adjusted R-squared: 0.01605,p-value: 0.3729 >0.05

3)Radiation

rad_polyd <- polyDlm(x=as.vector(ffdata$Radiation), y=as.vector(ffdata$FFD), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error  t value P(>|t|)
## beta.0   -0.4200       8.60 -0.04880   0.962
## beta.1   -0.6790       5.50 -0.12300   0.904
## beta.2   -0.6930       3.82 -0.18200   0.859
## beta.3   -0.4620       3.58 -0.12900   0.900
## beta.4    0.0143       3.90  0.00367   0.997
## beta.5    0.7360       4.03  0.18300   0.858
## beta.6    1.7000       3.80  0.44800   0.663
## beta.7    2.9100       3.57  0.81500   0.432
## beta.8    4.3700       4.28  1.02000   0.329
## beta.9    6.0700       6.50  0.93300   0.371
## beta.10   8.0200      10.00  0.80000   0.440

summary(rad_polyd)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.417 -11.818  -1.942  17.330  51.949 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -106.9869   505.1443  -0.212    0.835
## z.t0          -0.4198     8.6001  -0.049    0.962
## z.t1          -0.3816     4.1407  -0.092    0.928
## z.t2           0.1225     0.4193   0.292    0.774
## 
## Residual standard error: 26.9 on 17 degrees of freedom
## Multiple R-squared:  0.06187,    Adjusted R-squared:  -0.1037 
## F-statistic: 0.3737 on 3 and 17 DF,  p-value: 0.773

residualcheck(rad_polyd$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.97843, p-value = 0.9012

checkresiduals(rad_polyd$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 6.0156, df = 7, p-value = 0.5379

From the temperature series, we obtained Adjusted R-squared: -0.1037,p-value: 0.773 >0.05 4)Humidity

hum_polyd <- polyDlm(x=as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value P(>|t|)
## beta.0   -8.8300       5.30 -1.6600  0.1240
## beta.1   -7.1000       3.82 -1.8600  0.0898
## beta.2   -5.5600       3.13 -1.7800  0.1030
## beta.3   -4.2000       3.07 -1.3700  0.1990
## beta.4   -3.0300       3.24 -0.9330  0.3710
## beta.5   -2.0400       3.35 -0.6080  0.5560
## beta.6   -1.2300       3.32 -0.3720  0.7170
## beta.7   -0.6160       3.25 -0.1900  0.8530
## beta.8   -0.1830       3.43 -0.0534  0.9580
## beta.9    0.0648       4.21  0.0154  0.9880
## beta.10   0.1270       5.74  0.0222  0.9830

summary(hum_polyd)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -37.56 -16.91  -6.12  13.94  43.19 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3290.68298 2900.02326   1.135    0.272
## z.t0          -8.83169    5.30493  -1.665    0.114
## z.t1           1.82201    2.32542   0.784    0.444
## z.t2          -0.09261    0.22708  -0.408    0.688
## 
## Residual standard error: 25.03 on 17 degrees of freedom
## Multiple R-squared:  0.1878, Adjusted R-squared:  0.04447 
## F-statistic:  1.31 on 3 and 17 DF,  p-value: 0.3036

## 
residualcheck(hum_polyd$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.95686, p-value = 0.4553

checkresiduals(hum_polyd$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 5.8819, df = 7, p-value = 0.5536

From the temperature series, we obtained Adjusted R-squared: 0.04447,p-value: 0.3036 >0.05 The analysis of residuals from polynomial model in Figure 7 shows the following:

The errors are not randomly spread. There are a lot of highly significant lags in the ACF plot as well as a wavy pattern at seasonal lags, so there is autocorrelation and seasonality still present in the residuals. Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.

The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05).

Overall, we can conclude that the second order polynomial of lag 10 is not successful at capturing the autocorrelation and seasonality in the series and has low explainability.

Koyck transformation

We will implement Koyck transformation model with precipitation predictor series as follows.

First we design multivariate model and then univariate models for each parameter

K_trans = koyckDlm(x=as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD))
summary(K_trans$model, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.998 -14.267  -3.171  17.086  44.040 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  963.41380 3083.76190   0.312    0.757
## Y.1           -0.05027    0.26344  -0.191    0.850
## X.t           -6.14434   25.16739  -0.244    0.809
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     0.826   0.371
## Wu-Hausman         1  26     0.012   0.914
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 24.54 on 27 degrees of freedom
## Multiple R-Squared: 0.007908,    Adjusted R-squared: -0.06558 
## Wald test: 0.0304 on 2 and 27 DF,  p-value: 0.9701

vif(K_trans$model)

##      Y.1      X.t 
## 1.845706 1.845706

1)Temperature

temp_Koyck <- koyckDlm(x=as.vector(ffdata$Temperature), y=as.vector(ffdata$FFD))
summary(temp_Koyck)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -57.755 -12.972  -4.079  17.329  58.541 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -19.87216  608.22407  -0.033    0.974
## Y.1           0.03846    0.25384   0.152    0.881
## X.t          23.22042   61.08917   0.380    0.707
## 
## Residual standard error: 28.38 on 27 degrees of freedom
## Multiple R-Squared: -0.3272, Adjusted R-squared: -0.4255 
## Wald test: 0.07269 on 2 and 27 DF,  p-value: 0.9301 
## 
## Diagnostic tests:
## NULL
## 
##                              alpha     beta       phi
## Geometric coefficients:  -20.66698 23.22042 0.0384583

vif(temp_Koyck$model, diagnostics =T)

##      Y.1      X.t 
## 1.280988 1.280988

From the temperature series, we obtained Adjusted R-squared: -0.4255,p-value: 0.9301 >0.05

2)Rain

rain_Koyck <- koyckDlm(x=as.vector(ffdata$Rainfall), y=as.vector(ffdata$FFD))
summary(rain_Koyck)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.691 -21.222   2.697  14.856  68.192 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.266e+02  2.185e+02   0.579    0.567
## Y.1         4.591e-03  2.196e-01   0.021    0.983
## X.t         3.448e+01  8.772e+01   0.393    0.697
## 
## Residual standard error: 27.55 on 27 degrees of freedom
## Multiple R-Squared: -0.2505, Adjusted R-squared: -0.3431 
## Wald test: 0.07773 on 2 and 27 DF,  p-value: 0.9254 
## 
## Diagnostic tests:
## NULL
## 
##                             alpha     beta         phi
## Geometric coefficients:  127.2254 34.48095 0.004590669

vif(rain_Koyck$model,diagnostics =T)

##      Y.1      X.t 
## 1.017508 1.017508

From the temperature series, we obtained Adjusted R-squared: -0.3431,p-value: 0.9254 >0.05 3)Radiation

rad_Koyck <- koyckDlm(x=as.vector(ffdata$Radiation), y=as.vector(ffdata$FFD))
summary(rad_Koyck)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -55.229 -19.662   3.956  16.232  54.756 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 418.31843  384.12678   1.089    0.286
## Y.1          -0.03254    0.20729  -0.157    0.876
## X.t         -13.87153   25.49529  -0.544    0.591
## 
## Residual standard error: 25.54 on 27 degrees of freedom
## Multiple R-Squared: -0.07436,    Adjusted R-squared: -0.1539 
## Wald test: 0.1486 on 2 and 27 DF,  p-value: 0.8626 
## 
## Diagnostic tests:
## NULL
## 
##                             alpha      beta         phi
## Geometric coefficients:  405.1353 -13.87153 -0.03254005

vif(rad_Koyck$model)

##      Y.1      X.t 
## 1.055257 1.055257

From the temperature series, we obtained Adjusted R-squared: -0.1539,p-value: 0.8626 >0.05

humidity

hum_Koyck <- koyckDlm(x=as.vector(ffdata$RelHumidity), y=as.vector(ffdata$FFD))
summary(hum_Koyck)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.787 -14.896  -3.024  15.673  55.019 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1898.18932 4211.29973   0.451    0.656
## Y.1           -0.05904    0.25016  -0.236    0.815
## X.t          -17.72952   44.24103  -0.401    0.692
## 
## Residual standard error: 27 on 27 degrees of freedom
## Multiple R-Squared: -0.2016, Adjusted R-squared: -0.2906 
## Wald test: 0.0808 on 2 and 27 DF,  p-value: 0.9226 
## 
## Diagnostic tests:
## NULL
## 
##                             alpha      beta         phi
## Geometric coefficients:  1792.364 -17.72952 -0.05904222

##From the temperature series, we obtained Adjusted R-squared:  -0.2906,p-value: 0.9226 >0.05 
vif(hum_Koyck$model)

##      Y.1      X.t 
## 1.374145 1.374145

From the model summary, we can conclude that all terms of Koyck model are not significant at 5% level. The model is reported to be overall statistically nonsignificant at 5% level (p-value > 0.05) and its adjusted R2 is negative which means the model explains about negative variability in ffd

According to the Weak instruments test (p-value > 0.05), the model at the first stage of least-squares estimation is notsignificant at 5% level.

From the Wu-Hausman test (p-value > 0.05), we can conclude that there is no significant correlation between the explanatory variable and the error term at 5% level. There is no effect of multicollinearity as all VIFs are less than 10.

par(mfrow=c(1,2))
#residualcheck(temp_Koyck$model)
checkresiduals(temp_Koyck$model)

par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(rain_Koyck$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98582, p-value = 0.9504

checkresiduals(rain_Koyck$model)

par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(rad_Koyck$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98748, p-value = 0.9716

checkresiduals(rad_Koyck$model)

par(mfrow=c(1,1))
par(mfrow=c(1,2))
residualcheck(hum_Koyck$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98176, p-value = 0.8702

checkresiduals(hum_Koyck$model)

par(mfrow=c(1,1))

From the residual analysis in Figure 8, we can conclude the following:

The errors are not spread randomly.

All the lags in ACF plot are significant and have a wave-like pattern, which suggests serial correlation and seasonality remaining in the residuals.

The errors are not normal. The histogram and the Shapiro-Wilk normality test with p-value < 0.05 suggest not normal residuals.

Overall, we can conclude that the Koyck model is also not successful at capturing the autocorrelation and seasonality in the series.

Autoregressive distributed lag models The final model type from time series regression method is Autoregressive distributed lag models. For specifying the parameters of ARDL(p,q), we create a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and obtains their accuracy measures, like AIC/BIC and MASE.

Three models with lowest values of MASE were chosen for fitting and analysis. The models were:

ARDL(3,5)

ARDL(4,5)

ARDL(5,5)

We create a loop to fit these candidate models and do residual analysis in a dynamical way.

for (i in 1:5){
  for(j in 1:5){
    model2 = ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model2$model), "BIC =", BIC(model2$model), "MASE =", MASE(model2)$MASE, "\n")
  }
}

## p = 1 q = 1 AIC = 283.5269 BIC = 290.5329 MASE = 0.6795852 
## p = 1 q = 2 AIC = 276.4449 BIC = 284.6487 MASE = 0.6858828 
## p = 1 q = 3 AIC = 269.4214 BIC = 278.7469 MASE = 0.6627738 
## p = 1 q = 4 AIC = 262.8803 BIC = 273.247 MASE = 0.6451817 
## p = 1 q = 5 AIC = 255.9729 BIC = 267.2957 MASE = 0.6432902 
## p = 2 q = 1 AIC = 275.0869 BIC = 283.2907 MASE = 0.6547865 
## p = 2 q = 2 AIC = 276.9991 BIC = 286.5702 MASE = 0.6507562 
## p = 2 q = 3 AIC = 270.1351 BIC = 280.7927 MASE = 0.6370252 
## p = 2 q = 4 AIC = 262.9649 BIC = 274.6274 MASE = 0.6008803 
## p = 2 q = 5 AIC = 255.8111 BIC = 268.3921 MASE = 0.5647418 
## p = 3 q = 1 AIC = 268.1205 BIC = 277.446 MASE = 0.6534486 
## p = 3 q = 2 AIC = 270.0886 BIC = 280.7462 MASE = 0.6510693 
## p = 3 q = 3 AIC = 271.8702 BIC = 283.86 MASE = 0.6428087 
## p = 3 q = 4 AIC = 264.8975 BIC = 277.8559 MASE = 0.6035375 
## p = 3 q = 5 AIC = 257.5522 BIC = 271.3913 MASE = 0.5623997 
## p = 4 q = 1 AIC = 261.3992 BIC = 271.7659 MASE = 0.6330189 
## p = 4 q = 2 AIC = 263.3272 BIC = 274.9897 MASE = 0.6237364 
## p = 4 q = 3 AIC = 264.9645 BIC = 277.9228 MASE = 0.6073044 
## p = 4 q = 4 AIC = 266.7086 BIC = 280.9628 MASE = 0.598053 
## p = 4 q = 5 AIC = 259.5419 BIC = 274.639 MASE = 0.5621408 
## p = 5 q = 1 AIC = 250.4857 BIC = 261.8085 MASE = 0.6196805 
## p = 5 q = 2 AIC = 252.4857 BIC = 265.0666 MASE = 0.6196831 
## p = 5 q = 3 AIC = 254.4675 BIC = 268.3065 MASE = 0.620152 
## p = 5 q = 4 AIC = 255.8517 BIC = 270.9489 MASE = 0.6083909 
## p = 5 q = 5 AIC = 256.0737 BIC = 272.4289 MASE = 0.5493301

for (i in c(3,4,5)){
  ardl <- ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p = i, q = 5)
  summary(ardl)
  #bgtest(ardl$model)
  
}

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -63.367 -10.203  -0.015  18.350  36.453 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1477.26374 1365.76665   1.082    0.295
## X.t           -4.64092    6.28682  -0.738    0.471
## X.1            2.58737    6.76133   0.383    0.707
## X.2           -6.30493    6.42318  -0.982    0.341
## X.3           -2.46544    6.16105  -0.400    0.694
## Y.1           -0.06198    0.25392  -0.244    0.810
## Y.2            0.08573    0.27328   0.314    0.758
## Y.3           -0.15581    0.27753  -0.561    0.582
## Y.4            0.09237    0.27418   0.337    0.741
## Y.5            0.23504    0.26418   0.890    0.387
## 
## Residual standard error: 28.61 on 16 degrees of freedom
## Multiple R-squared:  0.1318, Adjusted R-squared:  -0.3565 
## F-statistic:  0.27 on 9 and 16 DF,  p-value: 0.9741
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -63.26 -10.12   0.49  17.61  36.88 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1534.76078 1594.70318   0.962    0.351
## X.t           -4.55557    6.58510  -0.692    0.500
## X.1            2.53173    7.01876   0.361    0.723
## X.2           -6.46004    6.92989  -0.932    0.366
## X.3           -2.31162    6.66630  -0.347    0.734
## X.4           -0.48565    6.28801  -0.077    0.939
## Y.1           -0.06130    0.26234  -0.234    0.818
## Y.2            0.08177    0.28681   0.285    0.779
## Y.3           -0.15693    0.28695  -0.547    0.593
## Y.4            0.09315    0.28330   0.329    0.747
## Y.5            0.22827    0.28653   0.797    0.438
## 
## Residual standard error: 29.54 on 15 degrees of freedom
## Multiple R-squared:  0.1322, Adjusted R-squared:  -0.4464 
## F-statistic: 0.2285 on 10 and 15 DF,  p-value: 0.9884
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.322 -10.557   1.106  13.976  41.797 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -3.650e+02  1.819e+03  -0.201   0.8438  
## X.t         -1.865e+00  6.313e+00  -0.295   0.7720  
## X.1         -2.276e-01  6.715e+00  -0.034   0.9734  
## X.2         -4.653e+00  6.534e+00  -0.712   0.4880  
## X.3          1.302e+00  6.524e+00   0.200   0.8447  
## X.4         -1.937e+00  5.914e+00  -0.328   0.7481  
## X.5          1.148e+01  6.344e+00   1.810   0.0918 .
## Y.1         -3.483e-02  2.449e-01  -0.142   0.8889  
## Y.2         -1.330e-03  2.712e-01  -0.005   0.9962  
## Y.3          4.191e-03  2.818e-01   0.015   0.9883  
## Y.4          1.405e-01  2.653e-01   0.530   0.6048  
## Y.5          2.666e-01  2.678e-01   0.995   0.3364  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 27.52 on 14 degrees of freedom
## Multiple R-squared:  0.2968, Adjusted R-squared:  -0.2558 
## F-statistic: 0.5371 on 11 and 14 DF,  p-value: 0.8474

From model summaries, we can conclude that all the fitted ARDL models were reported to be statistically significant at 5% level with p-value < 0.05. All models have an adjusted R2=0.933, which means they explain about 93.3% of the variability in radiation.

Regarding model coefficient estimates, we can observe for ARDL(3,5) only X.2 lag of predictor series is significant at 5% level (p-value = 0.0393 < 0.05), for ARDL(4,5) only X.4 lag of predictor series is significant at 5% level (p-value = 03014 < 0.05), and all lags of predictor series are not statistically significant at 5% level for ARDL(5,5). All lags of independent series are statistically significant in all models except Y.2 (p-value = 0.7829 > 0.05).

The plots from diagnostic checking in Figure 9 show that there is a very similar overall picture in residuals from all three fitted models:

The residuals are not as randomly spread as desired, they show evidence of changing variance.

There are a some highly significant lags in the ACF plot. The seasonal lags are also highly significant. Therefore, there is autocorrelation and seasonality still present in the residuals.

Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.

Long tails on the histogram of residuals suggest the normality of the residuals is violated.

Based on the observation about model estimates made earlier, we can try to decrease the number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.

ardl_15 <- ardlDlm(x = as.vector(ffdata$Temperature)+as.vector(ffdata$Rainfall)+as.vector(ffdata$Radiation)+as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -53.157 -14.444  -0.465  19.927  47.433 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 362.62412  872.79704   0.415    0.683
## X.t          -2.19478    5.89426  -0.372    0.714
## X.1           0.50563    5.89138   0.086    0.933
## Y.1          -0.01082    0.24148  -0.045    0.965
## Y.2           0.07149    0.26885   0.266    0.793
## Y.3          -0.05385    0.25949  -0.208    0.838
## Y.4           0.05257    0.25078   0.210    0.836
## Y.5           0.17975    0.25215   0.713    0.485
## 
## Residual standard error: 28.26 on 18 degrees of freedom
## Multiple R-squared:  0.04712,    Adjusted R-squared:  -0.3234 
## F-statistic: 0.1272 on 7 and 18 DF,  p-value: 0.9951

residualcheck(ardl_15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98628, p-value = 0.9724

#temperature
ardl_temp15 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_temp15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -55.363 -17.901  -1.096  15.657  45.469 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 246.044968 196.309815   1.253    0.226
## X.t         -20.508096  16.166491  -1.269    0.221
## X.1          10.580366  16.043160   0.659    0.518
## Y.1          -0.005384   0.231828  -0.023    0.982
## Y.2           0.060271   0.234783   0.257    0.800
## Y.3          -0.041509   0.235489  -0.176    0.862
## Y.4           0.168807   0.239335   0.705    0.490
## Y.5           0.087847   0.249085   0.353    0.728
## 
## Residual standard error: 27.15 on 18 degrees of freedom
## Multiple R-squared:  0.1206, Adjusted R-squared:  -0.2214 
## F-statistic: 0.3525 on 7 and 18 DF,  p-value: 0.9179

residualcheck(ardl_temp15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98258, p-value = 0.9233

##Based on the observation about model ,p-value: 0.9179 >0.05, Residual standard error: 27.15 and Adjusted R-squared:  -0.2214 

#rainfall
ardl_rain15 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_rain15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -51.777 -12.609  -1.348  16.712  45.413 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 143.617855 109.253360   1.315    0.205
## X.t           1.677519  16.237143   0.103    0.919
## X.1           9.943103  16.455578   0.604    0.553
## Y.1           0.003725   0.233996   0.016    0.987
## Y.2           0.031718   0.279802   0.113    0.911
## Y.3          -0.136841   0.280969  -0.487    0.632
## Y.4           0.080008   0.239877   0.334    0.743
## Y.5           0.199777   0.242849   0.823    0.421
## 
## Residual standard error: 28.04 on 18 degrees of freedom
## Multiple R-squared:  0.06175,    Adjusted R-squared:  -0.3031 
## F-statistic: 0.1692 on 7 and 18 DF,  p-value: 0.9884

residualcheck(ardl_rain15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.9879, p-value = 0.9852

##Based on the observation about model ,p-value: 0.9884 >0.05, Residual standard error: 28.04 and Adjusted R-squared:  -0.3031 

#radiation
ardl_rad15 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_rad15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -54.212 -11.254   0.832  19.508  48.780 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  7.178e+01  3.381e+02   0.212    0.834
## X.t          9.774e+00  1.597e+01   0.612    0.548
## X.1         -5.631e+00  1.535e+01  -0.367    0.718
## Y.1          3.118e-02  2.371e-01   0.131    0.897
## Y.2          4.365e-02  2.447e-01   0.178    0.860
## Y.3          7.349e-04  2.596e-01   0.003    0.998
## Y.4          8.326e-02  2.555e-01   0.326    0.748
## Y.5          2.043e-01  2.535e-01   0.806    0.431
## 
## Residual standard error: 28.06 on 18 degrees of freedom
## Multiple R-squared:  0.06065,    Adjusted R-squared:  -0.3047 
## F-statistic: 0.166 on 7 and 18 DF,  p-value: 0.9891

residualcheck(ardl_rad15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.9804, p-value = 0.8823

##Based on the observation about model ,p-value: 0.9891 >0.05, Residual standard error: 28.06 and Adjusted R-squared:  -0.3047 

#humidity
ardl_hum15 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=1, q=5)
summary(ardl_hum15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -49.635 -15.409   0.681  20.856  49.830 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 523.482390 884.751156   0.592    0.561
## X.t          -1.938592   7.785492  -0.249    0.806
## X.1          -1.937931   7.797149  -0.249    0.807
## Y.1          -0.007652   0.238367  -0.032    0.975
## Y.2           0.037000   0.256453   0.144    0.887
## Y.3          -0.023502   0.251293  -0.094    0.927
## Y.4           0.071939   0.263173   0.273    0.788
## Y.5           0.167932   0.258863   0.649    0.525
## 
## Residual standard error: 28.23 on 18 degrees of freedom
## Multiple R-squared:  0.04908,    Adjusted R-squared:  -0.3207 
## F-statistic: 0.1327 on 7 and 18 DF,  p-value: 0.9944

residualcheck(ardl_hum15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98316, p-value = 0.9329

##Based on the observation about model ,p-value: 0.9944 >0.05, Residual standard error: 28.23 and Adjusted R-squared:  -0.3207

The p-value of the overall significance test is > 0.05, therefore ARDL(1,5) model is statistically not significant at 5% level. All the estimated terms are significant at 5% level except Y.2 - second lag of independent series (p-value = 0.9724 > 0.05). Adjusted R2=0.932, which means they explain 93.2% of the variability in radiation.

The plots from diagnostic checking in Figure 10 show the same picture as the diagnostic checkings in Figure 9, so the comments are the same as for previously fitted models.

Overall, none of the models from time series regression method were successful at capturing the autocorrelation and seasonal pattern in radiation series.

We create a data frame accuracy to store the accuracy measures, like AIC/BIC and MASE from the models fitted so far. The accuracy measures for further models will be appended to this data frame.

attr(K_trans$model,"class") = "lm"

#temperature
ardl_temp35 <- ardlDlm(x = (ffdata$Temperature), y = (ffdata$FFD), p=3, q=5)
ardl_temp45 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_temp55 <- ardlDlm(x = as.vector(ffdata$Temperature), y = as.vector(ffdata$FFD), p=5, q=5)

models <- c("FFtemp_DLM", "temp_PolyD", "temp_Koyck", "ARDL_temp15", "ARDL_temp35", "ARDL_temp45", "ARDL_temp55")
aic_1 <- AIC(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)

## [1] 28.17827

bic_1 <- BIC(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)

## [1] 41.75706

MASE_1 <- MASE(ftem_dlm, temp_polyd, temp_Koyck, ardl_temp15, ardl_temp35, ardl_temp45, ardl_temp55)
accuracy_1 <- data.frame(models, MASE_1, aic_1, bic_1 )
colnames(accuracy_1) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_1)

##                   Model MASE       AIC      BIC       NA
## ftem_dlm     FFtemp_DLM   21 0.7001861 28.17827 41.75706
## temp_polyd   temp_PolyD   21 0.6461061 28.17827 41.75706
## temp_Koyck   temp_Koyck   30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15   26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35   26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45   26 0.6145971 28.17827 41.75706

2)rainfall

ardl_rain35 <- ardlDlm(x = (ffdata$Rainfall), y = (ffdata$FFD), p=3, q=5)
ardl_rain45 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_rain55 <- ardlDlm(x = as.vector(ffdata$Rainfall), y = as.vector(ffdata$FFD), p=5, q=5)
#better compared to others
models <- c("FFrain_DLM", "rain_PolyD", "rain_Koyck", "ARDL_rain15", "ARDL_rain35", "ARDL_rain45", "ARDL_rain55")
aic_2 <- AIC(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)

## [1] 35.13643

bic_2 <- BIC(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)

## [1] 48.71522

MASE_2 <- MASE(frain_dlm, rain_polyd, rain_Koyck, ardl_rain15, ardl_rain35, ardl_rain45, ardl_rain55)
accuracy_2 <- data.frame(models, MASE_2, aic_2, bic_2 )
colnames(accuracy_2) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_2)

##                   Model MASE       AIC      BIC       NA
## frain_dlm    FFrain_DLM   21 0.5137970 35.13643 48.71522
## rain_polyd   rain_PolyD   21 0.6248857 35.13643 48.71522
## rain_Koyck   rain_Koyck   30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15   26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35   26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45   26 0.6175106 35.13643 48.71522

#radiation
ardl_rad35 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=3, q=5)
ardl_rad45 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_rad55 <- ardlDlm(x = as.vector(ffdata$Radiation), y = as.vector(ffdata$FFD), p=5, q=5)

models <- c("FFrad_DLM", "rad_PolyD", "rad_Koyck", "ARDL_rad15", "ARDL_rad35", "ARDL_rad45", "ARDL_rad55")
aic_3 <- AIC(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)

## [1] 26.34091

bic_3 <- BIC(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)

## [1] 39.9197

MASE_3 <- MASE(frad_dlm, rad_polyd, rad_Koyck, ardl_rad15, ardl_rad35, ardl_rad45, ardl_rad55)
accuracy_3 <- data.frame(models, MASE_3, aic_3, bic_3 )
colnames(accuracy_3) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_3)

##                 Model MASE       AIC      BIC      NA
## frad_dlm    FFrad_DLM   21 0.5356254 26.34091 39.9197
## rad_polyd   rad_PolyD   21 0.6463697 26.34091 39.9197
## rad_Koyck   rad_Koyck   30 0.7136301 26.34091 39.9197
## ardl_rad15 ARDL_rad15   26 0.6196376 26.34091 39.9197
## ardl_rad35 ARDL_rad35   26 0.6164068 26.34091 39.9197
## ardl_rad45 ARDL_rad45   26 0.6045879 26.34091 39.9197

#humidity
ardl_hum35 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=3, q=5)
ardl_hum45 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=4, q=5)
ardl_hum55 <- ardlDlm(x = as.vector(ffdata$RelHumidity), y = as.vector(ffdata$FFD), p=5, q=5)

#worst accuracy
models <- c("Fhum_DLM", "hum_PolyD", "hum_Koyck", "ARDL_hum15", "ARDL_hum35", "ARDL_hum45", "ARDL_hum55")
aic_4 <- AIC(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)

## [1] 64.34047

bic_4 <- BIC(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)

## [1] 77.91927

MASE_4 <- MASE(fhum_dlm, hum_polyd, hum_Koyck, ardl_hum15, ardl_hum35, ardl_hum45, ardl_hum55)
accuracy_4 <- data.frame(models, MASE_4, aic_4, bic_4 )
colnames(accuracy_4) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_4)

##                 Model MASE       AIC      BIC       NA
## fhum_dlm     Fhum_DLM   21 0.6141544 64.34047 77.91927
## hum_polyd   hum_PolyD   21 0.6520980 64.34047 77.91927
## hum_Koyck   hum_Koyck   30 0.7335956 64.34047 77.91927
## ardl_hum15 ARDL_hum15   26 0.6497544 64.34047 77.91927
## ardl_hum35 ARDL_hum35   26 0.5175561 64.34047 77.91927
## ardl_hum45 ARDL_hum45   26 0.5186368 64.34047 77.91927

Exponential smoothing methods

Another forecasting method we can try is exponential smoothing. Because we have found a weak seasonal component in ffd series also the frequncy is only 1 year based we cannot use expo smoothening.

ffd_ts <- ts(Ts_FFD, start=c(2015,1), frequency= 12)
ffd_ts

##      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2015 217 186 233 222 214 237 213 206 188 234 264 196
## 2016 229 212 244 178 154 207 182 218 192 199 200 225
## 2017 216 197 230 204 233 174 189

ES = c(T,F)
seasonality <- c("additive","multiplicative")
damped <- c(T,F)
expa <- expand.grid(ES, seasonality, damped)
expa <- expa[-c(1,5),]
f_aic <- array(NA, 6)
f_bic <- array(NA, 6)
f_mase <- array(NA, 6)
levels <- array(NA, dim=c(6,3))
for (i in 1:6){
  holt_w <- hw(ffd_ts, ES = expa[i,1], seasonal = toString(expa[i,2], damped = expa[i,3]))
  f_aic[i] <- holt_w$model$aic
  f_bic[i] <- holt_w$model$bic
  f_mase[i] <- accuracy(holt_w)[6]
  levels[i,1] <- expa[i,1]
  levels[i,2] <- toString(expa[i,2])
  levels[i,3] <- expa[i,3]
  checkresiduals(holt_w)
}

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' additive method
## Q* = 10.781, df = 3, p-value = 0.01297
## 
## Model df: 16.   Total lags used: 19

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
## 
## Model df: 16.   Total lags used: 19

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
## 
## Model df: 16.   Total lags used: 19

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' additive method
## Q* = 10.781, df = 3, p-value = 0.01297
## 
## Model df: 16.   Total lags used: 19

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
## 
## Model df: 16.   Total lags used: 19

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 20.214, df = 3, p-value = 0.0001533
## 
## Model df: 16.   Total lags used: 19

#Overall, we can observe a slight improvement in residuals from exponential smoothing models in terms of serial correlation and especially seasonality. Holt-Winters’ multiplicative method is the most successful at capturing autocorrelation and seasonality in radiation series.

#We append the accuracy measures for exponential smoothing models to the accuracy data frame. The format of model names is: trend (multiplicative or additive), seasonality (multiplicative or additive) and if the trend is damped (damped) or no (N).

newvalues <- data.frame(levels, f_mase, f_aic, f_bic, NA)
colnames(newvalues) <- c("Trend", "Seasonality", "damped", "MASE", "AIC", "BIC","NA")
newvalues$Trend <- factor(newvalues$Trend, levels = c(T,F), labels = c("multiplicative","additive"))
newvalues$damped <- factor(newvalues$damped, levels = c(T,F), labels = c("damped","N"))

newvalues <- unite(newvalues, col = "Model", c("Trend","Seasonality","damped"))
accuracy_T <- rbind(accuracy_1, accuracy_2,accuracy_3,accuracy_4)
accuracy_T

##                   Model MASE       AIC      BIC       NA
## ftem_dlm     FFtemp_DLM   21 0.7001861 28.17827 41.75706
## temp_polyd   temp_PolyD   21 0.6461061 28.17827 41.75706
## temp_Koyck   temp_Koyck   30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15   26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35   26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45   26 0.6145971 28.17827 41.75706
## ardl_temp55 ARDL_temp55   26 0.5872934 28.17827 41.75706
## frain_dlm    FFrain_DLM   21 0.5137970 35.13643 48.71522
## rain_polyd   rain_PolyD   21 0.6248857 35.13643 48.71522
## rain_Koyck   rain_Koyck   30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15   26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35   26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45   26 0.6175106 35.13643 48.71522
## ardl_rain55 ARDL_rain55   26 0.5306522 35.13643 48.71522
## frad_dlm      FFrad_DLM   21 0.5356254 26.34091 39.91970
## rad_polyd     rad_PolyD   21 0.6463697 26.34091 39.91970
## rad_Koyck     rad_Koyck   30 0.7136301 26.34091 39.91970
## ardl_rad15   ARDL_rad15   26 0.6196376 26.34091 39.91970
## ardl_rad35   ARDL_rad35   26 0.6164068 26.34091 39.91970
## ardl_rad45   ARDL_rad45   26 0.6045879 26.34091 39.91970
## ardl_rad55   ARDL_rad55   26 0.5314267 26.34091 39.91970
## fhum_dlm       Fhum_DLM   21 0.6141544 64.34047 77.91927
## hum_polyd     hum_PolyD   21 0.6520980 64.34047 77.91927
## hum_Koyck     hum_Koyck   30 0.7335956 64.34047 77.91927
## ardl_hum15   ARDL_hum15   26 0.6497544 64.34047 77.91927
## ardl_hum35   ARDL_hum35   26 0.5175561 64.34047 77.91927
## ardl_hum45   ARDL_hum45   26 0.5186368 64.34047 77.91927
## ardl_hum55   ARDL_hum55   26 0.5164104 64.34047 77.91927

State-space models variations

For each exponential smoothing method, there are two corresponding state-space models (with additive or multiplicative errors). There are 8 state-space variations which include seasonality that we cam implement in R (some combinations are forbidden due to their stability issues). We create a loop to fit these models and capture their accuracy measures for further comparison.

vlist <- c("AAA", "MAA", "MAM", "MMM")
damp <- c(T,F)
ets_models <- expand.grid(vlist, damp)
ets_aic <- array(NA, 8)
ets_mase <- array(NA,8)
ets_bic <- array(NA,8)
mod <- array(NA, dim=c(8,2))

#Auto ETS fitted to see what the software automatically suggested model is

auto_ets <- ets(ffd_ts)
summary(auto_ets)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = ffd_ts) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
## 
##   Initial states:
##     l = 209.448 
## 
##   sigma:  0.1137
## 
##      AIC     AICc      BIC 
## 306.9448 307.8337 311.2468 
## 
## Training set error measures:
##                        ME     RMSE    MAE       MPE     MAPE      MASE
## Training set -0.001214657 23.03386 18.696 -1.263579 9.173601 0.6517872
##                      ACF1
## Training set -0.006864096

#The model suggested automatically is ETS(M,N,N) which is a model with multiplicative errors, No damped trend and no seasonality.

checkresiduals(auto_ets)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 7.489, df = 4, p-value = 0.1122
## 
## Model df: 2.   Total lags used: 6

Overall, the residual analysis suggest this model is not successful at capturing autocorrelation and seasonality in ffd series.

We append the accuracy measures for state-space models to the accuracy data frame.

library(tidyr)
calculate <- data.frame(mod, ets_mase, ets_aic, ets_bic,"NA")
calculate$X2 <- factor(calculate$X2, levels = c(T,F), labels = c("Damped","N"))
calculate <- unite(calculate, "Model", c("X1","X2"))
colnames(calculate) <- c("Model", "MASE", "AIC", "BIC","NA")
accuracy_T <- rbind(accuracy_1, accuracy_2,accuracy_3,accuracy_4)
accuracy_T

##                   Model MASE       AIC      BIC       NA
## ftem_dlm     FFtemp_DLM   21 0.7001861 28.17827 41.75706
## temp_polyd   temp_PolyD   21 0.6461061 28.17827 41.75706
## temp_Koyck   temp_Koyck   30 0.7927338 28.17827 41.75706
## ardl_temp15 ARDL_temp15   26 0.6113428 28.17827 41.75706
## ardl_temp35 ARDL_temp35   26 0.6133766 28.17827 41.75706
## ardl_temp45 ARDL_temp45   26 0.6145971 28.17827 41.75706
## ardl_temp55 ARDL_temp55   26 0.5872934 28.17827 41.75706
## frain_dlm    FFrain_DLM   21 0.5137970 35.13643 48.71522
## rain_polyd   rain_PolyD   21 0.6248857 35.13643 48.71522
## rain_Koyck   rain_Koyck   30 0.7267872 35.13643 48.71522
## ardl_rain15 ARDL_rain15   26 0.6396905 35.13643 48.71522
## ardl_rain35 ARDL_rain35   26 0.6369725 35.13643 48.71522
## ardl_rain45 ARDL_rain45   26 0.6175106 35.13643 48.71522
## ardl_rain55 ARDL_rain55   26 0.5306522 35.13643 48.71522
## frad_dlm      FFrad_DLM   21 0.5356254 26.34091 39.91970
## rad_polyd     rad_PolyD   21 0.6463697 26.34091 39.91970
## rad_Koyck     rad_Koyck   30 0.7136301 26.34091 39.91970
## ardl_rad15   ARDL_rad15   26 0.6196376 26.34091 39.91970
## ardl_rad35   ARDL_rad35   26 0.6164068 26.34091 39.91970
## ardl_rad45   ARDL_rad45   26 0.6045879 26.34091 39.91970
## ardl_rad55   ARDL_rad55   26 0.5314267 26.34091 39.91970
## fhum_dlm       Fhum_DLM   21 0.6141544 64.34047 77.91927
## hum_polyd     hum_PolyD   21 0.6520980 64.34047 77.91927
## hum_Koyck     hum_Koyck   30 0.7335956 64.34047 77.91927
## ardl_hum15   ARDL_hum15   26 0.6497544 64.34047 77.91927
## ardl_hum35   ARDL_hum35   26 0.5175561 64.34047 77.91927
## ardl_hum45   ARDL_hum45   26 0.5186368 64.34047 77.91927
## ardl_hum55   ARDL_hum55   26 0.5164104 64.34047 77.91927

#The data frame accuracy is sorted by ascending MASE value.

The accuracy table will be used to compare all methods we have tried at the modelling stage in terms of their MASE. The model that minimises MASE is Holt-Winters’ multiplicative method with additive trend (there is no difference in models in terms of damping). The best state-space model in terms of lowest MASE is ETS(A,Ad,A) which is the model with additive errors, additive damped trend and additive seasonality. ETS(A,Ad,A) was also the model suggested automatically. We can see from the table that time series regression methods perform the worst in terms of MASE but this approach has lower AIC/BIC measures than the exponential smoothing approach.

Forecasting

For deciding on the final model to give four years ahead forecasts of FFD value, we compare forecasts from three models:

Holt-Winters’ multiplicative method which has the lowest MASE and is the most successful at capturing the autocorrelation and seasonality in the series

Holt-Winters’ multiplicative method with multiplicative trend which has the second lowest MASE and is also good at capturing the autocorrelation and seasonality in the series

ETS(A,Ad,A) model was suggested by an automatic algorithm and has the lowest MASE of all state-space models but does not capture autocorrelation in the series

The fitted values and 4 year forecasts are displayed in Figure 13.

fitm1 <- hw(ffd_ts, seasonal = "multiplicative", h = 2*frequency(ffd_ts))
fitm2 <- hw(ffd_ts, seasonal = "multiplicative", exponential = T, h = 2*frequency(ffd_ts))
fitm3 <- ets(ffd_ts,model="AAA", damped=T)
#class(fit3)
#methods(forecast())
for_fit3 <- forecast.ets(fitm3)
plot(for_fit3, fcol = "black", main = "FFD occurences series with four years ahead forecasts", ylab = "ffd", ylim = c(-10,55))
lines(fitted(fitm1), col = "darkgreen")
lines(fitm1$mean, col = "darkgreen", lwd = 2)
lines(fitted(fitm2), col = "brown2")
lines(fitm2$mean, col = "brown2", lwd = 2)
lines(fitted(fitm3), col = "dodgerblue3")
lines(for_fit3$mean, col = "dodgerblue3", lwd = 2)
legend("bottomleft", lty = 1, col = c("black", "darkgreen", "brown2", "dodgerblue3"), c("Data", "Holt-Winters' Multiplicative", "Holt-Winters' Multiplicative Exponential", "ETS(M,N,N)"))

Overall, based on the residual analysis and the lowest MASE value, we take Holt-Winters’ multiplicative model to give 4 years ahead forecasts of the amount of solar radiation.

The final forecasts are displayed in Figure 14.

plot(fitm1, fcol = "white", main = "FFD series with four years ahead forecasts", ylab = "ffd occurences")
lines(fitted(fitm1), col = "darkgreen")
lines(fitm1$mean, col = "darkgreen", lwd = 2)
legend("topleft", lty = 1, col = c("black", "darkgreen"), c("Data", "Forecasts"))

#The solar radiation 2 years ahead point forecast values with corresponding 95% confidence intervals are as follows:

forc <- fitm1$mean
ub <- fitm1$upper[,2]
lb <- fitm1$lower[,2]
forecasts <- ts.intersect(ts(lb, start = c(2015,1),end =c(2018,1) , frequency = 1), ts(forc,start = c(2015,1),end =c(2018,1), frequency = 1), ts(ub,start = c(2015,1),end =c(2018,1), frequency = 1))
colnames(forecasts) <- c("Lower bound", "Point forecast", "Upper bound")
forecasts

## Time Series:
## Start = 2015 
## End = 2018 
## Frequency = 1 
##      Lower bound Point forecast Upper bound
## 2015    152.4610       202.5019    252.5427
## 2016    139.4139       185.1746    230.9353
## 2017    156.2027       207.4778    258.7530
## 2018    172.0050       228.4733    284.9416

plot(forecasts)

#However, we can observe that the 95% confidence intervals for the forecasts from selected approach are very wide and do not provide reliable forecasts

Task 3

Introduction

This task consists of two parts. Task 3[A] The objective of the first task is Carry out your analysis based on univariate climate regressors (model one climate indicator at a time, i.e., univariate regressor). For this task, we will apply Modelling methods like (DLM, ARDL, polyck, koyck, dynlm)and will make Choice of optimal models within EACH a specific method can be assessed from values of R squared, AIC, BIC, MASE etc (as is appropriate to the method). The final goal of this analysis is to forecast RBO three years ahead using each regressor one at a time and (use percentiles for the regressors) in forecasting for each of the best models within the methods utilised.

task 3[B] The aim of the second task is to analyse the correlation structure between quarterly Residential Property Price Index (PPI) in Melbourne and population change over the previous quarter in Victoria from September 2003 to December 2016. We will explore and demonstrate if the correlation between the two series is found spurious or not.

About dataset

Dataset explores the relative flowering order similarity of 81 species of plants from 1983 to 2014. The species were ranked annually by the time taken to flower (FFD), and changes in flowering order were measured by computing the similarity between annual flowering order and the flowering order of 1983 using the Rank-based Order similarity metric (RBO). The earliest flowering species is ranked 1 and latest ranked 81 for the given year under study. RBO values are therefore numbers between 0 and 1. Higher RBO values indicate higher similarity of the order of the first flowering occurrence (based on FFD) of the 81 species from 1983 compared to each of the subsequent years, 1984 to 2014, so the time series are of length 31.

Flowering orders became more dissimilar over the most recent decades, particularly during the Millennium Drought (1997 – 2009), suggesting that flora in Australia is responding to changes in their environment. According to the BoM the drought period for Australia occurred from 1996 to 2009.

Primary Dataset “RBO.csv” consist of four dependent factors :Temperature,Rainfall, radiation ,Humidity with Target variable RBO which gives order value wrt FFD. Secondary dataset given “Covariate x-values for Task 3.csv” consist of future values 2015 to 2018 of dependent variables.

Task 3(a):

#Load data
RBOdata <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/RBO.csv")
RBOdata

##    ï..Year       RBO Temperature Rainfall Radiation RelHumidity
## 1     1984 0.7550088    9.371585 2.489344  14.87158    93.92650
## 2     1985 0.7407520    9.656164 2.475890  14.68493    94.93589
## 3     1986 0.8423860    9.273973 2.421370  14.51507    94.09507
## 4     1987 0.7484425    9.219178 2.319726  14.67397    94.49699
## 5     1988 0.7984084   10.202186 2.465301  14.74863    94.08142
## 6     1989 0.7938803    9.441096 2.735890  14.78356    96.08685
## 7     1990 0.7925678    9.943836 2.398630  14.67671    93.77918
## 8     1991 0.8138698    9.690411 2.635616  14.41096    93.15562
## 9     1992 0.8152843    9.691257 2.795902  13.39617    94.09863
## 10    1993 0.7758007    9.947945 2.878630  14.26575    94.91973
## 11    1994 0.7471853    9.316438 1.974795  14.52329    93.26932
## 12    1995 0.7508197    9.164384 2.843288  13.90411    94.45863
## 13    1996 0.6644419    8.967213 2.814754  14.33060    94.60000
## 14    1997 0.6941213    9.038356 1.403014  14.77534    93.74685
## 15    1998 0.7045545    8.934247 2.289041  14.60000    94.60822
## 16    1999 0.6992259    9.547945 2.126301  14.61370    96.22603
## 17    2000 0.7137116    9.680328 2.471858  14.65574    95.65738
## 18    2001 0.7267423    9.561644 2.227945  14.14521    94.70712
## 19    2002 0.6629484    9.389041 1.740000  14.63836    93.53233
## 20    2003 0.7118227    9.210959 2.270411  15.11233    94.47096
## 21    2004 0.7039938    9.300546 2.620492  14.64481    95.01421
## 22    2005 0.7321166    9.623288 2.284110  15.09315    94.30356
## 23    2006 0.7258027    8.715068 1.781370  15.41096    94.84493
## 24    2007 0.7007718    9.801370 2.191233  15.19452    94.11068
## 25    2008 0.7445151    9.034153 1.743169  14.80328    94.39508
## 26    2009 0.6853045    9.457534 2.038630  15.12877    94.63096
## 27    2010 0.7022626    9.765753 2.777808  14.29315    96.05205
## 28    2011 0.7582674    9.826027 2.886301  14.01096    95.70603
## 29    2012 0.7346374    9.767760 2.599454  14.40710    94.90519
## 30    2013 0.7255165   10.097260 2.540274  14.43014    93.83479
## 31    2014 0.7090916   10.247253 2.239286  14.60165    94.21016

#Converting into timeseries

RBO_ts<- ts(RBOdata$RBO, start =c(1984,1), frequency = 1)
head(RBO_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 0.7550088 0.7407520 0.8423860 0.7484425 0.7984084 0.7938803

Temperature_ts <- ts(RBOdata$Temperature, start =c(1984,1), frequency = 1)
head(Temperature_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1]  9.371585  9.656164  9.273973  9.219178 10.202186  9.441096

RainFall_ts <-ts(RBOdata$Temperature, start =c(1984,1), frequency = 1)
head(RainFall_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1]  9.371585  9.656164  9.273973  9.219178 10.202186  9.441096

Radiation_ts <-ts(RBOdata$Radiation, start =c(1984,1), frequency = 1)
head(Radiation_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356

Humidity_ts <-ts(RBOdata$RelHumidity, start =c(1984,1), frequency = 1)
head(Humidity_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685

RBOdata_ts <-ts(RBOdata[,2:6], start =c(1984,1), frequency = 1)
head(RBOdata_ts)

## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
##            RBO Temperature Rainfall Radiation RelHumidity
## 1984 0.7550088    9.371585 2.489344  14.87158    93.92650
## 1985 0.7407520    9.656164 2.475890  14.68493    94.93589
## 1986 0.8423860    9.273973 2.421370  14.51507    94.09507
## 1987 0.7484425    9.219178 2.319726  14.67397    94.49699
## 1988 0.7984084   10.202186 2.465301  14.74863    94.08142
## 1989 0.7938803    9.441096 2.735890  14.78356    96.08685

#Load Covariate x-values for Task 3  

xvalues <- read.csv("D:/Drive data/Rmit/Sem4/Forecasting/Covariate x-values for Task 3  .csv")
head(xvalues)

##   ï..Year Temperature Rainfall Radiation RelHumidity
## 1    2015       10.23     2.27     14.60       94.45
## 2    2016       10.10     2.38     14.56       94.03
## 3    2017        9.53     2.26     14.79       95.04
## 4    2018        9.54     2.27     14.79       95.06

# **Data exploration and visualisation

plot(RBO_ts,  main = "Fig.1 Time series plot of the order of the FFD of the 81 species from 1983 ", ylab = "Similarity values for the flowering orders ", xlab = "Time")

#points(RBO_ts, x=time(RBO_ts), pch = as.vector(season(RBO_ts)))

From the plot in Figure 1, we can observe the following characteristics of the series:

There is no apparent trend.

There is obvious seasonality, with lower values in December and January and higher values in June and July. The seasonal pattern is not consistent across the observed time.

Changing variance and behaviour of the series are not obvious due to the presence of seasonality.

There are two potential intervention points around 1965 and 1987.

acf(RBO_ts, lag.max = 48, main="Fig.2 ACF plot of RBO values series")

adf.test(RBO_ts, k=ar(RBO_ts)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  RBO_ts
## Dickey-Fuller = -1.4542, Lag order = 2, p-value = 0.7829
## alternative hypothesis: stationary

The ACF plot in Figure 2 shows strong seasonal patterns and suggests no trend. ADF test with lag order = 25 reports stationarity in the series at 5% level of significance (p-value < 0.05). Overall, we conclude that solar radiation series has a strong seasonality pattern.

We will display a time series plot of precipitation series which we will use as a predictor series for distributed lag models.

#Dependent variables plotting
par(mfrow=c(2,2))
plot(Temperature_ts, main ="Fig.3.1 Time series plot of temperature effects on rbo value", ylab="Temperature change", xlab = "Time")

plot(RainFall_ts, main ="Fig3.2.Time series plot of Rain effects on rbo value series", ylab="Rainfall", xlab = "Time")

plot(Radiation_ts, main ="Fig 3.3 Time series plot of Radiations on rbo value series", ylab="Radiations", xlab = "Time")

plot(Humidity_ts, main ="Fig 3.4 Time series plot of Humidity effects on ffd series", ylab="Humidity", xlab = "Time")

#points (precip, x= time(precip), pch = as.vector(season(precip)))
par(mfrow=c(1,1))

Plots 3.1 to 3.4 concludes: Based on the plot in Figure 3, we can make the following comments on the characteristics of the series:

There might be a slight downward trend, especially in the beginning of the series.

There is a clear seasonality, while the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January.

The existence of changing variance and behaviour of the series is not apparent due to seasonality.

There are no obvious intervention points

To study more on trend and seasonality we further display ADF test and acf plot for each feature

par(mfrow=c(2,2))
#Temperature
acf(Temperature_ts,lag.max = 48, main = "Fig. 4.1 ACF plot of Temperature values")
adf.test(Temperature_ts,k=ar(Temperature_ts)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Temperature_ts
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary

#Rainfall
acf(RainFall_ts,lag.max = 48, main = "Fig.4.2 ACF plot of rainfall values")
adf.test(RainFall_ts,k=ar(RainFall_ts)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  RainFall_ts
## Dickey-Fuller = -1.1484, Lag order = 2, p-value = 0.9002
## alternative hypothesis: stationary

#radiation
acf(Radiation_ts,lag.max = 48, main = "Fig 4.3 ACF plot of radiation values")
adf.test(Radiation_ts,k=ar(Radiation_ts)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Radiation_ts
## Dickey-Fuller = -2.7317, Lag order = 4, p-value = 0.2911
## alternative hypothesis: stationary

#humidity
acf(Humidity_ts,lag.max = 48, main = "Fig.4.4 ACF plot of humidity values")

adf.test(Humidity_ts,k=ar(Humidity_ts)$order)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Humidity_ts
## Dickey-Fuller = -4.5749, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary

par(mfrow=c(1,1))

From Figure 4, we can observe that there is a strong seasonal pattern, a decaying pattern of seasonal lags also suggests the possible existence of trend. The ADF test reports p-value = 0.078 > 0.05 which suggests the series is nonstationary at 5% level of significance.

#Scaling of data
shift<- scale(RBOdata_ts)
plot(shift, plot.type="s",col=c("Red", "Blue", "Brown","Black","Green"),main= "Fig.5 RBO similarity values for the flowering orders versus factor affecting RBO wrt time(Scaled)")
legend("bottomright", lty=1, text.width = 7, col = c("Red", "Blue", "Brown","Black","Green"), c("Temperature", "Rain", "Radiation", "Humidity","FFD"))

The plot in Figure 5 shows that the dependent and the independent series are likely to be negatively correlated. High values of radiation correspond to low values of precipitation and vice versa.

We also calculate the correlation coefficient to check the relationship.

#correlation between each variable with RBO

cor(RBO_ts,Temperature_ts)

## [1] 0.2610007

cor(RBO_ts,RainFall_ts)

## [1] 0.2610007

cor(RBO_ts,Radiation_ts)

## [1] -0.3173602

cor(RBO_ts,Humidity_ts)

## [1] -0.1776349

The correlation between temperature and rainfall is same i.e 0.2610007 whereas for radiation and humidity coefficient is reported r=-0.3173602 and -0.1776349 respectively which suggests a moderate negative correlation between the series and confirms the conclusion made from the plot in Figure 5. After we have explored the characteristics of the individual series and found the evidence of relationship between them, we proceed to modelling stage.

**Time series regression methods

##Finite distributed lag model To find a suitable model for forecasting solar radiation values, we will try fitting distributed lag models which include an independent explanatory series and its lags to help explain the overall variation and correlation structure in our dependent series.

for (i in 1:10){
  model1 <- dlm(x = RBOdata$Temperature, y = RBOdata$RBO, q = i)
  cat("q =", i, "AIC =", AIC(model1$model), "BIC =", BIC(model1$model), "MASE =", MASE(model1)$MASE, "\n")
}

## q = 1 AIC = -101.8617 BIC = -96.2569 MASE = 0.9239038 
## q = 2 AIC = -95.49894 BIC = -88.66246 MASE = 1.032564 
## q = 3 AIC = -96.76727 BIC = -88.77404 MASE = 1.033663 
## q = 4 AIC = -92.75653 BIC = -83.68567 MASE = 1.005373 
## q = 5 AIC = -91.46337 BIC = -81.3986 MASE = 0.8594175 
## q = 6 AIC = -85.74127 BIC = -74.77139 MASE = 0.8103361 
## q = 7 AIC = -82.04015 BIC = -70.25962 MASE = 0.7518958 
## q = 8 AIC = -83.28717 BIC = -70.79674 MASE = 0.6497633 
## q = 9 AIC = -88.10651 BIC = -75.014 MASE = 0.5120906 
## q = 10 AIC = -87.92965 BIC = -74.35085 MASE = 0.458588

#Temperature has lowest Aic at q=1 than rainfall data
for (i in 1:10){
  model1_r <- dlm(x = RBOdata$Rainfall, y = RBOdata$RBO, q = i)
  cat("q =", i, "AIC =", AIC(model1_r$model), "BIC =", BIC(model1_r$model), "MASE =", MASE(model1_r)$MASE, "\n")
}

## q = 1 AIC = -100.898 BIC = -95.29319 MASE = 0.9417954 
## q = 2 AIC = -96.70956 BIC = -89.87308 MASE = 0.9993747 
## q = 3 AIC = -97.19966 BIC = -89.20643 MASE = 0.9796852 
## q = 4 AIC = -90.46187 BIC = -81.39101 MASE = 1.038827 
## q = 5 AIC = -87.24242 BIC = -77.17765 MASE = 0.925677 
## q = 6 AIC = -82.31788 BIC = -71.348 MASE = 0.8543964 
## q = 7 AIC = -77.98405 BIC = -66.20351 MASE = 0.829337 
## q = 8 AIC = -76.81922 BIC = -64.32879 MASE = 0.7233794 
## q = 9 AIC = -80.79432 BIC = -67.70181 MASE = 0.6205897 
## q = 10 AIC = -76.93255 BIC = -63.35376 MASE = 0.585617

Finite dlm of each variate

1)Temperature

temp_dlm <- dlm(x = RBOdata$Temperature, y = RBOdata$RBO, q=10)
summary(temp_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.033097 -0.011942  0.005304  0.008460  0.030820 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  0.521379   0.537055   0.971   0.3570  
## x.t          0.003982   0.016965   0.235   0.8197  
## x.1          0.047512   0.018674   2.544   0.0315 *
## x.2         -0.010070   0.019814  -0.508   0.6235  
## x.3         -0.024214   0.020051  -1.208   0.2580  
## x.4          0.011690   0.021762   0.537   0.6042  
## x.5         -0.001764   0.023152  -0.076   0.9409  
## x.6          0.017653   0.019245   0.917   0.3829  
## x.7          0.015177   0.018673   0.813   0.4373  
## x.8         -0.011418   0.020339  -0.561   0.5882  
## x.9         -0.036343   0.019400  -1.873   0.0938 .
## x.10         0.008222   0.018870   0.436   0.6733  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02453 on 9 degrees of freedom
## Multiple R-squared:  0.6008, Adjusted R-squared:  0.1129 
## F-statistic: 1.232 on 11 and 9 DF,  p-value: 0.3834
## 
## AIC and BIC values for the model:
##         AIC       BIC
## 1 -87.92965 -74.35085

vif(temp_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.525783 1.621374 1.577203 1.582646 1.951362 2.096894 1.824277 1.649730 
##      x.8      x.9     x.10 
## 1.882237 1.408012 1.321229

From the temperature series, we obtained Adjusted R-squared: 0.1129,p-value: 0.3834 >0.05 and AIC:-87.92965

2)Rain

rain_dlm <- dlm(x = RBOdata$Rainfall, y = RBOdata$RBO, q=10)
summary(rain_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.050274 -0.013229 -0.001445  0.015071  0.039030 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.709414   0.129781   5.466 0.000397 ***
## x.t          0.016449   0.019478   0.845 0.420260    
## x.1          0.009100   0.017831   0.510 0.622079    
## x.2          0.014628   0.018404   0.795 0.447163    
## x.3         -0.006321   0.018174  -0.348 0.735980    
## x.4         -0.006181   0.020176  -0.306 0.766285    
## x.5          0.004570   0.020010   0.228 0.824453    
## x.6         -0.007054   0.019391  -0.364 0.724424    
## x.7         -0.011836   0.021444  -0.552 0.594424    
## x.8          0.004407   0.020473   0.215 0.834366    
## x.9         -0.021710   0.021728  -0.999 0.343817    
## x.10         0.006802   0.022752   0.299 0.771745    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03187 on 9 degrees of freedom
## Multiple R-squared:  0.3261, Adjusted R-squared:  -0.4975 
## F-statistic: 0.3959 on 11 and 9 DF,  p-value: 0.9251
## 
## AIC and BIC values for the model:
##         AIC       BIC
## 1 -76.93255 -63.35376

vif(rain_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.242349 1.147054 1.282021 1.257098 1.420090 1.381663 1.279639 1.407959 
##      x.8      x.9     x.10 
## 1.274726 1.277607 1.399164

From the temperature series, we obtained Adjusted R-squared: -0.4975,p-value: 0.9251 >0.05 and AIC:-76.93255

3)Radiation

rad_dlm <- dlm(x = RBOdata$Radiation, y = RBOdata$RBO, q=10)
summary(rad_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.036879 -0.014268 -0.000611  0.013122  0.040675 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.428718   0.623265   0.688    0.509
## x.t         -0.036410   0.037502  -0.971    0.357
## x.1          0.048104   0.044770   1.074    0.311
## x.2         -0.021862   0.026884  -0.813    0.437
## x.3          0.002035   0.025673   0.079    0.939
## x.4          0.002219   0.026890   0.083    0.936
## x.5          0.002622   0.025560   0.103    0.921
## x.6          0.027607   0.026512   1.041    0.325
## x.7         -0.014768   0.025996  -0.568    0.584
## x.8          0.008387   0.026420   0.317    0.758
## x.9          0.010569   0.032643   0.324    0.754
## x.10        -0.008962   0.032801  -0.273    0.791
## 
## Residual standard error: 0.03102 on 9 degrees of freedom
## Multiple R-squared:  0.3616, Adjusted R-squared:  -0.4187 
## F-statistic: 0.4634 on 11 and 9 DF,  p-value: 0.8854
## 
## AIC and BIC values for the model:
##         AIC    BIC
## 1 -78.06879 -64.49

vif(rad_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 4.571066 6.783792 3.502952 3.193621 3.262380 2.898270 2.938461 2.800495 
##      x.8      x.9     x.10 
## 2.625446 3.207576 3.013446

From the temperature series, we obtained Adjusted R-squared: -0.4187,p-value: 0.8854 >0.05 and AIC:-78.06879

4)humidity

hum_dlm <- dlm(x = RBOdata$RelHumidity, y = RBOdata$RBO, q=10)
summary(hum_dlm)

## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0305118 -0.0079610 -0.0002443  0.0159831  0.0256032 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -3.3474469  2.8680763  -1.167   0.2731  
## x.t          0.0090536  0.0094493   0.958   0.3630  
## x.1         -0.0067710  0.0098800  -0.685   0.5104  
## x.2          0.0214895  0.0097786   2.198   0.0556 .
## x.3         -0.0116432  0.0094885  -1.227   0.2509  
## x.4         -0.0052385  0.0092682  -0.565   0.5857  
## x.5          0.0177998  0.0087535   2.033   0.0725 .
## x.6          0.0005203  0.0081145   0.064   0.9503  
## x.7          0.0091601  0.0080459   1.138   0.2843  
## x.8          0.0074338  0.0079817   0.931   0.3760  
## x.9         -0.0031835  0.0081450  -0.391   0.7050  
## x.10         0.0043326  0.0084918   0.510   0.6222  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02378 on 9 degrees of freedom
## Multiple R-squared:  0.6249, Adjusted R-squared:  0.1663 
## F-statistic: 1.363 on 11 and 9 DF,  p-value: 0.3263
## 
## AIC and BIC values for the model:
##         AIC       BIC
## 1 -89.23348 -75.65468

vif(hum_dlm$model)

##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.925010 2.083771 1.978215 2.200686 1.986807 1.787024 1.553067 1.526096 
##      x.8      x.9     x.10 
## 1.503189 1.573971 1.745843

From the temperature series, we obtained Adjusted R-squared: 0.1663 ,p-value: 0.3263 >0.05 and AIC:-89.23348

It is observed that the values of information criteria as well as MASE decrease as the lag q increases, so we will fit a finite DLM with a number of lags = 10.

According to the significance tests of model coefficients obtained from the summary, nearly all lag weights of a predictor series are not statistically significant at 5% level. The adjusted R2 for finite_dlm is 0.296, which means that the model explains only 29.6% of the variability in radiation. F-test of the overall significance of the model reports the model is statistically significant at 5% level (p-value < 0.05). However, we conclude that the model is not a good fit to the data due to insignificant terms and low explainability.

There is no issue with multicollinearity in the model, VIF values are reported < 10.

residualcheck <- function(x){
  checkresiduals(x)
 # bgtest(x)
  shapiro.test(x$residuals)
}

Polynomial distributed lag model

Polynomial modelling on univariate

1)Temperature

Temp_polyd3 <- polyDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##          Estimate Std. Error t value P(>|t|)
## beta.0   0.009240    0.00932   0.991   0.343
## beta.1   0.008290    0.00704   1.180   0.264
## beta.2   0.007160    0.00608   1.180   0.264
## beta.3   0.005860    0.00605   0.969   0.353
## beta.4   0.004380    0.00626   0.699   0.499
## beta.5   0.002720    0.00631   0.431   0.675
## beta.6   0.000882    0.00611   0.144   0.888
## beta.7  -0.001130    0.00588  -0.192   0.851
## beta.8  -0.003320    0.00618  -0.537   0.602
## beta.9  -0.005690    0.00768  -0.741   0.474
## beta.10 -0.008230    0.01050  -0.780   0.452

summary(Temp_polyd3, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.048188 -0.013552  0.000488  0.011598  0.039393 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  5.246e-01  5.467e-01   0.960    0.351
## z.t0         9.240e-03  9.322e-03   0.991    0.335
## z.t1        -8.622e-04  4.008e-03  -0.215    0.832
## z.t2        -8.849e-05  4.042e-04  -0.219    0.829
## 
## Residual standard error: 0.02613 on 17 degrees of freedom
## Multiple R-squared:  0.1445, Adjusted R-squared:  -0.00647 
## F-statistic: 0.9571 on 3 and 17 DF,  p-value: 0.4354

residualcheck(Temp_polyd3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.96513, p-value = 0.6248

checkresiduals(Temp_polyd3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 11.321, df = 7, p-value = 0.1252

From the temperature series, we obtained Adjusted R-squared: -0.00647 ,p-value: 0.4354 >0.05

2)Rain

#better than others
Rain_polyd3 <- polyDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##          Estimate Std. Error t value P(>|t|)
## beta.0   0.015300    0.01110   1.380   0.196
## beta.1   0.009620    0.00737   1.310   0.218
## beta.2   0.004830    0.00576   0.839   0.419
## beta.3   0.000907    0.00580   0.156   0.879
## beta.4  -0.002160    0.00618  -0.349   0.734
## beta.5  -0.004360    0.00611  -0.712   0.491
## beta.6  -0.005690    0.00546  -1.040   0.320
## beta.7  -0.006170    0.00474  -1.300   0.220
## beta.8  -0.005780    0.00549  -1.050   0.315
## beta.9  -0.004520    0.00872  -0.519   0.614
## beta.10 -0.002410    0.01380  -0.174   0.865

summary(Rain_polyd3)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.056127 -0.014819  0.002352  0.012277  0.038053 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.7166742  0.1029308   6.963 2.29e-06 ***
## z.t0         0.0152725  0.0110859   1.378    0.186    
## z.t1        -0.0060829  0.0055819  -1.090    0.291    
## z.t2         0.0004315  0.0005823   0.741    0.469    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02592 on 17 degrees of freedom
## Multiple R-squared:  0.1584, Adjusted R-squared:  0.009865 
## F-statistic: 1.066 on 3 and 17 DF,  p-value: 0.3894

residualcheck(Rain_polyd3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.97008, p-value = 0.7348

checkresiduals(Rain_polyd3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 9.9973, df = 7, p-value = 0.1887

From the temperature series, we obtained Adjusted R-squared: 0.009865,p-value: 0.3894 >0.05

3)Radiation

Rad_polyd3 <- polyDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##          Estimate Std. Error t value P(>|t|)
## beta.0  -0.005740    0.00831  -0.691   0.504
## beta.1  -0.002240    0.00532  -0.422   0.681
## beta.2   0.000648    0.00369   0.176   0.864
## beta.3   0.002930    0.00346   0.847   0.415
## beta.4   0.004600    0.00377   1.220   0.248
## beta.5   0.005650    0.00389   1.450   0.174
## beta.6   0.006100    0.00367   1.660   0.125
## beta.7   0.005940    0.00345   1.720   0.113
## beta.8   0.005160    0.00413   1.250   0.237
## beta.9   0.003780    0.00628   0.601   0.560
## beta.10  0.001780    0.00968   0.184   0.857

summary(Rad_polyd3)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.048403 -0.012315  0.001545  0.021358  0.033699 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.2975083  0.4879633   0.610    0.550
## z.t0        -0.0057426  0.0083076  -0.691    0.499
## z.t1         0.0038059  0.0039999   0.952    0.355
## z.t2        -0.0003054  0.0004050  -0.754    0.461
## 
## Residual standard error: 0.02599 on 17 degrees of freedom
## Multiple R-squared:  0.1538, Adjusted R-squared:  0.004456 
## F-statistic:  1.03 on 3 and 17 DF,  p-value: 0.4043

residualcheck(Rad_polyd3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.951, p-value = 0.3557

checkresiduals(Rad_polyd3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 7.121, df = 7, p-value = 0.4164

From the temperature series, we obtained Adjusted R-squared: 0.004456,p-value: 0.4043 >0.05

4)Humidity

Humi_polyd3 <- polyDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO), q=10,k=2)

## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value P(>|t|)
## beta.0  0.002210    0.00585  0.3770   0.713
## beta.1  0.002620    0.00421  0.6210   0.547
## beta.2  0.002890    0.00345  0.8390   0.419
## beta.3  0.003030    0.00339  0.8960   0.390
## beta.4  0.003040    0.00358  0.8490   0.414
## beta.5  0.002910    0.00370  0.7870   0.448
## beta.6  0.002640    0.00366  0.7220   0.486
## beta.7  0.002240    0.00358  0.6260   0.544
## beta.8  0.001710    0.00378  0.4520   0.660
## beta.9  0.001040    0.00464  0.2230   0.828
## beta.10 0.000229    0.00633  0.0362   0.972

summary(Humi_polyd3)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.053351 -0.013376 -0.000361  0.013300  0.044624 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.6065641  3.1981382  -0.502    0.622
## z.t0         0.0022052  0.0058503   0.377    0.711
## z.t1         0.0004784  0.0025645   0.187    0.854
## z.t2        -0.0000676  0.0002504  -0.270    0.790
## 
## Residual standard error: 0.0276 on 17 degrees of freedom
## Multiple R-squared:  0.04517,    Adjusted R-squared:  -0.1233 
## F-statistic: 0.2681 on 3 and 17 DF,  p-value: 0.8475

residualcheck(Humi_polyd3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.96765, p-value = 0.6806

checkresiduals(Humi_polyd3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 5.2541, df = 7, p-value = 0.629

From the temperature series, we obtained Adjusted R-squared: -0.1233,p-value: 0.8475 >0.05

The analysis of residuals from polynomial model in Figure 7 shows the following:

The errors are not randomly spread.

There are a lot of highly significant lags in the ACF plot as well as a wavy pattern at seasonal lags, so there is autocorrelation and seasonality still present in the residuals.

Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.

The normality of the residuals is also violated, as observed from the histogram and Shapiro-Wilk normality test report (p-value < 0.05).

Overall, we can conclude that the second order polynomial of lag 10 is not successful at capturing the autocorrelation and seasonality in the series and has low explainability.

Koyck transformation

We will implement Koyck transformation model with precipitation predictor series as follows

K_total = koyckDlm(x=as.vector(RBOdata$Temperature)+as.vector(RBOdata$Rainfall)+as.vector(RBOdata$Radiation)+as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
summary(K_total$model, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0877331 -0.0443882  0.0004844  0.0327202  0.1376518 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -5.37256    7.12386  -0.754   0.4573  
## Y.1          0.63395    0.27436   2.311   0.0287 *
## X.t          0.04661    0.05834   0.799   0.4313  
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     1.017   0.322
## Wu-Hausman         1  26     1.787   0.193
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 0.06357 on 27 degrees of freedom
## Multiple R-Squared: -0.8376, Adjusted R-squared: -0.9738 
## Wald test: 2.677 on 2 and 27 DF,  p-value: 0.08698

vif(K_total$model)

##      Y.1      X.t 
## 1.095691 1.095691

1)Temperature

Temp_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO))
summary(Temp_Koyck3, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.15981 -0.04678 -0.01440  0.04750  0.14952 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -1.2032     1.6184  -0.743    0.464
## Y.1           0.2469     0.4609   0.536    0.597
## X.t           0.1847     0.1947   0.949    0.351
## 
## Residual standard error: 0.07557 on 27 degrees of freedom
## Multiple R-Squared: -1.597,  Adjusted R-squared: -1.789 
## Wald test: 2.119 on 2 and 27 DF,  p-value: 0.1397 
## 
## Diagnostic tests:
##                  df1 df2 statistic   p-value
## Weak instruments   1  27  1.010792 0.3236389
## Wu-Hausman         1  26  3.246834 0.0831690
## 
##                              alpha      beta      phi
## Geometric coefficients:  -1.597614 0.1847273 0.246894

vif(Temp_Koyck3$model, diagnostics =T)

##      Y.1      X.t 
## 2.188316 2.188316

From the temperature series, we obtained Adjusted R-squared:-1.789,p-value: 0.1397 >0.05

2)Rainfall

Rain_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO))
summary(Rain_Koyck3,diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3665 -0.4155 -0.1142  0.3241  1.6012 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   0.3207     2.4302   0.132    0.896
## Y.1          -6.5147   243.8216  -0.027    0.979
## X.t           2.2101    76.0635   0.029    0.977
## 
## Residual standard error: 0.7951 on 27 degrees of freedom
## Multiple R-Squared: -286.5,  Adjusted R-squared: -307.8 
## Wald test: 0.01549 on 2 and 27 DF,  p-value: 0.9846 
## 
## Diagnostic tests:
##                  df1 df2    statistic   p-value
## Weak instruments   1  27 0.0008275768 0.9772615
## Wu-Hausman         1  26 0.3602689549 0.5535531
## 
##                               alpha    beta       phi
## Geometric coefficients:  0.04267914 2.21011 -6.514689

vif(Rain_Koyck3$model,diagnostics =T)

##      Y.1      X.t 
## 5531.807 5531.807

From the temperature series, we obtained Adjusted R-squared:-307.8,p-value: 0.9846 >0.05 3)Radiation

Rad_Koyck3 <- koyckDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO))
summary(Rad_Koyck3, diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.082255 -0.017008 -0.001036  0.021424  0.106984 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -0.48011    0.94819  -0.506   0.6167   
## Y.1          0.69801    0.24502   2.849   0.0083 **
## X.t          0.04812    0.05661   0.850   0.4028   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0467 on 27 degrees of freedom
## Multiple R-Squared: 0.008467,    Adjusted R-squared: -0.06498 
## Wald test: 4.731 on 2 and 27 DF,  p-value: 0.01732 
## 
## Diagnostic tests:
##                  df1 df2 statistic    p-value
## Weak instruments   1  27  4.941539 0.03478221
## Wu-Hausman         1  26  2.764873 0.10836470
## 
##                              alpha       beta       phi
## Geometric coefficients:  -1.589802 0.04811971 0.6980071

vif(Rad_Koyck3$model)

##      Y.1      X.t 
## 1.619594 1.619594

From the temperature series, we obtained Adjusted R-squared: -0.06498,p-value: 0.01732 <0.05 Results are much good.

4)humidity

Humi_Koyck3 <- koyckDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
summary(Humi_Koyck3,diagnostics=T)

## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.080897 -0.021103 -0.004676  0.022673  0.111041 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -1.16679    8.04941  -0.145   0.8858  
## Y.1          0.62503    0.34753   1.798   0.0833 .
## X.t          0.01525    0.08274   0.184   0.8551  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04127 on 27 degrees of freedom
## Multiple R-Squared: 0.2256,  Adjusted R-squared: 0.1682 
## Wald test: 5.612 on 2 and 27 DF,  p-value: 0.009161 
## 
## Diagnostic tests:
##                  df1 df2  statistic   p-value
## Weak instruments   1  27 0.39261559 0.5361901
## Wu-Hausman         1  26 0.05497691 0.8164556
## 
##                              alpha       beta       phi
## Geometric coefficients:  -3.111733 0.01525207 0.6250339

vif(Humi_Koyck3$model)

##      Y.1      X.t 
## 4.171591 4.171591

From the temperature series, we obtained Adjusted R-squared: 0.1682,p-value: 0.009161 <0.05.Results obtained are better as compared to other koyck models.

From the model summary, we can conclude that all terms of Koyck model are not significant at 5% level except rainfall and humidity. The model is reported to be overall statistically significant at 5% level (p-value < 0.05) and its adjusted R2 is negative which means the model explains about negative variability in rbo

According to the Weak instruments test (p-value > 0.05), the model at the first stage of least-squares estimation is notsignificant at 5% level.

From the Wu-Hausman test (p-value > 0.05), we can conclude that there is no significant correlation between the explanatory variable and the error term at 5% level.

There is no effect of multicollinearity as all VIFs are less than 10.

#Residual analysis univariately

residualcheck(Temp_Koyck3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.983, p-value = 0.8984

checkresiduals(Temp_Koyck3$model)

residualcheck(Rain_Koyck3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.96568, p-value = 0.4287

checkresiduals(Rain_Koyck3$model)

residualcheck(Rad_Koyck3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.96083, p-value = 0.3253

checkresiduals(Rad_Koyck3$model)

residualcheck(Humi_Koyck3$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.97232, p-value = 0.6044

checkresiduals(Humi_Koyck3$model)

From the residual analysis in Figure 8, we can conclude the following:

The errors are not spread randomly.

All the lags in ACF plot are significant and have a wave-like pattern, which suggests serial correlation and seasonality remaining in the residuals.

The errors are not normal. The histogram and the Shapiro-Wilk normality test with p-value < 0.05 suggest not normal residuals.

Overall, we can conclude that the Koyck model is also not successful at capturing the autocorrelation and seasonality in the series.

Autoregressive distributed lag models

The final model type from time series regression method is Autoregressive distributed lag models. For specifying the parameters of ARDL(p,q), we create a loop that fits autoregressive DLMs for a range of lag lengths and orders of the AR process and obtains their accuracy measures, like AIC/BIC and MASE.

Three models with lowest values of MASE were chosen for fitting and analysis. The models were:

ARDL(3,5)

ARDL(4,5)

ARDL(5,5)

We create a loop to fit these candidate models and do residual analysis in a dynamical way.

1)Temperature

for (i in 1:5){
  for(j in 1:5){
    modtemp = ardlDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO))
    cat("p =", i, "q =", j, "AIC =", AIC(modtemp$model), "BIC =", BIC(modtemp$model), "MASE =", MASE(modtemp)$MASE, "\n")
  }
}

## p = 1 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 1 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 1 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 1 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 1 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 2 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 2 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 2 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 2 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 2 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 3 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 3 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 3 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 3 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 3 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 4 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 4 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 4 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 4 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 4 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 5 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 5 q = 2 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 5 q = 3 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 5 q = 4 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 5 q = 5 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316

2)Rainfall

for (i in 1:5){
  for(j in 1:5){
    modrain = ardlDlm(x=as.vector(RBOdata$Rainfall), y=as.vector(RBOdata$RBO))
    cat("p =", i, "q =", j, "AIC =", AIC(modrain$model), "BIC =", BIC(modrain$model), "MASE =", MASE(modrain)$MASE, "\n")
  }
}

## p = 1 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 1 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 1 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 1 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 1 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 2 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 2 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 2 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 2 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 2 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 3 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 3 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 3 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 3 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 3 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 4 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 4 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 4 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 4 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 4 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 5 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 5 q = 2 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 5 q = 3 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 5 q = 4 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 5 q = 5 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275

3)Radiation

#
for (i in 1:5){
  for(j in 1:5){
    modrad = ardlDlm(x=as.vector(RBOdata$Radiation), y=as.vector(RBOdata$RBO))
    cat("p =", i, "q =", j, "AIC =", AIC(modrad$model), "BIC =", BIC(modrad$model), "MASE =", MASE(modrad)$MASE, "\n")
  }
}

## p = 1 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 1 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 1 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 1 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 1 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 2 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 2 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 2 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 2 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 2 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 3 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 3 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 3 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 3 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 3 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 4 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 4 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 4 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 4 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 4 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 5 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 5 q = 2 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 5 q = 3 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 5 q = 4 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 5 q = 5 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648

4)Humidity

for (i in 1:5){
  for(j in 1:5){
    modhum = ardlDlm(x=as.vector(RBOdata$RelHumidity), y=as.vector(RBOdata$RBO))
    cat("p =", i, "q =", j, "AIC =", AIC(modhum$model), "BIC =", BIC(modhum$model), "MASE =", MASE(modhum)$MASE, "\n")
  }
}

## p = 1 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 1 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 1 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 1 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 1 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 2 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 2 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 2 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 2 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 2 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 3 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 3 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 3 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 3 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 3 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 4 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 4 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 4 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 4 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 4 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 5 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 5 q = 2 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 5 q = 3 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 5 q = 4 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 5 q = 5 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564

Lowest Mase with lowest aic in temperature so taking up temp data further

for (i in c(3,4,5)){
  ardl3_temp <- ardlDlm(x=as.vector(RBOdata$Temperature), y=as.vector(RBOdata$RBO), p = i, q = 5)
  summary(ardl3_temp)
  #bgtest(ardl3_temp$model)
  #residualcheck(ardl3_temp$model)
  
}

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.051613 -0.010047  0.000777  0.020277  0.040746 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.09995    0.32029   0.312    0.759
## X.t          0.01654    0.02173   0.761    0.458
## X.1          0.03858    0.02603   1.482    0.158
## X.2         -0.01886    0.02632  -0.717    0.484
## X.3         -0.03100    0.02251  -1.377    0.187
## Y.1          0.30160    0.25501   1.183    0.254
## Y.2          0.26910    0.28224   0.953    0.355
## Y.3          0.11888    0.23217   0.512    0.616
## Y.4          0.05585    0.24529   0.228    0.823
## Y.5          0.04069    0.22768   0.179    0.860
## 
## Residual standard error: 0.03121 on 16 degrees of freedom
## Multiple R-squared:  0.6393, Adjusted R-squared:  0.4364 
## F-statistic: 3.151 on 9 and 16 DF,  p-value: 0.02187
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.052918 -0.009863  0.003109  0.020643  0.043277 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.039481   0.360467   0.110    0.914
## X.t          0.017090   0.022358   0.764    0.457
## X.1          0.038753   0.026735   1.450    0.168
## X.2         -0.022368   0.028347  -0.789    0.442
## X.3         -0.029645   0.023352  -1.269    0.224
## X.4          0.009898   0.024127   0.410    0.687
## Y.1          0.332351   0.272427   1.220    0.241
## Y.2          0.273800   0.290099   0.944    0.360
## Y.3          0.064754   0.272518   0.238    0.815
## Y.4          0.052400   0.252066   0.208    0.838
## Y.5          0.036593   0.234047   0.156    0.878
## 
## Residual standard error: 0.03206 on 15 degrees of freedom
## Multiple R-squared:  0.6433, Adjusted R-squared:  0.4055 
## F-statistic: 2.705 on 10 and 15 DF,  p-value: 0.04003
## 
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.054872 -0.009819  0.003510  0.019692  0.041115 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.088818   0.433502   0.205    0.841
## X.t          0.016838   0.023130   0.728    0.479
## X.1          0.037499   0.028194   1.330    0.205
## X.2         -0.023297   0.029586  -0.787    0.444
## X.3         -0.027820   0.025485  -1.092    0.293
## X.4          0.009152   0.025155   0.364    0.721
## X.5         -0.005800   0.026075  -0.222    0.827
## Y.1          0.346273   0.288366   1.201    0.250
## Y.2          0.259300   0.306758   0.845    0.412
## Y.3          0.063980   0.281607   0.227    0.824
## Y.4          0.077358   0.283595   0.273    0.789
## Y.5          0.037752   0.241891   0.156    0.878
## 
## Residual standard error: 0.03312 on 14 degrees of freedom
## Multiple R-squared:  0.6446, Adjusted R-squared:  0.3653 
## F-statistic: 2.308 on 11 and 14 DF,  p-value: 0.07142

#checkresiduals(ardl3_temp$model)

Regarding model coefficient estimates, we can observe for ARDL(3,5) only X.2 lag of predictor series is significant at 5% level (p-value = 0.02187 < 0.05), for ARDL(4,5) only X.4 lag of predictor series is significant at 5% level (p-value = 0.04003 < 0.05), and all lags of predictor series are not statistically significant at 5% level for ARDL(5,5). All lags of independent series are statistically significant in all models except Y.2 (p-value = 0.7829 > 0.05).

The plots from diagnostic checking in Figure 9 show that there is a very similar overall picture in residuals from all three fitted models:

The residuals are not as randomly spread as desired, they show evidence of changing variance.

There are a some highly significant lags in the ACF plot. The seasonal lags are also highly significant. Therefore, there is autocorrelation and seasonality still present in the residuals.

Beusch-Godfrey test reports a p-value < 0.05, therefore there is serial correlation in the residuals at 5% level of significance.

Long tails on the histogram of residuals suggest the normality of the residuals is violated.

Based on the observation about model estimates made earlier, we can try to decrease the number of lags for predictor series. We will fit ARDL(1,5) and perform diagnostic checking.

1)Temperature

ardl3_Temp15 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Temp15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.065387 -0.009906  0.006212  0.016715  0.038292 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.14864    0.26458  -0.562    0.581
## X.t          0.01261    0.02137   0.590    0.563
## X.1          0.03374    0.02549   1.323    0.202
## Y.1          0.32065    0.24308   1.319    0.204
## Y.2          0.07062    0.25490   0.277    0.785
## Y.3          0.10359    0.23352   0.444    0.663
## Y.4          0.03316    0.24215   0.137    0.893
## Y.5          0.06797    0.19677   0.345    0.734
## 
## Residual standard error: 0.03159 on 18 degrees of freedom
## Multiple R-squared:  0.5843, Adjusted R-squared:  0.4226 
## F-statistic: 3.614 on 7 and 18 DF,  p-value: 0.01312

residualcheck(ardl3_Temp15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.92583, p-value = 0.06171

Based on the observation about model ,p-value: 0.01312 <0.05, Residual standard error: 0.03159 and Adjusted R-squared: 0.4226

2)Rainfall

ardl3_Rain15 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Rain15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.078127 -0.017340  0.005343  0.014866  0.039246 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.228993   0.134816   1.699    0.107
## X.t          0.019647   0.018162   1.082    0.294
## X.1          0.007268   0.018640   0.390    0.701
## Y.1          0.372964   0.248591   1.500    0.151
## Y.2          0.260567   0.244988   1.064    0.302
## Y.3          0.174384   0.214252   0.814    0.426
## Y.4         -0.203452   0.196005  -1.038    0.313
## Y.5         -0.008021   0.196922  -0.041    0.968
## 
## Residual standard error: 0.0326 on 18 degrees of freedom
## Multiple R-squared:  0.5575, Adjusted R-squared:  0.3854 
## F-statistic: 3.239 on 7 and 18 DF,  p-value: 0.02091

residualcheck(ardl3_Rain15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.94165, p-value = 0.147

Based on the observation about model ,p-value: 0.02091 <0.05, Residual standard error: 0.0326 and Adjusted R-squared: 0.3854

3)Radiation

#best results
ardl3_Rad15 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Rad15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.058184 -0.015578  0.002216  0.017187  0.043461 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  0.12704    0.37571   0.338   0.7392  
## X.t         -0.02852    0.01654  -1.724   0.1017  
## X.1          0.03195    0.01691   1.889   0.0751 .
## Y.1          0.54526    0.21425   2.545   0.0203 *
## Y.2          0.24093    0.21721   1.109   0.2819  
## Y.3          0.06064    0.19795   0.306   0.7629  
## Y.4         -0.21453    0.18353  -1.169   0.2577  
## Y.5          0.12092    0.18766   0.644   0.5275  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02995 on 18 degrees of freedom
## Multiple R-squared:  0.6264, Adjusted R-squared:  0.4811 
## F-statistic: 4.311 on 7 and 18 DF,  p-value: 0.005806

residualcheck(ardl3_Rad15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.98447, p-value = 0.9519

Based on the observation about model ,p-value: 0.005806 <0.05, Residual standard error: 0.02995 and Adjusted R-squared: 0.4811

4)Humidity

ardl3_Hum15 <- ardlDlm(x = as.vector(RBOdata$RelHumidity), y = as.vector(RBOdata$RBO), p=1, q=5)
summary(ardl3_Hum15)

## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.07050 -0.01625  0.00118  0.01782  0.04034 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.831753   1.210326  -0.687   0.5007  
## X.t          0.006641   0.009130   0.727   0.4764  
## X.1          0.003776   0.009215   0.410   0.6868  
## Y.1          0.446855   0.235475   1.898   0.0739 .
## Y.2          0.312764   0.264204   1.184   0.2519  
## Y.3          0.211856   0.234960   0.902   0.3791  
## Y.4         -0.188857   0.198286  -0.952   0.3535  
## Y.5          0.002712   0.198356   0.014   0.9892  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03306 on 18 degrees of freedom
## Multiple R-squared:  0.5449, Adjusted R-squared:  0.3679 
## F-statistic: 3.079 on 7 and 18 DF,  p-value: 0.02568

residualcheck(ardl3_Hum15$model)

## 
##  Shapiro-Wilk normality test
## 
## data:  x$residuals
## W = 0.95958, p-value = 0.3833

Based on the observation about model ,p-value: 0.02568 >0.05, Residual standard error: 0.03306 and Adjusted R-squared: 0.3679

The plots from diagnostic checking in Figure 10 show the same picture as the diagnostic checkings in Figure 9, so the comments are the same as for previously fitted models.

Overall, none of the models from time series regression method were successful at capturing the autocorrelation and seasonal pattern in radiation series.

We create a data frame accuracy to store the accuracy measures, like AIC/BIC and MASE from the models fitted so far. The accuracy measures for further models will be appended to this data frame.

attr(K_total$model,"class") = "lm"

Univariate Ardl modelling 1) Temperature

ardl3_Temp35 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Temp45 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Temp55 <- ardlDlm(x = as.vector(RBOdata$Temperature), y = as.vector(RBOdata$RBO), p=5, q=5)

models <- c("Temp_DLM3", "Temp_PolyD3", "Temp_Koyck3", "ARDL3_temp15", "ARDL3_temp35", "ARDL3_temp45", "ARDL3_temp55")
aic_a <- AIC(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)

## [1] -87.92965

bic_a <- BIC(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)

## [1] -74.35085

MASE_a <- MASE(temp_dlm, Temp_polyd3, Temp_Koyck3, ardl3_Temp15, ardl3_Temp35, ardl3_Temp45, ardl3_Temp55)
accuracy_a <- data.frame(models, MASE_a, aic_a, bic_a )
colnames(accuracy_a) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_a)

##                     Model MASE       AIC       BIC        NA
## temp_dlm        Temp_DLM3   21 0.4585880 -87.92965 -74.35085
## Temp_polyd3   Temp_PolyD3   21 0.6742057 -87.92965 -74.35085
## Temp_Koyck3   Temp_Koyck3   30 1.9150155 -87.92965 -74.35085
## ardl3_Temp15 ARDL3_temp15   26 0.7735245 -87.92965 -74.35085
## ardl3_Temp35 ARDL3_temp35   26 0.7661286 -87.92965 -74.35085
## ardl3_Temp45 ARDL3_temp45   26 0.7643666 -87.92965 -74.35085

2)Rainfall

ardl3_Rain35 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Rain45 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Rain55 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=5, q=5)

models <- c("Rain_DLM", "Rain_PolyD3", "Rain_Koyck3", "ARDL3_Rain15", "ARDL3_Rain35", "ARDL3_Rain45", "ARDL3_Rain55")
aic_b <- AIC(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)

## [1] -76.93255

bic_b <- BIC(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)

## [1] -63.35376

MASE_b <- MASE(rain_dlm, Rain_polyd3, Rain_Koyck3, ardl3_Rain15, ardl3_Rain35, ardl3_Rain45, ardl3_Rain55)
accuracy_b <- data.frame(models, MASE_b, aic_b, bic_b )
colnames(accuracy_b) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_b)

##                     Model MASE        AIC       BIC        NA
## rain_dlm         Rain_DLM   21  0.5856170 -76.93255 -63.35376
## Rain_polyd3   Rain_PolyD3   21  0.6451804 -76.93255 -63.35376
## Rain_Koyck3   Rain_Koyck3   30 19.1057647 -76.93255 -63.35376
## ardl3_Rain15 ARDL3_Rain15   26  0.8152025 -76.93255 -63.35376
## ardl3_Rain35 ARDL3_Rain35   26  0.8095313 -76.93255 -63.35376
## ardl3_Rain45 ARDL3_Rain45   26  0.7390848 -76.93255 -63.35376

Rainfall model results better compared to others.

3)Radiation

ardl3_Rad35 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Rad45 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Rad55 <- ardlDlm(x = as.vector(RBOdata$Radiation), y = as.vector(RBOdata$RBO), p=5, q=5)

models <- c("Rad_DLM", "Rain_PolyD3", "Rain_Koyck3", "ARDL3_Rain15", "ARDL3_Rain35", "ARDL3_Rain45", "ARDL3_Rain55")
aic_c <- AIC(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)

## [1] -78.06879

bic_c <- BIC(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)

## [1] -64.49

MASE_c <- MASE(rad_dlm, Rad_polyd3, Rad_Koyck3, ardl3_Rad15, ardl3_Rad35, ardl3_Rad45, ardl3_Rad55)
accuracy_c <- data.frame(models, MASE_c, aic_c, bic_c )
colnames(accuracy_c) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_c)

##                    Model MASE       AIC       BIC     NA
## rad_dlm          Rad_DLM   21 0.6037290 -78.06879 -64.49
## Rad_polyd3   Rain_PolyD3   21 0.6711964 -78.06879 -64.49
## Rad_Koyck3   Rain_Koyck3   30 1.0314227 -78.06879 -64.49
## ardl3_Rad15 ARDL3_Rain15   26 0.7627672 -78.06879 -64.49
## ardl3_Rad35 ARDL3_Rain35   26 0.7009101 -78.06879 -64.49
## ardl3_Rad45 ARDL3_Rain45   26 0.7052516 -78.06879 -64.49

4)Humidity

ardl3_Humi35 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=3, q=5)
ardl3_Humi45 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=4, q=5)
ardl3_Humi55 <- ardlDlm(x = as.vector(RBOdata$Rainfall), y = as.vector(RBOdata$RBO), p=5, q=5)

models <- c("Hum_DLM", "Humi_PolyD3", "Humi_Koyck3", "ARDL3_Hum15", "ARDL3_Humi35", "ARDL3_Humi45", "ARDL3_Humi55")
aic_d <- AIC(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)

## [1] -89.23348

bic_d <- BIC(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)

## [1] -75.65468

MASE_d <- MASE(hum_dlm, Humi_polyd3, Humi_Koyck3, ardl3_Hum15, ardl3_Humi35, ardl3_Humi45, ardl3_Humi55)
accuracy_d <- data.frame(models, MASE_d, aic_d, bic_d )
colnames(accuracy_d) <- c("Model", "MASE", "AIC", "BIC")
head(accuracy_d)

##                     Model MASE       AIC       BIC        NA
## hum_dlm           Hum_DLM   21 0.4518490 -89.23348 -75.65468
## Humi_polyd3   Humi_PolyD3   21 0.6821919 -89.23348 -75.65468
## Humi_Koyck3   Humi_Koyck3   30 0.9559618 -89.23348 -75.65468
## ardl3_Hum15   ARDL3_Hum15   26 0.8156142 -89.23348 -75.65468
## ardl3_Humi35 ARDL3_Humi35   26 0.8095313 -89.23348 -75.65468
## ardl3_Humi45 ARDL3_Humi45   26 0.7390848 -89.23348 -75.65468

Humidity results have best accuracy

Forecasting

For deciding on the final model to give three years ahead forecasts of solar radiation, we compare forecasts from three models:

fit.auto =ets(RBO_ts,model="ZZZ",ic="bic")
fit.auto$method

## [1] "ETS(M,N,N)"

f1.etsM = ets(RBO_ts, model="MNN")
summary(f1.etsM)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = RBO_ts, model = "MNN") 
## 
##   Smoothing parameters:
##     alpha = 0.4421 
## 
##   Initial states:
##     l = 0.7685 
## 
##   sigma:  0.0479
## 
##       AIC      AICc       BIC 
## -96.69180 -95.80291 -92.38984 
## 
## Training set error measures:
##                        ME       RMSE        MAE        MPE     MAPE      MASE
## Training set -0.003529451 0.03466016 0.02607253 -0.6459794 3.549133 0.8460683
##                    ACF1
## Training set -0.0753464

checkresiduals(f1.etsM)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 2.2852, df = 4, p-value = 0.6835
## 
## Model df: 2.   Total lags used: 6

Holt-Winters’ multiplicative method which has the lowest MASE and is the most successful at capturing the autocorrelation and seasonality in the series

Holt-Winters’ multiplicative method with multiplicative trend which has the second lowest MASE and is also good at capturing the autocorrelation and seasonality in the series

ETS(M,N,N) model was suggested by an automatic algorithm and has the lowest MASE of all state-space models but does not capture autocorrelation in the series

1)simple exponential forecast

f1 <- ses(RBO_ts, alpha=0.1, initial="simple", h=3) # Set alpha to a small value
summary(f1)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = RBO_ts, h = 3, initial = "simple", alpha = 0.1) 
## 
##   Smoothing parameters:
##     alpha = 0.1 
## 
##   Initial states:
##     l = 0.755 
## 
##   sigma:  0.0406
## Error measures:
##                        ME      RMSE        MAE       MPE     MAPE     MASE
## Training set -0.009995034 0.0406462 0.03161539 -1.640631 4.337625 1.025937
##                   ACF1
## Training set 0.4122565
## 
## Forecasts:
##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2015      0.7240242 0.6719340 0.7761144 0.6443591 0.8036893
## 2016      0.7240242 0.6716742 0.7763742 0.6439618 0.8040866
## 2017      0.7240242 0.6714157 0.7766327 0.6435664 0.8044820

checkresiduals(f1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 17.699, df = 4, p-value = 0.001413
## 
## Model df: 2.   Total lags used: 6

2)Holts simple forecast

f2 <- holt(RBO_ts,initial = "simple",h=3)
summary(f2)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = RBO_ts, h = 3, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.5678 
##     beta  = 0.0888 
## 
##   Initial states:
##     l = 0.755 
##     b = -0.0143 
## 
##   sigma:  0.0376
## Error measures:
##                       ME       RMSE        MAE       MPE     MAPE      MASE
## Training set 0.008262872 0.03758621 0.02896041 0.9541908 3.905009 0.9397818
##                    ACF1
## Training set -0.1449597
## 
## Forecasts:
##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2015      0.7162102 0.6680415 0.7643789 0.6425426 0.7898778
## 2016      0.7148677 0.6572445 0.7724909 0.6267407 0.8029948
## 2017      0.7135252 0.6456320 0.7814184 0.6096915 0.8173589

checkresiduals(f2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 2.8477, df = 3, p-value = 0.4157
## 
## Model df: 4.   Total lags used: 7

3)Holts with exponential trend

f3 <- holt(RBO_ts, initial="simple", exponential=TRUE, h=3)
# Fit with exponential trend
summary(f3)

## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = RBO_ts, h = 3, initial = "simple", exponential = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.5667 
##     beta  = 0.0845 
## 
##   Initial states:
##     l = 0.755 
##     b = 0.9811 
## 
##   sigma:  0.0514
## Error measures:
##                       ME       RMSE        MAE       MPE     MAPE      MASE
## Training set 0.008053192 0.03737566 0.02865753 0.9221333 3.861994 0.9299532
##                    ACF1
## Training set -0.1461854
## 
## Forecasts:
##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2015      0.7164020 0.6685812 0.7621690 0.6454951 0.7897139
## 2016      0.7151840 0.6604416 0.7717697 0.6324023 0.8036000
## 2017      0.7139681 0.6478010 0.7801527 0.6163925 0.8181239

checkresiduals(f3)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 2.8343, df = 3, p-value = 0.4179
## 
## Model df: 4.   Total lags used: 7

4)Additive damped holts method

f4 <- holt(RBO_ts, damped=TRUE, initial="simple", h=3) 
# Fit with additive damped trend
summary(f4)

## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = RBO_ts, h = 3, damped = TRUE, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.4773 
##     beta  = 1e-04 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 0.7542 
##     b = 0.0082 
## 
##   sigma:  0.0377
## 
##       AIC      AICc       BIC 
## -90.15879 -86.65879 -81.55487 
## 
## Error measures:
##                        ME       RMSE        MAE        MPE     MAPE      MASE
## Training set -0.004553134 0.03457183 0.02527824 -0.7722916 3.450154 0.8202932
##                    ACF1
## Training set -0.1237046
## 
## Forecasts:
##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2015      0.7195752 0.6711967 0.7679537 0.6455866 0.7935638
## 2016      0.7195801 0.6659717 0.7731885 0.6375931 0.8015670
## 2017      0.7195840 0.6612112 0.7779567 0.6303106 0.8088574

checkresiduals(f4)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 3.3959, df = 3, p-value = 0.3345
## 
## Model df: 5.   Total lags used: 8

plot(f1, type="l", ylab="Similarity of RBO order wrt FFD", xlab="Year",main="Fig.13 forecasting of fitted models",
     fcol="white", plot.conf=FALSE)
lines(fitted(f1), col="blue")
lines(fitted(f2), col="red")
lines(fitted(f3), col="green")
lines(fitted(f4), col="cyan")
lines(f1$mean, col="blue", type="l")
lines(f2$mean, col="red", type="l")
lines(f3$mean, col="green", type="l")
lines(f4$mean, col="brown", type="l")
legend("topright", lty=1, col=c("black","blue","red","green","cyan"),c("Data","SES", "Holt's linear trend", "Exponential trend","Additive damped trend"))

The fitted values and 3 year forecasts are displayed in Figure 13

However, we can observe that the 95% confidence intervals for the forecasts from selected approach are very precise and provide reliable forecasts.

Task 3(b)

knitr::opts_chunk$set(echo = TRUE)

library(TSA)
library(car)
library(dynlm)
library(Hmisc)
library(forecast)
library(stats)

Loading dataset RBO.csv for task 3(b)

#recalling RBO data
class(RBOdata)

## [1] "data.frame"

head(RBOdata)

##   ï..Year       RBO Temperature Rainfall Radiation RelHumidity
## 1    1984 0.7550088    9.371585 2.489344  14.87158    93.92650
## 2    1985 0.7407520    9.656164 2.475890  14.68493    94.93589
## 3    1986 0.8423860    9.273973 2.421370  14.51507    94.09507
## 4    1987 0.7484425    9.219178 2.319726  14.67397    94.49699
## 5    1988 0.7984084   10.202186 2.465301  14.74863    94.08142
## 6    1989 0.7938803    9.441096 2.735890  14.78356    96.08685

Convert data into a time series object

RBO.ts = matrix(RBOdata$RBO, nrow = 25, ncol = 12)
RBO.ts = as.vector(t(RBO.ts))
RBO.ts = ts(RBO.ts,start=c(1984,1), end=c(2014,1), frequency=2)
class(RBO.ts)

## [1] "ts"

plot(RBO.ts,ylab='RBO similarity of the order of FFD',xlab='Year',type='o',
     main = "Time series plot of RBOs.")

acf(RBO.ts,max.lag = 48, main="Sample ACF for RBOs")

# Intervention results in an immediate and permanent shift in the mean function
RBO.tr = log(RBO.ts)
plot(RBO.tr,ylab='Log of landings in metric tons',xlab='Year',
     main = "Fig.14 Time series plot of the logarithm of yearly
similarity of order of RBOs.")
points(y=RBO.tr,x=time(RBO.tr), pch=as.vector(season(RBO.tr)))

Observations made from Figure 14 plot, we can make the following comments on the characteristics of the series: • There is a possibility of a slight downward trend, especially in the beginning of the series. • Seasonality is present strongly, though the pattern changes overtime, we can say that lower values are observed in July and August and higher values in December-January. • Since the series is Seasonal,the changing variance and behaviour of the series is not apparent due to seasonality. • Intervention on points is absent.

Dynlm Modelling univariately

Y.t = RBO.tr
T = 96
S.t = 1*(seq(Y.t) >= T)
S.t.1 = Lag(S.t,+1)
model31 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + trend(Y.t) + season(Y.t))
summary(model31)

## 
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + trend(Y.t) + season(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.122950 -0.043817 -0.009939  0.030096  0.119378 
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -2.343e-01  4.350e-02  -5.387 1.47e-06 ***
## L(Y.t, k = 1)  1.730e-01  1.331e-01   1.300    0.199    
## S.t                   NA         NA      NA       NA    
## trend(Y.t)    -9.609e-05  8.560e-04  -0.112    0.911    
## season(Y.t)2  -2.037e-02  1.494e-02  -1.363    0.178    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05729 on 56 degrees of freedom
## Multiple R-squared:  0.05307,    Adjusted R-squared:  0.002347 
## F-statistic: 1.046 on 3 and 56 DF,  p-value: 0.3793

model31.2 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + season(Y.t))
summary(model31.2)

## 
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + season(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.124162 -0.044216 -0.009184  0.029622  0.119925 
## 
## Coefficients: (1 not defined because of singularities)
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -0.23559    0.04166  -5.655 5.24e-07 ***
## L(Y.t, k = 1)  0.17391    0.13166   1.321    0.192    
## S.t                 NA         NA      NA       NA    
## season(Y.t)2  -0.02034    0.01481  -1.373    0.175    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05679 on 57 degrees of freedom
## Multiple R-squared:  0.05286,    Adjusted R-squared:  0.01963 
## F-statistic: 1.591 on 2 and 57 DF,  p-value: 0.2127

model31.3 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + trend(Y.t) )
summary(model31.3)

## 
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + trend(Y.t))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.11298 -0.04540 -0.01214  0.02424  0.12738 
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -2.525e-01  4.173e-02  -6.050 1.19e-07 ***
## L(Y.t, k = 1)  1.476e-01  1.327e-01   1.112    0.271    
## S.t                   NA         NA      NA       NA    
## trend(Y.t)    -7.265e-05  8.622e-04  -0.084    0.933    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05772 on 57 degrees of freedom
## Multiple R-squared:  0.02165,    Adjusted R-squared:  -0.01267 
## F-statistic: 0.6308 on 2 and 57 DF,  p-value: 0.5358

aic = AIC(model31, model31.2, model31.3)
bic = BIC(model31, model31.2, model31.3)
aic

##           df       AIC
## model31    5 -167.0321
## model31.2  4 -169.0186
## model31.3  4 -167.0735

bic

##           df       BIC
## model31    5 -156.5604
## model31.2  4 -160.6413
## model31.3  4 -158.6961

model32 = dynlm(Y.t ~ L(Y.t , k = 2 ) + S.t + trend(Y.t) + season(Y.t))
summary(model32)

## 
## Time series regression with "ts" data:
## Start = 1985(1), End = 2014(1)
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 2) + S.t + trend(Y.t) + season(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.127912 -0.040066 -0.007146  0.034405  0.118657 
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -0.3483867  0.0407243  -8.555  1.1e-11 ***
## L(Y.t, k = 2) -0.2344598  0.1307355  -1.793   0.0784 .  
## S.t                   NA         NA      NA       NA    
## trend(Y.t)    -0.0005389  0.0008606  -0.626   0.5338    
## season(Y.t)2  -0.0190425  0.0147801  -1.288   0.2030    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05613 on 55 degrees of freedom
## Multiple R-squared:  0.07605,    Adjusted R-squared:  0.02565 
## F-statistic: 1.509 on 3 and 55 DF,  p-value: 0.2224

model33 = dynlm(Y.t ~ L(Y.t , k = 1 ) + S.t + S.t.1 + trend(Y.t) + season(Y.t))
summary(model33)

## 
## Time series regression with "ts" data:
## Start = 1984(2), End = 2014(1)
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + S.t.1 + trend(Y.t) + 
##     season(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.122950 -0.043817 -0.009939  0.030096  0.119378 
## 
## Coefficients: (2 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -2.343e-01  4.350e-02  -5.387 1.47e-06 ***
## L(Y.t, k = 1)  1.730e-01  1.331e-01   1.300    0.199    
## S.t                   NA         NA      NA       NA    
## S.t.1                 NA         NA      NA       NA    
## trend(Y.t)    -9.609e-05  8.560e-04  -0.112    0.911    
## season(Y.t)2  -2.037e-02  1.494e-02  -1.363    0.178    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05729 on 56 degrees of freedom
## Multiple R-squared:  0.05307,    Adjusted R-squared:  0.002347 
## F-statistic: 1.046 on 3 and 56 DF,  p-value: 0.3793

. 1) Simple exponential forecasting

f31 <- ses(RBO.tr, alpha=0.1, initial="simple", h=3) # Set alpha to a small value
summary(f31)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = RBO.tr, h = 3, initial = "simple", alpha = 0.1) 
## 
##   Smoothing parameters:
##     alpha = 0.1 
## 
##   Initial states:
##     l = -0.281 
## 
##   sigma:  0.0591
## Error measures:
##                        ME       RMSE        MAE       MPE     MAPE      MASE
## Training set -0.003149581 0.05912707 0.04786464 -3.448812 17.99458 0.6576632
##                  ACF1
## Training set 0.135882
## 
## Forecasts:
##         Point Forecast      Lo 80     Hi 80      Lo 95      Hi 95
## 2014.50     -0.3002383 -0.3760127 -0.224464 -0.4161253 -0.1843514
## 2015.00     -0.3002383 -0.3763907 -0.224086 -0.4167033 -0.1837734
## 2015.50     -0.3002383 -0.3767667 -0.223710 -0.4172784 -0.1831983

checkresiduals(f31)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 20.508, df = 3, p-value = 0.0001332
## 
## Model df: 2.   Total lags used: 5

Holts simple

f32 <- holt(RBO.tr,initial = "simple",h=3)
summary(f32)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = RBO.tr, h = 3, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.7249 
##     beta  = 0.1837 
## 
##   Initial states:
##     l = -0.281 
##     b = -0.0969 
## 
##   sigma:  0.0796
## Error measures:
##                      ME       RMSE        MAE       MPE     MAPE     MASE
## Training set 0.01355766 0.07956267 0.06302001 -8.828896 24.08236 0.865899
##                    ACF1
## Training set 0.01998612
## 
## Forecasts:
##         Point Forecast      Lo 80       Hi 80      Lo 95        Hi 95
## 2014.50     -0.2288297 -0.3307933 -0.12686599 -0.3847696 -0.072889690
## 2015.00     -0.2155438 -0.3533143 -0.07777332 -0.4262456 -0.004842037
## 2015.50     -0.2022579 -0.3794222 -0.02509370 -0.4732073  0.068691395

checkresiduals(f32)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 15.226, df = 3, p-value = 0.001634
## 
## Model df: 4.   Total lags used: 7

Holts with exponential trend

f33 <- holt(RBO.tr, initial="simple", exponential=TRUE, h=3) 
# Fit with exponential trend
summary(f33)

## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = RBO.tr, h = 3, initial = "simple", exponential = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.7417 
##     beta  = 0.2581 
## 
##   Initial states:
##     l = -0.281 
##     b = 1.3447 
## 
##   sigma:  0.2658
## Error measures:
##                      ME       RMSE        MAE       MPE     MAPE      MASE
## Training set 0.01935805 0.08295608 0.06495641 -10.75976 24.88312 0.8925052
##                      ACF1
## Training set -0.002607442
## 
## Forecasts:
##         Point Forecast      Lo 80       Hi 80      Lo 95       Hi 95
## 2014.50     -0.2301822 -0.3103707 -0.15370514 -0.3492260 -0.11004616
## 2015.00     -0.2186152 -0.3347460 -0.12546123 -0.4108465 -0.08717152
## 2015.50     -0.2076295 -0.3691222 -0.09681279 -0.4845434 -0.06043851

checkresiduals(f33)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 15.743, df = 3, p-value = 0.00128
## 
## Model df: 4.   Total lags used: 7

4)Additive Damped holts method

#
f34 <- holt(RBO.tr, damped=TRUE, initial="simple", h=3) 
# Fit with additive damped trend
summary(f34)

## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = RBO.tr, h = 3, damped = TRUE, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
##     phi   = 0.98 
## 
##   Initial states:
##     l = -0.2809 
##     b = -6e-04 
## 
##   sigma:  0.0593
## 
##       AIC      AICc       BIC 
## -87.03774 -85.48219 -74.37250 
## 
## Error measures:
##                        ME       RMSE        MAE       MPE    MAPE      MASE
## Training set -0.004210585 0.05685667 0.04609657 -2.873118 17.1346 0.6333698
##                   ACF1
## Training set 0.1502685
## 
## Forecasts:
##         Point Forecast      Lo 80      Hi 80      Lo 95      Hi 95
## 2014.50     -0.3013437 -0.3773918 -0.2252956 -0.4176492 -0.1850381
## 2015.00     -0.3015122 -0.3775603 -0.2254641 -0.4178177 -0.1852066
## 2015.50     -0.3016773 -0.3777254 -0.2256292 -0.4179829 -0.1853718

checkresiduals(f34)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 30.622, df = 3, p-value = 1.021e-06
## 
## Model df: 5.   Total lags used: 8

The fitted values and 3 year forecasts are displayed in Figure 15

plot(f31, type="l", ylab="Similarity of RBO order wrt FFD", xlab="Year",main="Fig.15 Forecasting of RBO wrt FFd values",
     fcol="white", plot.conf=FALSE)
lines(fitted(f31), col="blue")
lines(fitted(f32), col="red")
lines(fitted(f33), col="green")
lines(fitted(f34), col="cyan")
lines(f31$mean, col="blue", type="l")
lines(f32$mean, col="red", type="l")
lines(f33$mean, col="green", type="l")
lines(f34$mean, col="brown", type="l")
legend("topright", lty=1, col=c("black","blue","red","green","cyan"),c("Data","SES", "Holt's linear trend", "Exponential trend","Additive damped trend"))

Conclusion: Using Various timeseries analysis and modelling techniques, we have obtained the forescasting for next three years 2015,2016,2017.

Forecasting Final Project

Rashmi Walavalkar, S3804366

10/23/2021

About the Final Project:

Task 1

Introduction

About Dataset

Plot the time series dataset

Scaling down to avoid mismatch

Analysing the non stationarity of data

STL decomposition and x12 decomposition

Modelling process:

3 KOYCK Transformation DL modeling

4 Autoregressive DLM

Exponential smoothening

Finding best fit for each attribute

State-space models variation

Conclusion:

Task 2

Introduction

About Dataset

**Data exploration and visualisation

Finite dlm

Polynomial distributed lag model

Koyck transformation

Exponential smoothing methods

State-space models variations

Forecasting

Task 3

Introduction

About dataset

Task 3(a):

**Time series regression methods

Polynomial distributed lag model

Koyck transformation

Autoregressive distributed lag models

Forecasting

Task 3(b)

Dynlm Modelling univariately