Required Packages

rm(list=ls()) #clears the memory

library(TSA)
library(car)
library(carData)
library(lmtest)
library(zoo)
library(AER) 
library(dynlm)
library(lattice)
library(Formula)
library(ggplot2)
library(Hmisc)
library(forecast)
library(xts)
library(x13binary)
library(x12) 
library(nardl)
library(dLagM)
library(readr) 
library(uroot)
library(tseries)
library(urca)
library(expsmooth)

Introduction

Air pollution is one of our era’s biggest plagues, not only because it influences climate change but also because of its impact on public and individual health owing to increased illness and death. Several contaminants are important contributors to human illness. Particulate Matter (PM), particles with varying but extremely tiny diameters, enter the respiratory system by inhalation and cause respiratory and cardiovascular illnesses, reproductive and central nervous system malfunction, and cancer. Human emissions of carbon dioxide and other greenhouse gases are also a major contributor to climate change and one of the world’s most urgent problems. In addition, talk about flowering, flowering patterns are critical for understanding plant reproductive dynamics and pollinator resource availability. Blossom and leaf phenology are constrained by seasonal climate, and leaf and flower hues are likely to change across seasons. In comparison to this, the Rank-based Order similarity metric is used to compute the similarity between the yearly flowering order and the blooming order.

The research is broken into three sections. The first task aims to assess and provide the best 4 weeks ahead forecasts for the mortality series, including point forecasts and confidence intervals for the most optimum model for each technique employed. For this work, we will utilize the time-series regression technique to build multivariate distributed lag models with weekly mortality data as an autonomous explanatory series. We will also show exponential smoothing approaches in combination with appropriate state-space models to forecast mortality data. We will next compare these approaches using residual assumptions and quality of fit measures. The final goal of this research is to offer four-week predictions in terms of average absolute scaled error from the most suited model (MASE).

The next goal is to examine how to model and predict FFD to produce the best four-year forecasts for the FFD series, as well as point forecasts and confidence intervals. In this section, we will use the time-series regression approach to create univariate distributed lag models with annual FFD series. We will also illustrate how to anticipate FFD data using exponential smoothing methods in combination with appropriate state-space models. To achieve our ultimate aim, we will assess various techniques utilizing residual assumptions and quality of fit metrics.

The final and most important job is separated into two parts. The first stage necessitates the use of several modeling approaches, including: (DLM, ARDL, Poly, Koyck, Dynam). Among all of the techniques employed with the ideal model three years ahead, forecasting must be demonstrated, and the second stage obliges adequate analysis and the generation of three-year predictions.

Dataset Details

The datasets were obtained from MATH1307 assignment - 3 canvas, which features three separate datasets: one concerning weekly mortality series, the second is about the first flowering day, and the last is regarding the Rank-based Order similarity metric (RBO).

Task 1 - Time series analysis and forecasting of disease-specific weekly mortality in relation to temperature, pollutant size, and noxious chemical emissions levels

The objective of this task is to forecast the mortality series. The analysis looks at the possible consequences of climate change and pollution on disease-specific mortality from 2010 to 2020. A group of researchers studied disease-specific average weekly mortality in Paris, France, as well as the city’s local climate (temperature), size of pollutants, and levels of noxious chemical emissions from cars and industry in the air -all of which were measured at the same points between 2010 and 2020.

mort_1 <- read.csv("/Users/zuaibshaikh/Desktop/SEM 4/Forecasting/Final Project/mort  .csv")
mort= mort_1[,2:6]
head(mort)
##   mortality  temp chem1 chem2 particle.size
## 1     97.85 72.38 11.51  3.37         72.72
## 2    104.64 67.19  8.92  2.59         49.60
## 3     94.36 62.94  9.48  3.29         55.68
## 4     98.05 72.49 10.28  3.04         55.16
## 5     95.85 74.25 10.57  3.39         66.02
## 6     95.98 67.88  7.99  2.57         44.01
class(mort$mortality)
## [1] "numeric"
class(mort$temp)
## [1] "numeric"
class(mort$chem1)
## [1] "numeric"
class(mort$chem2)
## [1] "numeric"
class(mort$particle.size)
## [1] "numeric"
mortality.ts <- ts(mort$mortality, start =2010, frequency= 52)
head(mortality.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
## [1]  97.85 104.64  94.36  98.05  95.85  95.98
tail(mortality.ts)
## Time Series:
## Start = c(2019, 35) 
## End = c(2019, 40) 
## Frequency = 52 
## [1] 73.46 79.03 76.56 78.52 89.43 85.49
tail(mortality.ts)
## Time Series:
## Start = c(2019, 35) 
## End = c(2019, 40) 
## Frequency = 52 
## [1] 73.46 79.03 76.56 78.52 89.43 85.49
str(mortality.ts)
##  Time-Series [1:508] from 2010 to 2020: 97.8 104.6 94.4 98 95.8 ...
temp.ts= ts(mort$temp,start =2010, frequency= 52)
head(temp.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
## [1] 72.38 67.19 62.94 72.49 74.25 67.88
chem_1.ts = ts(mort$chem1, start =2010, frequency= 52)
head(chem_1.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
## [1] 11.51  8.92  9.48 10.28 10.57  7.99
chem_2.ts = ts(mort$chem2, start =2010, frequency= 52)
head(chem_2.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
## [1] 3.37 2.59 3.29 3.04 3.39 2.57
particle.size.ts = ts(mort$particle.size, start =2010, frequency= 52)
head(particle.size.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
## [1] 72.72 49.60 55.68 55.16 66.02 44.01
mort.ts=ts(mort,start =2010, frequency= 52)
head(mort.ts)
## Time Series:
## Start = c(2010, 1) 
## End = c(2010, 6) 
## Frequency = 52 
##          mortality  temp chem1 chem2 particle.size
## 2010.000     97.85 72.38 11.51  3.37         72.72
## 2010.019    104.64 67.19  8.92  2.59         49.60
## 2010.038     94.36 62.94  9.48  3.29         55.68
## 2010.058     98.05 72.49 10.28  3.04         55.16
## 2010.077     95.85 74.25 10.57  3.39         66.02
## 2010.096     95.98 67.88  7.99  2.57         44.01

Analysis and visualisation of data

Plotting graphs for the converted time series characteristics.

  1. Plotting time series plot of mortality series which we will use as a predictor series for distributed lag models
  2. Plotting a time series graph for temperature data to investigate its properties.
  3. Plotting a time series graph for chemical level 1 data to investigate its properties.
  4. Plotting a time series graph for chemical level 2 data to investigate its properties.
  5. Plotting a time series graph for partical size data to investigate its properties.
  • Mortality Variable:
plot(mortality.ts, xlab='Year',type='o', main = " Figure 1. Time series plot of weekly mortality")

From Figure 1 of time series plot for weekly mortality series, we can interpret as follows:

  1. Trend - The plot is showing there is no obvious trend.
  2. Seasonality - There is a presence of seasonality but, the seasonal pattern is not constant over time.
  3. Changing Variation - Because of the presence of seasonality, changing variance is not visible.
  4. Behaviour – The series behaviour is not apparent due to the seasonal trend.
  5. Change Point - There is one possible intervention periods in the vicinity of 2013.
  • Temperature Variable:
plot(temp.ts, xlab='Year',type='o', main = " Figure 2. Time series plot of temperature")

From Figure 2 of time series plot for temperature series, we can interpret as follows:

  1. Trend - The plot is showing there is no obvious trend.
  2. Seasonality - There is a presence of seasonality.
  3. Changing Variation - Because of the presence of seasonality, changing variance is not visible.
  4. Behaviour – The series behaviour is not apparent due to the seasonal trend.
  5. Change Point - There is no possible intervention periods.
  • First Chemical Variable:
plot(chem_1.ts, xlab='Year',type='o', main = " Figure 3. Time series plot of noxious chemical emission level 1")

From Figure 3 of time series plot for chemical level 1 series, we can interpret as follows:

  1. Trend - The plot is showing there kind of downward trend but it is not obvious.
  2. Seasonality - There is a presence of seasonality but, the seasonal pattern is not constant over time.
  3. Changing Variation - Because of the presence of seasonality, changing variance is not visible.
  4. Behaviour – The series behaviour is not apparent due to the seasonal trend.
  5. Change Point - There is no possible intervention periods.
  • Second Chemical Variable:
plot(chem_2.ts, xlab='Year',type='o', main = " Figure 4. Time series plot of noxious chemical emission level 2")

From Figure 4 of time series plot for chemical level 2 series, we can interpret as follows:

  1. Trend - The plot is showing there is no obvious trend.
  2. Seasonality - There is a presence of seasonality.
  3. Changing Variation - Because of the presence of seasonality, changing variance is not visible.
  4. Behaviour – The series behaviour is not apparent due to the seasonal trend.
  5. Change Point - There is one possible intervention periods in the vicinity of 2014.
  • Particle Size Variable:
plot(particle.size.ts, xlab='Year',type='o', main = " Figure 5. Time series plot of pollutants particle size")

From Figure 5 of time series plot for particle size series, we can interpret as follows:

  1. Trend - The plot is showing there is no obvious trend.
  2. Seasonality - There is a presence of seasonality.
  3. Changing Variation - Because of the presence of seasonality, changing variance is not visible.
  4. Behaviour – The series behaviour is not apparent due to the seasonal trend.
  5. Change Point - There is no possible intervention periods.
  • In order to precisely depict the secondary mortality series alongside the explicative all the rest response series in the same figure, we pleasure normalize the data. The code below gives a time series tale to investigate the series relationship.
mort.scale = scale(mort.ts)
plot(mort.scale, plot.type="s",col = c("black", "red", "blue", "green", "brown"), main = "Figure 6. Weekly mortality data series")
legend("topleft",lty=1, text.width =1.7, col=c("black", "red", "blue", "green", "brown"), c("Mortality", "Temperature", "Chemical 1", "Chemical 2","Partical size"))

From figure 6 we can infer all of the above five-time series drawn together after scaling and centering.

Stationarity Check

  • Plotting ACF/PACF plots for all attributes and performing ADF test for the same.
  1. Mortality Series
acf(mortality.ts, lag.max = 48, main = "Figure 7. Sample ACF for Mortality Series")

pacf(mortality.ts, lag.max = 48, main = "Figure 8. Sample PACF for Mortality Series")

adf.test(mortality.ts)
## Warning in adf.test(mortality.ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  mortality.ts
## Dickey-Fuller = -5.4125, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
adf.mortality.ts = ur.df(mortality.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.mortality.ts)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -19.1256  -3.7511   0.0502   3.6939  20.9358 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.001961   0.002903  -0.676      0.5    
## z.diff.lag -0.505434   0.038383 -13.168   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.823 on 504 degrees of freedom
## Multiple R-squared:  0.2574, Adjusted R-squared:  0.2545 
## F-statistic: 87.35 on 2 and 504 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -0.6755 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
pp.mortality.ts = ur.pp(mortality.ts, type = "Z-alpha", lags = "short") 
summary(pp.mortality.ts)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.618  -4.155  -0.370   4.019  22.264 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 20.22464    2.52168    8.02 7.41e-15 ***
## y.l1         0.77173    0.02825   27.32  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.359 on 505 degrees of freedom
## Multiple R-squared:  0.5964, Adjusted R-squared:  0.5956 
## F-statistic: 746.3 on 1 and 505 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -111.1669 
## 
##          aux. Z statistics
## Z-tau-mu            7.8809

The ACF/PACF plot for the mortality series is shown in Figure 7 & Figure 8 which tells us about:

  • A sinusoidal like pattern in the ACF plot of mortality series, indicating that there is no trend.
  • The ACF plot revealed strong seasonal patterns.
  • In this case, our PACF’s longer initial line implies the likelihood of a trend. In addition, we can see high significant lags and a large drop at the second major lag.
  • The augmented Dickey-Fuller test yields a p-value of 0.01 which is less than 0.05, indicating that the series is stationary at the 5% level of significance.
  • We infer that the moratlity series exhibits a significant seasonality pattern.
  1. Temperature Series
acf(temp.ts, lag.max = 48, main = "Figure 9. Sample ACF for Temperature Series")

pacf(temp.ts, lag.max = 48, main = "Figure 10. Sample PACF for Temperature Series")

adf.test(temp.ts)
## Warning in adf.test(temp.ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  temp.ts
## Dickey-Fuller = -4.4572, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
adf.temp.ts = ur.df(temp.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.temp.ts)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.8049  -4.1532  -0.1683   4.4153  21.5154 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.002683   0.004093  -0.656    0.512    
## z.diff.lag -0.519365   0.038052 -13.649   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.878 on 504 degrees of freedom
## Multiple R-squared:  0.2719, Adjusted R-squared:  0.269 
## F-statistic: 94.12 on 2 and 504 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -0.6556 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
pp.temp.ts = ur.pp(temp.ts, type = "Z-alpha", lags = "short") 
summary(pp.temp.ts)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -23.6204  -4.6804  -0.0509   4.6633  23.7841 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 29.54652    2.65853   11.11   <2e-16 ***
## y.l1         0.60211    0.03554   16.94   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.211 on 505 degrees of freedom
## Multiple R-squared:  0.3624, Adjusted R-squared:  0.3612 
## F-statistic: 287.1 on 1 and 505 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -244.5292 
## 
##          aux. Z statistics
## Z-tau-mu           11.9834

The ACF/PACF plot for the temperature series is shown in Figure 9 & Figure 10 which tells us about:

  • A sinusoidal like pattern in the ACF plot of temperature series, indicating that there is no trend.
  • The ACF plot revealed strong seasonal patterns.
  • In this case, our PACF’s longer initial line implies the likelihood of a trend. In addition, we can see high significant lags.
  • The augmented Dickey-Fuller test yields a p-value of 0.01 which is less than 0.05, indicating that the series is stationary at the 5% level of significance.
  • We infer that the temperature series exhibits a significant seasonality pattern.
  1. First Chemical Series
acf(chem_1.ts, lag.max = 48, main = "Figure 11. Sample ACF for noxious chemical emission level 1")

pacf(chem_1.ts, lag.max = 48, main = "Figure 12. Sample PACF for noxious chemical emission level 1")

adf.test(chem_1.ts)
## Warning in adf.test(chem_1.ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  chem_1.ts
## Dickey-Fuller = -4.4926, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
adf.chem_1.ts = ur.df(chem_1.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.chem_1.ts)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5785 -1.1611  0.0874  1.5744 12.4693 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.02973    0.01351  -2.201   0.0282 *  
## z.diff.lag -0.58331    0.03612 -16.150   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.613 on 504 degrees of freedom
## Multiple R-squared:  0.3641, Adjusted R-squared:  0.3616 
## F-statistic: 144.3 on 2 and 504 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -2.2012 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
pp.chem_1.ts = ur.pp(chem_1.ts, type = "Z-alpha", lags = "short") 
summary(pp.chem_1.ts)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.1240 -1.7605 -0.5254  1.3137 13.1374 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.98653    0.30487   9.796   <2e-16 ***
## y.l1         0.62152    0.03481  17.855   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.949 on 505 degrees of freedom
## Multiple R-squared:  0.387,  Adjusted R-squared:  0.3858 
## F-statistic: 318.8 on 1 and 505 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -233.4323 
## 
##          aux. Z statistics
## Z-tau-mu           10.5921

The ACF/PACF plot for the chemical level 1 series is shown in Figure 11 & Figure 12 which tells us about:

  • A sinusoidal like pattern in the ACF plot of chemical level 1 series, indicating that there is no trend.
  • The ACF plot revealed strong seasonal patterns.
  • In this case, our PACF’s longer initial line implies the likelihood of a trend. In addition, we can see high significant lags.
  • The augmented Dickey-Fuller test yields a p-value of 0.01 which is less than 0.05, indicating that the series is stationary at the 5% level of significance.
  • We infer that the chemical level 1 series exhibits a significant seasonality pattern.
  1. Second Chemical Series
acf(chem_2.ts, lag.max = 48, main = "Figure 13. Sample ACF for noxious chemical emission level 2")

pacf(chem_2.ts, lag.max = 48, main = "Figure 14. Sample PACF for noxious chemical emission level 2")

adf.test(chem_2.ts)
## Warning in adf.test(chem_2.ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  chem_2.ts
## Dickey-Fuller = -5.2791, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
adf.chem_2.ts = ur.df(chem_2.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.chem_2.ts)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6726 -0.4772  0.0717  0.6809  3.6885 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.03645    0.01466  -2.487   0.0132 *  
## z.diff.lag -0.52667    0.03781 -13.931   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9815 on 504 degrees of freedom
## Multiple R-squared:  0.3054, Adjusted R-squared:  0.3027 
## F-statistic: 110.8 on 2 and 504 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -2.4868 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
pp.chem_2.ts = ur.pp(chem_2.ts, type = "Z-alpha", lags = "short") 
summary(pp.chem_2.ts)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.2311 -0.6721 -0.1768  0.5706  3.9163 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.77789    0.12537  14.181   <2e-16 ***
## y.l1         0.37427    0.04132   9.057   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9761 on 505 degrees of freedom
## Multiple R-squared:  0.1397, Adjusted R-squared:  0.138 
## F-statistic: 82.03 on 1 and 505 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -441.7985 
## 
##          aux. Z statistics
## Z-tau-mu           15.7764

The ACF/PACF plot for the chemical level 2 series is shown in Figure 13 & Figure 14 which tells us about:

  • A sinusoidal like pattern in the ACF plot of chemical level 2 series, indicating that there is no trend.
  • The ACF plot revealed strong seasonal patterns.
  • In this case, our PACF’s longer initial line implies the likelihood of a trend. In addition, we can see high significant lags.
  • The augmented Dickey-Fuller test yields a p-value of 0.01 which is less than 0.05, indicating that the series is stationary at the 5% level of significance.
  • We infer that the chemical level 2 series exhibits a significant seasonality pattern.
  1. Partical Size Series
acf(particle.size.ts, lag.max = 48, main = "Figure 15. Sample ACF for Partical Size Series")

pacf(particle.size.ts, lag.max = 48, main = "Figure 16. Sample PACF for Partical Size Series")

adf.test(particle.size.ts)
## Warning in adf.test(particle.size.ts): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  particle.size.ts
## Dickey-Fuller = -4.493, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
adf.particle.size.ts = ur.df(particle.size.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.particle.size.ts)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.961  -6.857   0.668   8.580  55.164 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.01843    0.01070  -1.723   0.0855 .  
## z.diff.lag -0.54545    0.03720 -14.663   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.83 on 504 degrees of freedom
## Multiple R-squared:  0.313,  Adjusted R-squared:  0.3103 
## F-statistic: 114.8 on 2 and 504 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -1.7232 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
pp.particle.size.ts = ur.pp(particle.size.ts, type = "Z-alpha", lags = "short") 
summary(pp.particle.size.ts)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.935  -8.853  -1.172   7.184  56.511 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 21.16583    1.83956   11.51   <2e-16 ***
## y.l1         0.55288    0.03698   14.95   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.59 on 505 degrees of freedom
## Multiple R-squared:  0.3068, Adjusted R-squared:  0.3054 
## F-statistic: 223.5 on 1 and 505 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -293.0187 
## 
##          aux. Z statistics
## Z-tau-mu           12.6758

The ACF/PACF plot for the partical size series is shown in Figure 15 & Figure 16 which tells us about:

  • A sinusoidal like pattern in the ACF plot of partical size series, indicating that there is no trend.
  • The ACF plot revealed strong seasonal patterns.
  • In this case, our PACF’s longer initial line implies the likelihood of a trend. In addition, we can see high significant lags.
  • The augmented Dickey-Fuller test yields a p-value of 0.01 which is less than 0.05, indicating that the series is stationary at the 5% level of significance.
  • We infer that the partical size series exhibits a significant seasonality pattern.

As a result of this analysis, the outcomes of ADF tests are less than the 5% significance level, and additional variables support this. In conclusion, this implies that the series is stationary.

Analysing the impact of the components of a time series data on the given dataset.

  • Seasonality, Trend, and Remainder are time-series important parameters.
  • It is essential to divide the time series into discrete components. This facilitates in observing individual effects as well as past activities on existing components. This decomposition can also be used to study and learn more about the individual components.

The Influence of Trend and Seasonality

The series can be deconstructed to examine the outcome/influence of trend and seasonality. We spoke about the STL decomposition in this section.

Time Series Decomposition of mortality

mortality.ts.decom.stl <- stl(mortality.ts, t.window=15, s.window="periodic", robust=TRUE)
plot(mortality.ts.decom.stl, main="Figure 17. STL decomposition of the weekly mortality series")

From figure 17 STL, that trend follows the entire model of the original series with an increase in tendency and then decreases. The influence of seasonality is constant over time and the remainder of the series illustrates the minor interventions.

Time Series Decomposition of temperature

temperature.ts.decom.stl <- stl(temp.ts, t.window=15, s.window="periodic", robust=TRUE)
plot(temperature.ts.decom.stl, main="Figure 18. STL decomposition of the temperature series")

From figure 18 STL, that trend follows the entire model of the original series with an increase in tendency, decreases, and again rises. The influence of seasonality is constant over time and there are no significant variations and changing variance in the remainder section.

Time Series Decomposition of noxious chemical emission level 1

chem_1.ts.decom.stl <- stl(chem_1.ts, t.window=15, s.window="periodic", robust=TRUE)
plot(chem_1.ts.decom.stl, main="Figure 19. STL decomposition of the noxious chemical emission level 1 series")

From figure 19 STL, that trend follows the entire model of the original series with an increase in tendency and then decreases. The influence of seasonality is constant over time and the remainder of the series illustrates the minor interventions.

Time Series Decomposition of noxious chemical emission level 2

chem_2.ts.decom.stl <- stl(chem_2.ts, t.window=15, s.window="periodic", robust=TRUE)
plot(chem_2.ts.decom.stl, main="Figure 20. STL decomposition of the noxious chemical emission level 2 series")

From figure 20 STL, that trend follows the entire model of the original series with an increase in tendency and then decreases. The influence of seasonality is constant over time and there are no significant variations and changing variance in the remainder section.

Time Series Decomposition of Partical Size Series

particle.size.ts.decom.stl <- stl(particle.size.ts, t.window=15, s.window="periodic", robust=TRUE)
plot(particle.size.ts.decom.stl, main="Figure 21. STL decomposition of the Partical Size Series")

From figure 21 STL, that trend follows the entire model of the original series with an increase in tendency, decreases, and again rises.. The influence of seasonality is constant over time and the remainder of the series illustrates the minor interventions.

The Correlation matrix

cor(mort.ts)
##                mortality        temp       chem1     chem2 particle.size
## mortality      1.0000000 -0.43863962  0.55744759 0.2569989    0.44387133
## temp          -0.4386396  1.00000000 -0.09785582 0.4043740   -0.01723095
## chem1          0.5574476 -0.09785582  1.00000000 0.5130047    0.86611747
## chem2          0.2569989  0.40437401  0.51300467 1.0000000    0.46793404
## particle.size  0.4438713 -0.01723095  0.86611747 0.4679340    1.00000000

From the above correlation matrix, we can infer that the temperature has a negative-weak correlation of -0.4386396, with the weekly mortality, the chem-1 has a moderate correlation of 0.5574476, with the weekly mortality, the chem-2 has a weak correlation of 0.2569989, with the weekly mortality and, the particle. size has a weak correlation of 0.4438713 with the weekly mortality.

Because the weekly mortality is estimated as a dependent variable, it occupies the y-axis. Such is compared to the other four variables.

Time series regression methods

Model Fiiting dLagM with Multiple predictors are to be modelled

Finite distributed lag model

To identify an appropriate model for forecasting weekly mortality, we will consider fitting distributed lag models that incorporate an independent detailed series and its lags to support describe the general variance and correlation formation in our dependent series.

To determine the model’s finite lag length, we build a loop with multiple predictors that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:10){
model1.1 = dlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts), q = i )
cat("q = ", i, "AIC = ", AIC(model1.1$model), "BIC = ", BIC(model1.1$model),"MASE =", MASE(model1.1)$MASE,"\n")
}
## q =  1 AIC =  3747.517 BIC =  3764.431 MASE = 1.417338 
## q =  2 AIC =  3730.984 BIC =  3752.116 MASE = 1.400131 
## q =  3 AIC =  3720.465 BIC =  3745.813 MASE = 1.399245 
## q =  4 AIC =  3689.127 BIC =  3718.685 MASE = 1.357431 
## q =  5 AIC =  3669.239 BIC =  3703.004 MASE = 1.327089 
## q =  6 AIC =  3643.467 BIC =  3681.434 MASE = 1.293083 
## q =  7 AIC =  3605.878 BIC =  3648.044 MASE = 1.235668 
## q =  8 AIC =  3575.687 BIC =  3622.048 MASE = 1.2001 
## q =  9 AIC =  3550.289 BIC =  3600.84 MASE = 1.175756 
## q =  10 AIC =  3530.224 BIC =  3584.962 MASE = 1.150523

According to the output of finite distributed lag, lag 10 has the lowest MASE, AIC, and BIC values which are MASE = 1.150523, AIC = 3530.224, and BIC = 3584.962. As a result, we provide a lag duration of (q=10).

  • Fitting a finite DLM with a lag of 10 and doing the diagnostic checking for multiple predictors with respect to dependent variable Weekly mortality.
finite_DLM <- dlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts), q = 10)
summary(finite_DLM)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.990  -5.291  -0.430   4.276  39.026 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 36.730076   3.837959   9.570  < 2e-16 ***
## x.t          0.044659   0.020493   2.179 0.029794 *  
## x.1         -0.047400   0.020611  -2.300 0.021887 *  
## x.2         -0.009657   0.021077  -0.458 0.647031    
## x.3         -0.022598   0.021228  -1.065 0.287616    
## x.4          0.042823   0.021432   1.998 0.046263 *  
## x.5          0.016857   0.021419   0.787 0.431663    
## x.6          0.041028   0.021446   1.913 0.056325 .  
## x.7          0.080844   0.021271   3.801 0.000163 ***
## x.8          0.077446   0.021143   3.663 0.000277 ***
## x.9          0.087036   0.020624   4.220 2.91e-05 ***
## x.10         0.081121   0.020485   3.960 8.62e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.261 on 486 degrees of freedom
## Multiple R-squared:  0.338,  Adjusted R-squared:  0.323 
## F-statistic: 22.56 on 11 and 486 DF,  p-value: < 2.2e-16
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 3530.224 3584.962

The above model of the finite distributed lag model has q=10, Almost all lag weights in a predictor series are statistically significant at the 5% level. The adjusted R-squared of the above model is 0.323, indicating that this only explains 32.3 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(finite_DLM$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 15
## 
## data:  Residuals
## LM test = 284.1, df = 15, p-value < 2.2e-16
 shapiro.test(residuals(finite_DLM$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM$model)
## W = 0.97601, p-value = 2.717e-07

The residual graphs for the above model are shown in Figure 22:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_dlm =vif(finite_DLM$model)
vif_dlm
##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.353760 1.370024 1.440241 1.457022 1.484833 1.483227 1.484407 1.460836 
##      x.8      x.9     x.10 
## 1.440274 1.364831 1.345552
vif_dlm >10
##   x.t   x.1   x.2   x.3   x.4   x.5   x.6   x.7   x.8   x.9  x.10 
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=10 does not have a multicollinearity problem.

###$ Fitting polynomial distributed lag model

Again we creating a loop with multiple predictors for polynomial distributed lag models that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and orders and selects the model with the lowest values.

  • Fitting a polynomial DLM for multiple predictors with respect to dependent variable Weekly mortality.
for(i in 1:10){
for(j in 1:4){
                model_2.1 <- polyDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts), q
= i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_2.1$model), "BIC:", BIC(model_2.1
$model),"MASE =", MASE(model_2.1)$MASE, "\n")
} }
## q: 1 k: 1 AIC: 3747.517 BIC: 3764.431 MASE = 1.417338 
## q: 1 k: 2 AIC: 3747.517 BIC: 3764.431 MASE = 1.417338 
## q: 1 k: 3 AIC: 3747.517 BIC: 3764.431 MASE = 1.417338 
## q: 1 k: 4 AIC: 3747.517 BIC: 3764.431 MASE = 1.417338 
## q: 2 k: 1 AIC: 3733.257 BIC: 3750.163 MASE = 1.411932 
## q: 2 k: 2 AIC: 3730.984 BIC: 3752.116 MASE = 1.400131 
## q: 2 k: 3 AIC: 3730.984 BIC: 3752.116 MASE = 1.400131 
## q: 2 k: 4 AIC: 3730.984 BIC: 3752.116 MASE = 1.400131 
## q: 3 k: 1 AIC: 3721.951 BIC: 3738.849 MASE = 1.40686 
## q: 3 k: 2 AIC: 3722.182 BIC: 3743.304 MASE = 1.404651 
## q: 3 k: 3 AIC: 3720.465 BIC: 3745.813 MASE = 1.399245 
## q: 3 k: 4 AIC: 3720.465 BIC: 3745.813 MASE = 1.399245 
## q: 4 k: 1 AIC: 3693.894 BIC: 3710.784 MASE = 1.376291 
## q: 4 k: 2 AIC: 3687.596 BIC: 3708.709 MASE = 1.364139 
## q: 4 k: 3 AIC: 3689.177 BIC: 3714.513 MASE = 1.362777 
## q: 4 k: 4 AIC: 3689.127 BIC: 3718.685 MASE = 1.357431 
## q: 5 k: 1 AIC: 3674.814 BIC: 3691.696 MASE = 1.347283 
## q: 5 k: 2 AIC: 3673.125 BIC: 3694.228 MASE = 1.343057 
## q: 5 k: 3 AIC: 3668.849 BIC: 3694.172 MASE = 1.335296 
## q: 5 k: 4 AIC: 3670.842 BIC: 3700.387 MASE = 1.335148 
## q: 6 k: 1 AIC: 3646.166 BIC: 3663.04 MASE = 1.316898 
## q: 6 k: 2 AIC: 3643.92 BIC: 3665.013 MASE = 1.308253 
## q: 6 k: 3 AIC: 3641.666 BIC: 3666.978 MASE = 1.302312 
## q: 6 k: 4 AIC: 3642.301 BIC: 3671.832 MASE = 1.299765 
## q: 7 k: 1 AIC: 3611.471 BIC: 3628.337 MASE = 1.260943 
## q: 7 k: 2 AIC: 3607.355 BIC: 3628.438 MASE = 1.2548 
## q: 7 k: 3 AIC: 3606.084 BIC: 3631.384 MASE = 1.248608 
## q: 7 k: 4 AIC: 3605.261 BIC: 3634.777 MASE = 1.245028 
## q: 8 k: 1 AIC: 3579.211 BIC: 3596.07 MASE = 1.226687 
## q: 8 k: 2 AIC: 3575.903 BIC: 3596.976 MASE = 1.226551 
## q: 8 k: 3 AIC: 3573.528 BIC: 3598.816 MASE = 1.217305 
## q: 8 k: 4 AIC: 3573.857 BIC: 3603.359 MASE = 1.212355 
## q: 9 k: 1 AIC: 3553.872 BIC: 3570.722 MASE = 1.200755 
## q: 9 k: 2 AIC: 3552.577 BIC: 3573.64 MASE = 1.197739 
## q: 9 k: 3 AIC: 3547.694 BIC: 3572.97 MASE = 1.18829 
## q: 9 k: 4 AIC: 3548.258 BIC: 3577.746 MASE = 1.186252 
## q: 10 k: 1 AIC: 3530.848 BIC: 3547.69 MASE = 1.181206 
## q: 10 k: 2 AIC: 3531.085 BIC: 3552.138 MASE = 1.182812 
## q: 10 k: 3 AIC: 3525.658 BIC: 3550.922 MASE = 1.168079 
## q: 10 k: 4 AIC: 3525.611 BIC: 3555.085 MASE = 1.162673

According to the output of polynomial distributed lag model, lag =10 and k=4 has the lowest MASE, AIC, and BIC values which are MASE = 1.162673, AIC = 3525.611, and BIC = 3555.085 As a result, we provide a lag duration of (q=10, k=4).

poly_DLM <- polyDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts), q = 10, k = 4)
## Estimates and t-tests for beta coefficients:
##         Estimate Std. Error t value  P(>|t|)
## beta.0    0.0375     0.0197   1.900 5.81e-02
## beta.1   -0.0184     0.0120  -1.530 1.27e-01
## beta.2   -0.0300     0.0124  -2.420 1.60e-02
## beta.3   -0.0156     0.0100  -1.560 1.20e-01
## beta.4    0.0103     0.0104   0.996 3.20e-01
## beta.5    0.0373     0.0114   3.280 1.13e-03
## beta.6    0.0585     0.0104   5.650 2.80e-08
## beta.7    0.0712     0.0100   7.090 4.75e-12
## beta.8    0.0764     0.0124   6.150 1.59e-09
## beta.9    0.0788     0.0120   6.540 1.56e-10
## beta.10   0.0871     0.0197   4.420 1.24e-05
 summary(poly_DLM)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.756  -5.220  -0.491   4.370  38.027 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 36.6052953  3.8419811   9.528   <2e-16 ***
## z.t0         0.0375034  0.0197442   1.899   0.0581 .  
## z.t1        -0.0850572  0.0335875  -2.532   0.0116 *  
## z.t2         0.0330265  0.0147649   2.237   0.0257 *  
## z.t3        -0.0040073  0.0022738  -1.762   0.0786 .  
## z.t4         0.0001605  0.0001127   1.424   0.1552    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.272 on 492 degrees of freedom
## Multiple R-squared:  0.3281, Adjusted R-squared:  0.3213 
## F-statistic: 48.05 on 5 and 492 DF,  p-value: < 2.2e-16

The above model of the polynomial distributed lag model has q=10 and k=4, and there are consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.3213, indicating that this only explains 32.13 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(poly_DLM$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 272.16, df = 10, p-value < 2.2e-16
 shapiro.test(residuals(poly_DLM$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM$model)
## W = 0.97725, p-value = 5.251e-07

The residual graphs for the above model are shown in Figure 23:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_poly =vif(poly_DLM$model)
vif_poly
##        z.t0        z.t1        z.t2        z.t3        z.t4 
##    57.58862  4594.09441 47097.44738 73859.34321 13523.64713
 vif_poly >10
## z.t0 z.t1 z.t2 z.t3 z.t4 
## TRUE TRUE TRUE TRUE TRUE
  • According to the VIF values, with q=10, k=4 has a multicollinearity problem.

Fitting Koyck model

Fitting a Koyck models for multiple predictors with respect to dependent variable Weekly mortality.

Koyck_model = koyckDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts))
summary(Koyck_model$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -20.4155  -4.4573  -0.6556   4.5012  24.4119 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 29.35359    6.19203   4.741 2.78e-06 ***
## Y.1          0.78704    0.03245  24.256  < 2e-16 ***
## X.t         -0.07922    0.04804  -1.649   0.0997 .  
## 
## Diagnostic tests:
##                  df1 df2 statistic  p-value    
## Weak instruments   1 504     53.77 9.04e-13 ***
## Wu-Hausman         1 503     16.93 4.53e-05 ***
## Sargan             0  NA        NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.999 on 504 degrees of freedom
## Multiple R-Squared: 0.5121,  Adjusted R-squared: 0.5102 
## Wald test: 309.4 on 2 and 504 DF,  p-value: < 2.2e-16

The above Koyck model states that there are consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.5102, indicating that this only explains 51.02 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

 checkresiduals(Koyck_model$model)

 shapiro.test(residuals(Koyck_model$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model$model)
## W = 0.99332, p-value = 0.02396

The residual graphs for the above model are shown in Figure 24:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_poly=vif(Koyck_model$model)
vif_poly
##      Y.1      X.t 
## 1.089155 1.089155
 vif_poly>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop with multiple predictors that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){ for(j in 1:5){
    model_2 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_2$model), "BIC =", BIC(model_2$model),
"MASE =", MASE(model_2)$MASE, "\n")
} }
## p = 1 q = 1 AIC = 3266.984 BIC = 3288.126 MASE = 0.8931963 
## p = 1 q = 2 AIC = 3145.391 BIC = 3170.751 MASE = 0.7948619 
## p = 1 q = 3 AIC = 3140.519 BIC = 3170.091 MASE = 0.7943807 
## p = 1 q = 4 AIC = 3135.02 BIC = 3168.8 MASE = 0.7930542 
## p = 1 q = 5 AIC = 3131.09 BIC = 3169.076 MASE = 0.7901741 
## p = 2 q = 1 AIC = 3250.602 BIC = 3275.962 MASE = 0.8887641 
## p = 2 q = 2 AIC = 3147.348 BIC = 3176.934 MASE = 0.7947922 
## p = 2 q = 3 AIC = 3142.4 BIC = 3176.196 MASE = 0.7940517 
## p = 2 q = 4 AIC = 3136.76 BIC = 3174.764 MASE = 0.7926527 
## p = 2 q = 5 AIC = 3132.828 BIC = 3175.034 MASE = 0.7898194 
## p = 3 q = 1 AIC = 3245.331 BIC = 3274.903 MASE = 0.8871001 
## p = 3 q = 2 AIC = 3142.864 BIC = 3176.661 MASE = 0.7936563 
## p = 3 q = 3 AIC = 3144.3 BIC = 3182.321 MASE = 0.7939373 
## p = 3 q = 4 AIC = 3138.584 BIC = 3180.81 MASE = 0.7923693 
## p = 3 q = 5 AIC = 3134.595 BIC = 3181.021 MASE = 0.7893653 
## p = 4 q = 1 AIC = 3216.275 BIC = 3250.056 MASE = 0.8595735 
## p = 4 q = 2 AIC = 3119.965 BIC = 3157.968 MASE = 0.7764105 
## p = 4 q = 3 AIC = 3120.931 BIC = 3163.157 MASE = 0.7762804 
## p = 4 q = 4 AIC = 3122.79 BIC = 3169.238 MASE = 0.7758835 
## p = 4 q = 5 AIC = 3118.246 BIC = 3168.894 MASE = 0.7712669 
## p = 5 q = 1 AIC = 3212.646 BIC = 3250.632 MASE = 0.8584651 
## p = 5 q = 2 AIC = 3115.924 BIC = 3158.13 MASE = 0.7735218 
## p = 5 q = 3 AIC = 3116.817 BIC = 3163.244 MASE = 0.7732834 
## p = 5 q = 4 AIC = 3118.643 BIC = 3169.29 MASE = 0.7728108 
## p = 5 q = 5 AIC = 3119.911 BIC = 3174.779 MASE = 0.7703275

For fitting and analysis, five models with the lowest MASE values were chosen. The models were as follows:

  • ARDL(1,4)

  • ARDL(2,4)

  • ARDL(3,4)

  • ARDL(4,4)

  • ARDL(5,5)

  • Fitting a autoregressive distributed lag models for multiple predictors with respect to dependent variable Weekly mortality (p=1, q=4).

ardldlm_14 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts),p = 1, q =4)
summary(ardldlm_14)
## 
## Time series regression with "ts" data:
## Start = 5, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0882  -3.5352  -0.2849   3.4947  22.1967 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.95265    2.92040   0.669  0.50405    
## X.t          0.09903    0.01214   8.156 2.84e-15 ***
## X.1         -0.03391    0.01270  -2.670  0.00783 ** 
## Y.1          0.41830    0.04442   9.417  < 2e-16 ***
## Y.2          0.38769    0.04589   8.447 3.30e-16 ***
## Y.3          0.01034    0.04589   0.225  0.82175    
## Y.4          0.06399    0.04259   1.502  0.13364    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.378 on 497 degrees of freedom
## Multiple R-squared:  0.7138, Adjusted R-squared:  0.7103 
## F-statistic: 206.6 on 6 and 497 DF,  p-value: < 2.2e-16

The above model of the autoregressive distributed lag model has p=1 and q=4, the Y.3, and Y.4 attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.7103, indicating that this only explains 71.03 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_14$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 23.295, df = 10, p-value = 0.009709
shapiro.test(residuals(ardldlm_14$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_14$model)
## W = 0.99307, p-value = 0.02006

The residual graphs for the above model are shown in Figure 25:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_ardldlm_14=vif(ardldlm_14$model)
vif_ardldlm_14
##       X.t L(X.t, 1) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##  1.131635  1.238256  3.431484  3.665752  3.675594  3.162498
vif_ardldlm_14>10
##       X.t L(X.t, 1) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE

According to the VIF values, with p=1, q=4 does not have multicollinearity problem.

  • Fitting a autoregressive distributed lag models for multiple predictors with respect to dependent variable Weekly mortality (p=2, q=4).
ardldlm_24 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts),p = 2, q =4)
summary(ardldlm_24)
## 
## Time series regression with "ts" data:
## Start = 5, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.9442  -3.4942  -0.2707   3.5021  22.0101 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.681700   2.971382   0.566  0.57167    
## X.t          0.096851   0.012897   7.510 2.78e-13 ***
## X.1         -0.035479   0.013083  -2.712  0.00692 ** 
## X.2          0.006856   0.013569   0.505  0.61359    
## Y.1          0.421353   0.044859   9.393  < 2e-16 ***
## Y.2          0.380035   0.048365   7.858 2.45e-14 ***
## Y.3          0.010976   0.045939   0.239  0.81125    
## Y.4          0.066377   0.042885   1.548  0.12231    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.382 on 496 degrees of freedom
## Multiple R-squared:  0.7139, Adjusted R-squared:  0.7099 
## F-statistic: 176.8 on 7 and 496 DF,  p-value: < 2.2e-16

The above model of the autoregressive distributed lag model has p=2 and q=4, the X.2, Y.3, and Y.4 attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is .7099, indicating that this only explains 70.99 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_24$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 11
## 
## data:  Residuals
## LM test = 26.973, df = 11, p-value = 0.004639
 shapiro.test(residuals(ardldlm_24$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_24$model)
## W = 0.99333, p-value = 0.02487

The residual graphs for the above model are shown in Figure 26:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_ardldlm_24=vif(ardldlm_24$model)
vif_ardldlm_24
##       X.t L(X.t, 1) L(X.t, 2) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##  1.274860  1.311734  1.410571  3.494798  4.064891  3.678327  3.201439
vif_ardldlm_24>10
##       X.t L(X.t, 1) L(X.t, 2) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE

According to the VIF values, with p=2, q=4 does not have multicollinearity problem.

  • Fitting a autoregressive distributed lag models for multiple predictors with respect to dependent variable Weekly mortality (p=3, q=4).
ardldlm_34 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts),p = 3, q =4)
summary(ardldlm_34)
## 
## Time series regression with "ts" data:
## Start = 5, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0153  -3.4434  -0.2995   3.5496  21.9420 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.552344   2.990073   0.519   0.6039    
## X.t          0.095928   0.013097   7.324 9.81e-13 ***
## X.1         -0.037139   0.013689  -2.713   0.0069 ** 
## X.2          0.005829   0.013803   0.422   0.6730    
## X.3          0.005706   0.013715   0.416   0.6776    
## Y.1          0.421248   0.044898   9.382  < 2e-16 ***
## Y.2          0.382757   0.048846   7.836 2.86e-14 ***
## Y.3          0.005631   0.047738   0.118   0.9061    
## Y.4          0.067442   0.042997   1.569   0.1174    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.387 on 495 degrees of freedom
## Multiple R-squared:  0.714,  Adjusted R-squared:  0.7094 
## F-statistic: 154.5 on 8 and 495 DF,  p-value: < 2.2e-16

The above model of the autoregressive distributed lag model has p=3 and q=4, the X.t, X.1, Y.1 and Y.2 attributes has consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.7094, indicating that this only explains 70.94 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_34$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 12
## 
## data:  Residuals
## LM test = 29.359, df = 12, p-value = 0.003483
 shapiro.test(residuals(ardldlm_34$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_34$model)
## W = 0.9934, p-value = 0.02652

The residual graphs for the above model are shown in Figure 27:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_ardldlm_34=vif(ardldlm_34$model)
vif_ardldlm_34
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##  1.312536  1.433662  1.457195  1.434802  3.494908  4.139196  3.965569  3.212841
vif_ardldlm_34>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE

According to the VIF values, with p=3, q=4 does not have multicollinearity problem.

  • Fitting a autoregressive distributed lag models for multiple predictors with respect to dependent variable Weekly mortality (p=4, q=4).
ardldlm_44 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts),p = 4, q =4)
summary(ardldlm_44)
## 
## Time series regression with "ts" data:
## Start = 5, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.6223  -3.5512  -0.4413   3.2751  23.8114 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.114314   2.942560   0.379 0.705082    
## X.t          0.085519   0.013116   6.520 1.74e-10 ***
## X.1         -0.045102   0.013595  -3.318 0.000975 ***
## X.2         -0.007922   0.013962  -0.567 0.570709    
## X.3         -0.003119   0.013650  -0.228 0.819355    
## X.4          0.057646   0.013681   4.213 2.99e-05 ***
## Y.1          0.420537   0.044157   9.524  < 2e-16 ***
## Y.2          0.382992   0.048039   7.972 1.09e-14 ***
## Y.3          0.037189   0.047544   0.782 0.434470    
## Y.4          0.016400   0.043988   0.373 0.709435    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.298 on 494 degrees of freedom
## Multiple R-squared:  0.724,  Adjusted R-squared:  0.7189 
## F-statistic:   144 on 9 and 494 DF,  p-value: < 2.2e-16

The above model of the autoregressive distributed lag model has p=4 and q=4, the X.t, X.1, X.4, Y.1 and Y.2 attributes has consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.7189, indicating that this only explains 71.89 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_44$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 13
## 
## data:  Residuals
## LM test = 23.269, df = 13, p-value = 0.03856
 shapiro.test(residuals(ardldlm_44$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_44$model)
## W = 0.9903, p-value = 0.002109

The residual graphs for the above model are shown in Figure 28:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_ardldlm_44=vif(ardldlm_44$model)
vif_ardldlm_44
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(y.t, 1) L(y.t, 2) L(y.t, 3) 
##  1.360812  1.461911  1.541413  1.469395  1.480920  3.494959  4.139202  4.066486 
## L(y.t, 4) 
##  3.476505
vif_ardldlm_44>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(y.t, 1) L(y.t, 2) L(y.t, 3) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 4) 
##     FALSE

According to the VIF values, with p=4, q=4 does not have multicollinearity problem.

  • Fitting a autoregressive distributed lag models for multiple predictors with respect to dependent variable Weekly mortality (p=5, q=5).
ardldlm_55 = ardlDlm(y =as.vector(mortality.ts), x=as.vector(temp.ts)  + as.vector(chem_1.ts)  + as.vector(chem_2.ts)  + as.vector(particle.size.ts),p = 5, q =5)
summary(ardldlm_55)
## 
## Time series regression with "ts" data:
## Start = 6, End = 508
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.4243  -3.4712  -0.2898   3.1652  23.8281 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.151977   3.065488   0.050   0.9605    
## X.t          0.086507   0.013252   6.528 1.67e-10 ***
## X.1         -0.045082   0.013733  -3.283   0.0011 ** 
## X.2         -0.009456   0.014123  -0.670   0.5035    
## X.3         -0.004156   0.014107  -0.295   0.7684    
## X.4          0.057624   0.013866   4.156 3.82e-05 ***
## X.5          0.008008   0.014002   0.572   0.5676    
## Y.1          0.412127   0.045098   9.139  < 2e-16 ***
## Y.2          0.383368   0.048248   7.946 1.33e-14 ***
## Y.3          0.022594   0.051224   0.441   0.6593    
## Y.4          0.002833   0.047745   0.059   0.9527    
## Y.5          0.037484   0.044347   0.845   0.3984    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.305 on 491 degrees of freedom
## Multiple R-squared:  0.7246, Adjusted R-squared:  0.7185 
## F-statistic: 117.5 on 11 and 491 DF,  p-value: < 2.2e-16

The above model of the autoregressive distributed lag model has p=5 and q=5, the X.t, X.1, X.3, Y.1 and Y.2 attributes has consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.7185, indicating that this only explains 71.85 percent of the variability in the model. The whole model has a p-value of 2.2e-16, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_55$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 15
## 
## data:  Residuals
## LM test = 36.747, df = 15, p-value = 0.001378
shapiro.test(residuals(ardldlm_55$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_55$model)
## W = 0.99027, p-value = 0.002086

The residual graphs for the above model are shown in Figure 29:

  • The time series plot clearly shows that the residuals are not randomly distributed.
  • We may determine from the ACF plot that there is serial correlation as well as seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_ardldlm_55=vif(ardldlm_55$model)
vif_ardldlm_55
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  1.382653  1.487330  1.572984  1.565247  1.511810  1.546630  3.629411  4.161487 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##  4.684157  4.078127  3.517449
vif_ardldlm_55>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##     FALSE     FALSE     FALSE
  • According to the VIF values, with p=5, q=5 does not have multicollinearity problem.

Therefore, we may infer that none of the models from the time series regression method successfully captured the autocorrelation and seasonality in the series.

  • The data frame has been constructed to contain the model accuracy values, such as AIC/BIC and MASE, from the models that have been fitted so far.
model_dlm <- data.frame(Model=character(),MASE=numeric(),
                           BIC= numeric(),AICC=numeric(),AIC=numeric())
model_dlm = rbind(model_dlm,cbind(Model="Finite DLM",
                                               AIC = AIC(finite_DLM),
                                              BIC = BIC(finite_DLM),
                                              MASE= MASE(finite_DLM)
))
## [1] 3530.224
## [1] 3584.962
model_dlm = rbind(model_dlm,cbind(Model="Polynomial DLM",
                                               BIC = BIC(poly_DLM),
                                               AIC = AIC(poly_DLM),
                                              MASE= MASE(poly_DLM)
                                              ))
## [1] 3555.085
## [1] 3525.611
model_dlm = rbind(model_dlm,cbind(Model="Koyck Model",
                                               AIC = AIC(Koyck_model),
                                      BIC = BIC(Koyck_model),
                                              MASE= MASE(Koyck_model)
                                              ))
## [1] 3416.809
## [1] 3433.723
model_dlm_ = rbind(model_dlm,cbind(Model="autoregressive_dlm_14",
                                               AIC = AIC(ardldlm_14),
                                      BIC = BIC(ardldlm_14),
                                              MASE= MASE(ardldlm_14)
                                              ))
## [1] 3135.02
## [1] 3168.8
model_dlm = rbind(model_dlm,cbind(Model="autoregressive_dlm_24",
                                               AIC = AIC(ardldlm_24),
                                     BIC = BIC(ardldlm_24),
                                              MASE= MASE(ardldlm_24)
                                              ))
## [1] 3136.76
## [1] 3174.764
model_dlm = rbind(model_dlm,cbind(Model="autoregressive_dlm_34",
                                               AIC = AIC(ardldlm_34),
                                     BIC = BIC(ardldlm_34),
                                              MASE= MASE(ardldlm_34)
                                              ))
## [1] 3138.584
## [1] 3180.81
model_dlm = rbind(model_dlm,cbind(Model="autoregressive_dlm_44",
                                               AIC = AIC(ardldlm_44),
                                     BIC = BIC(ardldlm_44),
                                              MASE= MASE(ardldlm_44)
                                              ))
## [1] 3122.79
## [1] 3169.238
model_dlm = rbind(model_dlm,cbind(Model="autoregressive_dlm_55",
                                               AIC = AIC(ardldlm_55),
                                     BIC = BIC(ardldlm_55),
                                              MASE= MASE(ardldlm_55)
                                              ))
## [1] 3119.911
## [1] 3174.779
sortScore(model_dlm,score = "mase")
##                             Model      AIC      BIC      MASE
## ardldlm_55  autoregressive_dlm_55 3119.911 3174.779 0.7703275
## ardldlm_44  autoregressive_dlm_44 3122.790 3169.238 0.7758835
## ardldlm_34  autoregressive_dlm_34 3138.584 3180.810 0.7923693
## ardldlm_24  autoregressive_dlm_24 3136.760 3174.764 0.7926527
## Koyck_model           Koyck Model 3416.809 3433.723 1.0362819
## finite_DLM             Finite DLM 3530.224 3584.962 1.1505228
## poly_DLM           Polynomial DLM 3525.611 3555.085 1.1626725

Dynamic Lag Models

A distributed-lag model is a dynamic model in which the influence of a regressor x on y is spread out across time rather than occurring all at once.

dynlagmod_1 = dynlm(mortality.ts~as.vector(temp.ts) + as.vector(chem_1.ts) + as.vector(chem_1.ts) + as.vector(particle.size.ts) + L(mortality.ts,k=1)+season(mortality.ts))
summary(dynlagmod_1)
## 
## Time series regression with "ts" data:
## Start = 2010(2), End = 2019(40)
## 
## Call:
## dynlm(formula = mortality.ts ~ as.vector(temp.ts) + as.vector(chem_1.ts) + 
##     as.vector(chem_1.ts) + as.vector(particle.size.ts) + L(mortality.ts, 
##     k = 1) + season(mortality.ts))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.3110  -3.3959  -0.1698   3.3848  18.1287 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 34.52112    5.80514   5.947 5.49e-09 ***
## as.vector(temp.ts)           0.01534    0.06059   0.253 0.800219    
## as.vector(chem_1.ts)         0.90919    0.15602   5.827 1.07e-08 ***
## as.vector(particle.size.ts) -0.03940    0.03863  -1.020 0.308214    
## L(mortality.ts, k = 1)       0.54084    0.03813  14.185  < 2e-16 ***
## season(mortality.ts)2        4.48199    2.52585   1.774 0.076664 .  
## season(mortality.ts)3       -1.32140    2.52332  -0.524 0.600763    
## season(mortality.ts)4        3.35606    2.54382   1.319 0.187738    
## season(mortality.ts)5        2.11828    2.53635   0.835 0.404065    
## season(mortality.ts)6        0.25790    2.56852   0.100 0.920065    
## season(mortality.ts)7        1.63545    2.61444   0.626 0.531931    
## season(mortality.ts)8        1.02091    2.60282   0.392 0.695071    
## season(mortality.ts)9        0.96224    2.61285   0.368 0.712844    
## season(mortality.ts)10       2.73645    2.63034   1.040 0.298739    
## season(mortality.ts)11       0.05524    2.64065   0.021 0.983320    
## season(mortality.ts)12      -0.70757    2.66671  -0.265 0.790874    
## season(mortality.ts)13       1.97091    2.66749   0.739 0.460374    
## season(mortality.ts)14      -3.19582    2.71628  -1.177 0.239999    
## season(mortality.ts)15       4.98584    2.75511   1.810 0.071012 .  
## season(mortality.ts)16      -3.25006    2.76229  -1.177 0.239982    
## season(mortality.ts)17      -0.81846    2.77029  -0.295 0.767793    
## season(mortality.ts)18      -2.00392    2.80467  -0.714 0.475291    
## season(mortality.ts)19       0.74507    2.87333   0.259 0.795518    
## season(mortality.ts)20       0.94569    2.87142   0.329 0.742048    
## season(mortality.ts)21      -3.71501    2.87518  -1.292 0.196986    
## season(mortality.ts)22      -0.59093    2.89546  -0.204 0.838376    
## season(mortality.ts)23      -1.74104    2.90740  -0.599 0.549586    
## season(mortality.ts)24      -2.98157    2.90951  -1.025 0.306021    
## season(mortality.ts)25       2.00511    2.93032   0.684 0.494159    
## season(mortality.ts)26      -3.84264    2.91315  -1.319 0.187817    
## season(mortality.ts)27      -2.72631    2.86676  -0.951 0.342110    
## season(mortality.ts)28      -3.30332    2.86857  -1.152 0.250113    
## season(mortality.ts)29      -2.39140    2.85880  -0.837 0.403313    
## season(mortality.ts)30      -1.42690    2.87546  -0.496 0.619971    
## season(mortality.ts)31      -2.76325    2.76973  -0.998 0.318979    
## season(mortality.ts)32      -1.74793    2.76310  -0.633 0.527317    
## season(mortality.ts)33      -3.44268    2.78276  -1.237 0.216677    
## season(mortality.ts)34      -1.08051    2.71754  -0.398 0.691110    
## season(mortality.ts)35      -5.82374    2.64274  -2.204 0.028052 *  
## season(mortality.ts)36      -1.78802    2.63982  -0.677 0.498545    
## season(mortality.ts)37      -3.42434    2.60943  -1.312 0.190089    
## season(mortality.ts)38      -0.22625    2.59614  -0.087 0.930594    
## season(mortality.ts)39      -5.21826    2.57912  -2.023 0.043635 *  
## season(mortality.ts)40      -0.61172    2.58561  -0.237 0.813086    
## season(mortality.ts)41      -4.16338    2.62540  -1.586 0.113484    
## season(mortality.ts)42      -1.43106    2.65094  -0.540 0.589580    
## season(mortality.ts)43      -0.28438    2.67197  -0.106 0.915288    
## season(mortality.ts)44      -1.31043    2.64158  -0.496 0.620081    
## season(mortality.ts)45       1.47508    2.66488   0.554 0.580177    
## season(mortality.ts)46       9.17518    2.60647   3.520 0.000475 ***
## season(mortality.ts)47       8.46256    2.65097   3.192 0.001510 ** 
## season(mortality.ts)48      -0.97593    2.60713  -0.374 0.708334    
## season(mortality.ts)49       3.40194    2.59204   1.312 0.190034    
## season(mortality.ts)50      -0.53472    2.59421  -0.206 0.836791    
## season(mortality.ts)51       2.87166    2.59487   1.107 0.269027    
## season(mortality.ts)52       4.41168    2.58214   1.709 0.088225 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.466 on 451 degrees of freedom
## Multiple R-squared:  0.7337, Adjusted R-squared:  0.7013 
## F-statistic:  22.6 on 55 and 451 DF,  p-value: < 2.2e-16

At 5% level of significance, The seasonal effects are insignificant. The influence of the original series lag and the overall model is significant. * For this model, we obtained a high adjusted R-square which is 70.13%.

checkresiduals(dynlagmod_1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 101
## 
## data:  Residuals
## LM test = 177.99, df = 101, p-value = 3.558e-06

The residual graphs for the above model are shown in Figure 30:

  • The time series plot clearly shows that the errors are not randomly distributed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
dynlagmod_2 = dynlm(mortality.ts ~as.vector(temp.ts) + as.vector(chem_1.ts) + as.vector(chem_1.ts) + as.vector(particle.size.ts) +L(mortality.ts,k=1)+L(mortality.ts,k=2)+season(mortality.ts))
summary(dynlagmod_2)
## 
## Time series regression with "ts" data:
## Start = 2010(3), End = 2019(40)
## 
## Call:
## dynlm(formula = mortality.ts ~ as.vector(temp.ts) + as.vector(chem_1.ts) + 
##     as.vector(chem_1.ts) + as.vector(particle.size.ts) + L(mortality.ts, 
##     k = 1) + L(mortality.ts, k = 2) + season(mortality.ts))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.0233  -3.1708  -0.1462   3.2140  15.6154 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 14.828785   5.678118   2.612 0.009315 ** 
## as.vector(temp.ts)           0.058822   0.055445   1.061 0.289306    
## as.vector(chem_1.ts)         0.547931   0.147162   3.723 0.000222 ***
## as.vector(particle.size.ts)  0.002878   0.035504   0.081 0.935420    
## L(mortality.ts, k = 1)       0.325300   0.041415   7.855 2.98e-14 ***
## L(mortality.ts, k = 2)       0.398960   0.041636   9.582  < 2e-16 ***
## season(mortality.ts)2        2.137574   2.370409   0.902 0.367660    
## season(mortality.ts)3       -1.029187   2.301579  -0.447 0.654971    
## season(mortality.ts)4        1.861737   2.325277   0.801 0.423757    
## season(mortality.ts)5        2.644116   2.313923   1.143 0.253773    
## season(mortality.ts)6        0.466847   2.342713   0.199 0.842137    
## season(mortality.ts)7        1.283487   2.384762   0.538 0.590703    
## season(mortality.ts)8        1.831143   2.375428   0.771 0.441190    
## season(mortality.ts)9        1.391915   2.383483   0.584 0.559525    
## season(mortality.ts)10       3.167555   2.399433   1.320 0.187465    
## season(mortality.ts)11       1.027778   2.410600   0.426 0.670052    
## season(mortality.ts)12      -0.660445   2.432183  -0.272 0.786097    
## season(mortality.ts)13       2.149807   2.432971   0.884 0.377378    
## season(mortality.ts)14      -1.647077   2.482757  -0.663 0.507411    
## season(mortality.ts)15       4.632538   2.513054   1.843 0.065931 .  
## season(mortality.ts)16      -0.504275   2.535804  -0.199 0.842461    
## season(mortality.ts)17      -1.574618   2.527837  -0.623 0.533659    
## season(mortality.ts)18      -0.743516   2.561504  -0.290 0.771747    
## season(mortality.ts)19       1.929109   2.623665   0.735 0.462557    
## season(mortality.ts)20       2.850804   2.626630   1.085 0.278351    
## season(mortality.ts)21      -2.337499   2.626405  -0.890 0.373943    
## season(mortality.ts)22      -0.971063   2.641115  -0.368 0.713291    
## season(mortality.ts)23      -0.297867   2.656147  -0.112 0.910760    
## season(mortality.ts)24      -2.100841   2.655336  -0.791 0.429258    
## season(mortality.ts)25       3.114347   2.675254   1.164 0.244989    
## season(mortality.ts)26      -1.511426   2.668324  -0.566 0.571383    
## season(mortality.ts)27      -2.820492   2.614678  -1.079 0.281294    
## season(mortality.ts)28      -1.866550   2.620728  -0.712 0.476695    
## season(mortality.ts)29      -0.752266   2.613147  -0.288 0.773573    
## season(mortality.ts)30       0.648592   2.631703   0.246 0.805444    
## season(mortality.ts)31      -0.460596   2.537724  -0.181 0.856057    
## season(mortality.ts)32       0.194960   2.528367   0.077 0.938571    
## season(mortality.ts)33      -0.915307   2.551875  -0.359 0.720003    
## season(mortality.ts)34       0.508471   2.484176   0.205 0.837912    
## season(mortality.ts)35      -3.546508   2.422111  -1.464 0.143833    
## season(mortality.ts)36      -0.811909   2.409829  -0.337 0.736338    
## season(mortality.ts)37      -0.035615   2.406139  -0.015 0.988197    
## season(mortality.ts)38       2.126348   2.380517   0.893 0.372213    
## season(mortality.ts)39      -1.742840   2.380107  -0.732 0.464396    
## season(mortality.ts)40       0.927363   2.363636   0.392 0.694989    
## season(mortality.ts)41      -0.890727   2.418681  -0.368 0.712846    
## season(mortality.ts)42       0.102491   2.423038   0.042 0.966279    
## season(mortality.ts)43       2.563934   2.454937   1.044 0.296863    
## season(mortality.ts)44       1.282240   2.424304   0.529 0.597129    
## season(mortality.ts)45       3.136695   2.436604   1.287 0.198645    
## season(mortality.ts)46      10.864139   2.383708   4.558 6.68e-06 ***
## season(mortality.ts)47      10.255995   2.424977   4.229 2.84e-05 ***
## season(mortality.ts)48      -0.727981   2.377948  -0.306 0.759641    
## season(mortality.ts)49       0.724424   2.380548   0.304 0.761033    
## season(mortality.ts)50      -0.561303   2.366030  -0.237 0.812584    
## season(mortality.ts)51       1.549623   2.370666   0.654 0.513662    
## season(mortality.ts)52       4.421821   2.355017   1.878 0.061082 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.985 on 449 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7507 
## F-statistic: 28.16 on 56 and 449 DF,  p-value: < 2.2e-16

At 5% level of significance, The seasonal effects are insignificant. The effects of both lags of the original series and the overall model is significant. * For this model, we obtained a high adjusted R-square which is 75.07%.

checkresiduals(dynlagmod_2)

## 
##  Breusch-Godfrey test for serial correlation of order up to 101
## 
## data:  Residuals
## LM test = 112.56, df = 101, p-value = 0.203

The residual graphs for the above model are shown in Figure 31:

  • The time series plot clearly shows that the errors are not randomly distributed.
  • The ACF plot has a large number of highly significant lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
dynlagmod_3 = dynlm(mortality.ts ~as.vector(temp.ts) + as.vector(chem_1.ts) + as.vector(chem_1.ts) + as.vector(particle.size.ts) +L(mortality.ts,k=1)+L(mortality.ts,k=2)+ L(mortality.ts,k=3) +season(mortality.ts))
summary(dynlagmod_3)
## 
## Time series regression with "ts" data:
## Start = 2010(4), End = 2019(40)
## 
## Call:
## dynlm(formula = mortality.ts ~ as.vector(temp.ts) + as.vector(chem_1.ts) + 
##     as.vector(chem_1.ts) + as.vector(particle.size.ts) + L(mortality.ts, 
##     k = 1) + L(mortality.ts, k = 2) + L(mortality.ts, k = 3) + 
##     season(mortality.ts))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.1928  -3.2387  -0.1273   3.0997  15.5334 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 13.062265   5.877171   2.223 0.026746 *  
## as.vector(temp.ts)           0.063791   0.055836   1.142 0.253875    
## as.vector(chem_1.ts)         0.514918   0.149826   3.437 0.000644 ***
## as.vector(particle.size.ts)  0.006574   0.035663   0.184 0.853838    
## L(mortality.ts, k = 1)       0.303465   0.045447   6.677 7.24e-11 ***
## L(mortality.ts, k = 2)       0.381671   0.043993   8.676  < 2e-16 ***
## L(mortality.ts, k = 3)       0.055078   0.045932   1.199 0.231122    
## season(mortality.ts)2        2.035681   2.373535   0.858 0.391541    
## season(mortality.ts)3       -1.084758   2.374679  -0.457 0.648035    
## season(mortality.ts)4        1.775916   2.327963   0.763 0.445948    
## season(mortality.ts)5        2.397170   2.324952   1.031 0.303069    
## season(mortality.ts)6        0.507092   2.344303   0.216 0.828846    
## season(mortality.ts)7        1.221583   2.387571   0.512 0.609154    
## season(mortality.ts)8        1.717309   2.379632   0.722 0.470875    
## season(mortality.ts)9        1.416208   2.385223   0.594 0.552985    
## season(mortality.ts)10       3.149801   2.401486   1.312 0.190327    
## season(mortality.ts)11       1.012956   2.412694   0.420 0.674801    
## season(mortality.ts)12      -0.618914   2.434358  -0.254 0.799426    
## season(mortality.ts)13       2.033604   2.437891   0.834 0.404633    
## season(mortality.ts)14      -1.676392   2.485486  -0.674 0.500360    
## season(mortality.ts)15       4.668923   2.515860   1.856 0.064141 .  
## season(mortality.ts)16      -0.544343   2.538850  -0.214 0.830329    
## season(mortality.ts)17      -1.367053   2.535064  -0.539 0.589978    
## season(mortality.ts)18      -0.975189   2.573721  -0.379 0.704940    
## season(mortality.ts)19       1.976926   2.627051   0.753 0.452130    
## season(mortality.ts)20       2.932600   2.630056   1.115 0.265435    
## season(mortality.ts)21      -2.112093   2.634499  -0.802 0.423149    
## season(mortality.ts)22      -0.957728   2.644739  -0.362 0.717429    
## season(mortality.ts)23      -0.434230   2.663423  -0.163 0.870565    
## season(mortality.ts)24      -2.024092   2.658943  -0.761 0.446916    
## season(mortality.ts)25       3.115356   2.678968   1.163 0.245493    
## season(mortality.ts)26      -1.376184   2.672923  -0.515 0.606905    
## season(mortality.ts)27      -2.643844   2.620737  -1.009 0.313608    
## season(mortality.ts)28      -1.995720   2.627540  -0.760 0.447930    
## season(mortality.ts)29      -0.700840   2.616360  -0.268 0.788924    
## season(mortality.ts)30       0.761360   2.635758   0.289 0.772824    
## season(mortality.ts)31      -0.250556   2.544881  -0.098 0.921615    
## season(mortality.ts)32       0.419139   2.536279   0.165 0.868816    
## season(mortality.ts)33      -0.746911   2.557209  -0.292 0.770361    
## season(mortality.ts)34       0.731884   2.492031   0.294 0.769131    
## season(mortality.ts)35      -3.397714   2.426351  -1.400 0.162106    
## season(mortality.ts)36      -0.652831   2.414458  -0.270 0.786990    
## season(mortality.ts)37       0.060089   2.408648   0.025 0.980108    
## season(mortality.ts)38       2.474918   2.398811   1.032 0.302758    
## season(mortality.ts)39      -1.445947   2.393689  -0.604 0.546106    
## season(mortality.ts)40       1.280498   2.382850   0.537 0.591272    
## season(mortality.ts)41      -0.793413   2.421309  -0.328 0.743308    
## season(mortality.ts)42       0.446482   2.441056   0.183 0.854955    
## season(mortality.ts)43       2.708216   2.459484   1.101 0.271432    
## season(mortality.ts)44       1.631781   2.443894   0.668 0.504671    
## season(mortality.ts)45       3.447838   2.452209   1.406 0.160415    
## season(mortality.ts)46      11.007431   2.388397   4.609 5.29e-06 ***
## season(mortality.ts)47      10.648313   2.450660   4.345 1.72e-05 ***
## season(mortality.ts)48      -0.319261   2.404332  -0.133 0.894423    
## season(mortality.ts)49       0.631667   2.382950   0.265 0.791072    
## season(mortality.ts)50      -0.885078   2.382348  -0.372 0.710430    
## season(mortality.ts)51       1.439008   2.373546   0.606 0.544644    
## season(mortality.ts)52       4.239843   2.361222   1.796 0.073231 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.988 on 447 degrees of freedom
## Multiple R-squared:  0.779,  Adjusted R-squared:  0.7508 
## F-statistic: 27.64 on 57 and 447 DF,  p-value: < 2.2e-16

At 5% level of significance, The seasonal effects are insignificant. The effects of both lags of the original series and the overall model is significant. * For this model, we obtained a high adjusted R-square which is 75.08%.

checkresiduals(dynlagmod_3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 101
## 
## data:  Residuals
## LM test = 114.34, df = 101, p-value = 0.1719

The residual graphs for the above model are shown in Figure 32:

  • The time series plot clearly shows that the errors are not randomly distributed.
  • The ACF plot has a large number of highly significant lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
dynlagmod_4 = dynlm(mortality.ts ~as.vector(temp.ts) + as.vector(chem_1.ts) + as.vector(chem_1.ts) + as.vector(particle.size.ts) +L(mortality.ts,k=1)+L(mortality.ts,k=2)+ L(mortality.ts,k=3)  + L(mortality.ts,k=4)+ L(mortality.ts,k=5)+season(mortality.ts))
summary(dynlagmod_4)
## 
## Time series regression with "ts" data:
## Start = 2010(6), End = 2019(40)
## 
## Call:
## dynlm(formula = mortality.ts ~ as.vector(temp.ts) + as.vector(chem_1.ts) + 
##     as.vector(chem_1.ts) + as.vector(particle.size.ts) + L(mortality.ts, 
##     k = 1) + L(mortality.ts, k = 2) + L(mortality.ts, k = 3) + 
##     L(mortality.ts, k = 4) + L(mortality.ts, k = 5) + season(mortality.ts))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.1531  -3.1270  -0.1249   3.1573  15.8406 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 10.35030    6.22034   1.664  0.09683 .  
## as.vector(temp.ts)           0.06736    0.05640   1.194  0.23299    
## as.vector(chem_1.ts)         0.47475    0.15251   3.113  0.00197 ** 
## as.vector(particle.size.ts)  0.01358    0.03598   0.377  0.70609    
## L(mortality.ts, k = 1)       0.29633    0.04580   6.470 2.59e-10 ***
## L(mortality.ts, k = 2)       0.35363    0.04763   7.424 5.85e-13 ***
## L(mortality.ts, k = 3)       0.02856    0.05060   0.564  0.57271    
## L(mortality.ts, k = 4)       0.06453    0.04779   1.350  0.17760    
## L(mortality.ts, k = 5)       0.01963    0.04610   0.426  0.67053    
## season(mortality.ts)2        2.22859    2.38374   0.935  0.35034    
## season(mortality.ts)3       -0.90562    2.38105  -0.380  0.70387    
## season(mortality.ts)4        1.84440    2.39321   0.771  0.44131    
## season(mortality.ts)5        2.66279    2.39321   1.113  0.26646    
## season(mortality.ts)6        0.49445    2.34801   0.211  0.83331    
## season(mortality.ts)7        1.52071    2.40550   0.632  0.52759    
## season(mortality.ts)8        1.97488    2.38899   0.827  0.40887    
## season(mortality.ts)9        1.56688    2.39052   0.655  0.51251    
## season(mortality.ts)10       3.48466    2.41504   1.443  0.14976    
## season(mortality.ts)11       1.31126    2.42319   0.541  0.58869    
## season(mortality.ts)12      -0.29892    2.44571  -0.122  0.90278    
## season(mortality.ts)13       2.40883    2.45238   0.982  0.32652    
## season(mortality.ts)14      -1.45370    2.49331  -0.583  0.56016    
## season(mortality.ts)15       4.92947    2.52687   1.951  0.05171 .  
## season(mortality.ts)16      -0.12680    2.55564  -0.050  0.96045    
## season(mortality.ts)17      -1.03500    2.54710  -0.406  0.68469    
## season(mortality.ts)18      -0.39415    2.60495  -0.151  0.87980    
## season(mortality.ts)19       2.12271    2.63757   0.805  0.42137    
## season(mortality.ts)20       3.27384    2.65350   1.234  0.21794    
## season(mortality.ts)21      -1.56657    2.65986  -0.589  0.55618    
## season(mortality.ts)22      -0.27978    2.68212  -0.104  0.91697    
## season(mortality.ts)23       0.04672    2.68681   0.017  0.98614    
## season(mortality.ts)24      -1.82012    2.66597  -0.683  0.49514    
## season(mortality.ts)25       3.54733    2.70211   1.313  0.18993    
## season(mortality.ts)26      -0.92599    2.69147  -0.344  0.73098    
## season(mortality.ts)27      -2.06689    2.64927  -0.780  0.43571    
## season(mortality.ts)28      -1.33496    2.66359  -0.501  0.61649    
## season(mortality.ts)29      -0.44441    2.62856  -0.169  0.86582    
## season(mortality.ts)30       1.15614    2.65581   0.435  0.66354    
## season(mortality.ts)31       0.30610    2.57182   0.119  0.90531    
## season(mortality.ts)32       1.12840    2.57865   0.438  0.66190    
## season(mortality.ts)33      -0.02853    2.60218  -0.011  0.99126    
## season(mortality.ts)34       1.37140    2.53038   0.542  0.58811    
## season(mortality.ts)35      -2.71435    2.46915  -1.099  0.27223    
## season(mortality.ts)36      -0.06559    2.45144  -0.027  0.97867    
## season(mortality.ts)37       0.67616    2.44533   0.277  0.78228    
## season(mortality.ts)38       2.95359    2.42636   1.217  0.22414    
## season(mortality.ts)39      -0.65477    2.44860  -0.267  0.78928    
## season(mortality.ts)40       2.07445    2.45307   0.846  0.39820    
## season(mortality.ts)41      -0.02218    2.48320  -0.009  0.99288    
## season(mortality.ts)42       0.96711    2.48206   0.390  0.69699    
## season(mortality.ts)43       3.42244    2.50518   1.366  0.17259    
## season(mortality.ts)44       2.18228    2.49106   0.876  0.38148    
## season(mortality.ts)45       4.21968    2.50565   1.684  0.09287 .  
## season(mortality.ts)46      11.68925    2.44530   4.780 2.39e-06 ***
## season(mortality.ts)47      11.24952    2.50001   4.500 8.70e-06 ***
## season(mortality.ts)48       0.63050    2.48647   0.254  0.79994    
## season(mortality.ts)49       1.50276    2.46933   0.609  0.54312    
## season(mortality.ts)50      -0.57774    2.42206  -0.239  0.81158    
## season(mortality.ts)51       1.19603    2.38244   0.502  0.61590    
## season(mortality.ts)52       4.25772    2.37529   1.793  0.07373 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.995 on 443 degrees of freedom
## Multiple R-squared:  0.7797, Adjusted R-squared:  0.7504 
## F-statistic: 26.58 on 59 and 443 DF,  p-value: < 2.2e-16

At 5% level of significance, The original series of is insignificant The seasonal effects are insignificant. * The effects of both lags of the original series and the overall model is insignificant. * For this model, we obtained a high adjusted R-square which is 75.04%.

checkresiduals(dynlagmod_4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 101
## 
## data:  Residuals
## LM test = 128.18, df = 101, p-value = 0.03518

The residual graphs for the above model are shown in Figure 33:

  • The time series plot clearly shows that the errors are not randomly distributed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.

The data frame has been constructed to contain the Dynamic lag models values, such as AIC/BIC, and then sorted to its lowest AIC and BIC values. (Note:There is no MASE value in dynamic lag models.)

model_dlm_1 <- data.frame(Model=character(),MASE=numeric(),
                           BIC= numeric(),AICC=numeric(),AIC=numeric())


model_dlm_1 = rbind(model_dlm_1,cbind(Model="dynlagmod_1",
                                               AIC = AIC(dynlagmod_1),
                                              BIC = BIC(dynlagmod_1)
                                      ))

model_dlm_1 = rbind(model_dlm_1,cbind(Model="dynlagmod_2",
                                               BIC = BIC(dynlagmod_2),
                                               AIC = AIC(dynlagmod_2)
                                              ))
model_dlm_1 = rbind(model_dlm_1,cbind(Model="dynlagmod_3",
                                               AIC = AIC(dynlagmod_3),
                                      BIC = BIC(dynlagmod_3)
                                              ))
model_dlm_1 = rbind(model_dlm_1,cbind(Model="dynlagmod_4",
                                               AIC = AIC(dynlagmod_4),
                                      BIC = BIC(dynlagmod_4)
                                              ))
#model_dlm_1 = rbind(model_dlm_1,cbind(Model="dynlagmod_5",
                                               #AIC = AIC(dynlagmod_5),
                                     # BIC = BIC(dynlagmod_5)
                                        #      ))

model_dlm_1
##         Model              AIC              BIC
## 1 dynlagmod_1 3215.77841437438 3456.80354157908
## 2 dynlagmod_2 3117.23565359791 3362.37478041658
## 3 dynlagmod_3 3112.55591072037 3361.80485804762
## 4 dynlagmod_4 3103.62264122512  3361.0786416012
sortScore(model_dlm_1,score = "aic")
##         Model              AIC              BIC
## 4 dynlagmod_4 3103.62264122512  3361.0786416012
## 3 dynlagmod_3 3112.55591072037 3361.80485804762
## 2 dynlagmod_2 3117.23565359791 3362.37478041658
## 1 dynlagmod_1 3215.77841437438 3456.80354157908

Here model 4 of dynamic lag models has the lowest AIC and BIC values which are AIC= 3198.293 and BIC= 3457.306.

Exponential smoothing methods

  • Exponential smoothing will be another forecasting approach we will explore. We will only evaluate models with either additive or multiplicative seasonality since we have discovered a substantial seasonal component in the solar radiation series for which we wish to make predictions.
  • Given that there is no trend, the models that contain seasonality components are (Additive or Multiplicative) and may include the error term, resulting in the following possible models:
  • No Trend, Additive Seasonality.
  • No Trend, Additive Seasonality, Damped.
  • No Trend, Multiplicative Seasonality.
  • No Trend, Multiplicative Seasonality, Damped.

Fitting Residuals from Holt−Winters’ additive method

fit_1<- holt(mortality.ts, seasonal="additive", h=4*frequency(mortality.ts))
summary(fit_1)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = mortality.ts, h = 4 * frequency(mortality.ts), seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.5121 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 100.9771 
##     b = -0.0289 
## 
##   sigma:  5.906
## 
##      AIC     AICc      BIC 
## 4975.460 4975.579 4996.612 
## 
## Error measures:
##                       ME    RMSE      MAE        MPE     MAPE     MASE
## Training set -0.00452697 5.88274 4.590378 -0.3179476 5.155929 0.687417
##                     ACF1
## Training set -0.07633466
## 
## Forecasts:
##          Point Forecast    Lo 80     Hi 80       Lo 95     Hi 95
## 2019.769       84.61340 77.04450  92.18229 73.03777546  96.18902
## 2019.788       84.58425 76.08043  93.08806 71.57877856  97.58971
## 2019.808       84.55509 75.20910  93.90109 70.26163215  98.84856
## 2019.827       84.52594 74.40734  94.64455 69.05087673 100.00101
## 2019.846       84.49679 73.66026  95.33333 67.92374057 101.06985
## 2019.865       84.46764 72.95761  95.97767 66.86456948 102.07071
## 2019.885       84.43849 72.29201  96.58497 65.86205442 103.01493
## 2019.904       84.40934 71.65791  97.16077 64.90771006 103.91097
## 2019.923       84.38019 71.05102  97.70936 63.99497650 104.76540
## 2019.942       84.35104 70.46793  98.23415 63.11865817 105.58342
## 2019.962       84.32189 69.90591  98.73786 62.27455686 106.36922
## 2019.981       84.29274 69.36270  99.22277 61.45922266 107.12625
## 2020.000       84.26359 68.83642  99.69075 60.66977964 107.85739
## 2020.019       84.23443 68.32549 100.14338 59.90380047 108.56507
## 2020.038       84.20528 67.82854 100.58203 59.15921417 109.25135
## 2020.058       84.17613 67.34441 101.00785 58.43423691 109.91803
## 2020.077       84.14698 66.87209 101.42187 57.72731926 110.56664
## 2020.096       84.11783 66.41070 101.82496 57.03710516 111.19856
## 2020.115       84.08868 65.95944 102.21792 56.36239977 111.81496
## 2020.135       84.05953 65.51763 102.60143 55.70214386 112.41691
## 2020.154       84.03038 65.08465 102.97610 55.05539320 113.00536
## 2020.173       84.00123 64.65995 103.34250 54.42130180 113.58115
## 2020.192       83.97208 64.24303 103.70112 53.79910818 114.14504
## 2020.212       83.94292 63.83344 104.05241 53.18812403 114.69772
## 2020.231       83.91377 63.43077 104.39678 52.58772470 115.23982
## 2020.250       83.88462 63.03465 104.73459 51.99734127 115.77190
## 2020.269       83.85547 62.64474 105.06621 51.41645383 116.29449
## 2020.288       83.82632 62.26072 105.39192 50.84458579 116.80806
## 2020.308       83.79717 61.88232 105.71202 50.28129894 117.31304
## 2020.327       83.76802 61.50926 106.02677 49.72618932 117.80985
## 2020.346       83.73887 61.14131 106.33643 49.17888354 118.29885
## 2020.365       83.70972 60.77823 106.64120 48.63903568 118.78040
## 2020.385       83.68057 60.41982 106.94131 48.10632448 119.25481
## 2020.404       83.65141 60.06588 107.23695 47.58045104 119.72238
## 2020.423       83.62226 59.71623 107.52830 47.06113661 120.18339
## 2020.442       83.59311 59.37069 107.81553 46.54812080 120.63810
## 2020.462       83.56396 59.02912 108.09880 46.04115990 121.08676
## 2020.481       83.53481 58.69136 108.37827 45.54002546 121.52960
## 2020.500       83.50566 58.35726 108.65406 45.04450293 121.96682
## 2020.519       83.47651 58.02670 108.92631 44.55439054 122.39863
## 2020.538       83.44736 57.69956 109.19516 44.06949827 122.82522
## 2020.558       83.41821 57.37571 109.46070 43.58964689 123.24677
## 2020.577       83.38906 57.05505 109.72306 43.11466713 123.66344
## 2020.596       83.35990 56.73747 109.98234 42.64439893 124.07541
## 2020.615       83.33075 56.42287 110.23864 42.17869074 124.48282
## 2020.635       83.30160 56.11115 110.49205 41.71739892 124.88581
## 2020.654       83.27245 55.80224 110.74266 41.26038714 125.28452
## 2020.673       83.24330 55.49604 110.99056 40.80752593 125.67908
## 2020.692       83.21415 55.19247 111.23583 40.35869213 126.06961
## 2020.712       83.18500 54.89146 111.47854 39.91376853 126.45623
## 2020.731       83.15585 54.59294 111.71876 39.47264345 126.83905
## 2020.750       83.12670 54.29682 111.95657 39.03521039 127.21818
## 2020.769       83.09755 54.00306 112.19203 38.60136768 127.59372
## 2020.788       83.06840 53.71158 112.42521 38.17101820 127.96577
## 2020.808       83.03924 53.42232 112.65617 37.74406911 128.33442
## 2020.827       83.01009 53.13523 112.88496 37.32043155 128.69975
## 2020.846       82.98094 52.85025 113.11164 36.90002046 129.06186
## 2020.865       82.95179 52.56732 113.33626 36.48275430 129.42083
## 2020.885       82.92264 52.28640 113.55888 36.06855489 129.77673
## 2020.904       82.89349 52.00744 113.77954 35.65734720 130.12963
## 2020.923       82.86434 51.73038 113.99830 35.24905920 130.47962
## 2020.942       82.83519 51.45519 114.21519 34.84362166 130.82675
## 2020.962       82.80604 51.18182 114.43025 34.44096802 131.17110
## 2020.981       82.77689 50.91023 114.64355 34.04103424 131.51274
## 2021.000       82.74773 50.64037 114.85510 33.64375869 131.85171
## 2021.019       82.71858 50.37222 115.06495 33.24908199 132.18808
## 2021.038       82.68943 50.10572 115.27314 32.85694694 132.52192
## 2021.058       82.66028 49.84085 115.47971 32.46729835 132.85326
## 2021.077       82.63113 49.57758 115.68468 32.08008302 133.18218
## 2021.096       82.60198 49.31586 115.88810 31.69524956 133.50871
## 2021.115       82.57283 49.05566 116.08999 31.31274838 133.83291
## 2021.135       82.54368 48.79696 116.29039 30.93253153 134.15482
## 2021.154       82.51453 48.53973 116.48933 30.55455270 134.47450
## 2021.173       82.48538 48.28392 116.68683 30.17876708 134.79198
## 2021.192       82.45622 48.02953 116.88292 29.80513133 135.10732
## 2021.212       82.42707 47.77651 117.07764 29.43360350 135.42054
## 2021.231       82.39792 47.52484 117.27101 29.06414296 135.73170
## 2021.250       82.36877 47.27450 117.46305 28.69671038 136.04083
## 2021.269       82.33962 47.02546 117.65378 28.33126761 136.34797
## 2021.288       82.31047 46.77769 117.84325 27.96777770 136.65316
## 2021.308       82.28132 46.53118 118.03145 27.60620478 136.95643
## 2021.327       82.25217 46.28590 118.21843 27.24651409 137.25782
## 2021.346       82.22302 46.04183 118.40420 26.88867187 137.55736
## 2021.365       82.19387 45.79895 118.58878 26.53264535 137.85509
## 2021.385       82.16471 45.55723 118.77220 26.17840269 138.15103
## 2021.404       82.13556 45.31666 118.95447 25.82591299 138.44521
## 2021.423       82.10641 45.07722 119.13561 25.47514619 138.73768
## 2021.442       82.07726 44.83888 119.31564 25.12607307 139.02845
## 2021.462       82.04811 44.60163 119.49459 24.77866524 139.31756
## 2021.481       82.01896 44.36546 119.67246 24.43289505 139.60502
## 2021.500       81.98981 44.13033 119.84929 24.08873563 139.89088
## 2021.519       81.96066 43.89624 120.02507 23.74616079 140.17516
## 2021.538       81.93151 43.66318 120.19984 23.40514506 140.45787
## 2021.558       81.90236 43.43111 120.37360 23.06566362 140.73905
## 2021.577       81.87321 43.20003 120.54638 22.72769230 141.01872
## 2021.596       81.84405 42.96993 120.71818 22.39120754 141.29690
## 2021.615       81.81490 42.74078 120.88903 22.05618639 141.57362
## 2021.635       81.78575 42.51257 121.05893 21.72260646 141.84890
## 2021.654       81.75660 42.28529 121.22791 21.39044593 142.12276
## 2021.673       81.72745 42.05893 121.39597 21.05968351 142.39522
## 2021.692       81.69830 41.83347 121.56313 20.73029842 142.66630
## 2021.712       81.66915 41.60889 121.72941 20.40227039 142.93603
## 2021.731       81.64000 41.38519 121.89481 20.07557964 143.20441
## 2021.750       81.61085 41.16235 122.05934 19.75020686 143.47149
## 2021.769       81.58170 40.94036 122.22303 19.42613317 143.73726
## 2021.788       81.55254 40.71920 122.38588 19.10334015 144.00175
## 2021.808       81.52339 40.49888 122.54791 18.78180979 144.26498
## 2021.827       81.49424 40.27936 122.70912 18.46152451 144.52696
## 2021.846       81.46509 40.06065 122.86953 18.14246710 144.78772
## 2021.865       81.43594 39.84273 123.02915 17.82462077 145.04726
## 2021.885       81.40679 39.62560 123.18798 17.50796908 145.30561
## 2021.904       81.37764 39.40923 123.34605 17.19249594 145.56278
## 2021.923       81.34849 39.19362 123.50335 16.87818564 145.81879
## 2021.942       81.31934 38.97877 123.65991 16.56502279 146.07365
## 2021.962       81.29019 38.76465 123.81572 16.25299235 146.32738
## 2021.981       81.26103 38.55127 123.97080 15.94207957 146.57999
## 2022.000       81.23188 38.33860 124.12517 15.63227004 146.83150
## 2022.019       81.20273 38.12665 124.27882 15.32354964 147.08192
## 2022.038       81.17358 37.91540 124.43176 15.01590453 147.33126
## 2022.058       81.14443 37.70485 124.58401 14.70932118 147.57954
## 2022.077       81.11528 37.49498 124.73558 14.40378632 147.82677
## 2022.096       81.08613 37.28579 124.88647 14.09928694 148.07297
## 2022.115       81.05698 37.07726 125.03669 13.79581032 148.31815
## 2022.135       81.02783 36.86940 125.18625 13.49334398 148.56231
## 2022.154       80.99868 36.66219 125.33516 13.19187566 148.80548
## 2022.173       80.96952 36.45563 125.48342 12.89139338 149.04766
## 2022.192       80.94037 36.24970 125.63105 12.59188538 149.28886
## 2022.212       80.91122 36.04440 125.77804 12.29334011 149.52911
## 2022.231       80.88207 35.83972 125.92442 11.99574627 149.76840
## 2022.250       80.85292 35.63566 126.07018 11.69909276 150.00675
## 2022.269       80.82377 35.43221 126.21533 11.40336868 150.24417
## 2022.288       80.79462 35.22936 126.35988 11.10856336 150.48067
## 2022.308       80.76547 35.02710 126.50384 10.81466629 150.71627
## 2022.327       80.73632 34.82543 126.64721 10.52166720 150.95097
## 2022.346       80.70717 34.62433 126.79000 10.22955598 151.18478
## 2022.365       80.67802 34.42382 126.93221  9.93832270 151.41771
## 2022.385       80.64886 34.22387 127.07386  9.64795762 151.64977
## 2022.404       80.61971 34.02448 127.21495  9.35845119 151.88098
## 2022.423       80.59056 33.82565 127.35548  9.06979399 152.11133
## 2022.442       80.56141 33.62736 127.49546  8.78197682 152.34085
## 2022.462       80.53226 33.42962 127.63490  8.49499058 152.56953
## 2022.481       80.50311 33.23242 127.77380  8.20882639 152.79739
## 2022.500       80.47396 33.03575 127.91217  7.92347548 153.02444
## 2022.519       80.44481 32.83960 128.05001  7.63892924 153.25069
## 2022.538       80.41566 32.64398 128.18733  7.35517923 153.47613
## 2022.558       80.38651 32.44887 128.32414  7.07221713 153.70079
## 2022.577       80.35735 32.25427 128.46044  6.79003476 153.92467
## 2022.596       80.32820 32.06017 128.59623  6.50862408 154.14778
## 2022.615       80.29905 31.86658 128.73153  6.22797721 154.37013
## 2022.635       80.26990 31.67348 128.86632  5.94808636 154.59172
## 2022.654       80.24075 31.48087 129.00063  5.66894388 154.81256
## 2022.673       80.21160 31.28874 129.13446  5.39054227 155.03266
## 2022.692       80.18245 31.09709 129.26780  5.11287412 155.25202
## 2022.712       80.15330 30.90592 129.40068  4.83593216 155.47066
## 2022.731       80.12415 30.71522 129.53308  4.55970922 155.68858
## 2022.750       80.09500 30.52498 129.66501  4.28419826 155.90579
## 2022.769       80.06584 30.33520 129.79649  4.00939233 156.12230
## 2022.788       80.03669 30.14588 129.92750  3.73528462 156.33810
## 2022.808       80.00754 29.95702 130.05807  3.46186840 156.55322
## 2022.827       79.97839 29.76860 130.18819  3.18913706 156.76765
## 2022.846       79.94924 29.58062 130.31786  2.91708407 156.98140
## 2022.865       79.92009 29.39308 130.44710  2.64570303 157.19448
## 2022.885       79.89094 29.20598 130.57589  2.37498762 157.40689
## 2022.904       79.86179 29.01931 130.70426  2.10493163 157.61864
## 2022.923       79.83264 28.83307 130.83220  1.83552891 157.82974
## 2022.942       79.80349 28.64725 130.95972  1.56677344 158.04020
## 2022.962       79.77433 28.46185 131.08682  1.29865927 158.25001
## 2022.981       79.74518 28.27686 131.21350  1.03118054 158.45919
## 2023.000       79.71603 28.09229 131.33978  0.76433149 158.66773
## 2023.019       79.68688 27.90812 131.46564  0.49810642 158.87566
## 2023.038       79.65773 27.72436 131.59110  0.23249972 159.08296
## 2023.058       79.62858 27.54100 131.71616 -0.03249411 159.28965
## 2023.077       79.59943 27.35804 131.84082 -0.29688053 159.49574
## 2023.096       79.57028 27.17547 131.96509 -0.56066491 159.70122
## 2023.115       79.54113 26.99329 132.08896 -0.82385253 159.90611
## 2023.135       79.51198 26.81150 132.21245 -1.08644860 160.11040
## 2023.154       79.48283 26.63009 132.33556 -1.34845828 160.31411
## 2023.173       79.45367 26.44906 132.45829 -1.60988663 160.51723
## 2023.192       79.42452 26.26841 132.58064 -1.87073864 160.71978
## 2023.212       79.39537 26.08813 132.70261 -2.13101926 160.92176
## 2023.231       79.36622 25.90822 132.82422 -2.39073334 161.12318
## 2023.250       79.33707 25.72868 132.94546 -2.64988568 161.32403
## 2023.269       79.30792 25.54950 133.06633 -2.90848101 161.52432
## 2023.288       79.27877 25.37069 133.18685 -3.16652399 161.72406
## 2023.308       79.24962 25.19223 133.30700 -3.42401923 161.92325
## 2023.327       79.22047 25.01413 133.42680 -3.68097128 162.12190
## 2023.346       79.19132 24.83638 133.54625 -3.93738460 162.32002
## 2023.365       79.16216 24.65898 133.66535 -4.19326364 162.51759
## 2023.385       79.13301 24.48193 133.78410 -4.44861274 162.71464
## 2023.404       79.10386 24.30522 133.90251 -4.70343622 162.91116
## 2023.423       79.07471 24.12885 134.02058 -4.95773833 163.10716
## 2023.442       79.04556 23.95281 134.13831 -5.21152327 163.30264
## 2023.462       79.01641 23.77712 134.25570 -5.46479517 163.49761
## 2023.481       78.98726 23.60176 134.37276 -5.71755812 163.69208
## 2023.500       78.95811 23.42672 134.48949 -5.96981616 163.88603
## 2023.519       78.92896 23.25202 134.60590 -6.22157327 164.07949
## 2023.538       78.89981 23.07764 134.72197 -6.47283339 164.27244
## 2023.558       78.87065 22.90358 134.83773 -6.72360040 164.46491
## 2023.577       78.84150 22.72984 134.95317 -6.97387814 164.65689
## 2023.596       78.81235 22.55642 135.06828 -7.22367040 164.84838
## 2023.615       78.78320 22.38332 135.18309 -7.47298091 165.03938
## 2023.635       78.75405 22.21052 135.29758 -7.72181338 165.22991
## 2023.654       78.72490 22.03804 135.41176 -7.97017144 165.41997
## 2023.673       78.69575 21.86586 135.52563 -8.21805872 165.60956
## 2023.692       78.66660 21.69400 135.63920 -8.46547876 165.79867
## 2023.712       78.63745 21.52243 135.75246 -8.71243509 165.98733
## 2023.731       78.60830 21.35116 135.86543 -8.95893117 166.17552
## 2023.750       78.57914 21.18020 135.97809 -9.20497046 166.36326
checkresiduals(fit_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 192.35, df = 98, p-value = 4.133e-08
## 
## Model df: 4.   Total lags used: 102
shapiro.test(residuals(fit_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(fit_1$model)
## W = 0.99539, p-value = 0.1384

The residual graphs for the above model are shown in Figure 34:

  • MASE of this model is 0.687417.
  • The time series plot clearly shows that the residuals are not randomly distributed also changing variance observed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Fitting Residuals from Holt−Winters’ multiplicative method

fit_3 <- holt(mortality.ts,seasonal="multiplicative", h=4*frequency(mortality.ts))
summary(fit_3) 
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = mortality.ts, h = 4 * frequency(mortality.ts), seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.5121 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 100.9771 
##     b = -0.0289 
## 
##   sigma:  5.906
## 
##      AIC     AICc      BIC 
## 4975.460 4975.579 4996.612 
## 
## Error measures:
##                       ME    RMSE      MAE        MPE     MAPE     MASE
## Training set -0.00452697 5.88274 4.590378 -0.3179476 5.155929 0.687417
##                     ACF1
## Training set -0.07633466
## 
## Forecasts:
##          Point Forecast    Lo 80     Hi 80       Lo 95     Hi 95
## 2019.769       84.61340 77.04450  92.18229 73.03777546  96.18902
## 2019.788       84.58425 76.08043  93.08806 71.57877856  97.58971
## 2019.808       84.55509 75.20910  93.90109 70.26163215  98.84856
## 2019.827       84.52594 74.40734  94.64455 69.05087673 100.00101
## 2019.846       84.49679 73.66026  95.33333 67.92374057 101.06985
## 2019.865       84.46764 72.95761  95.97767 66.86456948 102.07071
## 2019.885       84.43849 72.29201  96.58497 65.86205442 103.01493
## 2019.904       84.40934 71.65791  97.16077 64.90771006 103.91097
## 2019.923       84.38019 71.05102  97.70936 63.99497650 104.76540
## 2019.942       84.35104 70.46793  98.23415 63.11865817 105.58342
## 2019.962       84.32189 69.90591  98.73786 62.27455686 106.36922
## 2019.981       84.29274 69.36270  99.22277 61.45922266 107.12625
## 2020.000       84.26359 68.83642  99.69075 60.66977964 107.85739
## 2020.019       84.23443 68.32549 100.14338 59.90380047 108.56507
## 2020.038       84.20528 67.82854 100.58203 59.15921417 109.25135
## 2020.058       84.17613 67.34441 101.00785 58.43423691 109.91803
## 2020.077       84.14698 66.87209 101.42187 57.72731926 110.56664
## 2020.096       84.11783 66.41070 101.82496 57.03710516 111.19856
## 2020.115       84.08868 65.95944 102.21792 56.36239977 111.81496
## 2020.135       84.05953 65.51763 102.60143 55.70214386 112.41691
## 2020.154       84.03038 65.08465 102.97610 55.05539320 113.00536
## 2020.173       84.00123 64.65995 103.34250 54.42130180 113.58115
## 2020.192       83.97208 64.24303 103.70112 53.79910818 114.14504
## 2020.212       83.94292 63.83344 104.05241 53.18812403 114.69772
## 2020.231       83.91377 63.43077 104.39678 52.58772470 115.23982
## 2020.250       83.88462 63.03465 104.73459 51.99734127 115.77190
## 2020.269       83.85547 62.64474 105.06621 51.41645383 116.29449
## 2020.288       83.82632 62.26072 105.39192 50.84458579 116.80806
## 2020.308       83.79717 61.88232 105.71202 50.28129894 117.31304
## 2020.327       83.76802 61.50926 106.02677 49.72618932 117.80985
## 2020.346       83.73887 61.14131 106.33643 49.17888354 118.29885
## 2020.365       83.70972 60.77823 106.64120 48.63903568 118.78040
## 2020.385       83.68057 60.41982 106.94131 48.10632448 119.25481
## 2020.404       83.65141 60.06588 107.23695 47.58045104 119.72238
## 2020.423       83.62226 59.71623 107.52830 47.06113661 120.18339
## 2020.442       83.59311 59.37069 107.81553 46.54812080 120.63810
## 2020.462       83.56396 59.02912 108.09880 46.04115990 121.08676
## 2020.481       83.53481 58.69136 108.37827 45.54002546 121.52960
## 2020.500       83.50566 58.35726 108.65406 45.04450293 121.96682
## 2020.519       83.47651 58.02670 108.92631 44.55439054 122.39863
## 2020.538       83.44736 57.69956 109.19516 44.06949827 122.82522
## 2020.558       83.41821 57.37571 109.46070 43.58964689 123.24677
## 2020.577       83.38906 57.05505 109.72306 43.11466713 123.66344
## 2020.596       83.35990 56.73747 109.98234 42.64439893 124.07541
## 2020.615       83.33075 56.42287 110.23864 42.17869074 124.48282
## 2020.635       83.30160 56.11115 110.49205 41.71739892 124.88581
## 2020.654       83.27245 55.80224 110.74266 41.26038714 125.28452
## 2020.673       83.24330 55.49604 110.99056 40.80752593 125.67908
## 2020.692       83.21415 55.19247 111.23583 40.35869213 126.06961
## 2020.712       83.18500 54.89146 111.47854 39.91376853 126.45623
## 2020.731       83.15585 54.59294 111.71876 39.47264345 126.83905
## 2020.750       83.12670 54.29682 111.95657 39.03521039 127.21818
## 2020.769       83.09755 54.00306 112.19203 38.60136768 127.59372
## 2020.788       83.06840 53.71158 112.42521 38.17101820 127.96577
## 2020.808       83.03924 53.42232 112.65617 37.74406911 128.33442
## 2020.827       83.01009 53.13523 112.88496 37.32043155 128.69975
## 2020.846       82.98094 52.85025 113.11164 36.90002046 129.06186
## 2020.865       82.95179 52.56732 113.33626 36.48275430 129.42083
## 2020.885       82.92264 52.28640 113.55888 36.06855489 129.77673
## 2020.904       82.89349 52.00744 113.77954 35.65734720 130.12963
## 2020.923       82.86434 51.73038 113.99830 35.24905920 130.47962
## 2020.942       82.83519 51.45519 114.21519 34.84362166 130.82675
## 2020.962       82.80604 51.18182 114.43025 34.44096802 131.17110
## 2020.981       82.77689 50.91023 114.64355 34.04103424 131.51274
## 2021.000       82.74773 50.64037 114.85510 33.64375869 131.85171
## 2021.019       82.71858 50.37222 115.06495 33.24908199 132.18808
## 2021.038       82.68943 50.10572 115.27314 32.85694694 132.52192
## 2021.058       82.66028 49.84085 115.47971 32.46729835 132.85326
## 2021.077       82.63113 49.57758 115.68468 32.08008302 133.18218
## 2021.096       82.60198 49.31586 115.88810 31.69524956 133.50871
## 2021.115       82.57283 49.05566 116.08999 31.31274838 133.83291
## 2021.135       82.54368 48.79696 116.29039 30.93253153 134.15482
## 2021.154       82.51453 48.53973 116.48933 30.55455270 134.47450
## 2021.173       82.48538 48.28392 116.68683 30.17876708 134.79198
## 2021.192       82.45622 48.02953 116.88292 29.80513133 135.10732
## 2021.212       82.42707 47.77651 117.07764 29.43360350 135.42054
## 2021.231       82.39792 47.52484 117.27101 29.06414296 135.73170
## 2021.250       82.36877 47.27450 117.46305 28.69671038 136.04083
## 2021.269       82.33962 47.02546 117.65378 28.33126761 136.34797
## 2021.288       82.31047 46.77769 117.84325 27.96777770 136.65316
## 2021.308       82.28132 46.53118 118.03145 27.60620478 136.95643
## 2021.327       82.25217 46.28590 118.21843 27.24651409 137.25782
## 2021.346       82.22302 46.04183 118.40420 26.88867187 137.55736
## 2021.365       82.19387 45.79895 118.58878 26.53264535 137.85509
## 2021.385       82.16471 45.55723 118.77220 26.17840269 138.15103
## 2021.404       82.13556 45.31666 118.95447 25.82591299 138.44521
## 2021.423       82.10641 45.07722 119.13561 25.47514619 138.73768
## 2021.442       82.07726 44.83888 119.31564 25.12607307 139.02845
## 2021.462       82.04811 44.60163 119.49459 24.77866524 139.31756
## 2021.481       82.01896 44.36546 119.67246 24.43289505 139.60502
## 2021.500       81.98981 44.13033 119.84929 24.08873563 139.89088
## 2021.519       81.96066 43.89624 120.02507 23.74616079 140.17516
## 2021.538       81.93151 43.66318 120.19984 23.40514506 140.45787
## 2021.558       81.90236 43.43111 120.37360 23.06566362 140.73905
## 2021.577       81.87321 43.20003 120.54638 22.72769230 141.01872
## 2021.596       81.84405 42.96993 120.71818 22.39120754 141.29690
## 2021.615       81.81490 42.74078 120.88903 22.05618639 141.57362
## 2021.635       81.78575 42.51257 121.05893 21.72260646 141.84890
## 2021.654       81.75660 42.28529 121.22791 21.39044593 142.12276
## 2021.673       81.72745 42.05893 121.39597 21.05968351 142.39522
## 2021.692       81.69830 41.83347 121.56313 20.73029842 142.66630
## 2021.712       81.66915 41.60889 121.72941 20.40227039 142.93603
## 2021.731       81.64000 41.38519 121.89481 20.07557964 143.20441
## 2021.750       81.61085 41.16235 122.05934 19.75020686 143.47149
## 2021.769       81.58170 40.94036 122.22303 19.42613317 143.73726
## 2021.788       81.55254 40.71920 122.38588 19.10334015 144.00175
## 2021.808       81.52339 40.49888 122.54791 18.78180979 144.26498
## 2021.827       81.49424 40.27936 122.70912 18.46152451 144.52696
## 2021.846       81.46509 40.06065 122.86953 18.14246710 144.78772
## 2021.865       81.43594 39.84273 123.02915 17.82462077 145.04726
## 2021.885       81.40679 39.62560 123.18798 17.50796908 145.30561
## 2021.904       81.37764 39.40923 123.34605 17.19249594 145.56278
## 2021.923       81.34849 39.19362 123.50335 16.87818564 145.81879
## 2021.942       81.31934 38.97877 123.65991 16.56502279 146.07365
## 2021.962       81.29019 38.76465 123.81572 16.25299235 146.32738
## 2021.981       81.26103 38.55127 123.97080 15.94207957 146.57999
## 2022.000       81.23188 38.33860 124.12517 15.63227004 146.83150
## 2022.019       81.20273 38.12665 124.27882 15.32354964 147.08192
## 2022.038       81.17358 37.91540 124.43176 15.01590453 147.33126
## 2022.058       81.14443 37.70485 124.58401 14.70932118 147.57954
## 2022.077       81.11528 37.49498 124.73558 14.40378632 147.82677
## 2022.096       81.08613 37.28579 124.88647 14.09928694 148.07297
## 2022.115       81.05698 37.07726 125.03669 13.79581032 148.31815
## 2022.135       81.02783 36.86940 125.18625 13.49334398 148.56231
## 2022.154       80.99868 36.66219 125.33516 13.19187566 148.80548
## 2022.173       80.96952 36.45563 125.48342 12.89139338 149.04766
## 2022.192       80.94037 36.24970 125.63105 12.59188538 149.28886
## 2022.212       80.91122 36.04440 125.77804 12.29334011 149.52911
## 2022.231       80.88207 35.83972 125.92442 11.99574627 149.76840
## 2022.250       80.85292 35.63566 126.07018 11.69909276 150.00675
## 2022.269       80.82377 35.43221 126.21533 11.40336868 150.24417
## 2022.288       80.79462 35.22936 126.35988 11.10856336 150.48067
## 2022.308       80.76547 35.02710 126.50384 10.81466629 150.71627
## 2022.327       80.73632 34.82543 126.64721 10.52166720 150.95097
## 2022.346       80.70717 34.62433 126.79000 10.22955598 151.18478
## 2022.365       80.67802 34.42382 126.93221  9.93832270 151.41771
## 2022.385       80.64886 34.22387 127.07386  9.64795762 151.64977
## 2022.404       80.61971 34.02448 127.21495  9.35845119 151.88098
## 2022.423       80.59056 33.82565 127.35548  9.06979399 152.11133
## 2022.442       80.56141 33.62736 127.49546  8.78197682 152.34085
## 2022.462       80.53226 33.42962 127.63490  8.49499058 152.56953
## 2022.481       80.50311 33.23242 127.77380  8.20882639 152.79739
## 2022.500       80.47396 33.03575 127.91217  7.92347548 153.02444
## 2022.519       80.44481 32.83960 128.05001  7.63892924 153.25069
## 2022.538       80.41566 32.64398 128.18733  7.35517923 153.47613
## 2022.558       80.38651 32.44887 128.32414  7.07221713 153.70079
## 2022.577       80.35735 32.25427 128.46044  6.79003476 153.92467
## 2022.596       80.32820 32.06017 128.59623  6.50862408 154.14778
## 2022.615       80.29905 31.86658 128.73153  6.22797721 154.37013
## 2022.635       80.26990 31.67348 128.86632  5.94808636 154.59172
## 2022.654       80.24075 31.48087 129.00063  5.66894388 154.81256
## 2022.673       80.21160 31.28874 129.13446  5.39054227 155.03266
## 2022.692       80.18245 31.09709 129.26780  5.11287412 155.25202
## 2022.712       80.15330 30.90592 129.40068  4.83593216 155.47066
## 2022.731       80.12415 30.71522 129.53308  4.55970922 155.68858
## 2022.750       80.09500 30.52498 129.66501  4.28419826 155.90579
## 2022.769       80.06584 30.33520 129.79649  4.00939233 156.12230
## 2022.788       80.03669 30.14588 129.92750  3.73528462 156.33810
## 2022.808       80.00754 29.95702 130.05807  3.46186840 156.55322
## 2022.827       79.97839 29.76860 130.18819  3.18913706 156.76765
## 2022.846       79.94924 29.58062 130.31786  2.91708407 156.98140
## 2022.865       79.92009 29.39308 130.44710  2.64570303 157.19448
## 2022.885       79.89094 29.20598 130.57589  2.37498762 157.40689
## 2022.904       79.86179 29.01931 130.70426  2.10493163 157.61864
## 2022.923       79.83264 28.83307 130.83220  1.83552891 157.82974
## 2022.942       79.80349 28.64725 130.95972  1.56677344 158.04020
## 2022.962       79.77433 28.46185 131.08682  1.29865927 158.25001
## 2022.981       79.74518 28.27686 131.21350  1.03118054 158.45919
## 2023.000       79.71603 28.09229 131.33978  0.76433149 158.66773
## 2023.019       79.68688 27.90812 131.46564  0.49810642 158.87566
## 2023.038       79.65773 27.72436 131.59110  0.23249972 159.08296
## 2023.058       79.62858 27.54100 131.71616 -0.03249411 159.28965
## 2023.077       79.59943 27.35804 131.84082 -0.29688053 159.49574
## 2023.096       79.57028 27.17547 131.96509 -0.56066491 159.70122
## 2023.115       79.54113 26.99329 132.08896 -0.82385253 159.90611
## 2023.135       79.51198 26.81150 132.21245 -1.08644860 160.11040
## 2023.154       79.48283 26.63009 132.33556 -1.34845828 160.31411
## 2023.173       79.45367 26.44906 132.45829 -1.60988663 160.51723
## 2023.192       79.42452 26.26841 132.58064 -1.87073864 160.71978
## 2023.212       79.39537 26.08813 132.70261 -2.13101926 160.92176
## 2023.231       79.36622 25.90822 132.82422 -2.39073334 161.12318
## 2023.250       79.33707 25.72868 132.94546 -2.64988568 161.32403
## 2023.269       79.30792 25.54950 133.06633 -2.90848101 161.52432
## 2023.288       79.27877 25.37069 133.18685 -3.16652399 161.72406
## 2023.308       79.24962 25.19223 133.30700 -3.42401923 161.92325
## 2023.327       79.22047 25.01413 133.42680 -3.68097128 162.12190
## 2023.346       79.19132 24.83638 133.54625 -3.93738460 162.32002
## 2023.365       79.16216 24.65898 133.66535 -4.19326364 162.51759
## 2023.385       79.13301 24.48193 133.78410 -4.44861274 162.71464
## 2023.404       79.10386 24.30522 133.90251 -4.70343622 162.91116
## 2023.423       79.07471 24.12885 134.02058 -4.95773833 163.10716
## 2023.442       79.04556 23.95281 134.13831 -5.21152327 163.30264
## 2023.462       79.01641 23.77712 134.25570 -5.46479517 163.49761
## 2023.481       78.98726 23.60176 134.37276 -5.71755812 163.69208
## 2023.500       78.95811 23.42672 134.48949 -5.96981616 163.88603
## 2023.519       78.92896 23.25202 134.60590 -6.22157327 164.07949
## 2023.538       78.89981 23.07764 134.72197 -6.47283339 164.27244
## 2023.558       78.87065 22.90358 134.83773 -6.72360040 164.46491
## 2023.577       78.84150 22.72984 134.95317 -6.97387814 164.65689
## 2023.596       78.81235 22.55642 135.06828 -7.22367040 164.84838
## 2023.615       78.78320 22.38332 135.18309 -7.47298091 165.03938
## 2023.635       78.75405 22.21052 135.29758 -7.72181338 165.22991
## 2023.654       78.72490 22.03804 135.41176 -7.97017144 165.41997
## 2023.673       78.69575 21.86586 135.52563 -8.21805872 165.60956
## 2023.692       78.66660 21.69400 135.63920 -8.46547876 165.79867
## 2023.712       78.63745 21.52243 135.75246 -8.71243509 165.98733
## 2023.731       78.60830 21.35116 135.86543 -8.95893117 166.17552
## 2023.750       78.57914 21.18020 135.97809 -9.20497046 166.36326
checkresiduals(fit_3)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 192.35, df = 98, p-value = 4.133e-08
## 
## Model df: 4.   Total lags used: 102
shapiro.test(residuals(fit_3$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(fit_3$model)
## W = 0.99539, p-value = 0.1384

The residual graphs for the above model are shown in Figure 35:

  • MASE of this model is 0.687417.
  • The time series plot clearly shows that the residuals are not randomly distributed also changing variance observed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Fitting Residuals from Damped Holt−Winters’ additive method

fit_4 <- holt(mortality.ts,seasonal="additive",damped = TRUE, h=4*frequency(mortality.ts)) 
summary(fit_4)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = mortality.ts, h = 4 * frequency(mortality.ts), damped = TRUE,  
## 
##  Call:
##      seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.5082 
##     beta  = 1e-04 
##     phi   = 0.9119 
## 
##   Initial states:
##     l = 102.0415 
##     b = -1.1632 
## 
##   sigma:  5.9071
## 
##      AIC     AICc      BIC 
## 4976.632 4976.800 5002.015 
## 
## Error measures:
##                       ME     RMSE      MAE        MPE     MAPE     MASE
## Training set -0.02068605 5.877951 4.588308 -0.3396637 5.154426 0.687107
##                    ACF1
## Training set -0.0738815
## 
## Forecasts:
##          Point Forecast    Lo 80     Hi 80       Lo 95     Hi 95
## 2019.769       84.64364 77.07339  92.21388 73.06594913  96.22133
## 2019.788       84.64444 76.15242  93.13646 71.65701552  97.63187
## 2019.808       84.64517 75.32181  93.96854 70.38632011  98.90403
## 2019.827       84.64584 74.55921  94.73248 69.21966520 100.07202
## 2019.846       84.64645 73.85019  95.44272 68.13498927 101.15791
## 2019.865       84.64701 73.18480  96.10921 67.11707758 102.17694
## 2019.885       84.64751 72.55585  96.73918 66.15491218 103.14012
## 2019.904       84.64798 71.95792  97.33803 65.24021065 104.05574
## 2019.923       84.64840 71.38682  97.90998 64.36656034 104.93023
## 2019.942       84.64878 70.83922  98.45834 63.52887591 105.76869
## 2019.962       84.64913 70.31243  98.98583 62.72304381 106.57522
## 2019.981       84.64945 69.80425  99.49465 61.94568057 107.35322
## 2020.000       84.64974 69.31283  99.98665 61.19396328 108.10552
## 2020.019       84.65001 68.83661 100.46340 60.46550752 108.83451
## 2020.038       84.65025 68.37426 100.92624 59.75827754 109.54222
## 2020.058       84.65047 67.92464 101.37630 59.07051869 110.23042
## 2020.077       84.65067 67.48674 101.81460 58.40070594 110.90064
## 2020.096       84.65086 67.05970 102.24201 57.74750381 111.55421
## 2020.115       84.65102 66.64274 102.65930 57.10973490 112.19231
## 2020.135       84.65118 66.23519 103.06716 56.48635489 112.81600
## 2020.154       84.65132 65.83643 103.46620 55.87643234 113.42620
## 2020.173       84.65144 65.44592 103.85696 55.27913228 114.02375
## 2020.192       84.65156 65.06317 104.23995 54.69370278 114.60941
## 2020.212       84.65166 64.68773 104.61560 54.11946379 115.18386
## 2020.231       84.65176 64.31920 104.98432 53.55579791 115.74772
## 2020.250       84.65185 63.95722 105.34648 53.00214251 116.30155
## 2020.269       84.65193 63.60144 105.70242 52.45798321 116.84587
## 2020.288       84.65200 63.25156 106.05244 51.92284827 117.38115
## 2020.308       84.65207 62.90729 106.39684 51.39630375 117.90783
## 2020.327       84.65213 62.56838 106.73588 50.87794942 118.42631
## 2020.346       84.65218 62.23458 107.06979 50.36741522 118.93695
## 2020.365       84.65223 61.90566 107.39880 49.86435814 119.44011
## 2020.385       84.65228 61.58143 107.72313 49.36845952 119.93610
## 2020.404       84.65232 61.26168 108.04296 48.87942273 120.42522
## 2020.423       84.65236 60.94624 108.35848 48.39697104 120.90775
## 2020.442       84.65239 60.63493 108.66986 47.92084586 121.38394
## 2020.462       84.65243 60.32759 108.97726 47.45080506 121.85405
## 2020.481       84.65246 60.02409 109.28082 46.98662157 122.31829
## 2020.500       84.65248 59.72428 109.58069 46.52808210 122.77688
## 2020.519       84.65251 59.42802 109.87699 46.07498598 123.23003
## 2020.538       84.65253 59.13520 110.16986 45.62714415 123.67791
## 2020.558       84.65255 58.84570 110.45940 45.18437824 124.12072
## 2020.577       84.65257 58.55941 110.74573 44.74651972 124.55861
## 2020.596       84.65258 58.27622 111.02895 44.31340921 124.99176
## 2020.615       84.65260 57.99603 111.30917 43.88489574 125.42030
## 2020.635       84.65261 57.71876 111.58647 43.46083617 125.84439
## 2020.654       84.65263 57.44431 111.86094 43.04109464 126.26416
## 2020.673       84.65264 57.17260 112.13268 42.62554204 126.67973
## 2020.692       84.65265 56.90354 112.40175 42.21405556 127.09124
## 2020.712       84.65266 56.63707 112.66824 41.80651828 127.49880
## 2020.731       84.65267 56.37311 112.93222 41.40281877 127.90251
## 2020.750       84.65267 56.11159 113.19376 41.00285072 128.30250
## 2020.769       84.65268 55.85244 113.45292 40.60651266 128.69885
## 2020.788       84.65269 55.59560 113.70977 40.21370761 129.09167
## 2020.808       84.65269 55.34101 113.96438 39.82434284 129.48104
## 2020.827       84.65270 55.08861 114.21679 39.43832960 129.86707
## 2020.846       84.65270 54.83835 114.46706 39.05558288 130.24983
## 2020.865       84.65271 54.59017 114.71525 38.67602120 130.62940
## 2020.885       84.65271 54.34402 114.96141 38.29956643 131.00586
## 2020.904       84.65272 54.09985 115.20558 37.92614354 131.37929
## 2020.923       84.65272 53.85762 115.44782 37.55568052 131.74976
## 2020.942       84.65272 53.61728 115.68817 37.18810811 132.11734
## 2020.962       84.65273 53.37878 115.92667 36.82335974 132.48209
## 2020.981       84.65273 53.14209 116.16336 36.46137135 132.84409
## 2021.000       84.65273 52.90717 116.39830 36.10208125 133.20338
## 2021.019       84.65273 52.67397 116.63150 35.74543002 133.56004
## 2021.038       84.65274 52.44245 116.86302 35.39136040 133.91411
## 2021.058       84.65274 52.21259 117.09288 35.03981715 134.26566
## 2021.077       84.65274 51.98435 117.32113 34.69074698 134.61473
## 2021.096       84.65274 51.75769 117.54779 34.34409844 134.96138
## 2021.115       84.65274 51.53258 117.77291 33.99982185 135.30566
## 2021.135       84.65274 51.30899 117.99650 33.65786919 135.64762
## 2021.154       84.65274 51.08689 118.21860 33.31819405 135.98729
## 2021.173       84.65275 50.86624 118.43925 32.98075153 136.32474
## 2021.192       84.65275 50.64703 118.65846 32.64549820 136.65999
## 2021.212       84.65275 50.42923 118.87627 32.31239200 136.99310
## 2021.231       84.65275 50.21280 119.09270 31.98139223 137.32410
## 2021.250       84.65275 49.99772 119.30778 31.65245942 137.65304
## 2021.269       84.65275 49.78397 119.52153 31.32555534 137.97994
## 2021.288       84.65275 49.57152 119.73398 31.00064293 138.30486
## 2021.308       84.65275 49.36035 119.94515 30.67768623 138.62781
## 2021.327       84.65275 49.15044 120.15506 30.35665034 138.94885
## 2021.346       84.65275 48.94176 120.36374 30.03750141 139.26800
## 2021.365       84.65275 48.73429 120.57121 29.72020656 139.58530
## 2021.385       84.65275 48.52801 120.77749 29.40473385 139.90077
## 2021.404       84.65275 48.32291 120.98260 29.09105225 140.21445
## 2021.423       84.65275 48.11896 121.18655 28.77913160 140.52637
## 2021.442       84.65275 47.91613 121.38937 28.46894258 140.83656
## 2021.462       84.65275 47.71443 121.59108 28.16045667 141.14505
## 2021.481       84.65275 47.51381 121.79169 27.85364612 141.45186
## 2021.500       84.65275 47.31428 121.99123 27.54848393 141.75702
## 2021.519       84.65275 47.11580 122.18970 27.24494382 142.06056
## 2021.538       84.65275 46.91837 122.38713 26.94300020 142.36251
## 2021.558       84.65275 46.72197 122.58354 26.64262815 142.66288
## 2021.577       84.65275 46.52658 122.77893 26.34380338 142.96171
## 2021.596       84.65275 46.33219 122.97332 26.04650222 143.25901
## 2021.615       84.65275 46.13877 123.16674 25.75070160 143.55481
## 2021.635       84.65276 45.94633 123.35918 25.45637904 143.84913
## 2021.654       84.65276 45.75483 123.55068 25.16351260 144.14200
## 2021.673       84.65276 45.56427 123.74124 24.87208088 144.43343
## 2021.692       84.65276 45.37464 123.93087 24.58206300 144.72345
## 2021.712       84.65276 45.18592 124.11959 24.29343858 145.01207
## 2021.731       84.65276 44.99810 124.30741 24.00618773 145.29932
## 2021.750       84.65276 44.81116 124.49435 23.72029102 145.58522
## 2021.769       84.65276 44.62509 124.68042 23.43572948 145.86978
## 2021.788       84.65276 44.43989 124.86562 23.15248458 146.15303
## 2021.808       84.65276 44.25554 125.04998 22.87053821 146.43497
## 2021.827       84.65276 44.07202 125.23349 22.58987268 146.71564
## 2021.846       84.65276 43.88933 125.41618 22.31047069 146.99504
## 2021.865       84.65276 43.70745 125.59806 22.03231532 147.27320
## 2021.885       84.65276 43.52638 125.77913 21.75539004 147.55012
## 2021.904       84.65276 43.34610 125.95941 21.47967867 147.82583
## 2021.923       84.65276 43.16661 126.13891 21.20516538 148.10035
## 2021.942       84.65276 42.98789 126.31763 20.93183469 148.37368
## 2021.962       84.65276 42.80993 126.49558 20.65967146 148.64584
## 2021.981       84.65276 42.63272 126.67279 20.38866084 148.91685
## 2022.000       84.65276 42.45626 126.84925 20.11878832 149.18672
## 2022.019       84.65276 42.28054 127.02497 19.85003967 149.45547
## 2022.038       84.65276 42.10554 127.19997 19.58240098 149.72311
## 2022.058       84.65276 41.93126 127.37426 19.31585860 149.98965
## 2022.077       84.65276 41.75768 127.54783 19.05039917 150.25511
## 2022.096       84.65276 41.58481 127.72071 18.78600960 150.51950
## 2022.115       84.65276 41.41262 127.89289 18.52267705 150.78284
## 2022.135       84.65276 41.24112 128.06439 18.26038895 151.04512
## 2022.154       84.65276 41.07029 128.23522 17.99913296 151.30638
## 2022.173       84.65276 40.90014 128.40538 17.73889701 151.56662
## 2022.192       84.65276 40.73064 128.57488 17.47966923 151.82584
## 2022.212       84.65276 40.56179 128.74373 17.22143799 152.08407
## 2022.231       84.65276 40.39358 128.91193 16.96419190 152.34132
## 2022.250       84.65276 40.22602 129.07950 16.70791975 152.59759
## 2022.269       84.65276 40.05908 129.24643 16.45261059 152.85290
## 2022.288       84.65276 39.89276 129.41275 16.19825361 153.10726
## 2022.308       84.65276 39.72706 129.57845 15.94483827 153.36067
## 2022.327       84.65276 39.56197 129.74354 15.69235416 153.61316
## 2022.346       84.65276 39.39748 129.90803 15.44079111 153.86472
## 2022.365       84.65276 39.23359 130.07192 15.19013910 154.11537
## 2022.385       84.65276 39.07029 130.23522 14.94038830 154.36512
## 2022.404       84.65276 38.90757 130.39794 14.69152907 154.61398
## 2022.423       84.65276 38.74543 130.56009 14.44355193 154.86196
## 2022.442       84.65276 38.58385 130.72166 14.19644756 155.10906
## 2022.462       84.65276 38.42284 130.88267 13.95020681 155.35531
## 2022.481       84.65276 38.26239 131.04312 13.70482069 155.60069
## 2022.500       84.65276 38.10250 131.20301 13.46028035 155.84523
## 2022.519       84.65276 37.94315 131.36236 13.21657713 156.08894
## 2022.538       84.65276 37.78434 131.52117 12.97370247 156.33181
## 2022.558       84.65276 37.62607 131.67944 12.73164798 156.57386
## 2022.577       84.65276 37.46833 131.83718 12.49040542 156.81511
## 2022.596       84.65276 37.31112 131.99440 12.24996665 157.05555
## 2022.615       84.65276 37.15442 132.15109 12.01032371 157.29519
## 2022.635       84.65276 36.99824 132.30727 11.77146874 157.53404
## 2022.654       84.65276 36.84258 132.46294 11.53339402 157.77212
## 2022.673       84.65276 36.68741 132.61810 11.29609196 158.00942
## 2022.692       84.65276 36.53275 132.77276 11.05955507 158.24596
## 2022.712       84.65276 36.37858 132.92693 10.82377601 158.48174
## 2022.731       84.65276 36.22490 133.08061 10.58874753 158.71676
## 2022.750       84.65276 36.07171 133.23380 10.35446252 158.95105
## 2022.769       84.65276 35.91900 133.38651 10.12091397 159.18460
## 2022.788       84.65276 35.76677 133.53874  9.88809496 159.41742
## 2022.808       84.65276 35.61501 133.69050  9.65599871 159.64951
## 2022.827       84.65276 35.46372 133.84179  9.42461852 159.88089
## 2022.846       84.65276 35.31289 133.99262  9.19394782 160.11156
## 2022.865       84.65276 35.16253 134.14299  8.96398011 160.34153
## 2022.885       84.65276 35.01261 134.29290  8.73470901 160.57080
## 2022.904       84.65276 34.86315 134.44236  8.50612822 160.79938
## 2022.923       84.65276 34.71414 134.59137  8.27823154 161.02728
## 2022.942       84.65276 34.56557 134.73994  8.05101287 161.25450
## 2022.962       84.65276 34.41744 134.88807  7.82446620 161.48105
## 2022.981       84.65276 34.26974 135.03577  7.59858559 161.70693
## 2023.000       84.65276 34.12248 135.18303  7.37336521 161.93215
## 2023.019       84.65276 33.97564 135.32987  7.14879929 162.15671
## 2023.038       84.65276 33.82923 135.47628  6.92488218 162.38063
## 2023.058       84.65276 33.68324 135.62227  6.70160827 162.60390
## 2023.077       84.65276 33.53767 135.76785  6.47897206 162.82654
## 2023.096       84.65276 33.39251 135.91301  6.25696810 163.04854
## 2023.115       84.65276 33.24775 136.05776  6.03559106 163.26992
## 2023.135       84.65276 33.10341 136.20210  5.81483564 163.49068
## 2023.154       84.65276 32.95947 136.34604  5.59469663 163.71082
## 2023.173       84.65276 32.81593 136.48958  5.37516891 163.93034
## 2023.192       84.65276 32.67278 136.63273  5.15624741 164.14927
## 2023.212       84.65276 32.53003 136.77548  4.93792713 164.36759
## 2023.231       84.65276 32.38767 136.91784  4.72020314 164.58531
## 2023.250       84.65276 32.24569 137.05982  4.50307060 164.80244
## 2023.269       84.65276 32.10410 137.20141  4.28652470 165.01899
## 2023.288       84.65276 31.96289 137.34262  4.07056071 165.23495
## 2023.308       84.65276 31.82206 137.48346  3.85517397 165.45034
## 2023.327       84.65276 31.68160 137.62391  3.64035988 165.66515
## 2023.346       84.65276 31.54151 137.76400  3.42611389 165.87940
## 2023.365       84.65276 31.40179 137.90372  3.21243151 166.09308
## 2023.385       84.65276 31.26244 138.04308  2.99930833 166.30620
## 2023.404       84.65276 31.12345 138.18207  2.78673997 166.51877
## 2023.423       84.65276 30.98481 138.32070  2.57472213 166.73079
## 2023.442       84.65276 30.84654 138.45897  2.36325055 166.94226
## 2023.462       84.65276 30.70862 138.59689  2.15232102 167.15319
## 2023.481       84.65276 30.57105 138.73446  1.94192941 167.36358
## 2023.500       84.65276 30.43383 138.87168  1.73207162 167.57344
## 2023.519       84.65276 30.29696 139.00855  1.52274359 167.78277
## 2023.538       84.65276 30.16043 139.14508  1.31394136 167.99157
## 2023.558       84.65276 30.02425 139.28127  1.10566096 168.19985
## 2023.577       84.65276 29.88840 139.41711  0.89789851 168.40761
## 2023.596       84.65276 29.75289 139.55263  0.69065016 168.61486
## 2023.615       84.65276 29.61771 139.68781  0.48391211 168.82160
## 2023.635       84.65276 29.48286 139.82265  0.27768062 169.02783
## 2023.654       84.65276 29.34834 139.95717  0.07195198 169.23356
## 2023.673       84.65276 29.21415 140.09136 -0.13327748 169.43879
## 2023.692       84.65276 29.08028 140.22523 -0.33801136 169.64352
## 2023.712       84.65276 28.94673 140.35878 -0.54225325 169.84777
## 2023.731       84.65276 28.81351 140.49201 -0.74600666 170.05152
## 2023.750       84.65276 28.68060 140.62492 -0.94927510 170.25479
checkresiduals(fit_4)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 189.16, df = 97, p-value = 6.572e-08
## 
## Model df: 5.   Total lags used: 102
shapiro.test(residuals(fit_4$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(fit_4$model)
## W = 0.99547, p-value = 0.1477

The residual graphs for the above model are shown in Figure 36:

  • MASE of this model is 0.687107.
  • The time series plot clearly shows that the residuals are not randomly distributed also changing variance observed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Fitting Residuals from damped Holt−Winters’ multiplicative method

fit_5 <- holt(mortality.ts,seasonal="multiplicative", damped = TRUE ,h=4*frequency(mortality.ts))
summary(fit_5) 
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = mortality.ts, h = 4 * frequency(mortality.ts), damped = TRUE,  
## 
##  Call:
##      seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.5082 
##     beta  = 1e-04 
##     phi   = 0.9119 
## 
##   Initial states:
##     l = 102.0415 
##     b = -1.1632 
## 
##   sigma:  5.9071
## 
##      AIC     AICc      BIC 
## 4976.632 4976.800 5002.015 
## 
## Error measures:
##                       ME     RMSE      MAE        MPE     MAPE     MASE
## Training set -0.02068605 5.877951 4.588308 -0.3396637 5.154426 0.687107
##                    ACF1
## Training set -0.0738815
## 
## Forecasts:
##          Point Forecast    Lo 80     Hi 80       Lo 95     Hi 95
## 2019.769       84.64364 77.07339  92.21388 73.06594913  96.22133
## 2019.788       84.64444 76.15242  93.13646 71.65701552  97.63187
## 2019.808       84.64517 75.32181  93.96854 70.38632011  98.90403
## 2019.827       84.64584 74.55921  94.73248 69.21966520 100.07202
## 2019.846       84.64645 73.85019  95.44272 68.13498927 101.15791
## 2019.865       84.64701 73.18480  96.10921 67.11707758 102.17694
## 2019.885       84.64751 72.55585  96.73918 66.15491218 103.14012
## 2019.904       84.64798 71.95792  97.33803 65.24021065 104.05574
## 2019.923       84.64840 71.38682  97.90998 64.36656034 104.93023
## 2019.942       84.64878 70.83922  98.45834 63.52887591 105.76869
## 2019.962       84.64913 70.31243  98.98583 62.72304381 106.57522
## 2019.981       84.64945 69.80425  99.49465 61.94568057 107.35322
## 2020.000       84.64974 69.31283  99.98665 61.19396328 108.10552
## 2020.019       84.65001 68.83661 100.46340 60.46550752 108.83451
## 2020.038       84.65025 68.37426 100.92624 59.75827754 109.54222
## 2020.058       84.65047 67.92464 101.37630 59.07051869 110.23042
## 2020.077       84.65067 67.48674 101.81460 58.40070594 110.90064
## 2020.096       84.65086 67.05970 102.24201 57.74750381 111.55421
## 2020.115       84.65102 66.64274 102.65930 57.10973490 112.19231
## 2020.135       84.65118 66.23519 103.06716 56.48635489 112.81600
## 2020.154       84.65132 65.83643 103.46620 55.87643234 113.42620
## 2020.173       84.65144 65.44592 103.85696 55.27913228 114.02375
## 2020.192       84.65156 65.06317 104.23995 54.69370278 114.60941
## 2020.212       84.65166 64.68773 104.61560 54.11946379 115.18386
## 2020.231       84.65176 64.31920 104.98432 53.55579791 115.74772
## 2020.250       84.65185 63.95722 105.34648 53.00214251 116.30155
## 2020.269       84.65193 63.60144 105.70242 52.45798321 116.84587
## 2020.288       84.65200 63.25156 106.05244 51.92284827 117.38115
## 2020.308       84.65207 62.90729 106.39684 51.39630375 117.90783
## 2020.327       84.65213 62.56838 106.73588 50.87794942 118.42631
## 2020.346       84.65218 62.23458 107.06979 50.36741522 118.93695
## 2020.365       84.65223 61.90566 107.39880 49.86435814 119.44011
## 2020.385       84.65228 61.58143 107.72313 49.36845952 119.93610
## 2020.404       84.65232 61.26168 108.04296 48.87942273 120.42522
## 2020.423       84.65236 60.94624 108.35848 48.39697104 120.90775
## 2020.442       84.65239 60.63493 108.66986 47.92084586 121.38394
## 2020.462       84.65243 60.32759 108.97726 47.45080506 121.85405
## 2020.481       84.65246 60.02409 109.28082 46.98662157 122.31829
## 2020.500       84.65248 59.72428 109.58069 46.52808210 122.77688
## 2020.519       84.65251 59.42802 109.87699 46.07498598 123.23003
## 2020.538       84.65253 59.13520 110.16986 45.62714415 123.67791
## 2020.558       84.65255 58.84570 110.45940 45.18437824 124.12072
## 2020.577       84.65257 58.55941 110.74573 44.74651972 124.55861
## 2020.596       84.65258 58.27622 111.02895 44.31340921 124.99176
## 2020.615       84.65260 57.99603 111.30917 43.88489574 125.42030
## 2020.635       84.65261 57.71876 111.58647 43.46083617 125.84439
## 2020.654       84.65263 57.44431 111.86094 43.04109464 126.26416
## 2020.673       84.65264 57.17260 112.13268 42.62554204 126.67973
## 2020.692       84.65265 56.90354 112.40175 42.21405556 127.09124
## 2020.712       84.65266 56.63707 112.66824 41.80651828 127.49880
## 2020.731       84.65267 56.37311 112.93222 41.40281877 127.90251
## 2020.750       84.65267 56.11159 113.19376 41.00285072 128.30250
## 2020.769       84.65268 55.85244 113.45292 40.60651266 128.69885
## 2020.788       84.65269 55.59560 113.70977 40.21370761 129.09167
## 2020.808       84.65269 55.34101 113.96438 39.82434284 129.48104
## 2020.827       84.65270 55.08861 114.21679 39.43832960 129.86707
## 2020.846       84.65270 54.83835 114.46706 39.05558288 130.24983
## 2020.865       84.65271 54.59017 114.71525 38.67602120 130.62940
## 2020.885       84.65271 54.34402 114.96141 38.29956643 131.00586
## 2020.904       84.65272 54.09985 115.20558 37.92614354 131.37929
## 2020.923       84.65272 53.85762 115.44782 37.55568052 131.74976
## 2020.942       84.65272 53.61728 115.68817 37.18810811 132.11734
## 2020.962       84.65273 53.37878 115.92667 36.82335974 132.48209
## 2020.981       84.65273 53.14209 116.16336 36.46137135 132.84409
## 2021.000       84.65273 52.90717 116.39830 36.10208125 133.20338
## 2021.019       84.65273 52.67397 116.63150 35.74543002 133.56004
## 2021.038       84.65274 52.44245 116.86302 35.39136040 133.91411
## 2021.058       84.65274 52.21259 117.09288 35.03981715 134.26566
## 2021.077       84.65274 51.98435 117.32113 34.69074698 134.61473
## 2021.096       84.65274 51.75769 117.54779 34.34409844 134.96138
## 2021.115       84.65274 51.53258 117.77291 33.99982185 135.30566
## 2021.135       84.65274 51.30899 117.99650 33.65786919 135.64762
## 2021.154       84.65274 51.08689 118.21860 33.31819405 135.98729
## 2021.173       84.65275 50.86624 118.43925 32.98075153 136.32474
## 2021.192       84.65275 50.64703 118.65846 32.64549820 136.65999
## 2021.212       84.65275 50.42923 118.87627 32.31239200 136.99310
## 2021.231       84.65275 50.21280 119.09270 31.98139223 137.32410
## 2021.250       84.65275 49.99772 119.30778 31.65245942 137.65304
## 2021.269       84.65275 49.78397 119.52153 31.32555534 137.97994
## 2021.288       84.65275 49.57152 119.73398 31.00064293 138.30486
## 2021.308       84.65275 49.36035 119.94515 30.67768623 138.62781
## 2021.327       84.65275 49.15044 120.15506 30.35665034 138.94885
## 2021.346       84.65275 48.94176 120.36374 30.03750141 139.26800
## 2021.365       84.65275 48.73429 120.57121 29.72020656 139.58530
## 2021.385       84.65275 48.52801 120.77749 29.40473385 139.90077
## 2021.404       84.65275 48.32291 120.98260 29.09105225 140.21445
## 2021.423       84.65275 48.11896 121.18655 28.77913160 140.52637
## 2021.442       84.65275 47.91613 121.38937 28.46894258 140.83656
## 2021.462       84.65275 47.71443 121.59108 28.16045667 141.14505
## 2021.481       84.65275 47.51381 121.79169 27.85364612 141.45186
## 2021.500       84.65275 47.31428 121.99123 27.54848393 141.75702
## 2021.519       84.65275 47.11580 122.18970 27.24494382 142.06056
## 2021.538       84.65275 46.91837 122.38713 26.94300020 142.36251
## 2021.558       84.65275 46.72197 122.58354 26.64262815 142.66288
## 2021.577       84.65275 46.52658 122.77893 26.34380338 142.96171
## 2021.596       84.65275 46.33219 122.97332 26.04650222 143.25901
## 2021.615       84.65275 46.13877 123.16674 25.75070160 143.55481
## 2021.635       84.65276 45.94633 123.35918 25.45637904 143.84913
## 2021.654       84.65276 45.75483 123.55068 25.16351260 144.14200
## 2021.673       84.65276 45.56427 123.74124 24.87208088 144.43343
## 2021.692       84.65276 45.37464 123.93087 24.58206300 144.72345
## 2021.712       84.65276 45.18592 124.11959 24.29343858 145.01207
## 2021.731       84.65276 44.99810 124.30741 24.00618773 145.29932
## 2021.750       84.65276 44.81116 124.49435 23.72029102 145.58522
## 2021.769       84.65276 44.62509 124.68042 23.43572948 145.86978
## 2021.788       84.65276 44.43989 124.86562 23.15248458 146.15303
## 2021.808       84.65276 44.25554 125.04998 22.87053821 146.43497
## 2021.827       84.65276 44.07202 125.23349 22.58987268 146.71564
## 2021.846       84.65276 43.88933 125.41618 22.31047069 146.99504
## 2021.865       84.65276 43.70745 125.59806 22.03231532 147.27320
## 2021.885       84.65276 43.52638 125.77913 21.75539004 147.55012
## 2021.904       84.65276 43.34610 125.95941 21.47967867 147.82583
## 2021.923       84.65276 43.16661 126.13891 21.20516538 148.10035
## 2021.942       84.65276 42.98789 126.31763 20.93183469 148.37368
## 2021.962       84.65276 42.80993 126.49558 20.65967146 148.64584
## 2021.981       84.65276 42.63272 126.67279 20.38866084 148.91685
## 2022.000       84.65276 42.45626 126.84925 20.11878832 149.18672
## 2022.019       84.65276 42.28054 127.02497 19.85003967 149.45547
## 2022.038       84.65276 42.10554 127.19997 19.58240098 149.72311
## 2022.058       84.65276 41.93126 127.37426 19.31585860 149.98965
## 2022.077       84.65276 41.75768 127.54783 19.05039917 150.25511
## 2022.096       84.65276 41.58481 127.72071 18.78600960 150.51950
## 2022.115       84.65276 41.41262 127.89289 18.52267705 150.78284
## 2022.135       84.65276 41.24112 128.06439 18.26038895 151.04512
## 2022.154       84.65276 41.07029 128.23522 17.99913296 151.30638
## 2022.173       84.65276 40.90014 128.40538 17.73889701 151.56662
## 2022.192       84.65276 40.73064 128.57488 17.47966923 151.82584
## 2022.212       84.65276 40.56179 128.74373 17.22143799 152.08407
## 2022.231       84.65276 40.39358 128.91193 16.96419190 152.34132
## 2022.250       84.65276 40.22602 129.07950 16.70791975 152.59759
## 2022.269       84.65276 40.05908 129.24643 16.45261059 152.85290
## 2022.288       84.65276 39.89276 129.41275 16.19825361 153.10726
## 2022.308       84.65276 39.72706 129.57845 15.94483827 153.36067
## 2022.327       84.65276 39.56197 129.74354 15.69235416 153.61316
## 2022.346       84.65276 39.39748 129.90803 15.44079111 153.86472
## 2022.365       84.65276 39.23359 130.07192 15.19013910 154.11537
## 2022.385       84.65276 39.07029 130.23522 14.94038830 154.36512
## 2022.404       84.65276 38.90757 130.39794 14.69152907 154.61398
## 2022.423       84.65276 38.74543 130.56009 14.44355193 154.86196
## 2022.442       84.65276 38.58385 130.72166 14.19644756 155.10906
## 2022.462       84.65276 38.42284 130.88267 13.95020681 155.35531
## 2022.481       84.65276 38.26239 131.04312 13.70482069 155.60069
## 2022.500       84.65276 38.10250 131.20301 13.46028035 155.84523
## 2022.519       84.65276 37.94315 131.36236 13.21657713 156.08894
## 2022.538       84.65276 37.78434 131.52117 12.97370247 156.33181
## 2022.558       84.65276 37.62607 131.67944 12.73164798 156.57386
## 2022.577       84.65276 37.46833 131.83718 12.49040542 156.81511
## 2022.596       84.65276 37.31112 131.99440 12.24996665 157.05555
## 2022.615       84.65276 37.15442 132.15109 12.01032371 157.29519
## 2022.635       84.65276 36.99824 132.30727 11.77146874 157.53404
## 2022.654       84.65276 36.84258 132.46294 11.53339402 157.77212
## 2022.673       84.65276 36.68741 132.61810 11.29609196 158.00942
## 2022.692       84.65276 36.53275 132.77276 11.05955507 158.24596
## 2022.712       84.65276 36.37858 132.92693 10.82377601 158.48174
## 2022.731       84.65276 36.22490 133.08061 10.58874753 158.71676
## 2022.750       84.65276 36.07171 133.23380 10.35446252 158.95105
## 2022.769       84.65276 35.91900 133.38651 10.12091397 159.18460
## 2022.788       84.65276 35.76677 133.53874  9.88809496 159.41742
## 2022.808       84.65276 35.61501 133.69050  9.65599871 159.64951
## 2022.827       84.65276 35.46372 133.84179  9.42461852 159.88089
## 2022.846       84.65276 35.31289 133.99262  9.19394782 160.11156
## 2022.865       84.65276 35.16253 134.14299  8.96398011 160.34153
## 2022.885       84.65276 35.01261 134.29290  8.73470901 160.57080
## 2022.904       84.65276 34.86315 134.44236  8.50612822 160.79938
## 2022.923       84.65276 34.71414 134.59137  8.27823154 161.02728
## 2022.942       84.65276 34.56557 134.73994  8.05101287 161.25450
## 2022.962       84.65276 34.41744 134.88807  7.82446620 161.48105
## 2022.981       84.65276 34.26974 135.03577  7.59858559 161.70693
## 2023.000       84.65276 34.12248 135.18303  7.37336521 161.93215
## 2023.019       84.65276 33.97564 135.32987  7.14879929 162.15671
## 2023.038       84.65276 33.82923 135.47628  6.92488218 162.38063
## 2023.058       84.65276 33.68324 135.62227  6.70160827 162.60390
## 2023.077       84.65276 33.53767 135.76785  6.47897206 162.82654
## 2023.096       84.65276 33.39251 135.91301  6.25696810 163.04854
## 2023.115       84.65276 33.24775 136.05776  6.03559106 163.26992
## 2023.135       84.65276 33.10341 136.20210  5.81483564 163.49068
## 2023.154       84.65276 32.95947 136.34604  5.59469663 163.71082
## 2023.173       84.65276 32.81593 136.48958  5.37516891 163.93034
## 2023.192       84.65276 32.67278 136.63273  5.15624741 164.14927
## 2023.212       84.65276 32.53003 136.77548  4.93792713 164.36759
## 2023.231       84.65276 32.38767 136.91784  4.72020314 164.58531
## 2023.250       84.65276 32.24569 137.05982  4.50307060 164.80244
## 2023.269       84.65276 32.10410 137.20141  4.28652470 165.01899
## 2023.288       84.65276 31.96289 137.34262  4.07056071 165.23495
## 2023.308       84.65276 31.82206 137.48346  3.85517397 165.45034
## 2023.327       84.65276 31.68160 137.62391  3.64035988 165.66515
## 2023.346       84.65276 31.54151 137.76400  3.42611389 165.87940
## 2023.365       84.65276 31.40179 137.90372  3.21243151 166.09308
## 2023.385       84.65276 31.26244 138.04308  2.99930833 166.30620
## 2023.404       84.65276 31.12345 138.18207  2.78673997 166.51877
## 2023.423       84.65276 30.98481 138.32070  2.57472213 166.73079
## 2023.442       84.65276 30.84654 138.45897  2.36325055 166.94226
## 2023.462       84.65276 30.70862 138.59689  2.15232102 167.15319
## 2023.481       84.65276 30.57105 138.73446  1.94192941 167.36358
## 2023.500       84.65276 30.43383 138.87168  1.73207162 167.57344
## 2023.519       84.65276 30.29696 139.00855  1.52274359 167.78277
## 2023.538       84.65276 30.16043 139.14508  1.31394136 167.99157
## 2023.558       84.65276 30.02425 139.28127  1.10566096 168.19985
## 2023.577       84.65276 29.88840 139.41711  0.89789851 168.40761
## 2023.596       84.65276 29.75289 139.55263  0.69065016 168.61486
## 2023.615       84.65276 29.61771 139.68781  0.48391211 168.82160
## 2023.635       84.65276 29.48286 139.82265  0.27768062 169.02783
## 2023.654       84.65276 29.34834 139.95717  0.07195198 169.23356
## 2023.673       84.65276 29.21415 140.09136 -0.13327748 169.43879
## 2023.692       84.65276 29.08028 140.22523 -0.33801136 169.64352
## 2023.712       84.65276 28.94673 140.35878 -0.54225325 169.84777
## 2023.731       84.65276 28.81351 140.49201 -0.74600666 170.05152
## 2023.750       84.65276 28.68060 140.62492 -0.94927510 170.25479
checkresiduals(fit_5)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 189.16, df = 97, p-value = 6.572e-08
## 
## Model df: 5.   Total lags used: 102
shapiro.test(residuals(fit_5$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(fit_5$model)
## W = 0.99547, p-value = 0.1477

The residual graphs for the above model are shown in Figure 37:

  • MASE of this model is 0.687107.
  • The time series plot clearly shows that the residuals are not randomly distributed also changing variance observed.
  • The ACF plot has a large number of highly significant lags as well as a wave pattern at seasonal lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

The data frame has been constructed to contain the exponential smoothing models values, such as AIC/BIC and MASE, from the models that have been fitted for the same.

model_expo <- data.frame(Model=character() , MASE=numeric() ,
                           BIC= numeric() , AICC=numeric() , AIC=numeric())

model_expo = rbind(model_expo,cbind(Model="additive_seasonality",MASE= accuracy(fit_1)[6],
                                               AIC = fit_1$model$aic,
                                              BIC = fit_1$model$bic
                                      ))


model_expo = rbind(model_expo,cbind(Model="multiplicative_ses",MASE= accuracy(fit_3)[6],
                                               AIC = fit_3$model$aic,
                                              BIC = fit_3$model$bic
                                      ))

model_expo = rbind(model_expo,cbind(Model="additive_seasonality_damped",MASE= accuracy(fit_4)[6],
                                              AIC = fit_4$model$aic,
                                              BIC = fit_4$model$bic
                                      ))


model_expo = rbind(model_expo,cbind(Model="multiplicative_ses_damped",MASE= accuracy(fit_5)[6],
                                               AIC = fit_5$model$aic,
                                              BIC = fit_5$model$bic
                                      ))

model_expo
##                         Model              MASE              AIC
## 1        additive_seasonality 0.687417030630797 4975.45953283232
## 2          multiplicative_ses 0.687417030630797 4975.45953283232
## 3 additive_seasonality_damped 0.687106984627893 4976.63206754983
## 4   multiplicative_ses_damped 0.687106984627893 4976.63206754983
##                BIC
## 1 4996.61194007021
## 2 4996.61194007021
## 3  5002.0149562353
## 4  5002.0149562353
#sortScore(model_expo,score = "mase")

State-space models

There are two state-space models for each exponential smoothing approach (with additive or multiplicative errors). We introduced state-space variants, including seasonality (NOTE: some combinations are excluded due to their resistance problems).

Fitting additive error, additive trend and, no seasonality model.

fit.mortality_AAN = ets(y= as.vector(mortality.ts), model = "AAN") 
summary(fit.mortality_AAN)
## ETS(A,A,N) 
## 
## Call:
##  ets(y = as.vector(mortality.ts), model = "AAN") 
## 
##   Smoothing parameters:
##     alpha = 0.5122 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 100.9765 
##     b = -0.029 
## 
##   sigma:  5.906
## 
##      AIC     AICc      BIC 
## 4975.460 4975.579 4996.612 
## 
## Training set error measures:
##                        ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.004457548 5.88274 4.590352 -0.3178407 5.155916 0.8649647
##                     ACF1
## Training set -0.07651869
checkresiduals(fit.mortality_AAN)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,A,N)
## Q* = 27.685, df = 6, p-value = 0.0001077
## 
## Model df: 4.   Total lags used: 10

The residual graphs for the above model are shown in Figure 38:

  • MASE of this model is 0.8649647.
  • The time series plot clearly shows that the residuals are not randomly distributed.
  • The ACF plot has a large number of highly significant lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.

Fitting multiplicative error, no trend and, no seasonality model.

fit11.mortality_MNN = ets(mortality.ts, model = "MNN") 
summary(fit11.mortality_MNN)
## ETS(M,N,N) 
## 
## Call:
##  ets(y = mortality.ts, model = "MNN") 
## 
##   Smoothing parameters:
##     alpha = 0.4843 
## 
##   Initial states:
##     l = 98.5582 
## 
##   sigma:  0.0656
## 
##      AIC     AICc      BIC 
## 4954.111 4954.159 4966.803 
## 
## Training set error measures:
##                       ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.05730399 5.88508 4.593891 -0.3849281 5.159608 0.6879431
##                     ACF1
## Training set -0.04372931
checkresiduals(fit11.mortality_MNN)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 207.78, df = 100, p-value = 1.514e-09
## 
## Model df: 2.   Total lags used: 102

The residual graphs for the above model are shown in Figure 39:

  • MASE of this model is 0.6879431.
  • The time series plot clearly shows that the residuals are not randomly distributed.
  • The ACF plot has a large number of highly significant lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.

The auto ETS model is applied to check what the software’s automatically recommended model is.

auto_fit_t1 <- ets(mortality.ts)
summary(auto_fit_t1)
## ETS(M,N,N) 
## 
## Call:
##  ets(y = mortality.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.4843 
## 
##   Initial states:
##     l = 98.5582 
## 
##   sigma:  0.0656
## 
##      AIC     AICc      BIC 
## 4954.111 4954.159 4966.803 
## 
## Training set error measures:
##                       ME    RMSE      MAE        MPE     MAPE      MASE
## Training set -0.05730399 5.88508 4.593891 -0.3849281 5.159608 0.6879431
##                     ACF1
## Training set -0.04372931

ETS(M,N,N) is the model that is automatically proposed. It is a model with multiplicative errors, no trend, and no seasonality.

checkresiduals(auto_fit_t1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 207.78, df = 100, p-value = 1.514e-09
## 
## Model df: 2.   Total lags used: 102

The residual graphs for the above model are shown in Figure 40:

  • MASE of this model is 0.6879431.
  • The time series plot clearly shows that the residuals are not randomly distributed.
  • The ACF plot has a large number of highly significant lags, indicating that autocorrelation and seasonality are still present in the residuals.
  • Since the p-value is less than 0.05, the Ljung-Box test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.

The data frame has been constructed to contain the state space models values, such as AIC/BIC and MASE, from the models that have been fitted for the same.

model_SSM <- data.frame(Model=character() , MASE=numeric() ,
                           BIC= numeric() , AICC=numeric() , AIC=numeric())

model_SSM = rbind(model_SSM,cbind(Model="AAN", MASE= accuracy(fit.mortality_AAN)[6],
                                               AIC = fit.mortality_AAN$aic,
                                               BIC = fit.mortality_AAN$bic))

model_SSM = rbind(model_SSM,cbind(Model="MNN", MASE= accuracy(fit11.mortality_MNN)[6],
                                               AIC = fit11.mortality_MNN$aic,
                                               BIC = fit11.mortality_MNN$bic))

model_SSM = rbind(model_SSM,cbind(Model="Auto", MASE= accuracy(auto_fit_t1)[6],
                                               AIC = auto_fit_t1$aic,
                                               BIC = auto_fit_t1$bic))



model_SSM
##   Model              MASE              AIC              BIC
## 1   AAN 0.864964686781363 4975.45959962188 4996.61200685977
## 2   MNN  0.68794305418027 4954.11122387993 4966.80266822267
## 3  Auto  0.68794305418027 4954.11122387993 4966.80266822267

The data frame has been constructed to contain the Overall model values, such as AIC/BIC and MASE, from the models that have been fitted so far, it is sorted by ascending MASE value. As a result of this table, it will be obvious which models have the lowest MASE.

best_overall_model <- rbind(model_dlm,model_expo,model_SSM)

sortScore(best_overall_model,score = "mase")
##                                   Model              AIC              BIC
## 3           additive_seasonality_damped 4976.63206754983  5002.0149562353
## 4             multiplicative_ses_damped 4976.63206754983  5002.0149562353
## 1                  additive_seasonality 4975.45953283232 4996.61194007021
## 2                    multiplicative_ses 4975.45953283232 4996.61194007021
## 21                                  MNN 4954.11122387993 4966.80266822267
## 31                                 Auto 4954.11122387993 4966.80266822267
## ardldlm_55        autoregressive_dlm_55 3119.91146602251  3174.7791382338
## ardldlm_44        autoregressive_dlm_44 3122.78963943741 3169.23797838619
## ardldlm_34        autoregressive_dlm_34 3138.58428896875 3180.81005164946
## ardldlm_24        autoregressive_dlm_24 3136.76047461021 3174.76366102286
## 11                                  AAN 4975.45959962188 4996.61200685977
## Koyck_model                 Koyck Model 3416.80895937194  3433.7230033863
## finite_DLM                   Finite DLM 3530.22409150313 3584.96189250445
## poly_DLM                 Polynomial DLM 3525.61085058087 3555.08505112004
##                          MASE
## 3           0.687106984627893
## 4           0.687106984627893
## 1           0.687417030630797
## 2           0.687417030630797
## 21           0.68794305418027
## 31           0.68794305418027
## ardldlm_55   0.77032749420647
## ardldlm_44  0.775883452441026
## ardldlm_34  0.792369281742784
## ardldlm_24  0.792652667779921
## 11          0.864964686781363
## Koyck_model  1.03628192123151
## finite_DLM   1.15052278860105
## poly_DLM     1.16267251244066

In terms of MASE, the best overall model table will be taken into consideration to analyze all approaches we endeavored throughout the modeling step. The model that has the lowest MASE value is Damped additive method.


Task 2 - The task is to represent Time series analysis, model the FFD (first flowering day), and provide the best FFD 4-year projections for the FFD series.

The objective of this task is to predict the first blooming day series based on climate parameters such as rainfall (rain), temperature (temp), radiation level (rad), and relative humidity (RH). The research is a annual data collection that investigates the impact of long-term climate on the FFD of 81 plant species from 1984 to 2014.

ffd_1 <- read.csv("/Users/zuaibshaikh/Desktop/SEM 4/Forecasting/Final Project/FFD  .csv")

ffd= ffd_1[,2:6] # considering all the impactfull columns
head(ffd)
##   Temperature Rainfall Radiation RelHumidity FFD
## 1    9.371585 2.489344  14.87158    93.92650 217
## 2    9.656164 2.475890  14.68493    94.93589 186
## 3    9.273973 2.421370  14.51507    94.09507 233
## 4    9.219178 2.319726  14.67397    94.49699 222
## 5   10.202186 2.465301  14.74863    94.08142 214
## 6    9.441096 2.735890  14.78356    96.08685 237
class(ffd$Temperature)
## [1] "numeric"
class(ffd$Rainfall)
## [1] "numeric"
class(ffd$Radiation)
## [1] "numeric"
class(ffd$RelHumidity)
## [1] "numeric"
class(ffd$FFD)
## [1] "integer"
ffd$FFD=as.numeric(as.integer(ffd$FFD))
class(ffd$FFD)
## [1] "numeric"
ffd_temp.ts=ts(ffd$Temperature,start =1984, frequency = 1)
head(ffd_temp.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1]  9.371585  9.656164  9.273973  9.219178 10.202186  9.441096
ffd_rainfall.ts <- ts(ffd$Rainfall,start = 1984,frequency = 1)
head(ffd_rainfall.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 2.489344 2.475890 2.421370 2.319726 2.465301 2.735890
ffd_radiation.ts= ts(ffd$Radiation, start = 1984,frequency = 1)
head(ffd_radiation.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356
ffd_humidity.ts = ts(ffd$RelHumidity, start = 1984,frequency = 1)
head(ffd_humidity.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685
ffd_FFD.ts = ts(ffd$FFD, start = 1984,frequency = 1)
head(ffd_FFD.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 217 186 233 222 214 237
ffd.ts= ts(ffd,start = 1984,frequency = 1)
head(ffd.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
##      Temperature Rainfall Radiation RelHumidity FFD
## 1984    9.371585 2.489344  14.87158    93.92650 217
## 1985    9.656164 2.475890  14.68493    94.93589 186
## 1986    9.273973 2.421370  14.51507    94.09507 233
## 1987    9.219178 2.319726  14.67397    94.49699 222
## 1988   10.202186 2.465301  14.74863    94.08142 214
## 1989    9.441096 2.735890  14.78356    96.08685 237

The existence of non-stationarity in dataset.

The aim here is to check whether the time series is stationary or non-stationary. The approach to monitoring this is with an ACF and PACF performance. The uniqueness of this is achieved by running a unit root test. The two tests are the enlarged Dickey-Fuller (ADF) and Phillips-Perron (PP).

The Descriptive Analysis

Five major patterns from a time series plot could be obtained from:

  • Trend.
  • Seasonality.
  • Changing Variation.
  • Behaviour.
  • Change Point.

Plotting graphs for the converted time series characteristics. Further, we will take a look at how each dataset feature performs the specific patterns mentioned above.

plot(ffd_temp.ts, xlab='Year', main = " Figure 1. Time series plot of annual FFD temperature series")

From Figure 1 of time series plot for annual FFD temperature series, we can interpret as follows:

  1. Trend - The plot is showing there is no (unpredictive) trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - Two interventions appear to occur in 1988 and 2006.
plot(ffd_rainfall.ts, xlab='Year', main = " Figure 2. Time series plot of annual FFD rainfall series")

From Figure 2 of time series plot for annual FFD rainfall series, we can interpret as follows:

  1. Trend - The plot is showing there is no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appears to take place in the year 1997.
plot(ffd_radiation.ts, xlab='Year', main = " Figure 3. Time series plot of annual FFD radiation sereis")

From Figure 3 of time series plot for annual FFD radiation series, we can interpret as follows:

  1. Trend - The plot is showing kind of upward trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appears to take place in the year 1992.
plot(ffd_humidity.ts, xlab='Year', main = " Figure 4. Time series plot of annual FFD humidity series")

From Figure 4 of time series plot for annual FFD humidity series, we can interpret as follows:

  1. Trend - The plot is showing no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - Three interventions appear to occur in 1900, 2000 and, 2010.
plot(ffd_FFD.ts, xlab='Year', main = " Figure 5. Time series plot of annual FFD series")

From Figure 5 of time series plot for annual FFD series, we can interpret as follows:

  1. Trend - The plot is showing no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appear to occur in 1999.
  • In order to precisely depict the secondary FFD series alongside the explicative all the rest response series in the same figure, we pleasure normalize the data. The code below gives a time series tale to investigate the series relationship.
ffd.scaled = scale(ffd.ts)
plot(ffd.scaled, plot.type="s",col = c("black", "red", "blue", "green","brown"), main = "Figure 6. Annual FFD data series")
legend("topleft",lty=1, text.width =5, col=c("black", "red", "blue", "green","brown"), c("Temperature", "Rainfall", "Radiation", "Humidity","FFD"))

From figure 6 we can infer all of the above five-time series drawn together after scaling and centering.

Analysis of stationarity in data set

  • Plotting ACF/PACF plots for all attributes and performing ADF test for the same.
acf(ffd_temp.ts, lag.max = 48, main="Figure 7. Sample ACF for annual FFD temerature series")

Pacf(ffd_temp.ts, lag.max = 48, main="Figure 8. Sample PACF for annual FFD temerature series")

adf.test(ffd_temp.ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_temp.ts
## Dickey-Fuller = -1.6816, Lag order = 3, p-value = 0.6953
## alternative hypothesis: stationary
adf.ffd_temp = ur.df(ffd_temp.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.ffd_temp)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.74882 -0.24560 -0.06356  0.26178  0.91921 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## z.lag.1     0.003385   0.007510   0.451 0.655777    
## z.diff.lag -0.594821   0.154274  -3.856 0.000648 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3829 on 27 degrees of freedom
## Multiple R-squared:  0.3554, Adjusted R-squared:  0.3076 
## F-statistic: 7.443 on 2 and 27 DF,  p-value: 0.002663
## 
## 
## Value of test-statistic is: 0.4507 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.62 -1.95 -1.61
pp.ffd_temp = ur.pp(ffd_temp.ts, type = "Z-alpha", lags = "short")
summary(pp.ffd_temp)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8370 -0.2851  0.0566  0.2213  0.7546 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7.0629     1.8594   3.798  0.00072 ***
## y.l1          0.2587     0.1958   1.321  0.19726    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3826 on 28 degrees of freedom
## Multiple R-squared:  0.05865,    Adjusted R-squared:  0.02503 
## F-statistic: 1.745 on 1 and 28 DF,  p-value: 0.1973
## 
## 
## Value of test-statistic, type: Z-alpha  is: -23.5224 
## 
##          aux. Z statistics
## Z-tau-mu            3.8677

The ACF/PACF plot for the annual FFD temperature series is shown in Figure 7 & Figure 8 which tells us about:

  • A wave-like pattern in the ACF plot of ffd temperature series, indicating that there is no trend.
  • The ACF plot revealed no seasonality.
  • From PACF plot we can see there are no high significant lags which indicate all the points lie within the significant line.
  • The augmented Dickey-Fuller test yields a p-value of 0.6953 which is greater than 0.05, indicating that the null hypothesis of non-stationarity was not rejected.
acf(ffd_rainfall.ts , lag.max = 48, main="Figure 9. Sample ACF for annual FFD rainfall series")

Pacf(ffd_rainfall.ts, lag.max = 48, main="Figure 10. Sample PACF for annual FFD rainfall series")

adf.test(ffd_rainfall.ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_rainfall.ts
## Dickey-Fuller = -2.3024, Lag order = 3, p-value = 0.4563
## alternative hypothesis: stationary
adf.ffd_rainfall = ur.df(ffd_rainfall.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.ffd_rainfall)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3753 -0.1936  0.1544  0.3340  0.8803 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)  
## z.lag.1    -0.01662    0.03684  -0.451   0.6555  
## z.diff.lag -0.36299    0.18035  -2.013   0.0542 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4736 on 27 degrees of freedom
## Multiple R-squared:  0.1428, Adjusted R-squared:  0.07929 
## F-statistic: 2.249 on 2 and 27 DF,  p-value: 0.1249
## 
## 
## Value of test-statistic is: -0.4511 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.62 -1.95 -1.61
pp.ffd_rainfall = ur.pp(ffd_rainfall.ts, type = "Z-alpha", lags = "short")
summary(pp.ffd_rainfall)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0306 -0.1530  0.0554  0.2690  0.5381 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.0033     0.4491   4.461 0.000121 ***
## y.l1          0.1529     0.1868   0.818 0.420073    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3814 on 28 degrees of freedom
## Multiple R-squared:  0.02336,    Adjusted R-squared:  -0.01152 
## F-statistic: 0.6697 on 1 and 28 DF,  p-value: 0.4201
## 
## 
## Value of test-statistic, type: Z-alpha  is: -24.4646 
## 
##          aux. Z statistics
## Z-tau-mu            4.4335

The ACF/PACF plot for the annual FFD rainfall series is shown in Figure 9 & Figure 10 which tells us about:

  • From the acf of annual FFD rainfall series we can indicate that there is no trend.
  • The ACF plot revealed no seasonality.
  • From PACF we can see there are no high significant lags which indicate all the points lie within the significant line.
  • The augmented Dickey-Fuller test yields a p-value of 0.4563 which is greater than 0.05, indicating that the null hypothesis of non-stationarity was not rejected.
acf(ffd_radiation.ts , lag.max = 48, main="Figure 11. Sample ACF for annnual FFD radiation series")

Pacf(ffd_radiation.ts, lag.max = 48, main="Figure 12. Sample PACF for annual FFD radiation series")

adf.test(ffd_radiation.ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_radiation.ts
## Dickey-Fuller = -2.6949, Lag order = 3, p-value = 0.3052
## alternative hypothesis: stationary
adf.ffd_radiation = ur.df(ffd_radiation.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.ffd_radiation)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.06546 -0.27928  0.06367  0.34134  0.64395 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## z.lag.1    -0.0007715  0.0054907  -0.141    0.889
## z.diff.lag -0.2325212  0.1870728  -1.243    0.225
## 
## Residual standard error: 0.4312 on 27 degrees of freedom
## Multiple R-squared:  0.05456,    Adjusted R-squared:  -0.01548 
## F-statistic: 0.779 on 2 and 27 DF,  p-value: 0.4689
## 
## 
## Value of test-statistic is: -0.1405 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.62 -1.95 -1.61
pp.ffd_radiation = ur.pp(ffd_radiation.ts, type = "Z-alpha", lags = "short")
summary(pp.ffd_radiation)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.10563 -0.09934  0.05929  0.19569  0.60439 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   8.0639     2.4425   3.302  0.00263 **
## y.l1          0.4467     0.1673   2.670  0.01249 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3705 on 28 degrees of freedom
## Multiple R-squared:  0.2029, Adjusted R-squared:  0.1745 
## F-statistic: 7.128 on 1 and 28 DF,  p-value: 0.01249
## 
## 
## Value of test-statistic, type: Z-alpha  is: -15.7287 
## 
##          aux. Z statistics
## Z-tau-mu             3.244

The ACF/PACF plot for the annual FFD radiation series is shown in Figure 11 & Figure 12 which tells us about:

  • A wave-like pattern in the ACF plot of ffd radiation series, indicating that there is no trend.
  • The ACF plot revealed no seasonality.
  • The PACF plot starts at lag 1 rather than lag 0. As we can see, it has a sharp cut-off after the fourth lag. In contrast to our ACF plot, we see no sinusoidal trend.
  • The augmented Dickey-Fuller test yields a p-value of 0.3052 which is greater than 0.05, indicating that the null hypothesis of non-stationarity was not rejected.
acf(ffd_humidity.ts , lag.max = 48, main="Figure 13. Sample ACF for annual FFD humidity series")

Pacf(ffd_humidity.ts, lag.max = 48, main="Figure 14. Sample PACF for annual FFD humidity series")

adf.test(ffd_humidity.ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_humidity.ts
## Dickey-Fuller = -2.7992, Lag order = 3, p-value = 0.2651
## alternative hypothesis: stationary
adf.ffd_humidity = ur.df(ffd_humidity.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.ffd_humidity)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7309 -0.7848  0.1133  0.6584  1.9218 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## z.lag.1    -0.0003151  0.0019849  -0.159    0.875
## z.diff.lag -0.2724937  0.1822047  -1.496    0.146
## 
## Residual standard error: 1.011 on 27 degrees of freedom
## Multiple R-squared:  0.07733,    Adjusted R-squared:  0.008987 
## F-statistic: 1.131 on 2 and 27 DF,  p-value: 0.3374
## 
## 
## Value of test-statistic is: -0.1587 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.62 -1.95 -1.61
pp.ffd_humidity = ur.pp(ffd_humidity.ts, type = "Z-alpha", lags = "short")
summary(pp.ffd_humidity)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.35103 -0.49219  0.02665  0.39864  1.65341 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  80.0768    17.5215   4.570 8.97e-05 ***
## y.l1          0.1532     0.1853   0.827    0.415    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7954 on 28 degrees of freedom
## Multiple R-squared:  0.02384,    Adjusted R-squared:  -0.01103 
## F-statistic: 0.6837 on 1 and 28 DF,  p-value: 0.4153
## 
## 
## Value of test-statistic, type: Z-alpha  is: -24.0677 
## 
##          aux. Z statistics
## Z-tau-mu            4.5352

The ACF/PACF plot for the annual FFD humidity series is shown in Figure 13 & Figure 14 which tells us about:

  • A wave-like pattern in the ACF plot of ffd humidity series, indicating that there is no trend.
  • The ACF plot revealed no seasonality.
  • From the PACF plot we can say only one lag is touching the significant line.
  • The augmented Dickey-Fuller test yields a p-value of 0.2651 which is greater than 0.05, indicating that the null hypothesis of non-stationarity was not rejected.
acf(ffd_FFD.ts, lag.max = 48, main="Figure 15. Sample ACF for FFD series")

Pacf(ffd_FFD.ts, lag.max = 48, main="Figure 16. Sample PACF for FFD series")

adf.test(ffd_FFD.ts)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_FFD.ts
## Dickey-Fuller = -2.3723, Lag order = 3, p-value = 0.4294
## alternative hypothesis: stationary
adf.ffd_FFD = ur.df(ffd_FFD.ts, type = "none", lags = 1, selectlags = "AIC")
summary(adf.ffd_FFD)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -57.274 -14.617   2.381  24.385  56.281 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## z.lag.1    -0.008629   0.025397  -0.340  0.73665   
## z.diff.lag -0.527429   0.160435  -3.288  0.00281 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 28.86 on 27 degrees of freedom
## Multiple R-squared:  0.2899, Adjusted R-squared:  0.2373 
## F-statistic: 5.512 on 2 and 27 DF,  p-value: 0.009831
## 
## 
## Value of test-statistic is: -0.3398 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.62 -1.95 -1.61
pp.ffd_FFD = ur.pp(ffd_FFD.ts, type = "Z-alpha", lags = "short")
summary(pp.ffd_FFD)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -55.416 -16.070   0.175  18.712  54.961 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 210.614583  40.412896   5.212 1.56e-05 ***
## y.l1         -0.006732   0.191168  -0.035    0.972    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 24.19 on 28 degrees of freedom
## Multiple R-squared:  4.429e-05,  Adjusted R-squared:  -0.03567 
## F-statistic: 0.00124 on 1 and 28 DF,  p-value: 0.9722
## 
## 
## Value of test-statistic, type: Z-alpha  is: -30.576 
## 
##          aux. Z statistics
## Z-tau-mu            5.2148

The ACF/PACF plot for the annual FFD humidity series is shown in Figure 15 & Figure 16 which tells us about:

  • A wave-like pattern in the ACF plot of ffd humidity series, indicating that there is no trend.
  • The ACF plot revealed no seasonality.
  • From the PACF plot we can say only one lag is touching the significant line.
  • The augmented Dickey-Fuller test yields a p-value of 0.4294 which is greater than 0.05, indicating that the null hypothesis of non-stationarity was not rejected.

Analysing the impact of the components of a time series data on the given dataset.

  • Time-series key specifications are Seasonality, Trend, and Remainder.
  • It is critical to breaking down the time series into distinct components. This aids in the observation of individual impacts as well as historical actions on existing components. This decomposition may also be used to view and learn more about the components.
  • Now let check lambda values to evaluate if transformation is necessary and then differentiation is needed before data is deconstructed.
# Checking the lamda values for all specific attributes
ffd_temp_lamda = BoxCox.lambda(ffd_temp.ts, method = "loglik")
ffd_temp_lamda
## [1] 1.45
ffd_rainfall_lambda = BoxCox.lambda(ffd_rainfall.ts, method = "loglik")
ffd_rainfall_lambda
## [1] 2
ffd_radiation_lambda = BoxCox.lambda(ffd_radiation.ts, method = "loglik")
ffd_radiation_lambda
## [1] 2
ffd_humidity_lambda = BoxCox.lambda(ffd_humidity.ts, method = "loglik")
ffd_humidity_lambda
## [1] -1
ffd_FFD_lambda = BoxCox.lambda(ffd_FFD.ts, method = "loglik")
ffd_FFD_lambda
## [1] 1.2

A change of scale 2(Y^2) is necessary for the price of Temperature, rainfall, radiation and, FFD since the value of lambda is approaching 2. Whereas humidity doesn’t need any transformation because the lambda value is close to 0, but for better understanding, we will check its transformation.

  • Now let us calculate the temperature, rainfall, radiation, humidity and the FFD transformation.
ffd_temp_lamda
## [1] 1.45
Bc.ffd_temp=BoxCox(ffd_temp.ts, lambda = ffd_temp_lamda)
plot(Bc.ffd_temp,ylab='ASX price Index',xlab='Year',type='o', main="Figure 17. Box-Cox Transformed FFD temperature series")

ffd_rainfall_lambda
## [1] 2
Bc.ffd_rainfallp=BoxCox(ffd_rainfall.ts, lambda = ffd_rainfall_lambda)
plot(Bc.ffd_rainfallp,ylab='ASX price Index',xlab='Year',type='o', main="Figure 18. Box-Cox Transformed FFD rainfall series")

ffd_radiation_lambda
## [1] 2
Bc.ffd_radiation=BoxCox(ffd_radiation.ts, lambda = ffd_radiation_lambda)
plot(Bc.ffd_radiation,ylab='ASX price Index',xlab='Year',type='o', main="Figure 19. Box-Cox Transformed FFD radiation series")

ffd_humidity_lambda
## [1] -1
Bc.ffd_humidity=BoxCox(ffd_humidity.ts, lambda = ffd_humidity_lambda)
plot(Bc.ffd_humidity,ylab='ASX price Index',xlab='Year',type='o', main="Figure 20. Box-Cox Transformed FFD humidity series")

ffd_FFD_lambda
## [1] 1.2
Bc.ffd_FFD=BoxCox(ffd_FFD.ts, lambda = ffd_FFD_lambda)
plot(Bc.ffd_FFD,ylab='ASX price Index',xlab='Year',type='o', main="Figure 21. Box-Cox Transformed FFD series")

We may conclude from the transformation that the series is not stationary. So, let’s evaluate the difference for all of the series right now.

# Temperature differencing

ffd_temp.diff = diff(ffd_temp.ts)
plot(ffd_temp.diff ,ylab='ASX prices',xlab='Year',main = "Figure 22. Time series plot showing the initial difference in temperature series")

# Rainfall differencing

ffd_rainfall.diff = diff(ffd_rainfall.ts)
plot(ffd_rainfall.diff ,ylab='ASX prices',xlab='Year',main = "Figure 23. Time series plot showing the initial difference in rainfall series")

# Radiation differencing

ffd_radiation.diff = diff(ffd_radiation.ts)
plot(ffd_radiation.diff ,ylab='ASX prices',xlab='Year',main = "Figure 24. Time series plot showing the initial difference in radiation series")

# Humidity differencing

ffd_humidity.diff = diff(ffd_humidity.ts)
plot(ffd_humidity.diff ,ylab='ASX prices',xlab='Year',main = "Figure 25. Time series plot showing the initial difference in humidity series")

# FFD differencing

ffd_FFD.diff = diff(ffd_FFD.ts)
plot(ffd_FFD.diff ,ylab='ASX prices',xlab='Year',main = "Figure 26. Time series plot showing the initial difference in FFD series")

  • After analyzing, we can say that the series’ pattern appears partially stationary. So, to be sure, let’s run and validate it with an ADF test on every one of the individuals.
adf.test(ffd_temp.diff)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_temp.diff
## Dickey-Fuller = -2.9746, Lag order = 3, p-value = 0.1983
## alternative hypothesis: stationary
adf.test(ffd_rainfall.diff)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_rainfall.diff
## Dickey-Fuller = -3.7734, Lag order = 3, p-value = 0.03616
## alternative hypothesis: stationary
adf.test(ffd_radiation.diff)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_radiation.diff
## Dickey-Fuller = -2.6911, Lag order = 3, p-value = 0.3072
## alternative hypothesis: stationary
adf.test(ffd_humidity.diff)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_humidity.diff
## Dickey-Fuller = -3.1297, Lag order = 3, p-value = 0.1387
## alternative hypothesis: stationary
adf.test(ffd_FFD.diff)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_FFD.diff
## Dickey-Fuller = -4.2935, Lag order = 3, p-value = 0.01179
## alternative hypothesis: stationary

Now it is clear that variable temperature, radiation and humidity is still having non-stationarity. So, let’s evaluate the second differencing for all of the mentioned series right now.

# Temperature second differencing
ffd_temp.diff_2 = diff(ffd_temp.ts,differences = 2)
plot(ffd_temp.diff_2 ,ylab='ASX prices',xlab='Year',main = "Figure 27. Time series plot showing the second difference in temperature series")

# Radiation second differencing

ffd_radiation.diff_2 = diff(ffd_radiation.ts,differences = 2)
plot(ffd_radiation.diff_2 ,ylab='ASX prices',xlab='Year',main = "Figure 28. Time series plot showing the second difference in radiation series")

# Humidity second differencing

ffd_humidity.diff_2 = diff(ffd_humidity.ts,differences = 2)
plot(ffd_humidity.diff_2 ,ylab='ASX prices',xlab='Year',main = "Figure 29. Time series plot showing the second difference in humidity series")

  • Let’s run and validate the second differencing with an ADF test on every one of the individuals.
adf.test(ffd_temp.diff_2)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_temp.diff_2
## Dickey-Fuller = -3.7145, Lag order = 3, p-value = 0.04081
## alternative hypothesis: stationary
adf.test(ffd_radiation.diff_2)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_radiation.diff_2
## Dickey-Fuller = -3.0559, Lag order = 3, p-value = 0.1678
## alternative hypothesis: stationary
adf.test(ffd_humidity.diff_2)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_humidity.diff_2
## Dickey-Fuller = -3.9704, Lag order = 3, p-value = 0.02363
## alternative hypothesis: stationary
  • Now, radiation varaible is still having non-stationarity. So, let’s evaluate the third differencing for the mentioned series right now.
# Radiation Third differencing

ffd_radiation.diff_3 = diff(ffd_radiation.ts,differences = 3)
plot(ffd_radiation.diff_3 ,ylab='ASX prices',xlab='Year',main = "Figure 30. Time series plot showing the third difference in radiation series")

  • Let’s run and validate the third differencing with an ADF test on every one of the individuals.
adf.test(ffd_radiation.diff_3)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  ffd_radiation.diff_3
## Dickey-Fuller = -4.1613, Lag order = 3, p-value = 0.01709
## alternative hypothesis: stationary
  • We infer that the differenced series is stationary at the 5% level of significance since the p-value is less than 0.05.

The Correlation matrix

cor(ffd.ts)
##              Temperature   Rainfall   Radiation  RelHumidity         FFD
## Temperature  1.000000000  0.3933255 -0.24096625  0.009646021 -0.24793371
## Rainfall     0.393325545  1.0000000 -0.58131610  0.338461007  0.05069110
## Radiation   -0.240966245 -0.5813161  1.00000000 -0.055209652  0.04677758
## RelHumidity  0.009646021  0.3384610 -0.05520965  1.000000000 -0.12850244
## FFD         -0.247933708  0.0506911  0.04677758 -0.128502440  1.00000000

From the above correlation matrix, we can infer that the temperature has a negative-weak correlation of -0.24793371, with the annual FFD series, the Rainfall has a a very weak correlation of 0.05069110, with the FFD series, the Radiation-2 has a very weak correlation of 0.0467775, with the FFD series and, the RelHumidity has a negative weak correlation of -0.12850244 with the FFD series.

Because the annual FFD series is estimated as a dependent variable, it occupies the y-axis. Such is compared to the other four variables.

Model Fitting - FFD vs Temperature

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:10){
  model11.1 = dlm(x = as.vector(ffd_temp.ts), y = as.vector(ffd_FFD.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model11.1$model), "BIC = ", BIC(model11.1$model), "MASE =", MASE(model11.1)$MASE, "\n")
}
## q =  1 AIC =  279.6699 BIC =  285.2747 MASE = 0.6644773 
## q =  2 AIC =  272.7512 BIC =  279.5877 MASE = 0.6788097 
## q =  3 AIC =  264.2921 BIC =  272.2853 MASE = 0.643964 
## q =  4 AIC =  258.0491 BIC =  267.12 MASE = 0.6406911 
## q =  5 AIC =  247.1156 BIC =  257.1804 MASE = 0.5959619 
## q =  6 AIC =  238.7187 BIC =  249.6886 MASE = 0.5794878 
## q =  7 AIC =  231.1518 BIC =  242.9323 MASE = 0.5525488 
## q =  8 AIC =  224.2942 BIC =  236.7846 MASE = 0.5377635 
## q =  9 AIC =  218.0096 BIC =  231.1021 MASE = 0.5596056 
## q =  10 AIC =  211.9345 BIC =  225.5133 MASE = 0.5755493

According to the output of finite distributed lag, lag 8 has the lowest MASE, AIC, and BIC values which are MASE = 0.5377635, AIC = 224.2942 BIC = 236.7846. As a result, we provide a lag duration of (q=8).

  • Fitting a finite DLM with a lag of 8 and doing the diagostic checking for Temperature with respect to dependent variable FFD
finite_DLM_1 <- dlm(x = as.vector(ffd_temp.ts), y = as.vector(ffd_FFD.ts), q = 8)
summary(finite_DLM_1)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -37.275 -15.864  -1.115  17.803  35.433 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -234.5776   436.6363  -0.537    0.600
## x.t          -16.2546    16.9552  -0.959    0.355
## x.1           13.2675    19.2454   0.689    0.503
## x.2            6.9760    20.7470   0.336    0.742
## x.3            4.2206    21.1738   0.199    0.845
## x.4            0.3059    19.9592   0.015    0.988
## x.5           33.1797    19.4750   1.704    0.112
## x.6            9.0036    18.8281   0.478    0.640
## x.7          -17.7065    17.7672  -0.997    0.337
## x.8           13.9289    18.7092   0.744    0.470
## 
## Residual standard error: 26.15 on 13 degrees of freedom
## Multiple R-squared:  0.3739, Adjusted R-squared:  -0.05959 
## F-statistic: 0.8625 on 9 and 13 DF,  p-value: 0.578
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 224.2942 236.7846

The above model of the finite distributed lag model has q=8, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is -0.05959, indicating that this only explains -5.959 percent of the variability in the model. The whole model has a p-value of 0.578, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_1$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 13
## 
## data:  Residuals
## LM test = 23, df = 13, p-value = 0.04168
shapiro.test(residuals(finite_DLM_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_1$model)
## W = 0.96633, p-value = 0.602

The residual graphs for the above model are shown in Figure 31:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_1 =vif(finite_DLM_1$model)
vif_dlm_1
##      x.t      x.1      x.2      x.3      x.4      x.5      x.6      x.7 
## 1.459125 1.579066 1.725793 1.728840 1.785482 1.668934 1.571271 1.347587 
##      x.8 
## 1.429501
vif_dlm_1 >10
##   x.t   x.1   x.2   x.3   x.4   x.5   x.6   x.7   x.8 
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=8 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:10){
        for(j in 1:5){
                model_22.1 <- polyDlm(x = as.vector(ffd_temp.ts),y = as.vector(ffd_FFD.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_22.1$model), "BIC:", BIC(model_22.1$model),"MASE =", MASE(model_22.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: 279.6699 BIC: 285.2747 MASE = 0.6644773 
## q: 1 k: 2 AIC: 279.6699 BIC: 285.2747 MASE = 0.6644773 
## q: 1 k: 3 AIC: 279.6699 BIC: 285.2747 MASE = 0.6644773 
## q: 1 k: 4 AIC: 279.6699 BIC: 285.2747 MASE = 0.6644773 
## q: 1 k: 5 AIC: 279.6699 BIC: 285.2747 MASE = 0.6644773 
## q: 2 k: 1 AIC: 271.8418 BIC: 277.311 MASE = 0.6799324 
## q: 2 k: 2 AIC: 272.7512 BIC: 279.5877 MASE = 0.6788097 
## q: 2 k: 3 AIC: 272.7512 BIC: 279.5877 MASE = 0.6788097 
## q: 2 k: 4 AIC: 272.7512 BIC: 279.5877 MASE = 0.6788097 
## q: 2 k: 5 AIC: 272.7512 BIC: 279.5877 MASE = 0.6788097 
## q: 3 k: 1 AIC: 260.724 BIC: 266.0528 MASE = 0.6452749 
## q: 3 k: 2 AIC: 262.6923 BIC: 269.3533 MASE = 0.6467993 
## q: 3 k: 3 AIC: 264.2921 BIC: 272.2853 MASE = 0.643964 
## q: 3 k: 4 AIC: 264.2921 BIC: 272.2853 MASE = 0.643964 
## q: 3 k: 5 AIC: 264.2921 BIC: 272.2853 MASE = 0.643964 
## q: 4 k: 1 AIC: 253.7315 BIC: 258.9149 MASE = 0.6696413 
## q: 4 k: 2 AIC: 254.835 BIC: 261.3141 MASE = 0.6493527 
## q: 4 k: 3 AIC: 256.7827 BIC: 264.5577 MASE = 0.6481165 
## q: 4 k: 4 AIC: 258.0491 BIC: 267.12 MASE = 0.6406911 
## q: 4 k: 5 AIC: 258.0491 BIC: 267.12 MASE = 0.6406911 
## q: 5 k: 1 AIC: 242.023 BIC: 247.0554 MASE = 0.6547974 
## q: 5 k: 2 AIC: 243.9959 BIC: 250.2863 MASE = 0.6548472 
## q: 5 k: 3 AIC: 243.8858 BIC: 251.4344 MASE = 0.610184 
## q: 5 k: 4 AIC: 245.7791 BIC: 254.5857 MASE = 0.6107593 
## q: 5 k: 5 AIC: 247.1156 BIC: 257.1804 MASE = 0.5959619 
## q: 6 k: 1 AIC: 232.3109 BIC: 237.1864 MASE = 0.6274164 
## q: 6 k: 2 AIC: 234.0051 BIC: 240.0995 MASE = 0.6326293 
## q: 6 k: 3 AIC: 235.9857 BIC: 243.299 MASE = 0.6310903 
## q: 6 k: 4 AIC: 236.6387 BIC: 245.1708 MASE = 0.6115644 
## q: 6 k: 5 AIC: 237.7545 BIC: 247.5055 MASE = 0.6003641 
## q: 7 k: 1 AIC: 227.1352 BIC: 231.8475 MASE = 0.6411696 
## q: 7 k: 2 AIC: 225.7006 BIC: 231.5909 MASE = 0.6177101 
## q: 7 k: 3 AIC: 227.167 BIC: 234.2354 MASE = 0.6113556 
## q: 7 k: 4 AIC: 228.0539 BIC: 236.3003 MASE = 0.5880155 
## q: 7 k: 5 AIC: 229.7052 BIC: 239.1296 MASE = 0.5830142 
## q: 8 k: 1 AIC: 218.2523 BIC: 222.7942 MASE = 0.60046 
## q: 8 k: 2 AIC: 218.7617 BIC: 224.4392 MASE = 0.6419454 
## q: 8 k: 3 AIC: 219.9289 BIC: 226.7418 MASE = 0.6138246 
## q: 8 k: 4 AIC: 221.346 BIC: 229.2944 MASE = 0.6090361 
## q: 8 k: 5 AIC: 220.3549 BIC: 229.4388 MASE = 0.5487196 
## q: 9 k: 1 AIC: 210.5338 BIC: 214.898 MASE = 0.6380994 
## q: 9 k: 2 AIC: 208.9273 BIC: 214.3825 MASE = 0.6378389 
## q: 9 k: 3 AIC: 210.789 BIC: 217.3352 MASE = 0.6435655 
## q: 9 k: 4 AIC: 212.543 BIC: 220.1803 MASE = 0.6230324 
## q: 9 k: 5 AIC: 214.3866 BIC: 223.1149 MASE = 0.6178045 
## q: 10 k: 1 AIC: 201.9543 BIC: 206.1323 MASE = 0.6365091 
## q: 10 k: 2 AIC: 201.1493 BIC: 206.3719 MASE = 0.6461061 
## q: 10 k: 3 AIC: 201.9501 BIC: 208.2173 MASE = 0.639002 
## q: 10 k: 4 AIC: 203.947 BIC: 211.2586 MASE = 0.6399682 
## q: 10 k: 5 AIC: 205.695 BIC: 214.0511 MASE = 0.6231184

According to the output of polynomial distributed lag model, lag =8 and k=5 has the lowest MASE, AIC, and BIC values which are MASE = 0.5487196 , AIC: 220.3549 BIC: 229.4388. As a result, we provide a lag duration of (q=8, k=5).

  • Fitting a polynomial DLM forTemperature with respect to dependent variable FFD
poly_DLM_1 <- polyDlm(x = as.vector(ffd_temp.ts), y = as.vector(ffd_FFD.ts), q = 8, k = 5)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0  -14.100      15.80 -0.8930  0.3860
## beta.1   13.900      17.20  0.8100  0.4310
## beta.2    0.882      11.10  0.0795  0.9380
## beta.3    0.539      12.60  0.0429  0.9660
## beta.4   15.300       9.38  1.6400  0.1230
## beta.5   23.800      12.80  1.8700  0.0814
## beta.6    8.280      11.30  0.7340  0.4740
## beta.7  -18.000      16.10 -1.1200  0.2820
## beta.8   17.200      16.80  1.0300  0.3200
summary(poly_DLM_1)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.633 -14.469  -1.617  12.967  38.253 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -245.4675   411.3332  -0.597    0.559
## z.t0         -14.0807    15.7697  -0.893    0.385
## z.t1          84.8667    68.7487   1.234    0.235
## z.t2         -82.4781    64.7915  -1.273    0.221
## z.t3          29.8455    22.0487   1.354    0.195
## z.t4          -4.4351     3.1041  -1.429    0.172
## z.t5           0.2294     0.1539   1.491    0.155
## 
## Residual standard error: 24.66 on 16 degrees of freedom
## Multiple R-squared:  0.3152, Adjusted R-squared:  0.05839 
## F-statistic: 1.227 on 6 and 16 DF,  p-value: 0.3433

The above model of the polynomial distributed lag model has q=8 and k=5, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.05839, indicating that this only explains 5.839 percent of the variability in the model. The whole model has a p-value of 0.3433, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_1$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 13.677, df = 10, p-value = 0.1882
shapiro.test(residuals(poly_DLM_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_1$model)
## W = 0.96386, p-value = 0.5455

The residual graphs for the above model are shown in Figure 32:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_1 =vif(poly_DLM_1$model)
vif_poly_1
##         z.t0         z.t1         z.t2         z.t3         z.t4         z.t5 
##      12.0902    5747.7598  226953.9051 1316863.0441 1394103.0790  191117.9998
vif_poly_1 >10
## z.t0 z.t1 z.t2 z.t3 z.t4 z.t5 
## TRUE TRUE TRUE TRUE TRUE TRUE
  • According to the VIF values, the above model with q=8 and k=5 has a multicollinearity problem.

Fitting Koyck model

  • Fitting a Koyck model forTemperature with respect to dependent variable FFD
Koyck_model_1 = koyckDlm(x = as.vector(ffd_temp.ts) , y = as.vector(ffd_FFD.ts))
summary(Koyck_model_1$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -57.755 -12.972  -4.079  17.329  58.541 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -19.87216  608.22407  -0.033    0.974
## Y.1           0.03846    0.25384   0.152    0.881
## X.t          23.22042   61.08917   0.380    0.707
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     1.429   0.242
## Wu-Hausman         1  26     0.582   0.452
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 28.38 on 27 degrees of freedom
## Multiple R-Squared: -0.3272, Adjusted R-squared: -0.4255 
## Wald test: 0.07269 on 2 and 27 DF,  p-value: 0.9301
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.4255, indicating that this only explains -42.55 percent of the variability in the model. The whole model has a p-value of 0.9301, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is a no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_1$model)

shapiro.test(residuals(Koyck_model_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_1$model)
## W = 0.98351, p-value = 0.9093

The residual graphs for the above model are shown in Figure 33:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a slight presence of autocorrelation and seasonality in the residuals.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_loyck_1=vif(Koyck_model_1$model)
vif_loyck_1
##      Y.1      X.t 
## 1.280988 1.280988
vif_loyck_1>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_3 = ardlDlm(x = as.vector(ffd_temp.ts), y = as.vector(ffd_FFD.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_3$model), "BIC =", BIC(model_3$model), "MASE =", MASE(model_3)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = 281.6674 BIC = 288.6734 MASE = 0.6638257 
## p = 1 q = 2 AIC = 274.6571 BIC = 282.8609 MASE = 0.6701022 
## p = 1 q = 3 AIC = 267.8377 BIC = 277.1631 MASE = 0.6636023 
## p = 1 q = 4 AIC = 260.7609 BIC = 271.1276 MASE = 0.6312973 
## p = 1 q = 5 AIC = 253.8875 BIC = 265.2103 MASE = 0.6113428 
## p = 2 q = 1 AIC = 274.7477 BIC = 282.9515 MASE = 0.6779238 
## p = 2 q = 2 AIC = 276.5497 BIC = 286.1208 MASE = 0.6661485 
## p = 2 q = 3 AIC = 269.7687 BIC = 280.4263 MASE = 0.6590945 
## p = 2 q = 4 AIC = 262.731 BIC = 274.3935 MASE = 0.6290879 
## p = 2 q = 5 AIC = 255.6945 BIC = 268.2754 MASE = 0.6121448 
## p = 3 q = 1 AIC = 266.2915 BIC = 275.6169 MASE = 0.6440947 
## p = 3 q = 2 AIC = 268.2817 BIC = 278.9393 MASE = 0.6435364 
## p = 3 q = 3 AIC = 270.278 BIC = 282.2678 MASE = 0.6452747 
## p = 3 q = 4 AIC = 263.0578 BIC = 276.0161 MASE = 0.6250869 
## p = 3 q = 5 AIC = 256.2379 BIC = 270.077 MASE = 0.6133766 
## p = 4 q = 1 AIC = 260.0445 BIC = 270.4112 MASE = 0.6390572 
## p = 4 q = 2 AIC = 262.0215 BIC = 273.684 MASE = 0.6359683 
## p = 4 q = 3 AIC = 264.0057 BIC = 276.9641 MASE = 0.6402902 
## p = 4 q = 4 AIC = 264.9661 BIC = 279.2203 MASE = 0.6290295 
## p = 4 q = 5 AIC = 258.1505 BIC = 273.2476 MASE = 0.6145971 
## p = 5 q = 1 AIC = 249.017 BIC = 260.3399 MASE = 0.5906457 
## p = 5 q = 2 AIC = 250.5602 BIC = 263.1412 MASE = 0.5982639 
## p = 5 q = 3 AIC = 252.4929 BIC = 266.332 MASE = 0.5958209 
## p = 5 q = 4 AIC = 254.2659 BIC = 269.3631 MASE = 0.5977003 
## p = 5 q = 5 AIC = 254.762 BIC = 271.1173 MASE = 0.5872934

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.5872934 , AIC: 254.762, BIC: 271.1173 As a result, we provide a lag duration of (p=5,q=5).

  • Fitting a autoregressive distributed lag model for Temperature with respect to dependent variable FFD.
ardldlm_t2_55 = ardlDlm(x = as.vector(ffd_temp.ts), y = as.vector(ffd_FFD.ts),p = 5, q =5)
summary(ardldlm_t2_55)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.737 -19.061   1.234  14.390  35.592 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -224.23454  313.40616  -0.715   0.4861  
## X.t           -9.16144   20.27475  -0.452   0.6583  
## X.1            9.94277   17.54350   0.567   0.5799  
## X.2           -3.59758   19.35814  -0.186   0.8552  
## X.3            3.85078   17.98292   0.214   0.8335  
## X.4            9.73284   18.58716   0.524   0.6087  
## X.5           36.11788   20.11520   1.796   0.0942 .
## Y.1           -0.10372    0.24476  -0.424   0.6782  
## Y.2           -0.16756    0.26201  -0.640   0.5328  
## Y.3           -0.06975    0.25112  -0.278   0.7853  
## Y.4            0.05407    0.25163   0.215   0.8330  
## Y.5            0.24253    0.26563   0.913   0.3767  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26.84 on 14 degrees of freedom
## Multiple R-squared:  0.3314, Adjusted R-squared:  -0.194 
## F-statistic: 0.6308 on 11 and 14 DF,  p-value: 0.7761

The above model of the autoregressive distributed lag model has p=5 and q=5, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.194, indicating that this only explains -19.4 percent of the variability in the model. The whole model has a p-value of 0.7761, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(ardldlm_t2_55)
## Time Series:
## Start = 6 
## End = 31 
## Frequency = 1 
##          6          7          8          9         10         11         12 
##  24.238693  14.855868  -1.648442 -20.012696  -6.630781  35.592493 -26.529697 
##         13         14         15         16         17         18         19 
##  15.913520  -3.118743  10.216350 -24.649492 -34.737323   6.496199 -23.082908 
##         20         21         22         23         24         25         26 
##  10.273333 -16.206704  -6.196080 -22.373239  33.972539   4.115993   6.904172 
##         27         28         29         30         31 
##  25.271067  21.483544  12.992000 -15.627340 -21.512324

shapiro.test(residuals(ardldlm_t2_55$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t2_55$model)
## W = 0.95923, p-value = 0.3767

The residual graphs for the above model are shown in Figure 34:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t2_55=vif(ardldlm_t2_55$model)
vif_ardldlm_t2_55
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  2.120183  1.559726  1.743178  1.477845  1.531142  1.735191  1.223215  1.288416 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##  1.183578  1.238850  1.347144
vif_ardldlm_t2_55>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##     FALSE     FALSE     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

Model Fitting - FFD vs Rainfall

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model111.1 = dlm(x = as.vector(ffd_rainfall.ts), y = as.vector(ffd_FFD.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model111.1$model), "BIC = ", BIC(model111.1$model), "MASE =", MASE(model111.1)$MASE, "\n")
}
## q =  1 AIC =  281.9769 BIC =  287.5817 MASE = 0.6758928 
## q =  2 AIC =  273.8231 BIC =  280.6596 MASE = 0.6752737 
## q =  3 AIC =  265.1863 BIC =  273.1795 MASE = 0.6326564 
## q =  4 AIC =  257.086 BIC =  266.1568 MASE = 0.6309338 
## q =  5 AIC =  246.1244 BIC =  256.1891 MASE = 0.5808908

According to the output of finite distributed lag, lag 5 has the lowest MASE, AIC, and BIC values which are MASE = 0.5808908, AIC = 246.1244 BIC = 256.1891. As a result, we provide a lag duration of (q=5).

  • Fitting a finite DLM with a lag of 8 and doing the diagostic checking for Rainfall with respect to dependent variable FFD
finite_DLM_2 <- dlm(x = as.vector(ffd_rainfall.ts), y = as.vector(ffd_FFD.ts), q = 5)
summary(finite_DLM_2)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -36.580 -15.273   0.321  16.502  34.419 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 124.29819   61.42691   2.024   0.0573 .
## x.t          -1.59092   12.07498  -0.132   0.8966  
## x.1          10.07463   12.35892   0.815   0.4251  
## x.2          -0.05172   12.47752  -0.004   0.9967  
## x.3          21.36996   12.49193   1.711   0.1034  
## x.4         -19.37574   12.76458  -1.518   0.1455  
## x.5          25.48370   12.88078   1.978   0.0626 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.65 on 19 degrees of freedom
## Multiple R-squared:  0.2954, Adjusted R-squared:  0.0729 
## F-statistic: 1.328 on 6 and 19 DF,  p-value: 0.2931
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 246.1244 256.1891

The above model of the finite distributed lag model has q=5, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.0729, indicating that this only explains 7.29 percent of the variability in the model. The whole model has a p-value of 0.2931, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_2$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 19.533, df = 10, p-value = 0.03399
shapiro.test(residuals(finite_DLM_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_2$model)
## W = 0.96557, p-value = 0.5126

The residual graphs for the above model are shown in Figure 35:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a slight presence of autocorrelation and seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_2 =vif(finite_DLM_2$model)
vif_dlm_2
##      x.t      x.1      x.2      x.3      x.4      x.5 
## 1.079459 1.129580 1.143071 1.130249 1.099396 1.067556
vif_dlm_2 >10
##   x.t   x.1   x.2   x.3   x.4   x.5 
## FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=5 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:10){
        for(j in 1:5){
                model_222.1 <- polyDlm(x = as.vector(ffd_rainfall.ts),y = as.vector(ffd_FFD.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_222.1$model), "BIC:", BIC(model_222.1$model),"MASE =", MASE(model_222.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: 281.9769 BIC: 287.5817 MASE = 0.6758928 
## q: 1 k: 2 AIC: 281.9769 BIC: 287.5817 MASE = 0.6758928 
## q: 1 k: 3 AIC: 281.9769 BIC: 287.5817 MASE = 0.6758928 
## q: 1 k: 4 AIC: 281.9769 BIC: 287.5817 MASE = 0.6758928 
## q: 1 k: 5 AIC: 281.9769 BIC: 287.5817 MASE = 0.6758928 
## q: 2 k: 1 AIC: 271.8803 BIC: 277.3494 MASE = 0.6800138 
## q: 2 k: 2 AIC: 273.8231 BIC: 280.6596 MASE = 0.6752737 
## q: 2 k: 3 AIC: 273.8231 BIC: 280.6596 MASE = 0.6752737 
## q: 2 k: 4 AIC: 273.8231 BIC: 280.6596 MASE = 0.6752737 
## q: 2 k: 5 AIC: 273.8231 BIC: 280.6596 MASE = 0.6752737 
## q: 3 k: 1 AIC: 261.2148 BIC: 266.5436 MASE = 0.6328574 
## q: 3 k: 2 AIC: 263.1955 BIC: 269.8565 MASE = 0.633195 
## q: 3 k: 3 AIC: 265.1863 BIC: 273.1795 MASE = 0.6326564 
## q: 3 k: 4 AIC: 265.1863 BIC: 273.1795 MASE = 0.6326564 
## q: 3 k: 5 AIC: 265.1863 BIC: 273.1795 MASE = 0.6326564 
## q: 4 k: 1 AIC: 254.5701 BIC: 259.7534 MASE = 0.6503852 
## q: 4 k: 2 AIC: 254.6528 BIC: 261.132 MASE = 0.6352745 
## q: 4 k: 3 AIC: 255.6507 BIC: 263.4257 MASE = 0.6296653 
## q: 4 k: 4 AIC: 257.086 BIC: 266.1568 MASE = 0.6309338 
## q: 4 k: 5 AIC: 257.086 BIC: 266.1568 MASE = 0.6309338 
## q: 5 k: 1 AIC: 245.3577 BIC: 250.3901 MASE = 0.6415906 
## q: 5 k: 2 AIC: 247.2862 BIC: 253.5766 MASE = 0.6312437 
## q: 5 k: 3 AIC: 247.9819 BIC: 255.5305 MASE = 0.6449668 
## q: 5 k: 4 AIC: 247.2154 BIC: 256.0221 MASE = 0.6076301 
## q: 5 k: 5 AIC: 246.1244 BIC: 256.1891 MASE = 0.5808908 
## q: 6 k: 1 AIC: 236.3676 BIC: 241.2431 MASE = 0.6330879 
## q: 6 k: 2 AIC: 238.2623 BIC: 244.3567 MASE = 0.6392046 
## q: 6 k: 3 AIC: 240.1603 BIC: 247.4735 MASE = 0.6442302 
## q: 6 k: 4 AIC: 241.9518 BIC: 250.484 MASE = 0.6418412 
## q: 6 k: 5 AIC: 241.2959 BIC: 251.0469 MASE = 0.6106778 
## q: 7 k: 1 AIC: 228.2873 BIC: 232.9995 MASE = 0.628454 
## q: 7 k: 2 AIC: 230.1033 BIC: 235.9936 MASE = 0.6407467 
## q: 7 k: 3 AIC: 232.0614 BIC: 239.1297 MASE = 0.6394295 
## q: 7 k: 4 AIC: 233.8699 BIC: 242.1162 MASE = 0.6479944 
## q: 7 k: 5 AIC: 235.7375 BIC: 245.1619 MASE = 0.65258 
## q: 8 k: 1 AIC: 220.1186 BIC: 224.6605 MASE = 0.6347111 
## q: 8 k: 2 AIC: 221.8263 BIC: 227.5038 MASE = 0.653508 
## q: 8 k: 3 AIC: 223.8101 BIC: 230.623 MASE = 0.6540435 
## q: 8 k: 4 AIC: 225.6771 BIC: 233.6255 MASE = 0.6515374 
## q: 8 k: 5 AIC: 227.6729 BIC: 236.7569 MASE = 0.6530242 
## q: 9 k: 1 AIC: 210.5477 BIC: 214.9119 MASE = 0.6501095 
## q: 9 k: 2 AIC: 211.4687 BIC: 216.9239 MASE = 0.6498614 
## q: 9 k: 3 AIC: 213.3512 BIC: 219.8975 MASE = 0.6415101 
## q: 9 k: 4 AIC: 215.2773 BIC: 222.9146 MASE = 0.6401378 
## q: 9 k: 5 AIC: 217.2461 BIC: 225.9745 MASE = 0.6481429 
## q: 10 k: 1 AIC: 201.5409 BIC: 205.7189 MASE = 0.6564796 
## q: 10 k: 2 AIC: 201.0167 BIC: 206.2393 MASE = 0.6248857 
## q: 10 k: 3 AIC: 203.0026 BIC: 209.2697 MASE = 0.624006 
## q: 10 k: 4 AIC: 204.9837 BIC: 212.2954 MASE = 0.6226522 
## q: 10 k: 5 AIC: 206.2938 BIC: 214.6499 MASE = 0.6171977

According to the output of polynomial distributed lag model, lag =5 and k=5 has the lowest MASE, AIC, and BIC values which are MASE = 0.5808908, AIC: 246.1244, BIC: 256.1891. As a result, we provide a lag duration of (q=5, k=5).

  • Fitting a polynomial DLM forRainfall with respect to dependent variable FFD
poly_DLM_2 <- polyDlm(x = as.vector(ffd_rainfall.ts), y = as.vector(ffd_FFD.ts), q = 5, k = 5)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error  t value P(>|t|)
## beta.0  -1.5900       12.1 -0.13200  0.8960
## beta.1  10.1000       12.4  0.81500  0.4240
## beta.2  -0.0517       12.5 -0.00415  0.9970
## beta.3  21.4000       12.5  1.71000  0.1020
## beta.4 -19.4000       12.8 -1.52000  0.1440
## beta.5  25.5000       12.9  1.98000  0.0611
summary(poly_DLM_2)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -36.580 -15.273   0.321  16.502  34.419 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  124.298     61.427   2.024   0.0573 .
## z.t0          -1.591     12.075  -0.132   0.8966  
## z.t1         154.814    130.699   1.185   0.2508  
## z.t2        -266.859    207.796  -1.284   0.2145  
## z.t3         158.979    114.957   1.383   0.1827  
## z.t4         -38.506     26.157  -1.472   0.1574  
## z.t5           3.238      2.091   1.549   0.1379  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.65 on 19 degrees of freedom
## Multiple R-squared:  0.2954, Adjusted R-squared:  0.0729 
## F-statistic: 1.328 on 6 and 19 DF,  p-value: 0.2931

The above model of the polynomial distributed lag model has q=5 and k=5, and all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.0729, indicating that this only explains 7.29 percent of the variability in the model. The whole model has a p-value of 0.2931, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_2$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 19.533, df = 10, p-value = 0.03399
shapiro.test(residuals(poly_DLM_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_2$model)
## W = 0.96557, p-value = 0.5126

The residual graphs for the above model are shown in Figure 36:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a slight presence of autocorrelation and seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_2 =vif(poly_DLM_2$model)
vif_poly_2
##         z.t0         z.t1         z.t2         z.t3         z.t4         z.t5 
## 7.840291e+00 7.209277e+03 3.005203e+05 1.860099e+06 2.135919e+06 3.173162e+05
vif_poly_2 >10
##  z.t0  z.t1  z.t2  z.t3  z.t4  z.t5 
## FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
  • According to the VIF values, the above model with q=5 and k=5 is slightly affected by the multicollinearity.

Fitting Koyck model

  • Fitting a Koyck model forRainfall with respect to dependent variable FFD
Koyck_model_2 = koyckDlm(x = as.vector(ffd_rainfall.ts) , y = as.vector(ffd_FFD.ts))
summary(Koyck_model_2$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.691 -21.222   2.697  14.856  68.192 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.266e+02  2.185e+02   0.579    0.567
## Y.1         4.591e-03  2.196e-01   0.021    0.983
## X.t         3.448e+01  8.772e+01   0.393    0.697
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     0.654   0.426
## Wu-Hausman         1  26     0.161   0.692
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 27.55 on 27 degrees of freedom
## Multiple R-Squared: -0.2505, Adjusted R-squared: -0.3431 
## Wald test: 0.07773 on 2 and 27 DF,  p-value: 0.9254
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.3431, indicating that this only explains -34.31 percent of the variability in the model. The whole model has a p-value of 0.9254, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_2$model)

shapiro.test(residuals(Koyck_model_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_2$model)
## W = 0.98582, p-value = 0.9504

The residual graphs for the above model are shown in Figure 37:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a slight presence of autocorrelation and seasonality in the residuals.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_2=vif(Koyck_model_2$model)
vif_koyck_2
##      Y.1      X.t 
## 1.017508 1.017508
vif_koyck_2>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_33 = ardlDlm(x = as.vector(ffd_rainfall.ts), y = as.vector(ffd_FFD.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_33$model), "BIC =", BIC(model_33$model), "MASE =", MASE(model_33)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = 283.9744 BIC = 290.9804 MASE = 0.6760282 
## p = 1 q = 2 AIC = 276.7348 BIC = 284.9385 MASE = 0.6838177 
## p = 1 q = 3 AIC = 269.3361 BIC = 278.6615 MASE = 0.6676193 
## p = 1 q = 4 AIC = 262.6875 BIC = 273.0542 MASE = 0.6489023 
## p = 1 q = 5 AIC = 255.5705 BIC = 266.8934 MASE = 0.6396905 
## p = 2 q = 1 AIC = 275.8163 BIC = 284.0201 MASE = 0.6749962 
## p = 2 q = 2 AIC = 277.8084 BIC = 287.3795 MASE = 0.6762326 
## p = 2 q = 3 AIC = 270.6698 BIC = 281.3274 MASE = 0.6621822 
## p = 2 q = 4 AIC = 264.2889 BIC = 275.9514 MASE = 0.6506873 
## p = 2 q = 5 AIC = 257.2016 BIC = 269.7826 MASE = 0.641259 
## p = 3 q = 1 AIC = 267.1614 BIC = 276.4868 MASE = 0.6355497 
## p = 3 q = 2 AIC = 269.1262 BIC = 279.7838 MASE = 0.639182 
## p = 3 q = 3 AIC = 270.8152 BIC = 282.8051 MASE = 0.6371823 
## p = 3 q = 4 AIC = 264.4739 BIC = 277.4323 MASE = 0.6246495 
## p = 3 q = 5 AIC = 258.2483 BIC = 272.0874 MASE = 0.6369725 
## p = 4 q = 1 AIC = 259.0485 BIC = 269.4152 MASE = 0.6272275 
## p = 4 q = 2 AIC = 260.686 BIC = 272.3485 MASE = 0.6099768 
## p = 4 q = 3 AIC = 262.1327 BIC = 275.091 MASE = 0.601955 
## p = 4 q = 4 AIC = 263.7563 BIC = 278.0105 MASE = 0.6017331 
## p = 4 q = 5 AIC = 257.7752 BIC = 272.8724 MASE = 0.6175106 
## p = 5 q = 1 AIC = 247.6247 BIC = 258.9476 MASE = 0.5835814 
## p = 5 q = 2 AIC = 249.3037 BIC = 261.8846 MASE = 0.5755967 
## p = 5 q = 3 AIC = 247.4178 BIC = 261.2568 MASE = 0.5484 
## p = 5 q = 4 AIC = 248.4355 BIC = 263.5326 MASE = 0.5256786 
## p = 5 q = 5 AIC = 250.0315 BIC = 266.3868 MASE = 0.5306522

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.5256786 , AIC: 248.4355, BIC: 263.5326 As a result, we provide a lag duration of (p=5,q=4).

  • Fitting a autoregressive distributed lag model for Rainfall with respect to dependent variable FFD.
ardldlm_t2_54 = ardlDlm(x = as.vector(ffd_rainfall.ts), y = as.vector(ffd_FFD.ts),p = 5, q =4)
summary(ardldlm_t2_54)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -27.12 -15.50  -0.65  13.13  35.40 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 120.7626    87.1334   1.386   0.1860  
## X.t           1.4092    15.0160   0.094   0.9265  
## X.1          24.4971    15.7233   1.558   0.1401  
## X.2         -13.2575    16.1170  -0.823   0.4236  
## X.3          27.1489    13.2278   2.052   0.0580 .
## X.4         -23.8080    15.4248  -1.543   0.1435  
## X.5          41.6349    16.3506   2.546   0.0224 *
## Y.1           0.2037     0.2245   0.907   0.3785  
## Y.2          -0.1554     0.3001  -0.518   0.6120  
## Y.3          -0.4561     0.2707  -1.685   0.1127  
## Y.4           0.1863     0.2452   0.760   0.4591  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.86 on 15 degrees of freedom
## Multiple R-squared:  0.4339, Adjusted R-squared:  0.05646 
## F-statistic:  1.15 on 10 and 15 DF,  p-value: 0.3911

The above model of the autoregressive distributed lag model has p=5 and q=4, only X.5 attributes has consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.05646, indicating that this only explains 5.646 percent of the variability in the model. The whole model has a p-value of 0.3911, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(ardldlm_t2_54$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 14
## 
## data:  Residuals
## LM test = 22.044, df = 14, p-value = 0.07772
shapiro.test(residuals(ardldlm_t2_54$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t2_54$model)
## W = 0.95497, p-value = 0.3022

The residual graphs for the above model are shown in Figure 38:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t2_54=vif(ardldlm_t2_54$model)
vif_ardldlm_t2_54
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  1.640220  1.796402  1.873907  1.245234  1.577406  1.690184  1.302004  2.138465 
## L(y.t, 3) L(y.t, 4) 
##  1.740638  1.488317
vif_ardldlm_t2_54>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

Model Fitting - FFD vs Radiation

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model1111.11 = dlm(x = as.vector(ffd_radiation.ts), y = as.vector(ffd_FFD.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model1111.11$model), "BIC = ", BIC(model1111.11$model), "MASE =", MASE(model1111.11)$MASE, "\n")
}
## q =  1 AIC =  281.571 BIC =  287.1758 MASE = 0.6579604 
## q =  2 AIC =  273.2458 BIC =  280.0823 MASE = 0.6672639 
## q =  3 AIC =  264.666 BIC =  272.6592 MASE = 0.6164754 
## q =  4 AIC =  257.9491 BIC =  267.02 MASE = 0.5968786 
## q =  5 AIC =  249.3748 BIC =  259.4396 MASE = 0.5870893

According to the output of finite distributed lag, lag 5 has the lowest MASE, AIC, and BIC values which are MASE = 0.5870893, AIC = 249.3748 BIC = 259.4396. As a result, we provide a lag duration of (q=5).

  • Fitting a finite DLM with a lag of 8 and doing the diagostic checking for Radiation with respect to dependent variable FFD
finite_DLM_3 <- dlm(x = as.vector(ffd_radiation.ts), y = as.vector(ffd_FFD.ts), q = 5)
summary(finite_DLM_3)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.450 -11.001   0.942  12.898  42.567 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   28.090    308.725   0.091    0.928
## x.t            7.289     14.711   0.496    0.626
## x.1            4.250     17.044   0.249    0.806
## x.2          -28.378     16.633  -1.706    0.104
## x.3           11.347     16.478   0.689    0.499
## x.4           -3.407     17.532  -0.194    0.848
## x.5           21.205     15.250   1.390    0.180
## 
## Residual standard error: 25.18 on 19 degrees of freedom
## Multiple R-squared:  0.2016, Adjusted R-squared:  -0.05056 
## F-statistic: 0.7995 on 6 and 19 DF,  p-value: 0.5822
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 249.3748 259.4396

The above model of the finite distributed lag model has q=5, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is -0.05056, indicating that this only explains -5.056 percent of the variability in the model. The whole model has a p-value of 0.5822, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 12.679, df = 10, p-value = 0.2422
shapiro.test(residuals(finite_DLM_3$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_3$model)
## W = 0.9781, p-value = 0.8312

The residual graphs for the above model are shown in Figure 39:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_3 =vif(finite_DLM_3$model)
vif_dlm_3
##      x.t      x.1      x.2      x.3      x.4      x.5 
## 1.627610 2.198355 2.087422 2.036809 2.138937 1.599224
vif_dlm_3 >10
##   x.t   x.1   x.2   x.3   x.4   x.5 
## FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=5 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:10){
        for(j in 1:5){
                model_2222.1 <- polyDlm(x = as.vector(ffd_radiation.ts),y = as.vector(ffd_FFD.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_2222.1$model), "BIC:", BIC(model_2222.1$model),"MASE =", MASE(model_2222.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: 281.571 BIC: 287.1758 MASE = 0.6579604 
## q: 1 k: 2 AIC: 281.571 BIC: 287.1758 MASE = 0.6579604 
## q: 1 k: 3 AIC: 281.571 BIC: 287.1758 MASE = 0.6579604 
## q: 1 k: 4 AIC: 281.571 BIC: 287.1758 MASE = 0.6579604 
## q: 1 k: 5 AIC: 281.571 BIC: 287.1758 MASE = 0.6579604 
## q: 2 k: 1 AIC: 271.2624 BIC: 276.7316 MASE = 0.6659282 
## q: 2 k: 2 AIC: 273.2458 BIC: 280.0823 MASE = 0.6672639 
## q: 2 k: 3 AIC: 273.2458 BIC: 280.0823 MASE = 0.6672639 
## q: 2 k: 4 AIC: 273.2458 BIC: 280.0823 MASE = 0.6672639 
## q: 2 k: 5 AIC: 273.2458 BIC: 280.0823 MASE = 0.6672639 
## q: 3 k: 1 AIC: 263.6777 BIC: 269.0065 MASE = 0.6659083 
## q: 3 k: 2 AIC: 263.5785 BIC: 270.2395 MASE = 0.622861 
## q: 3 k: 3 AIC: 264.666 BIC: 272.6592 MASE = 0.6164754 
## q: 3 k: 4 AIC: 264.666 BIC: 272.6592 MASE = 0.6164754 
## q: 3 k: 5 AIC: 264.666 BIC: 272.6592 MASE = 0.6164754 
## q: 4 k: 1 AIC: 255.2221 BIC: 260.4055 MASE = 0.6591452 
## q: 4 k: 2 AIC: 254.5356 BIC: 261.0148 MASE = 0.6027035 
## q: 4 k: 3 AIC: 256.3555 BIC: 264.1305 MASE = 0.600258 
## q: 4 k: 4 AIC: 257.9491 BIC: 267.02 MASE = 0.5968786 
## q: 4 k: 5 AIC: 257.9491 BIC: 267.02 MASE = 0.5968786 
## q: 5 k: 1 AIC: 246.5758 BIC: 251.6082 MASE = 0.6702734 
## q: 5 k: 2 AIC: 246.071 BIC: 252.3614 MASE = 0.6504138 
## q: 5 k: 3 AIC: 246.9317 BIC: 254.4803 MASE = 0.6224155 
## q: 5 k: 4 AIC: 248.9085 BIC: 257.7152 MASE = 0.6217626 
## q: 5 k: 5 AIC: 249.3748 BIC: 259.4396 MASE = 0.5870893 
## q: 6 k: 1 AIC: 237.3563 BIC: 242.2318 MASE = 0.6477746 
## q: 6 k: 2 AIC: 239.3461 BIC: 245.4405 MASE = 0.6488056 
## q: 6 k: 3 AIC: 234.1991 BIC: 241.5124 MASE = 0.5333237 
## q: 6 k: 4 AIC: 236.1536 BIC: 244.6857 MASE = 0.5326018 
## q: 6 k: 5 AIC: 238.0197 BIC: 247.7707 MASE = 0.5327327 
## q: 7 k: 1 AIC: 228.8368 BIC: 233.549 MASE = 0.6448329 
## q: 7 k: 2 AIC: 230.74 BIC: 236.6303 MASE = 0.6443999 
## q: 7 k: 3 AIC: 232.1815 BIC: 239.2498 MASE = 0.6441366 
## q: 7 k: 4 AIC: 230.2716 BIC: 238.518 MASE = 0.5494545 
## q: 7 k: 5 AIC: 229.972 BIC: 239.3964 MASE = 0.513034 
## q: 8 k: 1 AIC: 220.3855 BIC: 224.9275 MASE = 0.6620282 
## q: 8 k: 2 AIC: 222.2164 BIC: 227.8938 MASE = 0.6629663 
## q: 8 k: 3 AIC: 224.0832 BIC: 230.8962 MASE = 0.6607326 
## q: 8 k: 4 AIC: 222.9641 BIC: 230.9125 MASE = 0.5951398 
## q: 8 k: 5 AIC: 224.8775 BIC: 233.9614 MASE = 0.6014529 
## q: 9 k: 1 AIC: 210.7384 BIC: 215.1025 MASE = 0.668574 
## q: 9 k: 2 AIC: 212.7091 BIC: 218.1644 MASE = 0.6697067 
## q: 9 k: 3 AIC: 214.4969 BIC: 221.0432 MASE = 0.6577838 
## q: 9 k: 4 AIC: 216.095 BIC: 223.7323 MASE = 0.6489701 
## q: 9 k: 5 AIC: 216.6825 BIC: 225.4108 MASE = 0.6261777 
## q: 10 k: 1 AIC: 201.5335 BIC: 205.7116 MASE = 0.6539994 
## q: 10 k: 2 AIC: 203.4282 BIC: 208.6509 MASE = 0.6463697 
## q: 10 k: 3 AIC: 205.4051 BIC: 211.6722 MASE = 0.6398524 
## q: 10 k: 4 AIC: 207.3431 BIC: 214.6547 MASE = 0.6386645 
## q: 10 k: 5 AIC: 208.4757 BIC: 216.8318 MASE = 0.6288343

According to the output of polynomial distributed lag model, lag =7 and k=5 has the lowest MASE, AIC, and BIC values which are MASE = 0.513034, AIC: 229.972, BIC: 239.3964. As a result, we provide a lag duration of (q=7, k=5).

  • Fitting a polynomial DLM forRadiation with respect to dependent variable FFD
poly_DLM_3 <- polyDlm(x = as.vector(ffd_radiation.ts), y = as.vector(ffd_FFD.ts), q = 7, k = 5)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0    13.00      12.70   1.030  0.3190
## beta.1   -14.30      12.10  -1.180  0.2540
## beta.2   -17.60       9.51  -1.850  0.0811
## beta.3     2.96       9.08   0.326  0.7490
## beta.4    19.80       9.93   1.990  0.0630
## beta.5     8.41      10.10   0.833  0.4160
## beta.6   -20.90      13.30  -1.580  0.1330
## beta.7     7.82      14.10   0.553  0.5870
summary(poly_DLM_3)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -48.321  -7.795   1.186  11.529  41.543 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) 217.9458   354.5240   0.615    0.547
## z.t0         13.0430    12.6987   1.027    0.319
## z.t1        -26.0358    53.1766  -0.490    0.631
## z.t2        -13.8248    55.4053  -0.250    0.806
## z.t3         16.0651    21.4663   0.748    0.464
## z.t4         -3.7592     3.4759  -1.082    0.295
## z.t5          0.2600     0.1989   1.307    0.208
## 
## Residual standard error: 24.81 on 17 degrees of freedom
## Multiple R-squared:  0.2634, Adjusted R-squared:  0.00336 
## F-statistic: 1.013 on 6 and 17 DF,  p-value: 0.4494

The above model of the polynomial distributed lag model has q=7 and k=5, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.00336, indicating that this only explains 0.336 percent of the variability in the model. The whole model has a p-value of 0.4494, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_3$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 17.887, df = 10, p-value = 0.0569
shapiro.test(residuals(poly_DLM_3$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_3$model)
## W = 0.96705, p-value = 0.5949

The residual graphs for the above model are shown in Figure 40:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a slight presence of autocorrelation and seasonality in the residuals.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_3 =vif(poly_DLM_3$model)
vif_poly_3
##         z.t0         z.t1         z.t2         z.t3         z.t4         z.t5 
## 1.979902e+01 5.890934e+03 1.922780e+05 1.029935e+06 1.055280e+06 1.430625e+05
vif_poly_3 >10
## z.t0 z.t1 z.t2 z.t3 z.t4 z.t5 
## TRUE TRUE TRUE TRUE TRUE TRUE
  • According to the VIF values, the above model with q=7 and k=5 has a multicollinearity problem.

Fitting Koyck model

  • Fitting a Koyck model for Radiation with respect to dependent variable FFD
Koyck_model_3 = koyckDlm(x = as.vector(ffd_radiation.ts) , y = as.vector(ffd_FFD.ts))
summary(Koyck_model_3$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -55.229 -19.662   3.956  16.232  54.756 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 418.31843  384.12678   1.089    0.286
## Y.1          -0.03254    0.20729  -0.157    0.876
## X.t         -13.87153   25.49529  -0.544    0.591
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value  
## Weak instruments   1  27     7.199  0.0123 *
## Wu-Hausman         1  26     0.538  0.4698  
## Sargan             0  NA        NA      NA  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25.54 on 27 degrees of freedom
## Multiple R-Squared: -0.07436,    Adjusted R-squared: -0.1539 
## Wald test: 0.1486 on 2 and 27 DF,  p-value: 0.8626
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.1539, indicating that this only explains -15.39 percent of the variability in the model. The whole model has a p-value of 0.8626, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_3$model)

shapiro.test(residuals(Koyck_model_3$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_3$model)
## W = 0.98748, p-value = 0.9716

The residual graphs for the above model are shown in Figure 41:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating a presence of autocorrelation and seasonality in the residuals.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_3=vif(Koyck_model_3$model)
vif_koyck_3
##      Y.1      X.t 
## 1.055257 1.055257
vif_koyck_3>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_333 = ardlDlm(x = as.vector(ffd_radiation.ts), y = as.vector(ffd_FFD.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_333$model), "BIC =", BIC(model_33$model), "MASE =", MASE(model_333)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = 283.5658 BIC = 266.3868 MASE = 0.6599906 
## p = 1 q = 2 AIC = 276.5262 BIC = 266.3868 MASE = 0.6718305 
## p = 1 q = 3 AIC = 269.1591 BIC = 266.3868 MASE = 0.6531487 
## p = 1 q = 4 AIC = 262.6904 BIC = 266.3868 MASE = 0.6386406 
## p = 1 q = 5 AIC = 255.6012 BIC = 266.3868 MASE = 0.6196376 
## p = 2 q = 1 AIC = 275.24 BIC = 266.3868 MASE = 0.6688985 
## p = 2 q = 2 AIC = 277.1882 BIC = 266.3868 MASE = 0.6661641 
## p = 2 q = 3 AIC = 269.5463 BIC = 266.3868 MASE = 0.646406 
## p = 2 q = 4 AIC = 262.9001 BIC = 266.3868 MASE = 0.6221685 
## p = 2 q = 5 AIC = 256.4326 BIC = 266.3868 MASE = 0.6306393 
## p = 3 q = 1 AIC = 266.5357 BIC = 266.3868 MASE = 0.6144131 
## p = 3 q = 2 AIC = 268.3699 BIC = 266.3868 MASE = 0.6149087 
## p = 3 q = 3 AIC = 270.1803 BIC = 266.3868 MASE = 0.6220337 
## p = 3 q = 4 AIC = 263.6351 BIC = 266.3868 MASE = 0.6016041 
## p = 3 q = 5 AIC = 257.3509 BIC = 266.3868 MASE = 0.6164068 
## p = 4 q = 1 AIC = 259.9163 BIC = 266.3868 MASE = 0.5983539 
## p = 4 q = 2 AIC = 261.3427 BIC = 266.3868 MASE = 0.5933603 
## p = 4 q = 3 AIC = 263.3127 BIC = 266.3868 MASE = 0.5959315 
## p = 4 q = 4 AIC = 265.2261 BIC = 266.3868 MASE = 0.5907476 
## p = 4 q = 5 AIC = 258.7707 BIC = 266.3868 MASE = 0.6045879 
## p = 5 q = 1 AIC = 251.3746 BIC = 266.3868 MASE = 0.587024 
## p = 5 q = 2 AIC = 252.9746 BIC = 266.3868 MASE = 0.5756885 
## p = 5 q = 3 AIC = 254.7 BIC = 266.3868 MASE = 0.5539405 
## p = 5 q = 4 AIC = 255.7433 BIC = 266.3868 MASE = 0.5278081 
## p = 5 q = 5 AIC = 257.5483 BIC = 266.3868 MASE = 0.5314267

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.5278081 , AIC: 257.5483, BIC: 266.3868 As a result, we provide a lag duration of (p=5,q=4).

  • Fitting a autoregressive distributed lag model for Radiation with respect to dependent variable FFD.
ardldlm_t2_54_radiation = ardlDlm(x = as.vector(ffd_radiation.ts), y = as.vector(ffd_FFD.ts),p = 5, q =4)
summary(ardldlm_t2_54_radiation)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.589  -7.045   0.804  12.449  32.637 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.903e+02  4.908e+02  -0.591    0.563
## X.t          1.353e+01  1.929e+01   0.702    0.494
## X.1          9.354e+00  2.091e+01   0.447    0.661
## X.2         -3.163e+01  1.882e+01  -1.681    0.114
## X.3          1.114e+01  1.984e+01   0.561    0.583
## X.4         -4.193e+00  2.204e+01  -0.190    0.852
## X.5          2.912e+01  2.014e+01   1.446    0.169
## Y.1         -8.686e-03  2.456e-01  -0.035    0.972
## Y.2          1.181e-01  2.663e-01   0.443    0.664
## Y.3          1.556e-01  3.111e-01   0.500    0.624
## Y.4          2.052e-01  2.737e-01   0.750    0.465
## 
## Residual standard error: 27.46 on 15 degrees of freedom
## Multiple R-squared:  0.2501, Adjusted R-squared:  -0.2498 
## F-statistic: 0.5004 on 10 and 15 DF,  p-value: 0.8644

The above model of the autoregressive distributed lag model has p=5 and q=4, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.2498, indicating that this only explains -24.98 percent of the variability in the model. The whole model has a p-value of 0.8644, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(ardldlm_t2_54_radiation$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 14
## 
## data:  Residuals
## LM test = 25.334, df = 14, p-value = 0.03141
shapiro.test(residuals(ardldlm_t2_54_radiation$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t2_54_radiation$model)
## W = 0.92845, p-value = 0.07118

The residual graphs for the above model are shown in Figure 42:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t2_54_radiation=vif(ardldlm_t2_54_radiation$model)
vif_ardldlm_t2_54_radiation
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  2.351222  2.781746  2.246503  2.482689  2.840642  2.344362  1.176489  1.271138 
## L(y.t, 3) L(y.t, 4) 
##  1.735659  1.399882
vif_ardldlm_t2_54_radiation>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE

According to the VIF values, the above model does not have the multicollinearity problem.

Model Fitting - FFD vs Humidity

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model111.111 = dlm(x = as.vector(ffd_humidity.ts), y = as.vector(ffd_FFD.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model111.111$model), "BIC = ", BIC(model111.111$model), "MASE =", MASE(model111.111)$MASE, "\n")
}
## q =  1 AIC =  281.6547 BIC =  287.2595 MASE = 0.6748273 
## q =  2 AIC =  273.1762 BIC =  280.0127 MASE = 0.6669837 
## q =  3 AIC =  260.6544 BIC =  268.6476 MASE = 0.6124744 
## q =  4 AIC =  254.6378 BIC =  263.7087 MASE = 0.6133102 
## q =  5 AIC =  248.6573 BIC =  258.722 MASE = 0.624735

According to the output of finite distributed lag, lag 3 has the lowest MASE, AIC, and BIC values which are MASE = 0.6124744, AIC = 260.6544 BIC = 268.6476. As a result, we provide a lag duration of (q=3).

  • Fitting a finite DLM with a lag of 6 and doing the diagostic checking for Humidity with respect to dependent variable FFD
finite_DLM_4 <- dlm(x = as.vector(ffd_humidity.ts), y = as.vector(ffd_FFD.ts), q = 3)
summary(finite_DLM_4)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -51.92 -14.42   1.82  17.90  36.79 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 2664.399   1066.751   2.498   0.0201 *
## x.t           -4.550      5.707  -0.797   0.4334  
## x.1           -5.365      5.720  -0.938   0.3581  
## x.2           -3.104      5.808  -0.534   0.5981  
## x.3          -12.942      5.635  -2.297   0.0311 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.64 on 23 degrees of freedom
## Multiple R-squared:  0.2286, Adjusted R-squared:  0.09449 
## F-statistic: 1.704 on 4 and 23 DF,  p-value: 0.1834
## 
## AIC and BIC values for the model:
##        AIC      BIC
## 1 260.6544 268.6476

The above model of the finite distributed lag model has q=3, Almost all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.09449, indicating that this only explains 9.449 percent of the variability in the model. The whole model has a p-value of 0.1834, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_4$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 8
## 
## data:  Residuals
## LM test = 9.6632, df = 8, p-value = 0.2895
shapiro.test(residuals(finite_DLM_4$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_4$model)
## W = 0.97217, p-value = 0.6398

The residual graphs for the above model are shown in Figure 43:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_4 =vif(finite_DLM_4$model)
vif_dlm_4
##      x.t      x.1      x.2      x.3 
## 1.130529 1.141729 1.148168 1.101532
vif_dlm_4 >10
##   x.t   x.1   x.2   x.3 
## FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=3 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:5){
        for(j in 1:5){
                model_222.111 <- polyDlm(x = as.vector(ffd_humidity.ts),y = as.vector(ffd_FFD.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_222.111$model), "BIC:", BIC(model_222.111$model),"MASE =", MASE(model_222.111)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: 281.6547 BIC: 287.2595 MASE = 0.6748273 
## q: 1 k: 2 AIC: 281.6547 BIC: 287.2595 MASE = 0.6748273 
## q: 1 k: 3 AIC: 281.6547 BIC: 287.2595 MASE = 0.6748273 
## q: 1 k: 4 AIC: 281.6547 BIC: 287.2595 MASE = 0.6748273 
## q: 1 k: 5 AIC: 281.6547 BIC: 287.2595 MASE = 0.6748273 
## q: 2 k: 1 AIC: 271.3876 BIC: 276.8567 MASE = 0.6732722 
## q: 2 k: 2 AIC: 273.1762 BIC: 280.0127 MASE = 0.6669837 
## q: 2 k: 3 AIC: 273.1762 BIC: 280.0127 MASE = 0.6669837 
## q: 2 k: 4 AIC: 273.1762 BIC: 280.0127 MASE = 0.6669837 
## q: 2 k: 5 AIC: 273.1762 BIC: 280.0127 MASE = 0.6669837 
## q: 3 k: 1 AIC: 257.8141 BIC: 263.1429 MASE = 0.6123806 
## q: 3 k: 2 AIC: 258.9412 BIC: 265.6022 MASE = 0.6054505 
## q: 3 k: 3 AIC: 260.6544 BIC: 268.6476 MASE = 0.6124744 
## q: 3 k: 4 AIC: 260.6544 BIC: 268.6476 MASE = 0.6124744 
## q: 3 k: 5 AIC: 260.6544 BIC: 268.6476 MASE = 0.6124744 
## q: 4 k: 1 AIC: 250.5898 BIC: 255.7732 MASE = 0.6305625 
## q: 4 k: 2 AIC: 252.2204 BIC: 258.6996 MASE = 0.6198356 
## q: 4 k: 3 AIC: 253.2751 BIC: 261.0501 MASE = 0.605609 
## q: 4 k: 4 AIC: 254.6378 BIC: 263.7087 MASE = 0.6133102 
## q: 4 k: 5 AIC: 254.6378 BIC: 263.7087 MASE = 0.6133102 
## q: 5 k: 1 AIC: 243.7885 BIC: 248.8209 MASE = 0.6625942 
## q: 5 k: 2 AIC: 244.2742 BIC: 250.5647 MASE = 0.6376046 
## q: 5 k: 3 AIC: 245.9031 BIC: 253.4517 MASE = 0.6348654 
## q: 5 k: 4 AIC: 247.6505 BIC: 256.4572 MASE = 0.6271509 
## q: 5 k: 5 AIC: 248.6573 BIC: 258.722 MASE = 0.624735

According to the output of polynomial distributed lag model, lag =4 and k=3 has the lowest MASE, AIC, and BIC values which are MASE = 0.605609, AIC: 253.2751, BIC: 261.0501. As a result, we provide a lag duration of (q=4, k=3).

  • Fitting a polynomial DLM forHumidity with respect to dependent variable FFD
poly_DLM_4 <- polyDlm(x = as.vector(ffd_humidity.ts), y = as.vector(ffd_FFD.ts), q = 4, k = 3)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0    -5.53       5.71  -0.969  0.3430
## beta.1    -2.68       4.68  -0.572  0.5730
## beta.2    -7.01       3.64  -1.930  0.0664
## beta.3    -9.87       4.77  -2.070  0.0500
## beta.4    -2.57       5.89  -0.436  0.6670
summary(poly_DLM_4)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -52.320 -14.791  -1.665  14.734  39.456 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 2824.335   1241.308   2.275    0.033 *
## z.t0          -5.529      5.707  -0.969    0.343  
## z.t1           9.337     15.452   0.604    0.552  
## z.t2          -7.931      9.857  -0.805    0.430  
## z.t3           1.445      1.633   0.885    0.386  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.37 on 22 degrees of freedom
## Multiple R-squared:  0.2048, Adjusted R-squared:  0.0602 
## F-statistic: 1.416 on 4 and 22 DF,  p-value: 0.2615

The above model of the polynomial distributed lag model has q=4 and k=3, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.0602, indicating that this only explains 6.02 percent of the variability in the model. The whole model has a p-value of 0.26156, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_4$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 8
## 
## data:  Residuals
## LM test = 10.578, df = 8, p-value = 0.2268
shapiro.test(residuals(poly_DLM_4$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_4$model)
## W = 0.97495, p-value = 0.735

The residual graphs for the above model are shown in Figure 44:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_4 =vif(poly_DLM_4$model)
vif_poly_4
##        z.t0        z.t1        z.t2        z.t3 
##    4.732047  226.968955 1122.890768  425.878581
vif_poly_4 >10
##  z.t0  z.t1  z.t2  z.t3 
## FALSE  TRUE  TRUE  TRUE
  • According to the VIF values, the above model with q=4 and k=3 is slightly affectd by multicollinearity.

Fitting Koyck model

  • Fitting a Koyck model forHumidity with respect to dependent variable FFD
Koyck_model_4 = koyckDlm(x = as.vector(ffd_humidity.ts) , y = as.vector(ffd_FFD.ts))
summary(Koyck_model_4$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.787 -14.896  -3.024  15.673  55.019 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1898.18932 4211.29973   0.451    0.656
## Y.1           -0.05904    0.25016  -0.236    0.815
## X.t          -17.72952   44.24103  -0.401    0.692
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     0.570   0.457
## Wu-Hausman         1  26     0.121   0.730
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 27 on 27 degrees of freedom
## Multiple R-Squared: -0.2016, Adjusted R-squared: -0.2906 
## Wald test: 0.0808 on 2 and 27 DF,  p-value: 0.9226
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.2906, indicating that this only explains -29.06 percent of the variability in the model. The whole model has a p-value of 0.9226, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_4$model)

shapiro.test(residuals(Koyck_model_4$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_4$model)
## W = 0.98176, p-value = 0.8702

The residual graphs for the above model are shown in Figure 45:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_4=vif(Koyck_model_4$model)
vif_koyck_4
##      Y.1      X.t 
## 1.374145 1.374145
vif_koyck_4>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_3333 = ardlDlm(x = as.vector(ffd_humidity.ts), y = as.vector(ffd_FFD.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_3333$model), "BIC =", BIC(model_3333$model), "MASE =", MASE(model_3333)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = 283.6338 BIC = 290.6398 MASE = 0.6759237 
## p = 1 q = 2 AIC = 276.424 BIC = 284.6278 MASE = 0.685099 
## p = 1 q = 3 AIC = 269.2856 BIC = 278.611 MASE = 0.6723943 
## p = 1 q = 4 AIC = 262.6906 BIC = 273.0573 MASE = 0.6531694 
## p = 1 q = 5 AIC = 255.9194 BIC = 267.2423 MASE = 0.6497544 
## p = 2 q = 1 AIC = 275.1189 BIC = 283.3227 MASE = 0.659539 
## p = 2 q = 2 AIC = 277.1178 BIC = 286.6889 MASE = 0.6591294 
## p = 2 q = 3 AIC = 270.4134 BIC = 281.0711 MASE = 0.6507711 
## p = 2 q = 4 AIC = 263.6474 BIC = 275.3099 MASE = 0.6251524 
## p = 2 q = 5 AIC = 255.9861 BIC = 268.5671 MASE = 0.6008689 
## p = 3 q = 1 AIC = 262.2897 BIC = 271.6151 MASE = 0.60307 
## p = 3 q = 2 AIC = 264.1836 BIC = 274.8412 MASE = 0.6017215 
## p = 3 q = 3 AIC = 265.8767 BIC = 277.8665 MASE = 0.5897134 
## p = 3 q = 4 AIC = 259.8661 BIC = 272.8245 MASE = 0.5855867 
## p = 3 q = 5 AIC = 251.0833 BIC = 264.9224 MASE = 0.5175561 
## p = 4 q = 1 AIC = 255.9681 BIC = 266.3348 MASE = 0.593868 
## p = 4 q = 2 AIC = 257.8014 BIC = 269.4639 MASE = 0.5969128 
## p = 4 q = 3 AIC = 259.3579 BIC = 272.3162 MASE = 0.577192 
## p = 4 q = 4 AIC = 261.3174 BIC = 275.5716 MASE = 0.5752897 
## p = 4 q = 5 AIC = 252.7653 BIC = 267.8625 MASE = 0.5186368 
## p = 5 q = 1 AIC = 249.9992 BIC = 261.3221 MASE = 0.6035524 
## p = 5 q = 2 AIC = 251.7412 BIC = 264.3221 MASE = 0.6031264 
## p = 5 q = 3 AIC = 253.2512 BIC = 267.0902 MASE = 0.5799393 
## p = 5 q = 4 AIC = 255.2458 BIC = 270.343 MASE = 0.5797153 
## p = 5 q = 5 AIC = 254.7429 BIC = 271.0982 MASE = 0.5164104

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.5164104 , AIC: 254.7429, BIC: 271.0982. As a result, we provide a lag duration of (p=5,q=5).

  • Fitting a autoregressive distributed lag model for Humidity with respect to dependent variable FFD.
ardldlm_t2_55_humidity = ardlDlm(x = as.vector(ffd_humidity.ts), y = as.vector(ffd_FFD.ts),p = 5, q =5)
summary(ardldlm_t2_55_humidity)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -54.954 -10.973  -2.066  14.132  34.529 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  3.709e+03  2.245e+03   1.652   0.1207  
## X.t         -6.692e+00  8.149e+00  -0.821   0.4253  
## X.1         -2.487e+00  8.073e+00  -0.308   0.7626  
## X.2         -8.137e+00  7.976e+00  -1.020   0.3249  
## X.3         -1.503e+01  7.637e+00  -1.968   0.0692 .
## X.4         -3.406e+00  8.047e+00  -0.423   0.6785  
## X.5         -9.856e-01  8.969e+00  -0.110   0.9141  
## Y.1         -2.127e-01  2.603e-01  -0.817   0.4276  
## Y.2         -7.884e-02  2.623e-01  -0.301   0.7682  
## Y.3         -1.466e-01  2.560e-01  -0.572   0.5761  
## Y.4         -8.156e-03  2.647e-01  -0.031   0.9759  
## Y.5          3.208e-01  2.697e-01   1.189   0.2541  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26.83 on 14 degrees of freedom
## Multiple R-squared:  0.3319, Adjusted R-squared:  -0.1931 
## F-statistic: 0.6322 on 11 and 14 DF,  p-value: 0.775

The above model of the autoregressive distributed lag model has p=5 and q=5, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.1931, indicating that this only explains -19.31 percent of the variability in the model. The whole model has a p-value of 0.775, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(ardldlm_t2_55_humidity)
## Time Series:
## Start = 6 
## End = 31 
## Frequency = 1 
##          6          7          8          9         10         11         12 
##  32.444007  11.755080 -14.174331 -13.621852   1.744286  14.923747 -17.756397 
##         13         14         15         16         17         18         19 
##  17.398633  -4.152907  20.207322 -35.096865 -54.954298  -2.728946 -10.755611 
##         20         21         22         23         24         25         26 
##   3.225217  -6.555465 -14.995335 -11.045613  20.075741   2.313395  -5.522932 
##         27         28         29         30         31 
##  26.673747  10.201792  34.529359  -2.316220  -1.815554

shapiro.test(residuals(ardldlm_t2_55_humidity$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t2_55_humidity$model)
## W = 0.95953, p-value = 0.3823

The residual graphs for the above model are shown in Figure 46:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t2_55_humidity=vif(ardldlm_t2_55_humidity$model)
vif_ardldlm_t2_55_humidity
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  1.617855  1.598036  1.509352  1.396057  1.445190  1.566825  1.384256  1.292572 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##  1.231256  1.371531  1.390214
vif_ardldlm_t2_55_humidity>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##     FALSE     FALSE     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

  • The data frame has been constructed to contain the model accuracy values, such as AIC/BIC and MASE, from the models that have been fitted so far.
model_dlm_t2 <- data.frame(Model=character(),MASE=numeric(),
                           BIC= numeric(),AICC=numeric(),AIC=numeric())
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Finite DLM_temperature",
                                               AIC = AIC(finite_DLM_1),
                                              BIC = BIC(finite_DLM_1),
                                              MASE= MASE(finite_DLM_1)
))
## [1] 224.2942
## [1] 236.7846
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Polynomial DLM_temperature",
                                               BIC = BIC(poly_DLM_1),
                                               AIC = AIC(poly_DLM_1),
                                              MASE= MASE(poly_DLM_1)
                                              ))
## [1] 229.4388
## [1] 220.3549
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Koyck Model temperature",
                                               AIC = AIC(Koyck_model_1),
                                      BIC = BIC(Koyck_model_1),
                                              MASE= MASE(Koyck_model_1)
                                              ))
## [1] 290.7199
## [1] 296.3247
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="autoregressive_dlm_t2_55_temperature",
                                               AIC = AIC(ardldlm_t2_55),
                                      BIC = BIC(ardldlm_t2_55),
                                              MASE= MASE(ardldlm_t2_55)
                                              ))
## [1] 254.762
## [1] 271.1173
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Finite DLM_rainfall",
                                               AIC = AIC(finite_DLM_2),
                                              BIC = BIC(finite_DLM_2),
                                              MASE= MASE(finite_DLM_2)
))
## [1] 246.1244
## [1] 256.1891
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Polynomial DLM_rainfall",
                                               BIC = BIC(poly_DLM_2),
                                               AIC = AIC(poly_DLM_2),
                                              MASE= MASE(poly_DLM_2)
                                              ))
## [1] 256.1891
## [1] 246.1244
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Koyck Model rainfall",
                                               AIC = AIC(Koyck_model_2),
                                      BIC = BIC(Koyck_model_2),
                                              MASE= MASE(Koyck_model_2)
                                              ))
## [1] 288.9325
## [1] 294.5373
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="autoregressive_dlm_t2_54_rainfall",
                                               AIC = AIC(ardldlm_t2_54),
                                      BIC = BIC(ardldlm_t2_54),
                                              MASE= MASE(ardldlm_t2_54)
                                              ))
## [1] 248.4355
## [1] 263.5326
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Finite DLM_radiation",
                                               AIC = AIC(finite_DLM_3),
                                              BIC = BIC(finite_DLM_3),
                                              MASE= MASE(finite_DLM_3)
))
## [1] 249.3748
## [1] 259.4396
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Polynomial DLM_radiation",
                                               BIC = BIC(poly_DLM_3),
                                               AIC = AIC(poly_DLM_3),
                                              MASE= MASE(poly_DLM_3)
                                              ))
## [1] 239.3964
## [1] 229.972
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Koyck Model radiation",
                                               AIC = AIC(Koyck_model_3),
                                      BIC = BIC(Koyck_model_3),
                                              MASE= MASE(Koyck_model_3)
                                              ))
## [1] 284.3792
## [1] 289.984
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="autoregressive_dlm_t2_54_radiation",
                                               AIC = AIC(ardldlm_t2_54_radiation),
                                      BIC = BIC(ardldlm_t2_54_radiation),
                                              MASE= MASE(ardldlm_t2_54_radiation)
                                              ))
## [1] 255.7433
## [1] 270.8405
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Finite DLM_humidity",
                                               AIC = AIC(finite_DLM_4),
                                              BIC = BIC(finite_DLM_4),
                                              MASE= MASE(finite_DLM_4)
))
## [1] 260.6544
## [1] 268.6476
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Polynomial DLM_humidity",
                                               BIC = BIC(poly_DLM_4),
                                               AIC = AIC(poly_DLM_4),
                                              MASE= MASE(poly_DLM_4)
                                              ))
## [1] 261.0501
## [1] 253.2751
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="Koyck Model humidity",
                                               AIC = AIC(Koyck_model_4),
                                      BIC = BIC(Koyck_model_4),
                                              MASE= MASE(Koyck_model_4)
                                              ))
## [1] 287.7361
## [1] 293.3409
model_dlm_t2 = rbind(model_dlm_t2,cbind(Model="autoregressive_dlm_t2_55_humidity",
                                               AIC = AIC(ardldlm_t2_55_humidity),
                                      BIC = BIC(ardldlm_t2_55_humidity),
                                              MASE= MASE(ardldlm_t2_55_humidity)
                                              ))
## [1] 254.7429
## [1] 271.0982
sortScore(model_dlm_t2,score = "mase")
##                                                        Model      AIC      BIC
## poly_DLM_3                          Polynomial DLM_radiation 229.9720 239.3964
## ardldlm_t2_55_humidity     autoregressive_dlm_t2_55_humidity 254.7429 271.0982
## ardldlm_t2_54              autoregressive_dlm_t2_54_rainfall 248.4355 263.5326
## ardldlm_t2_54_radiation   autoregressive_dlm_t2_54_radiation 255.7433 270.8405
## finite_DLM_1                          Finite DLM_temperature 224.2942 236.7846
## poly_DLM_1                        Polynomial DLM_temperature 220.3549 229.4388
## poly_DLM_2                           Polynomial DLM_rainfall 246.1244 256.1891
## finite_DLM_2                             Finite DLM_rainfall 246.1244 256.1891
## finite_DLM_3                            Finite DLM_radiation 249.3748 259.4396
## ardldlm_t2_55           autoregressive_dlm_t2_55_temperature 254.7620 271.1173
## poly_DLM_4                           Polynomial DLM_humidity 253.2751 261.0501
## finite_DLM_4                             Finite DLM_humidity 260.6544 268.6476
## Koyck_model_3                          Koyck Model radiation 284.3792 289.9840
## Koyck_model_2                           Koyck Model rainfall 288.9325 294.5373
## Koyck_model_4                           Koyck Model humidity 287.7361 293.3409
## Koyck_model_1                        Koyck Model temperature 290.7199 296.3247
##                              MASE
## poly_DLM_3              0.5130340
## ardldlm_t2_55_humidity  0.5164104
## ardldlm_t2_54           0.5256786
## ardldlm_t2_54_radiation 0.5278081
## finite_DLM_1            0.5377635
## poly_DLM_1              0.5487196
## poly_DLM_2              0.5808908
## finite_DLM_2            0.5808908
## finite_DLM_3            0.5870893
## ardldlm_t2_55           0.5872934
## poly_DLM_4              0.6056090
## finite_DLM_4            0.6124744
## Koyck_model_3           0.7136301
## Koyck_model_2           0.7267872
## Koyck_model_4           0.7335956
## Koyck_model_1           0.7927338

Exponential Smoothing

Exponential smoothing will be another forecasting approach we will explore. We will only evaluate some of the meaningful models.

FFD vs Temperature

Holt’s linear trend

ffd_temp_holt = holt(x=ffd_temp.ts,y= ffd_FFD.ts, initial = "simple", h=4)
summary(ffd_temp_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", x = ffd_temp.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.488 
##     beta  = 0.2874 
## 
##   Initial states:
##     l = 9.3716 
##     b = 0.2846 
## 
##   sigma:  0.4112
## Error measures:
##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.03332159 0.4111897 0.3407567 -0.4568255 3.602771 0.9872715
##                    ACF1
## Training set -0.1067001
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       10.31437 9.787414 10.84134 9.508458 11.12029
## 2016       10.45405 9.787213 11.12089 9.434209 11.47390
## 2017       10.59373 9.722875 11.46459 9.261871 11.92559
## 2018       10.73341 9.608813 11.85801 9.013488 12.45333
 checkresiduals(ffd_temp_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 3.7312, df = 3, p-value = 0.292
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_temp_holt$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_temp_holt$model)
## W = 0.97626, p-value = 0.7028

The residual graphs for the above model are shown in Figure 47:

  • MASE of this model is 0.9872715.
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Damped Holt’s Method

ffd_temp_holt_1 = holt(x=ffd_temp.ts,y= ffd_FFD.ts, damped = TRUE, initial = "simple", h=4)
summary(ffd_temp_holt_1)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, damped = TRUE, initial = "simple",  
## 
##  Call:
##      x = ffd_temp.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.4118 
##     beta  = 0.0033 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 9.3942 
##     b = 0.0719 
## 
##   sigma:  0.4026
## 
##      AIC     AICc      BIC 
## 56.59780 60.09780 65.20172 
## 
## Error measures:
##                     ME     RMSE       MAE       MPE     MAPE      MASE
## Training set 0.0269928 0.368736 0.3163056 0.1539601 3.329048 0.9164294
##                    ACF1
## Training set -0.1270235
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       10.02540 9.509402 10.54139 9.236251 10.81454
## 2016       10.02780 9.469236 10.58637 9.173549 10.88206
## 2017       10.02973 9.431218 10.62824 9.114387 10.94507
## 2018       10.03127 9.395021 10.66751 9.058214 11.00432
 checkresiduals(ffd_temp_holt_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 4.5635, df = 3, p-value = 0.2067
## 
## Model df: 5.   Total lags used: 8
 shapiro.test(residuals(ffd_temp_holt_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_temp_holt_1$model)
## W = 0.98081, p-value = 0.8347

The residual graphs for the above model are shown in Figure 48:

  • MASE of this model is 0.9164294.

  • The time series plot clearly shows a random trend.

  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.

  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Holt’s method with Exponential Trend

ffd_temp_holt_2 = holt(x=ffd_temp.ts,y= ffd_FFD.ts, exponential =TRUE, initial = "simple", h=4)
summary(ffd_temp_holt_2)
## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", exponential = TRUE,  
## 
##  Call:
##      x = ffd_temp.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.4797 
##     beta  = 0.3063 
## 
##   Initial states:
##     l = 9.3716 
##     b = 1.0304 
## 
##   sigma:  0.0435
## Error measures:
##                       ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.03827614 0.4112363 0.3409954 -0.5085983 3.606739 0.9879629
##                    ACF1
## Training set -0.1057739
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       10.33269 9.753887 10.89493 9.444919 11.20856
## 2016       10.48623 9.738940 11.23370 9.375204 11.64670
## 2017       10.64205 9.675409 11.67290 9.233348 12.23891
## 2018       10.80018 9.540170 12.14936 8.916412 12.89392
 checkresiduals(ffd_temp_holt_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 3.7584, df = 3, p-value = 0.2888
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_temp_holt_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_temp_holt_2$model)
## W = 0.97627, p-value = 0.7032

The residual graphs for the above model are shown in Figure 49:

  • MASE of this model is 0.9879629
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

FFD vs Rainfall

  • Holt’s linear trend
ffd_rainfall_holt = holt(x=ffd_rainfall.ts,y= ffd_FFD.ts, initial = "simple", h=4)
summary(ffd_rainfall_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", x = ffd_rainfall.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.0042 
##     beta  = 1 
## 
##   Initial states:
##     l = 2.4893 
##     b = -0.0135 
## 
##   sigma:  0.3715
## Error measures:
##                      ME      RMSE       MAE        MPE     MAPE      MASE
## Training set 0.05782536 0.3715493 0.2825729 -0.2889583 12.76096 0.7630564
##                   ACF1
## Training set 0.1463187
## 
## Forecasts:
##      Point Forecast     Lo 80    Hi 80      Lo 95    Hi 95
## 2015       2.154446 1.6782868 2.630606  1.4262231 2.882670
## 2016       2.148553 1.4737399 2.823366  1.1165156 3.180590
## 2017       2.142659 0.9738507 3.311467  0.3551209 3.930197
## 2018       2.136765 0.2894956 3.984035 -0.6883899 4.961921
 checkresiduals(ffd_rainfall_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 2.6432, df = 3, p-value = 0.45
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_rainfall_holt$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_rainfall_holt$model)
## W = 0.96181, p-value = 0.3255

The residual graphs for the above model are shown in Figure 50:

  • MASE of this model is 0.7630564

  • The time series plot clearly shows a random trend.

  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.

  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Damped Holt’s Method

ffd_rainfall_holt_1 = holt(x=ffd_rainfall.ts,y= ffd_FFD.ts, damped = TRUE, initial = "simple", h=4)
summary(ffd_rainfall_holt_1)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, damped = TRUE, initial = "simple",  
## 
##  Call:
##      x = ffd_rainfall.ts) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
##     phi   = 0.8488 
## 
##   Initial states:
##     l = 2.738 
##     b = -0.0791 
## 
##   sigma:  0.3939
## 
##      AIC     AICc      BIC 
## 55.23364 58.73364 63.83757 
## 
## Error measures:
##                        ME      RMSE       MAE       MPE     MAPE      MASE
## Training set -0.003434173 0.3607115 0.2891514 -2.861409 13.23434 0.7808209
##                   ACF1
## Training set 0.1187669
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       2.295986 1.791220 2.800753 1.524013 3.067960
## 2016       2.295663 1.790897 2.800429 1.523690 3.067637
## 2017       2.295389 1.790623 2.800155 1.523416 3.067362
## 2018       2.295156 1.790390 2.799922 1.523183 3.067129
 checkresiduals(ffd_temp_holt_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 4.5635, df = 3, p-value = 0.2067
## 
## Model df: 5.   Total lags used: 8
 shapiro.test(residuals(ffd_temp_holt_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_temp_holt_1$model)
## W = 0.98081, p-value = 0.8347

The residual graphs for the above model are shown in Figure 51:

  • MASE of this model is 0.7808209

  • The time series plot clearly shows a random trend.

  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.

  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Holt’s method with Exponential Trend

ffd_rainfall_holt_2 = holt(x=ffd_rainfall.ts,y= ffd_FFD.ts, exponential =TRUE, initial = "simple", h=4)
summary(ffd_rainfall_holt_2)
## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", exponential = TRUE,  
## 
##  Call:
##      x = ffd_rainfall.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.0034 
##     beta  = 1 
## 
##   Initial states:
##     l = 2.4893 
##     b = 0.9946 
## 
##   sigma:  0.162
## Error measures:
##                     ME      RMSE       MAE        MPE     MAPE     MASE
## Training set 0.0563072 0.3700172 0.2813159 -0.3483481 12.71329 0.759662
##                  ACF1
## Training set 0.141713
## 
## Forecasts:
##      Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## 2015       2.161561 1.7275026 2.612885 1.4764501 2.858969
## 2016       2.155610 1.5623469 2.800993 1.2613799 3.215230
## 2017       2.149675 1.2036740 3.446229 0.8267971 4.310111
## 2018       2.143757 0.8501008 4.392260 0.4789412 6.421013
 checkresiduals(ffd_rainfall_holt_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 2.6025, df = 3, p-value = 0.4571
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_rainfall_holt_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_rainfall_holt_2$model)
## W = 0.96106, p-value = 0.3112

The residual graphs for the above model are shown in Figure 52:

  • MASE of this model is 0.759662
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

FFD vs Radiation

  • Holt’s linear trend
ffd_radiation_holt = holt(x=ffd_radiation.ts ,y= ffd_FFD.ts, initial = "simple", h=4)
summary(ffd_radiation_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", x = ffd_radiation.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.6325 
##     beta  = 0.1083 
## 
##   Initial states:
##     l = 14.8716 
##     b = -0.1867 
## 
##   sigma:  0.418
## Error measures:
##                      ME      RMSE       MAE       MPE     MAPE     MASE
## Training set 0.07882334 0.4179609 0.3440272 0.4968725 2.378153 1.010459
##                    ACF1
## Training set 0.03120009
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       14.48264 13.94700 15.01828 13.66345 15.30183
## 2016       14.46331 13.79673 15.12990 13.44386 15.48277
## 2017       14.44399 13.63705 15.25092 13.20988 15.67809
## 2018       14.42466 13.46859 15.38073 12.96248 15.88684
 checkresiduals(ffd_radiation_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 10.881, df = 3, p-value = 0.01239
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_radiation_holt$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_radiation_holt$model)
## W = 0.93923, p-value = 0.07859

The residual graphs for the above model are shown in Figure 53:

  • MASE of this model is 1.010459

  • The time series plot clearly shows a random trend.

  • Following the ACF plot, we may infer that the residuals might have significant serial correlations.

  • Since the p-value is less than 0.05, the Ljung-Box test partially maintains serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Damped Holt’s Method

ffd_radiation_holt_1 = holt(x=ffd_radiation.ts,y= ffd_FFD.ts, damped = TRUE, initial = "simple", h=4)
summary(ffd_radiation_holt_1)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, damped = TRUE, initial = "simple",  
## 
##  Call:
##      x = ffd_radiation.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.5669 
##     beta  = 1e-04 
##     phi   = 0.8155 
## 
##   Initial states:
##     l = 14.8923 
##     b = -0.1427 
## 
##   sigma:  0.4209
## 
##      AIC     AICc      BIC 
## 59.35475 62.85475 67.95867 
## 
## Error measures:
##                     ME      RMSE       MAE     MPE    MAPE      MASE       ACF1
## Training set 0.0141804 0.3855026 0.2967386 0.04604 2.05814 0.8715649 0.03494101
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       14.51202 13.97256 15.05148 13.68699 15.33705
## 2016       14.51184 13.89170 15.13198 13.56341 15.46027
## 2017       14.51169 13.82020 15.20319 13.45414 15.56924
## 2018       14.51157 13.75542 15.26772 13.35513 15.66801
 checkresiduals(ffd_radiation_holt_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 13.056, df = 3, p-value = 0.004516
## 
## Model df: 5.   Total lags used: 8
 shapiro.test(residuals(ffd_radiation_holt_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_radiation_holt_1$model)
## W = 0.93462, p-value = 0.05864

The residual graphs for the above model are shown in Figure 54:

  • MASE of this model is 0.8715649

  • The time series plot clearly shows a random trend.

  • Following the ACF plot, we may infer that the residuals might have significant serial correlations.

  • Since the p-value is less than 0.05, the Ljung-Box test partially maintains serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Holt’s method with Exponential Trend

ffd_radiation_holt_2 = holt(x=ffd_radiation.ts,y= ffd_FFD.ts, exponential =TRUE, initial = "simple", h=4)
summary(ffd_radiation_holt_2)
## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", exponential = TRUE,  
## 
##  Call:
##      x = ffd_radiation.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.6385 
##     beta  = 0.101 
## 
##   Initial states:
##     l = 14.8716 
##     b = 0.9874 
## 
##   sigma:  0.0289
## Error measures:
##                      ME      RMSE       MAE       MPE    MAPE     MASE     ACF1
## Training set 0.07818801 0.4159767 0.3424326 0.4923296 2.36704 1.005775 0.024796
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       14.48964 13.97345 15.01994 13.67250 15.26816
## 2016       14.47275 13.83005 15.12803 13.47445 15.48106
## 2017       14.45588 13.66756 15.26359 13.28585 15.68846
## 2018       14.43904 13.51661 15.38646 13.07255 15.93547
 checkresiduals(ffd_radiation_holt_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 11.087, df = 3, p-value = 0.01127
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_radiation_holt_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_radiation_holt_2$model)
## W = 0.9393, p-value = 0.07894

The residual graphs for the above model are shown in Figure 55:

  • MASE of this model is 1.005775
  • The time series plot clearly shows a random trend.
  • Following the ACF plot, we may infer that the residuals might have significant serial correlations.
  • Since the p-value is less than 0.05, the Ljung-Box test partially maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

FFD vs Humidity

  • Holt’s linear trend
ffd_humidity_holt = holt(x=ffd_humidity.ts ,y= ffd_FFD.ts, initial = "simple", h=4)
summary(ffd_humidity_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", x = ffd_humidity.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.6314 
##     beta  = 0.2397 
## 
##   Initial states:
##     l = 93.9265 
##     b = 1.0094 
## 
##   sigma:  1.0799
## Error measures:
##                      ME     RMSE       MAE        MPE      MAPE     MASE
## Training set -0.2493176 1.079915 0.9054309 -0.2704184 0.9586888 1.029767
##                   ACF1
## Training set 0.0359162
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       94.06144 92.67747 95.44541 91.94485 96.17803
## 2016       93.90082 92.06533 95.73632 91.09368 96.70797
## 2017       93.74021 91.34587 96.13455 90.07838 97.40204
## 2018       93.57959 90.54199 96.61720 88.93398 98.22521
 checkresiduals(ffd_humidity_holt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 5.9265, df = 3, p-value = 0.1152
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_humidity_holt$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_humidity_holt$model)
## W = 0.97622, p-value = 0.7017

The residual graphs for the above model are shown in Figure 56:

  • MASE of this model is 1.029767

  • The time series plot clearly shows a random trend.

  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.

  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Damped Holt’s Method

ffd_humidity_holt_1 = holt(x=ffd_humidity.ts,y= ffd_FFD.ts, damped = TRUE, initial = "simple", h=4)
summary(ffd_humidity_holt_1)
## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, damped = TRUE, initial = "simple",  
## 
##  Call:
##      x = ffd_humidity.ts) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
##     phi   = 0.9615 
## 
##   Initial states:
##     l = 94.1975 
##     b = 0.0322 
## 
##   sigma:  0.8262
## 
##      AIC     AICc      BIC 
## 101.1672 104.6672 109.7711 
## 
## Error measures:
##                        ME      RMSE       MAE          MPE      MAPE     MASE
## Training set -0.001230171 0.7566817 0.5822329 -0.007682963 0.6147453 0.662187
##                   ACF1
## Training set 0.1179912
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       94.77188 93.71301 95.83075 93.15248 96.39129
## 2016       94.78068 93.72181 95.83955 93.16127 96.40008
## 2017       94.78914 93.73026 95.84801 93.16973 96.40854
## 2018       94.79727 93.73839 95.85614 93.17786 96.41667
 checkresiduals(ffd_humidity_holt_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt's method
## Q* = 9.024, df = 3, p-value = 0.02897
## 
## Model df: 5.   Total lags used: 8
 shapiro.test(residuals(ffd_humidity_holt_1$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_humidity_holt_1$model)
## W = 0.95475, p-value = 0.2108

The residual graphs for the above model are shown in Figure 57:

  • MASE of this model is 0.662187

  • The time series plot clearly shows a random trend.

  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.

  • Since the p-value is less than 0.05, the Ljung-Box test partially maintains serial correlation at a 5% level of significance.

  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

  • Holt’s method with Exponential Trend

ffd_humidity_holt_2 = holt(x=ffd_humidity.ts,y= ffd_FFD.ts, exponential =TRUE, initial = "simple", h=4)
summary(ffd_humidity_holt_2)
## 
## Forecast method: Holt's method with exponential trend
## 
## Model Information:
## Holt's method with exponential trend 
## 
## Call:
##  holt(y = ffd_FFD.ts, h = 4, initial = "simple", exponential = TRUE,  
## 
##  Call:
##      x = ffd_humidity.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.6314 
##     beta  = 0.2428 
## 
##   Initial states:
##     l = 93.9265 
##     b = 1.0107 
## 
##   sigma:  0.0114
## Error measures:
##                      ME     RMSE       MAE        MPE      MAPE     MASE
## Training set -0.2516407 1.082106 0.9076911 -0.2728742 0.9610782 1.032338
##                    ACF1
## Training set 0.03593745
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2015       94.06189 92.69148 95.43643 91.88429 96.15345
## 2016       93.90133 92.07156 95.77224 91.15742 96.70926
## 2017       93.74105 91.36050 96.15738 90.15466 97.40245
## 2018       93.58104 90.55006 96.71406 89.14798 98.31646
 checkresiduals(ffd_humidity_holt_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method with exponential trend
## Q* = 5.9265, df = 3, p-value = 0.1152
## 
## Model df: 4.   Total lags used: 7
 shapiro.test(residuals(ffd_humidity_holt_2$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ffd_humidity_holt_2$model)
## W = 0.97614, p-value = 0.6993

The residual graphs for the above model are shown in Figure 58:

  • MASE of this model is 1.032338
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

The data frame has been constructed to contain the exponential smoothing models values, such as AIC/BIC and MASE, from the models that have been fitted for the same.

model_expo_t2 <- data.frame(Model=character() , MASE=numeric() ,
                           BIC= numeric() , AICC=numeric() , AIC=numeric())

model_expo_t2 = rbind(model_expo_t2,cbind(Model="Temperature_Holt's linear trend",MASE= accuracy(ffd_temp_holt)[6],
                                              AIC = ffd_temp_holt$model$aic,
                                              BIC = ffd_temp_holt$model$bic
                                      ))
#model_expo_t2 = rbind(model_expo_t2,cbind(Model="Temperature_Damped Holt's trend",MASE= accuracy(ffd_temp_holt_2)[7],
                                            #  AIC = ffd_temp_holt_2$model$aic,
                                             # BIC = ffd_temp_holt_2$model$bic
                                    #  ))


model_expo_t2 = rbind(model_expo_t2,cbind(Model="Temperature_Holt's Exponential Trend",MASE= accuracy(ffd_temp_holt_2)[6],
                                              AIC = ffd_temp_holt_2$model$aic,
                                              BIC = ffd_temp_holt_2$model$bic
                                      ))


model_expo_t2 = rbind(model_expo_t2,cbind(Model="Rainfall_Holt's linear trend",MASE= accuracy(ffd_rainfall_holt)[6],
                                               AIC = ffd_rainfall_holt$model$aic,
                                              BIC = ffd_rainfall_holt$model$bic
                                      ))
#model_expo_t2 = rbind(model_expo_t2,cbind(Model="Rainfall_Damped Holt's trend",MASE= accuracy(ffd_rainfall_holt_1)[7],
                                             # AIC = ffd_rainfall_holt_1$model$aic,
                                              #BIC = ffd_rainfall_holt_1$model$bic
                                     # ))


model_expo_t2 = rbind(model_expo_t2,cbind(Model="Rainfall_Holt's Exponential Trend",MASE= accuracy(ffd_rainfall_holt_2)[6],
                                              AIC = ffd_rainfall_holt_2$model$aic,
                                              BIC = ffd_rainfall_holt_2$model$bic
                                      ))

model_expo_t2 = rbind(model_expo_t2,cbind(Model="Radiation_Holt's linear trend",MASE= accuracy(ffd_radiation_holt)[6],
                                               AIC = ffd_radiation_holt$model$aic,
                                              BIC = ffd_radiation_holt$model$bic
                                      ))
#model_expo_t2 = rbind(model_expo_t2,cbind(Model="Radiation_Damped Holt's trend",MASE= accuracy(ffd_radiation_holt_1)[7],
                                             # AIC = ffd_radiation_holt_1$model$aic,
                                              #BIC = ffd_radiation_holt_1$model$bic
                                     # ))


model_expo_t2 = rbind(model_expo_t2,cbind(Model="Radiation_Holt's Exponential Trend",MASE= accuracy(ffd_radiation_holt_2)[6],
                                              AIC = ffd_radiation_holt_2$model$aic,
                                              BIC = ffd_radiation_holt_2$model$bic
                                      ))

model_expo_t2 = rbind(model_expo_t2,cbind(Model="Humidity_Holt's linear trend",MASE= accuracy(ffd_humidity_holt)[6],
                                               AIC = ffd_humidity_holt$model$aic,
                                              BIC = ffd_humidity_holt$model$bic
                                      ))
#model_expo_t2 = rbind(model_expo_t2,cbind(Model="Humidity_Damped Holt's trend",MASE= accuracy(ffd_humidity_holt_1)[7],
                                             # AIC = ffd_humidity_holt_1$model$aic,
                                              #BIC = ffd_humidity_holt_1$model$bic
                                     # ))


model_expo_t2 = rbind(model_expo_t2,cbind(Model="Humidity_Holt's Exponential Trend",MASE= accuracy(ffd_humidity_holt_2)[6],
                                              AIC = ffd_humidity_holt_2$model$aic,
                                              BIC = ffd_humidity_holt_2$model$bic
                                      ))


model_expo_t2
##                                  Model              MASE
## 1      Temperature_Holt's linear trend 0.987271485585963
## 2 Temperature_Holt's Exponential Trend 0.987962903349266
## 3         Rainfall_Holt's linear trend 0.763056433064963
## 4    Rainfall_Holt's Exponential Trend 0.759662016582427
## 5        Radiation_Holt's linear trend  1.01045854338075
## 6   Radiation_Holt's Exponential Trend   1.0057749005218
## 7         Humidity_Holt's linear trend  1.02976749539997
## 8    Humidity_Holt's Exponential Trend  1.03233808496807
sortScore(model_expo_t2,score = "mase")
##                                  Model              MASE
## 4    Rainfall_Holt's Exponential Trend 0.759662016582427
## 3         Rainfall_Holt's linear trend 0.763056433064963
## 1      Temperature_Holt's linear trend 0.987271485585963
## 2 Temperature_Holt's Exponential Trend 0.987962903349266
## 6   Radiation_Holt's Exponential Trend   1.0057749005218
## 5        Radiation_Holt's linear trend  1.01045854338075
## 7         Humidity_Holt's linear trend  1.02976749539997
## 8    Humidity_Holt's Exponential Trend  1.03233808496807

All of the observations have very high MASE values, which does not meet our end target because we consider the least mase values when forecasting.

State-space models

There are two state-space models for each exponential smoothing approach (with additive or multiplicative errors). (NOTE: some combinations are excluded due to their resistance problems).In this section, the auto ETS model is applied to check what the software’s automatically recommended model is.

FFD vs Temperature

The auto ETS model is applied on *FFD vs Temperature to check what the software’s automatically recommended model is.

auto_fit_temp <- ets(ffd_temp.ts)
summary(auto_fit_temp)
## ETS(A,N,N) 
## 
## Call:
##  ets(y = ffd_temp.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.3849 
## 
##   Initial states:
##     l = 9.4791 
## 
##   sigma:  0.3816
## 
##      AIC     AICc      BIC 
## 50.65338 51.54227 54.95535 
## 
## Training set error measures:
##                      ME      RMSE       MAE       MPE     MAPE      MASE
## Training set 0.04346056 0.3690668 0.3135555 0.3242531 3.295761 0.9084618
##                    ACF1
## Training set -0.1044899

ETS(A,N,N) is the model that is automatically proposed. It is a model with additive errors, No trend, and No seasonality.

checkresiduals(auto_fit_temp)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,N)
## Q* = 4.6895, df = 4, p-value = 0.3207
## 
## Model df: 2.   Total lags used: 6

The residual graphs for the above model are shown in Figure 59:

  • MASE of this model is 0.9084618
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

FFD vs Rainfall

The auto ETS model is applied on *FFD vs Rainfall to check what the software’s automatically recommended model is.

auto_fit_rainfall <- ets(ffd_rainfall.ts)
summary(auto_fit_rainfall)
## ETS(M,N,N) 
## 
## Call:
##  ets(y = ffd_rainfall.ts) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
## 
##   Initial states:
##     l = 2.3704 
## 
##   sigma:  0.1602
## 
##      AIC     AICc      BIC 
## 50.37363 51.26252 54.67559 
## 
## Training set error measures:
##                         ME      RMSE    MAE       MPE     MAPE      MASE
## Training set -0.0001317816 0.3674116 0.2939 -2.898761 13.64392 0.7936439
##                   ACF1
## Training set 0.1521194

ETS(M,N,N) is the model that is automatically proposed. It is a model with multiplicative error, No trend, and No seasonality.

checkresiduals(auto_fit_rainfall)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 1.8722, df = 4, p-value = 0.7592
## 
## Model df: 2.   Total lags used: 6

The residual graphs for the above model are shown in Figure 60:

  • MASE of this model is 0.7936439
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

FFD vs Radiation

The auto ETS model is applied on *FFD vs Radiation to check what the software’s automatically recommended model is.

auto_fit_radiation <- ets(ffd_radiation.ts)
summary(auto_fit_radiation)
## ETS(M,N,N) 
## 
## Call:
##  ets(y = ffd_radiation.ts) 
## 
##   Smoothing parameters:
##     alpha = 0.4751 
## 
##   Initial states:
##     l = 14.7359 
## 
##   sigma:  0.0274
## 
##      AIC     AICc      BIC 
## 53.55555 54.44444 57.85751 
## 
## Training set error measures:
##                      ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.0160236 0.3871487 0.2887058 -0.1651105 2.003834 0.8479715
##                   ACF1
## Training set 0.1105218

ETS(M,N,N) is the model that is automatically proposed. It is a model with multiplicative error, No trend, and No seasonality.

checkresiduals(auto_fit_radiation)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 8.9602, df = 4, p-value = 0.0621
## 
## Model df: 2.   Total lags used: 6

The residual graphs for the above model are shown in Figure 61:

  • MASE of this model is 0.8479715
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals partially have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is violation in normality assumptions.

FFD vs Humidity

The auto ETS model is applied on *FFD vs Humidity to check what the software’s automatically recommended model is.

auto_fit_humidity <- ets(ffd_humidity.ts)
summary(auto_fit_humidity)
## ETS(A,N,N) 
## 
## Call:
##  ets(y = ffd_humidity.ts) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
## 
##   Initial states:
##     l = 94.5428 
## 
##   sigma:  0.7996
## 
##       AIC      AICc       BIC 
##  96.52188  97.41077 100.82384 
## 
## Training set error measures:
##                       ME      RMSE      MAE          MPE      MAPE      MASE
## Training set 0.001186639 0.7733966 0.598649 -0.005409276 0.6321325 0.6808573
##                   ACF1
## Training set 0.1526309

ETS(A,N,N) is the model that is automatically proposed. It is a model with additive errors, No trend, and No seasonality.

checkresiduals(auto_fit_humidity)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,N)
## Q* = 5.0113, df = 4, p-value = 0.2861
## 
## Model df: 2.   Total lags used: 6

The residual graphs for the above model are shown in Figure 62:

  • MASE of this model is 0.6808573
  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Ljung-Box test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.

The data frame has been constructed to contain the state space models values, such as AIC/BIC and MASE, from the models that have been fitted for the same.

model_SSM_t2 <- data.frame(Model=character() , MASE=numeric() ,
                           BIC= numeric() , AICC=numeric() , AIC=numeric())

model_SSM_t2 = rbind(model_SSM_t2,cbind(Model="temperature_ANN", MASE= accuracy(auto_fit_temp)[6],
                                               AIC = auto_fit_temp$aic,
                                               BIC = auto_fit_temp$bic))

model_SSM_t2 = rbind(model_SSM_t2,cbind(Model="rainfall_MNN", MASE= accuracy(auto_fit_rainfall)[6],
                                               AIC = auto_fit_rainfall$aic,
                                               BIC = auto_fit_rainfall$bic))

model_SSM_t2 = rbind(model_SSM_t2,cbind(Model="radiation_MNN", MASE= accuracy(auto_fit_radiation)[6],
                                               AIC = auto_fit_radiation$aic,
                                               BIC = auto_fit_radiation$bic))

model_SSM_t2 = rbind(model_SSM_t2,cbind(Model="humidity_ANN", MASE= accuracy(auto_fit_humidity)[6],
                                               AIC = auto_fit_humidity$aic,
                                               BIC = auto_fit_humidity$bic))


model_SSM_t2
##             Model              MASE              AIC              BIC
## 1 temperature_ANN 0.908461778812865   50.65338416811 54.9553457815655
## 2    rainfall_MNN 0.793643867400069  50.373629102617 54.6755907160725
## 3   radiation_MNN 0.847971458223711 53.5555492134549 57.8575108269103
## 4    humidity_ANN 0.680857293381209 96.5218778990795 100.823839512535

The data frame has been constructed to contain the Overall model values, such as AIC/BIC and MASE, from the models that have been fitted so far, it is sorted by ascending MASE value. As a result of this table, it will be obvious which models have the lowest MASE.

best_overall_model_1 <- rbind(model_dlm_t2, model_SSM_t2)

sortScore(best_overall_model_1,score = "mase")
##                                                        Model              AIC
## poly_DLM_3                          Polynomial DLM_radiation 229.971978818336
## ardldlm_t2_55_humidity     autoregressive_dlm_t2_55_humidity 254.742930720082
## ardldlm_t2_54              autoregressive_dlm_t2_54_rainfall 248.435463161008
## ardldlm_t2_54_radiation   autoregressive_dlm_t2_54_radiation 255.743311740945
## finite_DLM_1                          Finite DLM_temperature 224.294212558929
## poly_DLM_1                        Polynomial DLM_temperature 220.354857903056
## poly_DLM_2                           Polynomial DLM_rainfall 246.124366615203
## finite_DLM_2                             Finite DLM_rainfall  246.12436661522
## finite_DLM_3                            Finite DLM_radiation 249.374802053031
## ardldlm_t2_55           autoregressive_dlm_t2_55_temperature 254.762035220218
## poly_DLM_4                           Polynomial DLM_humidity 253.275056993383
## finite_DLM_4                             Finite DLM_humidity 260.654409130322
## 4                                               humidity_ANN 96.5218778990795
## Koyck_model_3                          Koyck Model radiation 284.379218330674
## Koyck_model_2                           Koyck Model rainfall 288.932534852619
## Koyck_model_4                           Koyck Model humidity 287.736063024393
## Koyck_model_1                        Koyck Model temperature 290.719929660595
## 2                                               rainfall_MNN  50.373629102617
## 3                                              radiation_MNN 53.5555492134549
## 1                                            temperature_ANN   50.65338416811
##                                      BIC              MASE
## poly_DLM_3               239.39640946112 0.513033974107104
## ardldlm_t2_55_humidity  271.098185714361 0.516410415509345
## ardldlm_t2_54           263.532621617266 0.525678575983519
## ardldlm_t2_54_radiation 270.840470197203 0.527808063908983
## finite_DLM_1             236.78464893415 0.537763527585094
## poly_DLM_1              229.438811630489 0.548719565834438
## poly_DLM_2              256.189138919375 0.580890832050104
## finite_DLM_2            256.189138919392 0.580890832050315
## finite_DLM_3            259.439574357203 0.587089330263442
## ardldlm_t2_55           271.117290214497 0.587293350086806
## poly_DLM_4              261.050078189409 0.605608986366122
## finite_DLM_4            268.647636191373 0.612474384015626
## 4                       100.823839512535 0.680857293381209
## Koyck_model_3           289.984007857323 0.713630121013886
## Koyck_model_2           294.537324379268 0.726787211094777
## Koyck_model_4           293.340852551042 0.733595563885038
## Koyck_model_1           296.324719187244 0.792733774583207
## 2                       54.6755907160725 0.793643867400069
## 3                       57.8575108269103 0.847971458223711
## 1                       54.9553457815655 0.908461778812865

In terms of MASE, the best overall model table will be taken into consideration to analyze all approaches we endeavored throughout the modeling step. The model that has the lowest MASE value is a polynomial distributed lag model method. Model:poly_DLM_3 is the best-distributed lag model in terms of the lowest MASE.

Forecasting

The model with the lowest MASE value from the above model fitting is poly_DLM_3, which adopts the polynomial distributed lag model technique. Forecast for the next four years will be predicted with the help of model poly_DLM_3.


Task 3 Part (a) - The task is to conduct analysis using univariate climate regressors, using the best models available within the methodologies used, and providing the best RBO 3-year forecasts for the RBO series.

The objective of this task is to predict the best RBO 3-year forecasts for the RBO series. From 1983 to 2014, this data examines the impact of long-term climatic changes in Victoria on the relative blooming order similarity of 81 plant species. The species were ranked yearly based on the time it took to blossom (FFD), and changes in flowering order were determined by computing the similarity between the annual flowering order and the flowering order from 1983 using the Rank-based Order similarity metric (RBO).

RBO_1 <- read.csv("/Users/zuaibshaikh/Desktop/SEM 4/Forecasting/Final Project/RBO  .csv")
RBO = RBO_1[,2:6]
head(RBO)
##         RBO Temperature Rainfall Radiation RelHumidity
## 1 0.7550088    9.371585 2.489344  14.87158    93.92650
## 2 0.7407520    9.656164 2.475890  14.68493    94.93589
## 3 0.8423860    9.273973 2.421370  14.51507    94.09507
## 4 0.7484425    9.219178 2.319726  14.67397    94.49699
## 5 0.7984084   10.202186 2.465301  14.74863    94.08142
## 6 0.7938803    9.441096 2.735890  14.78356    96.08685
class(RBO$RBO)
## [1] "numeric"
class(RBO$Temperature)
## [1] "numeric"
class(RBO$Rainfall)
## [1] "numeric"
class(RBO$Radiation)
## [1] "numeric"
class(RBO$RelHumidity)
## [1] "numeric"
rbo_RBO.ts = ts(RBO$RBO, start = 1984,frequency = 1)
head(rbo_RBO.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 0.7550088 0.7407520 0.8423860 0.7484425 0.7984084 0.7938803
rbo_temp.ts=ts(RBO$Temperature,start = 1984, frequency = 1)
head(rbo_temp.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1]  9.371585  9.656164  9.273973  9.219178 10.202186  9.441096
rbo_rainfall.ts <- ts(RBO$Rainfall,start = 1984,frequency = 1)
head(rbo_rainfall.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 2.489344 2.475890 2.421370 2.319726 2.465301 2.735890
rbo_radiation.ts= ts(RBO$Radiation, start = 1984,frequency = 1)
head(rbo_radiation.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 14.87158 14.68493 14.51507 14.67397 14.74863 14.78356
rbo_humidity.ts = ts(RBO$RelHumidity, start = 1984,frequency = 1)
head(rbo_humidity.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
## [1] 93.92650 94.93589 94.09507 94.49699 94.08142 96.08685
rbo.ts= ts(RBO,start = 1984,frequency = 1)
head(rbo.ts)
## Time Series:
## Start = 1984 
## End = 1989 
## Frequency = 1 
##            RBO Temperature Rainfall Radiation RelHumidity
## 1984 0.7550088    9.371585 2.489344  14.87158    93.92650
## 1985 0.7407520    9.656164 2.475890  14.68493    94.93589
## 1986 0.8423860    9.273973 2.421370  14.51507    94.09507
## 1987 0.7484425    9.219178 2.319726  14.67397    94.49699
## 1988 0.7984084   10.202186 2.465301  14.74863    94.08142
## 1989 0.7938803    9.441096 2.735890  14.78356    96.08685

The existence of non-stationarity in dataset.

The aim here is to check whether the time series is stationary or non-stationary. The approach to monitoring this is with an ACF and PACF performance. The uniqueness of this is achieved by running a unit root test. The two tests are the enlarged Dickey-Fuller (ADF) and Phillips-Perron (PP).

The Descriptive Analysis

Five major patterns from a time series plot could be obtained from:

  • Trend.
  • Seasonality.
  • Changing Variation.
  • Behaviour.
  • Change Point.

Plotting graphs for the converted time series characteristics. Further, we will take a look at how each dataset feature performs the specific patterns mentioned above.

plot(rbo_temp.ts, xlab='Year', main = " Figure 1. Time series plot of annual RBO temperature series")

From Figure 1 of time series plot for annual RBO temperature series, we can interpret as follows:

  1. Trend - The plot is showing there is no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - Two interventions appear to occur in 1988 and 2006.
plot(rbo_rainfall.ts, xlab='Year', main = " Figure 2. Time series plot of annual RBO rainfall series")

From Figure 2 of time series plot for annual RBO rainfall series, we can interpret as follows:

  1. Trend - The plot is showing there is no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appears to take place in the year 1997.
plot(rbo_radiation.ts, xlab='Year', main = " Figure 3. Time series plot of annual RBO radiation series ")

From Figure 3 of time series plot for annual RBO radiation series, we can interpret as follows:

  1. Trend - The plot is showing kind of upward trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appears to take place in the year 1992.
plot(rbo_humidity.ts, xlab='Year', main = " Figure 4. Time series plot of annual RBO humidity series")

From Figure 4 of time series plot for annual RBO humidity series, we can interpret as follows:

  1. Trend - The plot is showing no trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - Three interventions appear to occur in 1900, 2000 and, 2010.
plot(rbo_RBO.ts, xlab='Year', main = " Figure 5. Time series plot of annual RBO series")

From Figure 5 of time series plot for annual RBO series, we can interpret as follows:

  1. Trend - The plot is showing kind of downward (unpredictable) trend.
  2. Seasonality - No seasonality is noticeable.
  3. Changing Variation - Unable to see any fluctuations that are greater or lesser or both consecutively, hence change in variance is not found.
  4. Behaviour – The series shows a moving average (up and down) behaviour.
  5. Change Point - An intervention appear to occur in 1996.
  • In order to precisely depict the secondary RBO series alongside the explicative all the rest response series in the same figure, we pleasure normalize the data. The code below gives a time series tale to investigate the series relationship.
rbo.scaled = scale(rbo.ts)
plot(rbo.scaled, plot.type="s",col = c("black", "red", "blue", "green","brown"), main = "Figure 6. annual RBO data series")
legend("topleft",lty=1, text.width =5, col=c("black", "red", "blue", "green","brown"), c("RBO","Temperature", "Rainfall", "Radiation", "Humidity"))

From figure 6 we can infer all of the above five-time series drawn together after scaling and centering.

Model Fitting - RBO vs Temperature

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model4 = dlm(x = as.vector(rbo_temp.ts), y = as.vector(rbo_RBO.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model4$model), "BIC = ", BIC(model4$model), "MASE =", MASE(model4)$MASE, "\n")
}
## q =  1 AIC =  -101.8617 BIC =  -96.2569 MASE = 0.9239038 
## q =  2 AIC =  -95.49894 BIC =  -88.66246 MASE = 1.032564 
## q =  3 AIC =  -96.76727 BIC =  -88.77404 MASE = 1.033663 
## q =  4 AIC =  -92.75653 BIC =  -83.68567 MASE = 1.005373 
## q =  5 AIC =  -91.46337 BIC =  -81.3986 MASE = 0.8594175

According to the output of finite distributed lag, lag 5 has the lowest MASE, AIC, and BIC values which are MASE = 0.8594175, AIC = -91.46337 BIC = -81.3986. As a result, we provide a lag duration of (q=5).

  • Fitting a finite DLM with a lag of 5 and doing the diagostic checking for Temperature with respect to dependent variable RBO
finite_DLM_11 <- dlm(x = as.vector(rbo_temp.ts), y = as.vector(rbo_RBO.ts), q = 5)
summary(finite_DLM_11)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.077397 -0.017434  0.001208  0.015598  0.072589 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.448644   0.388152  -1.156   0.2621  
## x.t          0.017165   0.023126   0.742   0.4670  
## x.1          0.062904   0.022246   2.828   0.0108 *
## x.2          0.008775   0.023849   0.368   0.7170  
## x.3         -0.013616   0.022763  -0.598   0.5568  
## x.4          0.024775   0.022193   1.116   0.2782  
## x.5          0.024342   0.022814   1.067   0.2994  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03584 on 19 degrees of freedom
## Multiple R-squared:  0.4353, Adjusted R-squared:  0.257 
## F-statistic: 2.441 on 6 and 19 DF,  p-value: 0.06401
## 
## AIC and BIC values for the model:
##         AIC      BIC
## 1 -91.46337 -81.3986

The above model of the finite distributed lag model has q=5, Almost all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.257, indicating that this only explains 25.7 percent of the variability in the model. The whole model has a p-value of 0.06401, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_11$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 9.2105, df = 10, p-value = 0.5122
shapiro.test(residuals(finite_DLM_11$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_11$model)
## W = 0.97249, p-value = 0.6884

The residual graphs for the above model are shown in Figure 7:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals might partially have significant serial correlation.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_11 =vif(finite_DLM_11$model)
vif_dlm_11
##      x.t      x.1      x.2      x.3      x.4      x.5 
## 1.547003 1.406418 1.483764 1.327923 1.224099 1.251783
vif_dlm_11 >10
##   x.t   x.1   x.2   x.3   x.4   x.5 
## FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=5 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:5){
        for(j in 1:5){
                model_4.1 <- polyDlm(x = as.vector(rbo_temp.ts),y = as.vector(rbo_RBO.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_4.1$model), "BIC:", BIC(model_4.1$model),"MASE =", MASE(model_4.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: -101.8617 BIC: -96.2569 MASE = 0.9239038 
## q: 1 k: 2 AIC: -101.8617 BIC: -96.2569 MASE = 0.9239038 
## q: 1 k: 3 AIC: -101.8617 BIC: -96.2569 MASE = 0.9239038 
## q: 1 k: 4 AIC: -101.8617 BIC: -96.2569 MASE = 0.9239038 
## q: 1 k: 5 AIC: -101.8617 BIC: -96.2569 MASE = 0.9239038 
## q: 2 k: 1 AIC: -95.42191 BIC: -89.95273 MASE = 1.110161 
## q: 2 k: 2 AIC: -95.49894 BIC: -88.66246 MASE = 1.032564 
## q: 2 k: 3 AIC: -95.49894 BIC: -88.66246 MASE = 1.032564 
## q: 2 k: 4 AIC: -95.49894 BIC: -88.66246 MASE = 1.032564 
## q: 2 k: 5 AIC: -95.49894 BIC: -88.66246 MASE = 1.032564 
## q: 3 k: 1 AIC: -99.49445 BIC: -94.16563 MASE = 1.100549 
## q: 3 k: 2 AIC: -97.93803 BIC: -91.277 MASE = 1.0725 
## q: 3 k: 3 AIC: -96.76727 BIC: -88.77404 MASE = 1.033663 
## q: 3 k: 4 AIC: -96.76727 BIC: -88.77404 MASE = 1.033663 
## q: 3 k: 5 AIC: -96.76727 BIC: -88.77404 MASE = 1.033663 
## q: 4 k: 1 AIC: -96.16093 BIC: -90.97758 MASE = 1.091081 
## q: 4 k: 2 AIC: -94.65533 BIC: -88.17614 MASE = 1.089145 
## q: 4 k: 3 AIC: -94.02474 BIC: -86.24972 MASE = 1.041394 
## q: 4 k: 4 AIC: -92.75653 BIC: -83.68567 MASE = 1.005373 
## q: 4 k: 5 AIC: -92.75653 BIC: -83.68567 MASE = 1.005373 
## q: 5 k: 1 AIC: -94.07911 BIC: -89.04672 MASE = 1.014876 
## q: 5 k: 2 AIC: -92.07915 BIC: -85.78867 MASE = 1.014553 
## q: 5 k: 3 AIC: -90.95199 BIC: -83.40341 MASE = 0.9579783 
## q: 5 k: 4 AIC: -93.45358 BIC: -84.6469 MASE = 0.8594518 
## q: 5 k: 5 AIC: -91.46337 BIC: -81.3986 MASE = 0.8594175

According to the output of polynomial distributed lag model, lag =5 and k=5 has the lowest MASE, AIC, and BIC values which are MASE = 0.8594175 , AIC: -91.46337 BIC: -81.3986. As a result, we provide a lag duration of (q=5, k=5).

  • Fitting a polynomial DLM forTemperature with respect to dependent variable RBO
poly_DLM_11 <- polyDlm(x = as.vector(rbo_temp.ts), y = as.vector(rbo_RBO.ts), q = 5, k = 5)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0  0.01720     0.0231   0.742  0.4660
## beta.1  0.06290     0.0222   2.830  0.0101
## beta.2  0.00877     0.0239   0.368  0.7170
## beta.3 -0.01360     0.0228  -0.598  0.5560
## beta.4  0.02480     0.0222   1.120  0.2770
## beta.5  0.02430     0.0228   1.070  0.2980
summary(poly_DLM_11)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.077397 -0.017434  0.001208  0.015598  0.072589 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.4486437  0.3881522  -1.156    0.262
## z.t0         0.0171654  0.0231264   0.742    0.467
## z.t1         0.1599656  0.1576411   1.015    0.323
## z.t2        -0.1518767  0.2532764  -0.600    0.556
## z.t3         0.0399670  0.1412423   0.283    0.780
## z.t4        -0.0020997  0.0321902  -0.065    0.949
## z.t5        -0.0002174  0.0025687  -0.085    0.933
## 
## Residual standard error: 0.03584 on 19 degrees of freedom
## Multiple R-squared:  0.4353, Adjusted R-squared:  0.257 
## F-statistic: 2.441 on 6 and 19 DF,  p-value: 0.06401

The above model of the polynomial distributed lag model has q=5 and k=5, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.257, indicating that this only explains 25.7 percent of the variability in the model. The whole model has a p-value of 0.06401, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_11$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 9.2105, df = 10, p-value = 0.5122
shapiro.test(residuals(poly_DLM_11$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_11$model)
## W = 0.97249, p-value = 0.6884

The residual graphs for the above model are shown in Figure 8:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals might partially have significant serial correlation.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_11 =vif(poly_DLM_11$model)
vif_poly_11
##         z.t0         z.t1         z.t2         z.t3         z.t4         z.t5 
## 1.332356e+01 4.529873e+03 1.873279e+05 1.136592e+06 1.269904e+06 1.839998e+05
vif_poly_11 >10
## z.t0 z.t1 z.t2 z.t3 z.t4 z.t5 
## TRUE TRUE TRUE TRUE TRUE TRUE
  • According to the VIF values, the above model with q=5 and k=5 has a multicollinearity problem.

Fitting Koyck model

  • Fitting a Koyck model forTemperature with respect to dependent variable RBO
Koyck_model_11 = koyckDlm(x = as.vector(rbo_temp.ts) , y = as.vector(rbo_RBO.ts))
summary(Koyck_model_11$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.15981 -0.04678 -0.01440  0.04750  0.14952 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -1.2032     1.6184  -0.743    0.464
## Y.1           0.2469     0.4609   0.536    0.597
## X.t           0.1847     0.1947   0.949    0.351
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value  
## Weak instruments   1  27     1.011  0.3236  
## Wu-Hausman         1  26     3.247  0.0832 .
## Sargan             0  NA        NA      NA  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.07557 on 27 degrees of freedom
## Multiple R-Squared: -1.597,  Adjusted R-squared: -1.789 
## Wald test: 2.119 on 2 and 27 DF,  p-value: 0.1397
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -1.789, indicating that this only explains -178.9 percent of the variability in the model. The whole model has a p-value of 0.1397, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_11$model)

shapiro.test(residuals(Koyck_model_11$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_11$model)
## W = 0.983, p-value = 0.8984

The residual graphs for the above model are shown in Figure 9:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_loyck_11=vif(Koyck_model_11$model)
vif_loyck_11
##      Y.1      X.t 
## 2.188316 2.188316
vif_loyck_11>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_4.2 = ardlDlm(x = as.vector(rbo_temp.ts), y = as.vector(rbo_RBO.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_4.2$model), "BIC =", BIC(model_4.2$model), "MASE =", MASE(model_4.2)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = -107.8419 BIC = -100.8359 MASE = 0.8074316 
## p = 1 q = 2 AIC = -105.4398 BIC = -97.236 MASE = 0.8117033 
## p = 1 q = 3 AIC = -109.6659 BIC = -100.3404 MASE = 0.8057573 
## p = 1 q = 4 AIC = -102.3384 BIC = -91.97173 MASE = 0.8610456 
## p = 1 q = 5 AIC = -97.42825 BIC = -86.10538 MASE = 0.7735245 
## p = 2 q = 1 AIC = -102.1496 BIC = -93.94579 MASE = 0.9024954 
## p = 2 q = 2 AIC = -104.8752 BIC = -95.30414 MASE = 0.7913477 
## p = 2 q = 3 AIC = -109.639 BIC = -98.98132 MASE = 0.790971 
## p = 2 q = 4 AIC = -102.2956 BIC = -90.63304 MASE = 0.8324633 
## p = 2 q = 5 AIC = -96.20616 BIC = -83.62519 MASE = 0.7544973 
## p = 3 q = 1 AIC = -106.2352 BIC = -96.90972 MASE = 0.8876139 
## p = 3 q = 2 AIC = -112.3458 BIC = -101.6882 MASE = 0.7726235 
## p = 3 q = 3 AIC = -111.2734 BIC = -99.28358 MASE = 0.7779966 
## p = 3 q = 4 AIC = -103.7311 BIC = -90.7727 MASE = 0.8253913 
## p = 3 q = 5 AIC = -97.11952 BIC = -83.28046 MASE = 0.7661286 
## p = 4 q = 1 AIC = -102.0941 BIC = -91.72737 MASE = 0.8648001 
## p = 4 q = 2 AIC = -105.9038 BIC = -94.24126 MASE = 0.8119505 
## p = 4 q = 3 AIC = -104.0851 BIC = -91.12676 MASE = 0.8210751 
## p = 4 q = 4 AIC = -102.0982 BIC = -87.844 MASE = 0.8188817 
## p = 4 q = 5 AIC = -95.40963 BIC = -80.31247 MASE = 0.7643666 
## p = 5 q = 1 AIC = -97.85381 BIC = -86.53094 MASE = 0.7652401 
## p = 5 q = 2 AIC = -99.0233 BIC = -86.44233 MASE = 0.7549862 
## p = 5 q = 3 AIC = -97.26093 BIC = -83.42187 MASE = 0.7593949 
## p = 5 q = 4 AIC = -95.45616 BIC = -80.359 MASE = 0.7544104 
## p = 5 q = 5 AIC = -93.50136 BIC = -77.1461 MASE = 0.7617211

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.7544104 , AIC: -95.45616, BIC: -80.359. As a result, we provide a lag duration of (p=5,q=4).

  • Fitting a autoregressive distributed lag model for Temperature with respect to dependent variable RBO.
ardldlm_t3_54 = ardlDlm(x = as.vector(rbo_temp.ts), y = as.vector(rbo_RBO.ts),p = 5, q =4)
summary(ardldlm_t3_54)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.054630 -0.008789  0.003029  0.020337  0.043122 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.109571   0.398960   0.275    0.787
## X.t          0.016450   0.022235   0.740    0.471
## X.1          0.037057   0.027124   1.366    0.192
## X.2         -0.025487   0.025187  -1.012    0.328
## X.3         -0.027172   0.024313  -1.118    0.281
## X.4          0.009329   0.024298   0.384    0.706
## X.5         -0.005712   0.025206  -0.227    0.824
## Y.1          0.347933   0.278641   1.249    0.231
## Y.2          0.273353   0.283548   0.964    0.350
## Y.3          0.073556   0.265753   0.277    0.786
## Y.4          0.088971   0.264610   0.336    0.741
## 
## Residual standard error: 0.03203 on 15 degrees of freedom
## Multiple R-squared:  0.644,  Adjusted R-squared:  0.4066 
## F-statistic: 2.713 on 10 and 15 DF,  p-value: 0.03963

The above model of the autoregressive distributed lag model has p=5 and q=4, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.4066, indicating that this only explains 40.66 percent of the variability in the model. The whole model has a p-value of 0.03963, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_t3_54$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 14
## 
## data:  Residuals
## LM test = 24.688, df = 14, p-value = 0.03777
shapiro.test(residuals(ardldlm_t3_54$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t3_54$model)
## W = 0.94679, p-value = 0.1949

The residual graphs for the above model are shown in Figure 10:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t3_54=vif(ardldlm_t3_54$model)
vif_ardldlm_t3_54
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  1.790677  2.618062  2.072295  1.896919  1.837423  1.913348  3.564094  3.700611 
## L(y.t, 3) L(y.t, 4) 
##  4.018736  3.957886
vif_ardldlm_t3_54>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE

According to the VIF values, the above model does not have the multicollinearity problem.

Model Fitting - RBO vs Rainfall

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model5 = dlm(x = as.vector(rbo_rainfall.ts), y = as.vector(rbo_RBO.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model5$model), "BIC = ", BIC(model5$model), "MASE =", MASE(model5)$MASE, "\n")
}
## q =  1 AIC =  -100.898 BIC =  -95.29319 MASE = 0.9417954 
## q =  2 AIC =  -96.70956 BIC =  -89.87308 MASE = 0.9993747 
## q =  3 AIC =  -97.19966 BIC =  -89.20643 MASE = 0.9796852 
## q =  4 AIC =  -90.46187 BIC =  -81.39101 MASE = 1.038827 
## q =  5 AIC =  -87.24242 BIC =  -77.17765 MASE = 0.925677

According to the output of finite distributed lag, lag 5 has the lowest MASE, AIC, and BIC values which are MASE = 0.925677, AIC = -87.24242 BIC = -77.17765. As a result, we provide a lag duration of (q=5).

  • Fitting a finite DLM with a lag of 5 and doing the diagostic checking for Rainfall with respect to dependent variable RBO
finite_DLM_22 <- dlm(x = as.vector(rbo_rainfall.ts), y = as.vector(rbo_RBO.ts), q = 5)
summary(finite_DLM_22)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.091403 -0.021583 -0.001857  0.009887  0.060819 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.489483   0.100958   4.848 0.000112 ***
## x.t          0.041478   0.019846   2.090 0.050300 .  
## x.1          0.025103   0.020312   1.236 0.231581    
## x.2          0.023798   0.020507   1.160 0.260232    
## x.3          0.001443   0.020531   0.070 0.944703    
## x.4         -0.001944   0.020979  -0.093 0.927131    
## x.5          0.012340   0.021170   0.583 0.566816    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03887 on 19 degrees of freedom
## Multiple R-squared:  0.3358, Adjusted R-squared:  0.126 
## F-statistic: 1.601 on 6 and 19 DF,  p-value: 0.2013
## 
## AIC and BIC values for the model:
##         AIC       BIC
## 1 -87.24242 -77.17765

The above model of the finite distributed lag model has q=5, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.126, indicating that this only explains 12.6 percent of the variability in the model. The whole model has a p-value of 0.2013, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_22$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 11.098, df = 10, p-value = 0.35
shapiro.test(residuals(finite_DLM_22$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_22$model)
## W = 0.90869, p-value = 0.0246

The residual graphs for the above model are shown in Figure 11:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals might partially have significant serial correlation.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_dlm_22 =vif(finite_DLM_22$model)
vif_dlm_22
##      x.t      x.1      x.2      x.3      x.4      x.5 
## 1.079459 1.129580 1.143071 1.130249 1.099396 1.067556
vif_dlm_22 >10
##   x.t   x.1   x.2   x.3   x.4   x.5 
## FALSE FALSE FALSE FALSE FALSE FALSE
  • According to the VIF values, the above model with q=5 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:5){
        for(j in 1:5){
                model5.1 <- polyDlm(x = as.vector(rbo_rainfall.ts),y = as.vector(rbo_RBO.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model5.1$model), "BIC:", BIC(model5.1$model),"MASE =", MASE(model5.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: -100.898 BIC: -95.29319 MASE = 0.9417954 
## q: 1 k: 2 AIC: -100.898 BIC: -95.29319 MASE = 0.9417954 
## q: 1 k: 3 AIC: -100.898 BIC: -95.29319 MASE = 0.9417954 
## q: 1 k: 4 AIC: -100.898 BIC: -95.29319 MASE = 0.9417954 
## q: 1 k: 5 AIC: -100.898 BIC: -95.29319 MASE = 0.9417954 
## q: 2 k: 1 AIC: -98.48624 BIC: -93.01706 MASE = 0.9848848 
## q: 2 k: 2 AIC: -96.70956 BIC: -89.87308 MASE = 0.9993747 
## q: 2 k: 3 AIC: -96.70956 BIC: -89.87308 MASE = 0.9993747 
## q: 2 k: 4 AIC: -96.70956 BIC: -89.87308 MASE = 0.9993747 
## q: 2 k: 5 AIC: -96.70956 BIC: -89.87308 MASE = 0.9993747 
## q: 3 k: 1 AIC: -100.8065 BIC: -95.47764 MASE = 0.9539922 
## q: 3 k: 2 AIC: -98.84343 BIC: -92.18241 MASE = 0.9491588 
## q: 3 k: 3 AIC: -97.19966 BIC: -89.20643 MASE = 0.9796852 
## q: 3 k: 4 AIC: -97.19966 BIC: -89.20643 MASE = 0.9796852 
## q: 3 k: 5 AIC: -97.19966 BIC: -89.20643 MASE = 0.9796852 
## q: 4 k: 1 AIC: -96.04237 BIC: -90.85903 MASE = 1.006033 
## q: 4 k: 2 AIC: -94.04271 BIC: -87.56352 MASE = 1.006192 
## q: 4 k: 3 AIC: -92.04625 BIC: -84.27122 MASE = 1.003342 
## q: 4 k: 4 AIC: -90.46187 BIC: -81.39101 MASE = 1.038827 
## q: 4 k: 5 AIC: -90.46187 BIC: -81.39101 MASE = 1.038827 
## q: 5 k: 1 AIC: -93.67105 BIC: -88.63866 MASE = 0.9590236 
## q: 5 k: 2 AIC: -92.46231 BIC: -86.17182 MASE = 0.9264235 
## q: 5 k: 3 AIC: -90.96436 BIC: -83.41578 MASE = 0.9083469 
## q: 5 k: 4 AIC: -89.13162 BIC: -80.32495 MASE = 0.911311 
## q: 5 k: 5 AIC: -87.24242 BIC: -77.17765 MASE = 0.925677

According to the output of polynomial distributed lag model, lag =5 and k=3 has the lowest MASE, AIC, and BIC values which are MASE = 0.9083469, AIC: -90.96436, BIC: -83.41578. As a result, we provide a lag duration of (q=5, k=3).

  • Fitting a polynomial DLM forRainfall with respect to dependent variable RBO
poly_DLM_22 <- polyDlm(x = as.vector(rbo_rainfall.ts), y = as.vector(rbo_RBO.ts), q = 5, k = 3)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0  0.03820     0.0176  2.1700  0.0416
## beta.1  0.03170     0.0133  2.3800  0.0269
## beta.2  0.01720     0.0118  1.4600  0.1600
## beta.3  0.00324     0.0121  0.2680  0.7910
## beta.4 -0.00126     0.0142 -0.0893  0.9300
## beta.5  0.01240     0.0189  0.6590  0.5170
summary(poly_DLM_22)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.096909 -0.022011 -0.000940  0.008207  0.063816 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.4910770  0.0964585   5.091 4.84e-05 ***
## z.t0         0.0382111  0.0176055   2.170   0.0416 *  
## z.t1         0.0005182  0.0350342   0.015   0.9883    
## z.t2        -0.0084447  0.0174155  -0.485   0.6328    
## z.t3         0.0014620  0.0022848   0.640   0.5292    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03717 on 21 degrees of freedom
## Multiple R-squared:  0.3286, Adjusted R-squared:  0.2007 
## F-statistic:  2.57 on 4 and 21 DF,  p-value: 0.06786

The above model of the polynomial distributed lag model has q=5 and k=3, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.2007, indicating that this only explains 20.07 percent of the variability in the model. The whole model has a p-value of 0.2007, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_22$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 8
## 
## data:  Residuals
## LM test = 8.9817, df = 8, p-value = 0.3438
shapiro.test(residuals(poly_DLM_22$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_22$model)
## W = 0.91071, p-value = 0.02737

The residual graphs for the above model are shown in Figure 12:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals might partially have significant serial correlation.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals a breach of the normalcy assumptions.
  • Since the p-value is < 0.05 we reject the null hypothesis(H0). This implies that errors are not normally distributed. Hence assumption is violated.

Now checking the multicollinearity issue

vif_poly_22 =vif(poly_DLM_22$model)
vif_poly_22
##       z.t0       z.t1       z.t2       z.t3 
##   6.747024 209.695463 854.528635 297.455092
vif_poly_22 >10
##  z.t0  z.t1  z.t2  z.t3 
## FALSE  TRUE  TRUE  TRUE
  • According to the VIF values, the above model with q=5 and k=3 is slightly affected by multicollinearity.

Fitting Koyck model

  • Fitting a Koyck model forRainfall with respect to dependent variable RBO
Koyck_model_22 = koyckDlm(x = as.vector(rbo_rainfall.ts) , y = as.vector(rbo_RBO.ts))
summary(Koyck_model_22$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3665 -0.4155 -0.1142  0.3241  1.6012 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   0.3207     2.4302   0.132    0.896
## Y.1          -6.5147   243.8216  -0.027    0.979
## X.t           2.2101    76.0635   0.029    0.977
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     0.001   0.977
## Wu-Hausman         1  26     0.360   0.554
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 0.7951 on 27 degrees of freedom
## Multiple R-Squared: -286.5,  Adjusted R-squared: -307.8 
## Wald test: 0.01549 on 2 and 27 DF,  p-value: 0.9846
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -307.8, indicating that this only explains -30780 percent of the variability in the model. The whole model has a p-value of 0.9846, which is greater than 0.05, which shows that it is not statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_22$model)

shapiro.test(residuals(Koyck_model_22$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_22$model)
## W = 0.96568, p-value = 0.4287

The residual graphs for the above model are shown in Figure 13:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_22=vif(Koyck_model_22$model)
vif_koyck_22
##      Y.1      X.t 
## 5531.807 5531.807
vif_koyck_22>10
##  Y.1  X.t 
## TRUE TRUE
  • According to the VIF values, the above model has a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_5.2 = ardlDlm(x = as.vector(rbo_rainfall.ts), y = as.vector(rbo_RBO.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_5.2$model), "BIC =", BIC(model_5.2$model), "MASE =", MASE(model_5.2)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = -105.2619 BIC = -98.25588 MASE = 0.828275 
## p = 1 q = 2 AIC = -103.3681 BIC = -95.16429 MASE = 0.8543791 
## p = 1 q = 3 AIC = -106.9248 BIC = -97.59935 MASE = 0.8322089 
## p = 1 q = 4 AIC = -102.0678 BIC = -91.70114 MASE = 0.8714349 
## p = 1 q = 5 AIC = -95.80256 BIC = -84.47969 MASE = 0.8152025 
## p = 2 q = 1 AIC = -99.21505 BIC = -91.01127 MASE = 0.9189202 
## p = 2 q = 2 AIC = -101.4552 BIC = -91.88416 MASE = 0.8562257 
## p = 2 q = 3 AIC = -104.9284 BIC = -94.27076 MASE = 0.8328582 
## p = 2 q = 4 AIC = -100.0892 BIC = -88.4267 MASE = 0.8706236 
## p = 2 q = 5 AIC = -93.80342 BIC = -81.22246 MASE = 0.81573 
## p = 3 q = 1 AIC = -102.0287 BIC = -92.70325 MASE = 0.8852654 
## p = 3 q = 2 AIC = -106.4754 BIC = -95.8178 MASE = 0.8316226 
## p = 3 q = 3 AIC = -105.1996 BIC = -93.2098 MASE = 0.8307901 
## p = 3 q = 4 AIC = -99.66585 BIC = -86.70748 MASE = 0.860244 
## p = 3 q = 5 AIC = -93.30294 BIC = -79.46387 MASE = 0.8095313 
## p = 4 q = 1 AIC = -96.40802 BIC = -86.04133 MASE = 0.8956382 
## p = 4 q = 2 AIC = -100.4881 BIC = -88.82555 MASE = 0.8337316 
## p = 4 q = 3 AIC = -100.0049 BIC = -87.04657 MASE = 0.7754451 
## p = 4 q = 4 AIC = -98.96532 BIC = -84.71111 MASE = 0.7942758 
## p = 4 q = 5 AIC = -92.62017 BIC = -77.52301 MASE = 0.7390848 
## p = 5 q = 1 AIC = -93.55318 BIC = -82.23031 MASE = 0.7936346 
## p = 5 q = 2 AIC = -94.0473 BIC = -81.46633 MASE = 0.7842158 
## p = 5 q = 3 AIC = -93.91526 BIC = -80.0762 MASE = 0.7237105 
## p = 5 q = 4 AIC = -92.68035 BIC = -77.58319 MASE = 0.7490465 
## p = 5 q = 5 AIC = -90.68282 BIC = -74.32757 MASE = 0.7469974

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.7237105 , AIC: -93.91526, BIC: -80.0762. As a result, we provide a lag duration of (p=5,q=3).

  • Fitting a autoregressive distributed lag model for Rainfall with respect to dependent variable RBO.
ardldlm_t3_53 = ardlDlm(x = as.vector(rbo_rainfall.ts), y = as.vector(rbo_RBO.ts),p = 5, q =3)
summary(ardldlm_t3_53)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.072284 -0.008546  0.000512  0.019051  0.039909 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.184797   0.130413   1.417    0.176
## X.t          0.015255   0.019378   0.787    0.443
## X.1          0.002519   0.019466   0.129    0.899
## X.2         -0.005928   0.020127  -0.295    0.772
## X.3         -0.022491   0.019505  -1.153    0.266
## X.4         -0.021936   0.020202  -1.086    0.294
## X.5         -0.001398   0.019784  -0.071    0.945
## Y.1          0.319059   0.246953   1.292    0.215
## Y.2          0.275252   0.244529   1.126    0.277
## Y.3          0.255317   0.233870   1.092    0.291
## 
## Residual standard error: 0.0332 on 16 degrees of freedom
## Multiple R-squared:  0.592,  Adjusted R-squared:  0.3625 
## F-statistic:  2.58 on 9 and 16 DF,  p-value: 0.04715

The above model of the autoregressive distributed lag model has p=5 and q=3, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.3625, indicating that this only explains 36.25 percent of the variability in the model. The whole model has a p-value of 0.04715, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_t3_53$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 13
## 
## data:  Residuals
## LM test = 21.011, df = 13, p-value = 0.07272
shapiro.test(residuals(ardldlm_t3_53$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t3_53$model)
## W = 0.93614, p-value = 0.1086

The residual graphs for the above model are shown in Figure 14:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does no maintain serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t3_53=vif(ardldlm_t3_53$model)
vif_ardldlm_t3_53
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  1.410980  1.422348  1.509603  1.398588  1.397659  1.278252  2.605984  2.561898 
## L(y.t, 3) 
##  2.897114
vif_ardldlm_t3_53>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) 
##     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

Model Fitting - RBO vs Radiation

Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model_6 = dlm(x = as.vector(rbo_radiation.ts), y = as.vector(rbo_RBO.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model_6$model), "BIC = ", BIC(model_6$model), "MASE =", MASE(model_6)$MASE, "\n")
}
## q =  1 AIC =  -97.71113 BIC =  -92.10634 MASE = 1.063029 
## q =  2 AIC =  -91.50708 BIC =  -84.67061 MASE = 1.174709 
## q =  3 AIC =  -92.38658 BIC =  -84.39335 MASE = 1.128463 
## q =  4 AIC =  -86.43017 BIC =  -77.35931 MASE = 1.199809 
## q =  5 AIC =  -83.32223 BIC =  -73.25746 MASE = 1.071856

According to the output of finite distributed lag, lag 5 has the lowest MASE, AIC, and BIC values which are MASE = 1.063029 , AIC = -97.71113 BIC = -92.10634. As a result, we provide a lag duration of (q=1).

  • Fitting a finite DLM with a lag of 1 and doing the diagostic checking for Radiation with respect to dependent variable RBO
finite_DLM_33 <- dlm(x = as.vector(rbo_radiation.ts), y = as.vector(rbo_RBO.ts), q = 1)
summary(finite_DLM_33)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.072417 -0.030826 -0.001718  0.016681  0.100514 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  1.13698    0.34038   3.340  0.00246 **
## x.t         -0.04431    0.02234  -1.983  0.05764 . 
## x.1          0.01689    0.02216   0.762  0.45261   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0438 on 27 degrees of freedom
## Multiple R-squared:  0.1277, Adjusted R-squared:  0.06311 
## F-statistic: 1.977 on 2 and 27 DF,  p-value: 0.1581
## 
## AIC and BIC values for the model:
##         AIC       BIC
## 1 -97.71113 -92.10634

The above model of the finite distributed lag model has q=1, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is 0.06311, indicating that this only explains 6.311 percent of the variability in the model. The whole model has a p-value of 0.1581, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_33$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 6
## 
## data:  Residuals
## LM test = 13.502, df = 6, p-value = 0.03573
shapiro.test(residuals(finite_DLM_33$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_33$model)
## W = 0.96446, p-value = 0.4006

The residual graphs for model 2.1 are shown in Figure 15:

  • The time series plot clearly shows a random trend.
  • We may determine from the ACF plot that the serial correlation remaining in the residuals is significant.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_33 =vif(finite_DLM_33$model)
vif_dlm_33
##      x.t      x.1 
## 1.254578 1.254578
vif_dlm_33 >10
##   x.t   x.1 
## FALSE FALSE
  • According to the VIF values, the above model with q=1 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:10){
        for(j in 1:5){
                model_6.1 <- polyDlm(x = as.vector(rbo_radiation.ts),y = as.vector(rbo_RBO.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_6.1$model), "BIC:", BIC(model_6.1$model),"MASE =", MASE(model_6.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: -97.71113 BIC: -92.10634 MASE = 1.063029 
## q: 1 k: 2 AIC: -97.71113 BIC: -92.10634 MASE = 1.063029 
## q: 1 k: 3 AIC: -97.71113 BIC: -92.10634 MASE = 1.063029 
## q: 1 k: 4 AIC: -97.71113 BIC: -92.10634 MASE = 1.063029 
## q: 1 k: 5 AIC: -97.71113 BIC: -92.10634 MASE = 1.063029 
## q: 2 k: 1 AIC: -92.9251 BIC: -87.45592 MASE = 1.152551 
## q: 2 k: 2 AIC: -91.50708 BIC: -84.67061 MASE = 1.174709 
## q: 2 k: 3 AIC: -91.50708 BIC: -84.67061 MASE = 1.174709 
## q: 2 k: 4 AIC: -91.50708 BIC: -84.67061 MASE = 1.174709 
## q: 2 k: 5 AIC: -91.50708 BIC: -84.67061 MASE = 1.174709 
## q: 3 k: 1 AIC: -94.67865 BIC: -89.34983 MASE = 1.135538 
## q: 3 k: 2 AIC: -93.41298 BIC: -86.75196 MASE = 1.140154 
## q: 3 k: 3 AIC: -92.38658 BIC: -84.39335 MASE = 1.128463 
## q: 3 k: 4 AIC: -92.38658 BIC: -84.39335 MASE = 1.128463 
## q: 3 k: 5 AIC: -92.38658 BIC: -84.39335 MASE = 1.128463 
## q: 4 k: 1 AIC: -91.39597 BIC: -86.21262 MASE = 1.218236 
## q: 4 k: 2 AIC: -89.71544 BIC: -83.23625 MASE = 1.203645 
## q: 4 k: 3 AIC: -88.32214 BIC: -80.54712 MASE = 1.208186 
## q: 4 k: 4 AIC: -86.43017 BIC: -77.35931 MASE = 1.199809 
## q: 4 k: 5 AIC: -86.43017 BIC: -77.35931 MASE = 1.199809 
## q: 5 k: 1 AIC: -89.32415 BIC: -84.29176 MASE = 1.095777 
## q: 5 k: 2 AIC: -88.46836 BIC: -82.17788 MASE = 1.085079 
## q: 5 k: 3 AIC: -86.73669 BIC: -79.18811 MASE = 1.080463 
## q: 5 k: 4 AIC: -85.1591 BIC: -76.35242 MASE = 1.090996 
## q: 5 k: 5 AIC: -83.32223 BIC: -73.25746 MASE = 1.071856 
## q: 6 k: 1 AIC: -87.29653 BIC: -82.42103 MASE = 0.9929254 
## q: 6 k: 2 AIC: -87.48527 BIC: -81.39089 MASE = 0.9572973 
## q: 6 k: 3 AIC: -86.68188 BIC: -79.36862 MASE = 0.9534462 
## q: 6 k: 4 AIC: -84.71998 BIC: -76.18785 MASE = 0.949158 
## q: 6 k: 5 AIC: -83.85092 BIC: -74.09991 MASE = 0.9501105 
## q: 7 k: 1 AIC: -85.26721 BIC: -80.55499 MASE = 0.9474525 
## q: 7 k: 2 AIC: -86.62398 BIC: -80.73372 MASE = 0.8494332 
## q: 7 k: 3 AIC: -86.02833 BIC: -78.96001 MASE = 0.8732318 
## q: 7 k: 4 AIC: -84.07231 BIC: -75.82593 MASE = 0.8682161 
## q: 7 k: 5 AIC: -82.13156 BIC: -72.70713 MASE = 0.8714299 
## q: 8 k: 1 AIC: -86.99572 BIC: -82.45375 MASE = 0.8335042 
## q: 8 k: 2 AIC: -89.0863 BIC: -83.40883 MASE = 0.7259512 
## q: 8 k: 3 AIC: -88.53663 BIC: -81.72366 MASE = 0.723367 
## q: 8 k: 4 AIC: -86.53817 BIC: -78.58971 MASE = 0.7230053 
## q: 8 k: 5 AIC: -84.77659 BIC: -75.69263 MASE = 0.7167434 
## q: 9 k: 1 AIC: -90.03375 BIC: -85.66958 MASE = 0.7506991 
## q: 9 k: 2 AIC: -89.93849 BIC: -84.48328 MASE = 0.6974314 
## q: 9 k: 3 AIC: -88.29421 BIC: -81.74795 MASE = 0.6854902 
## q: 9 k: 4 AIC: -86.6218 BIC: -78.98451 MASE = 0.6712658 
## q: 9 k: 5 AIC: -84.83764 BIC: -76.1093 MASE = 0.6685971 
## q: 10 k: 1 AIC: -89.46004 BIC: -85.28196 MASE = 0.7081733 
## q: 10 k: 2 AIC: -88.15084 BIC: -82.92823 MASE = 0.6711964 
## q: 10 k: 3 AIC: -86.83238 BIC: -80.56525 MASE = 0.6623566 
## q: 10 k: 4 AIC: -85.0392 BIC: -77.72754 MASE = 0.6549231 
## q: 10 k: 5 AIC: -83.62597 BIC: -75.26979 MASE = 0.6654978

According to the output of polynomial distributed lag model, lag =10 and k=4 has the lowest MASE, AIC, and BIC values which are MASE = 0.6549231 , AIC: -85.0392, BIC: -77.72754. As a result, we provide a lag duration of (q=10, k=4).

  • Fitting a polynomial DLM forRadiation with respect to dependent variable RBO
poly_DLM_33 <- polyDlm(x = as.vector(rbo_radiation.ts), y = as.vector(rbo_RBO.ts), q = 10, k = 4)
## Estimates and t-tests for beta coefficients:
##          Estimate Std. Error t value P(>|t|)
## beta.0  -0.008810    0.01820 -0.4850   0.637
## beta.1  -0.005100    0.00804 -0.6340   0.539
## beta.2   0.000407    0.00923  0.0441   0.966
## beta.3   0.005250    0.00622  0.8440   0.417
## beta.4   0.007910    0.00598  1.3200   0.212
## beta.5   0.007820    0.00740  1.0600   0.313
## beta.6   0.005360    0.00650  0.8250   0.427
## beta.7   0.001850    0.00629  0.2940   0.774
## beta.8  -0.000437    0.00948 -0.0461   0.964
## beta.9   0.001730    0.01020  0.1700   0.868
## beta.10  0.012500    0.01670  0.7480   0.470
summary(poly_DLM_33)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.044041 -0.009137 -0.001172  0.023157  0.033911 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  2.999e-01  5.086e-01   0.590    0.564
## z.t0        -8.813e-03  1.816e-02  -0.485    0.634
## z.t1         1.754e-03  3.199e-02   0.055    0.957
## z.t2         2.565e-03  1.382e-02   0.186    0.855
## z.t3        -6.475e-04  2.094e-03  -0.309    0.761
## z.t4         3.947e-05  1.024e-04   0.385    0.705
## 
## Residual standard error: 0.02708 on 15 degrees of freedom
## Multiple R-squared:  0.1888, Adjusted R-squared:  -0.08155 
## F-statistic: 0.6984 on 5 and 15 DF,  p-value: 0.633

The above model of the polynomial distributed lag model has q=10 and k=4, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.08155, indicating that this only explains -8.155 percent of the variability in the model. The whole model has a p-value of 0.633, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_33$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 9
## 
## data:  Residuals
## LM test = 15.867, df = 9, p-value = 0.06972
shapiro.test(residuals(poly_DLM_33$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_33$model)
## W = 0.93935, p-value = 0.2116

The residual graphs for the above model are shown in Figure 16:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is greater than 0.05, the Beusch-Godfrey test does not maintain serial correlation at a 5% level of significance.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_33 =vif(poly_DLM_33$model)
vif_poly_33
##        z.t0        z.t1        z.t2        z.t3        z.t4 
##    40.28816  4121.60363 43446.90176 67250.31476 12038.80012
vif_poly_33 >10
## z.t0 z.t1 z.t2 z.t3 z.t4 
## TRUE TRUE TRUE TRUE TRUE
  • According to the VIF values, the above model with q=10 and k=4 has a multicollinearity problem.

Fitting Koyck model

  • Fitting a Koyck model for Radiation with respect to dependent variable RBO
Koyck_model_33 = koyckDlm(x = as.vector(rbo_radiation.ts) , y = as.vector(rbo_RBO.ts))
summary(Koyck_model_33$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.082255 -0.017008 -0.001036  0.021424  0.106984 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -0.48011    0.94819  -0.506   0.6167   
## Y.1          0.69801    0.24502   2.849   0.0083 **
## X.t          0.04812    0.05661   0.850   0.4028   
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value  
## Weak instruments   1  27     4.942  0.0348 *
## Wu-Hausman         1  26     2.765  0.1084  
## Sargan             0  NA        NA      NA  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0467 on 27 degrees of freedom
## Multiple R-Squared: 0.008467,    Adjusted R-squared: -0.06498 
## Wald test: 4.731 on 2 and 27 DF,  p-value: 0.01732
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.06498, indicating that this only explains -6.498 percent of the variability in the model. The whole model has a p-value of 0.01732, which is less than 0.05, which shows that it is statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_33$model)

shapiro.test(residuals(Koyck_model_33$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_33$model)
## W = 0.96083, p-value = 0.3253

The residual graphs for the above model are shown in Figure 17:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_33=vif(Koyck_model_33$model)
vif_koyck_33
##      Y.1      X.t 
## 1.619594 1.619594
vif_koyck_33>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_6.2 = ardlDlm(x = as.vector(rbo_radiation.ts), y = as.vector(rbo_RBO.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_6.2$model), "BIC =", BIC(model_6.2$model), "MASE =", MASE(model_6.2)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = -107.5622 BIC = -100.5562 MASE = 0.8390648 
## p = 1 q = 2 AIC = -106.904 BIC = -98.70018 MASE = 0.8249931 
## p = 1 q = 3 AIC = -110.6338 BIC = -101.3084 MASE = 0.8202653 
## p = 1 q = 4 AIC = -105.7923 BIC = -95.42561 MASE = 0.8192118 
## p = 1 q = 5 AIC = -100.2033 BIC = -88.88038 MASE = 0.7627672 
## p = 2 q = 1 AIC = -100.8085 BIC = -92.60473 MASE = 0.9205412 
## p = 2 q = 2 AIC = -106.3664 BIC = -96.79529 MASE = 0.765347 
## p = 2 q = 3 AIC = -109.7302 BIC = -99.07252 MASE = 0.7689908 
## p = 2 q = 4 AIC = -104.7047 BIC = -93.04219 MASE = 0.7822872 
## p = 2 q = 5 AIC = -98.98255 BIC = -86.40158 MASE = 0.7268185 
## p = 3 q = 1 AIC = -104.0823 BIC = -94.75685 MASE = 0.8867923 
## p = 3 q = 2 AIC = -109.5259 BIC = -98.8683 MASE = 0.756036 
## p = 3 q = 3 AIC = -108.3733 BIC = -96.38349 MASE = 0.7495842 
## p = 3 q = 4 AIC = -102.9405 BIC = -89.98212 MASE = 0.7620013 
## p = 3 q = 5 AIC = -97.36077 BIC = -83.5217 MASE = 0.7009101 
## p = 4 q = 1 AIC = -99.86384 BIC = -89.49714 MASE = 0.8800549 
## p = 4 q = 2 AIC = -103.3431 BIC = -91.6806 MASE = 0.7906658 
## p = 4 q = 3 AIC = -101.9438 BIC = -88.98546 MASE = 0.7820422 
## p = 4 q = 4 AIC = -100.979 BIC = -86.72481 MASE = 0.7706287 
## p = 4 q = 5 AIC = -95.51106 BIC = -80.4139 MASE = 0.7052516 
## p = 5 q = 1 AIC = -96.72745 BIC = -85.40458 MASE = 0.7913984 
## p = 5 q = 2 AIC = -96.91278 BIC = -84.33182 MASE = 0.7549241 
## p = 5 q = 3 AIC = -95.91334 BIC = -82.07428 MASE = 0.7375755 
## p = 5 q = 4 AIC = -94.72831 BIC = -79.63116 MASE = 0.7302406 
## p = 5 q = 5 AIC = -93.6481 BIC = -77.29285 MASE = 0.7145698

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.7009101 , -97.36077 BIC = -83.5217. As a result, we provide a lag duration of (p=5,q=5).

  • Fitting a autoregressive distributed lag model for Radiation with respect to dependent variable RBO.
ardldlm_t3_35_radiation = ardlDlm(x = as.vector(rbo_radiation.ts), y = as.vector(rbo_RBO.ts),p = 3, q =5)
summary(ardldlm_t3_35_radiation)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.061056 -0.008873  0.004947  0.012063  0.040060 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.072293   0.454349  -0.159   0.8756  
## X.t         -0.028858   0.017592  -1.640   0.1204  
## X.1          0.027037   0.019567   1.382   0.1860  
## X.2          0.008858   0.020033   0.442   0.6643  
## X.3          0.009100   0.018794   0.484   0.6348  
## Y.1          0.460864   0.243665   1.891   0.0768 .
## Y.2          0.254400   0.246138   1.034   0.3167  
## Y.3          0.125966   0.221058   0.570   0.5767  
## Y.4         -0.193158   0.198465  -0.973   0.3449  
## Y.5          0.123607   0.195183   0.633   0.5355  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03107 on 16 degrees of freedom
## Multiple R-squared:  0.6427, Adjusted R-squared:  0.4416 
## F-statistic: 3.197 on 9 and 16 DF,  p-value: 0.02061

The above model of the autoregressive distributed lag model has p=3 and q=5, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.4416, indicating that this only explains 44.16 percent of the variability in the model. The whole model has a p-value of 0.02061, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_t3_35_radiation$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 13
## 
## data:  Residuals
## LM test = 23.082, df = 13, p-value = 0.0407
shapiro.test(residuals(ardldlm_t3_35_radiation$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t3_35_radiation$model)
## W = 0.94737, p-value = 0.2011

The residual graphs for the above model are shown in Figure 18:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t3_35_radiation=vif(ardldlm_t3_35_radiation$model)
vif_ardldlm_t3_35_radiation
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##  1.528249  1.902602  1.988364  1.739890  2.896543  2.963557  2.955170  2.366200 
## L(y.t, 5) 
##  2.244788
vif_ardldlm_t3_35_radiation>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(y.t, 1) L(y.t, 2) L(y.t, 3) L(y.t, 4) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 5) 
##     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

Model Fitting - FFD vs Humidity

##@ Fitting finite distributed lag models

To determine the model’s finite lag length, we build a loop that calculates accuracy metrics such as AIC/BIC and MASE for models with varying lag lengths and selects the model with the lowest values.

for ( i in 1:5){
  model_7 = dlm(x = as.vector(rbo_humidity.ts), y = as.vector(rbo_RBO.ts), q = i )
  cat("q = ", i, "AIC = ", AIC(model_7$model), "BIC = ", BIC(model_7$model), "MASE =", MASE(model_7)$MASE, "\n")
}
## q =  1 AIC =  -94.56619 BIC =  -88.9614 MASE = 1.101099 
## q =  2 AIC =  -88.3792 BIC =  -81.54272 MASE = 1.235884 
## q =  3 AIC =  -89.1954 BIC =  -81.20218 MASE = 1.171117 
## q =  4 AIC =  -82.80086 BIC =  -73.73 MASE = 1.262425 
## q =  5 AIC =  -79.4041 BIC =  -69.33932 MASE = 1.149637

According to the output of finite distributed lag, lag 1 has the lowest MASE, AIC, and BIC values which are MASE = 1.101099, AIC = -94.56619 BIC = -88.9614. As a result, we provide a lag duration of (q=1).

  • Fitting a finite DLM with a lag of 6 and doing the diagostic checking for Humidity with respect to dependent variable FFD
finite_DLM_44 <- dlm(x = as.vector(rbo_humidity.ts), y = as.vector(rbo_RBO.ts), q = 1)
summary(finite_DLM_44)
## 
## Call:
## lm(formula = model.formula, data = design)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.083453 -0.029388 -0.006143  0.029204  0.101854 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.887511   1.343538   1.405    0.171
## x.t         -0.009241   0.010967  -0.843    0.407
## x.1         -0.002923   0.010884  -0.269    0.790
## 
## Residual standard error: 0.04616 on 27 degrees of freedom
## Multiple R-squared:  0.03131,    Adjusted R-squared:  -0.04044 
## F-statistic: 0.4364 on 2 and 27 DF,  p-value: 0.6509
## 
## AIC and BIC values for the model:
##         AIC      BIC
## 1 -94.56619 -88.9614

The above model of the finite distributed lag model has q=1, all lag weights in a predictor series are not statistically significant at the 5% level. The adjusted R-squared of the above model is -0.04044, indicating that this only explains -4.044 percent of the variability in the model. The whole model has a p-value of 0.6509, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(finite_DLM_44$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 6
## 
## data:  Residuals
## LM test = 13.056, df = 6, p-value = 0.04215
shapiro.test(residuals(finite_DLM_44$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(finite_DLM_44$model)
## W = 0.96919, p-value = 0.5173

The residual graphs for the above model are shown in Figure 19:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating presence of autocorrelation and seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_dlm_44 =vif(finite_DLM_44$model)
vif_dlm_44
##      x.t      x.1 
## 1.024419 1.024419
vif_dlm_44 >10
##   x.t   x.1 
## FALSE FALSE
  • According to the VIF values, the above model with q=1 does not have a multicollinearity problem.

Fitting polynomial distributed lag models

for(i in 1:5){
        for(j in 1:5){
                model_7.1 <- polyDlm(x = as.vector(rbo_humidity.ts),y = as.vector(rbo_RBO.ts), q = i, k = j, show.beta = FALSE)
                cat("q:",i,"k:",j, "AIC:",AIC(model_7.1$model), "BIC:", BIC(model_7.1$model),"MASE =", MASE(model_7.1)$MASE, "\n")
        }
}
## q: 1 k: 1 AIC: -94.56619 BIC: -88.9614 MASE = 1.101099 
## q: 1 k: 2 AIC: -94.56619 BIC: -88.9614 MASE = 1.101099 
## q: 1 k: 3 AIC: -94.56619 BIC: -88.9614 MASE = 1.101099 
## q: 1 k: 4 AIC: -94.56619 BIC: -88.9614 MASE = 1.101099 
## q: 1 k: 5 AIC: -94.56619 BIC: -88.9614 MASE = 1.101099 
## q: 2 k: 1 AIC: -90.23028 BIC: -84.7611 MASE = 1.228448 
## q: 2 k: 2 AIC: -88.3792 BIC: -81.54272 MASE = 1.235884 
## q: 2 k: 3 AIC: -88.3792 BIC: -81.54272 MASE = 1.235884 
## q: 2 k: 4 AIC: -88.3792 BIC: -81.54272 MASE = 1.235884 
## q: 2 k: 5 AIC: -88.3792 BIC: -81.54272 MASE = 1.235884 
## q: 3 k: 1 AIC: -92.19434 BIC: -86.86552 MASE = 1.23778 
## q: 3 k: 2 AIC: -90.71527 BIC: -84.05425 MASE = 1.205438 
## q: 3 k: 3 AIC: -89.1954 BIC: -81.20218 MASE = 1.171117 
## q: 3 k: 4 AIC: -89.1954 BIC: -81.20218 MASE = 1.171117 
## q: 3 k: 5 AIC: -89.1954 BIC: -81.20218 MASE = 1.171117 
## q: 4 k: 1 AIC: -88.09829 BIC: -82.91494 MASE = 1.288443 
## q: 4 k: 2 AIC: -86.21246 BIC: -79.73328 MASE = 1.265024 
## q: 4 k: 3 AIC: -84.26135 BIC: -76.48633 MASE = 1.26654 
## q: 4 k: 4 AIC: -82.80086 BIC: -73.73 MASE = 1.262425 
## q: 4 k: 5 AIC: -82.80086 BIC: -73.73 MASE = 1.262425 
## q: 5 k: 1 AIC: -85.442 BIC: -80.40961 MASE = 1.202637 
## q: 5 k: 2 AIC: -83.611 BIC: -77.32052 MASE = 1.218437 
## q: 5 k: 3 AIC: -82.05417 BIC: -74.50559 MASE = 1.197596 
## q: 5 k: 4 AIC: -80.22181 BIC: -71.41514 MASE = 1.188349 
## q: 5 k: 5 AIC: -79.4041 BIC: -69.33932 MASE = 1.149637

According to the output of polynomial distributed lag model, lag =1 and k=1 has the lowest MASE, AIC, and BIC values which are MASE = 1.101099, AIC: -94.56619, BIC: -88.9614 As a result, we provide a lag duration of (q=1, k=1).

  • Fitting a polynomial DLM forHumidity with respect to dependent variable RBO
poly_DLM_44 <- polyDlm(x = as.vector(rbo_humidity.ts), y = as.vector(rbo_RBO.ts), q = 1, k = 1)
## Estimates and t-tests for beta coefficients:
##        Estimate Std. Error t value P(>|t|)
## beta.0 -0.00924     0.0110  -0.843   0.406
## beta.1 -0.00292     0.0109  -0.269   0.790
summary(poly_DLM_44)
## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.083453 -0.029388 -0.006143  0.029204  0.101854 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.887511   1.343538   1.405    0.171
## z.t0        -0.009241   0.010967  -0.843    0.407
## z.t1         0.006318   0.016601   0.381    0.707
## 
## Residual standard error: 0.04616 on 27 degrees of freedom
## Multiple R-squared:  0.03131,    Adjusted R-squared:  -0.04044 
## F-statistic: 0.4364 on 2 and 27 DF,  p-value: 0.6509

The above model of the polynomial distributed lag model has q=1 and k=1, and there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is -0.04044, indicating that this only explains -4.044 percent of the variability in the model. The whole model has a p-value of 0.6509, which is greater than 0.05, which shows that it is not statistically significant.

checkresiduals(poly_DLM_44$model)

## 
##  Breusch-Godfrey test for serial correlation of order up to 6
## 
## data:  Residuals
## LM test = 13.056, df = 6, p-value = 0.04215
shapiro.test(residuals(poly_DLM_44$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(poly_DLM_44$model)
## W = 0.96919, p-value = 0.5173

The residual graphs for the above model are shown in Figure 20:

  • The time series plot clearly shows a random trend.
  • The ACF plot has only one lag which is significant, indicating presence of autocorrelation and seasonality in the residuals.
  • Since the p-value is less than 0.05, the Beusch-Godfrey test maintains serial correlation at a 5% level of significance.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_poly_44 =vif(poly_DLM_44$model)
vif_poly_44
##    z.t0    z.t1 
## 2.38332 2.38332
vif_poly_44 >10
##  z.t0  z.t1 
## FALSE FALSE
  • According to the VIF values, the above model with q=1 and k=1 does not have multicollinearity problem.

Fitting Koyck model

  • Fitting a Koyck model forHumidity with respect to dependent variable RBO
Koyck_model_44 = koyckDlm(x = as.vector(rbo_humidity.ts) , y = as.vector(rbo_RBO.ts))
summary(Koyck_model_44$model, diagnostics=TRUE)
## 
## Call:
## "Y ~ (Intercept) + Y.1 + X.t"
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.080897 -0.021103 -0.004676  0.022673  0.111041 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -1.16679    8.04941  -0.145   0.8858  
## Y.1          0.62503    0.34753   1.798   0.0833 .
## X.t          0.01525    0.08274   0.184   0.8551  
## 
## Diagnostic tests:
##                  df1 df2 statistic p-value
## Weak instruments   1  27     0.393   0.536
## Wu-Hausman         1  26     0.055   0.816
## Sargan             0  NA        NA      NA
## 
## Residual standard error: 0.04127 on 27 degrees of freedom
## Multiple R-Squared: 0.2256,  Adjusted R-squared: 0.1682 
## Wald test: 5.612 on 2 and 27 DF,  p-value: 0.009161
  • The above Koyck model states that there are no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.1682, indicating that this only explains 16.82 percent of the variability in the model. The whole model has a p-value of 0.009161, which is less than 0.05, which shows that it is statistically significant.
  • We may conclude from the Wu-Hausman test (p-value greater than 0.05) that there is no significant correlation between the descriptive variable and the error term at the 5% level.
checkresiduals(Koyck_model_44$model)

shapiro.test(residuals(Koyck_model_44$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(Koyck_model_44$model)
## W = 0.97232, p-value = 0.6044

The residual graphs for the above model are shown in Figure 21:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_koyck_44=vif(Koyck_model_44$model)
vif_koyck_44
##      Y.1      X.t 
## 4.171591 4.171591
vif_koyck_44>10
##   Y.1   X.t 
## FALSE FALSE
  • According to the VIF values, the above model does not have a multicollinearity problem.

Fitting autoregressive distributed lag models

Autoregressive distributed lag models are the last model type derived from the time series regression technique. To describe the parameters of ARDL(p,q), we build a loop that fits autoregressive distributed lag models for a variety of lag lengths and AR process orders and calculates accuracy metrics such as AIC/BIC and MASE.

for (i in 1:5){
  for(j in 1:5){
    model_7.2 = ardlDlm(x = as.vector(rbo_humidity.ts), y = as.vector(rbo_RBO.ts), p = i , q = j)
    cat("p =", i, "q =", j, "AIC =", AIC(model_7.2$model), "BIC =", BIC(model_7.2$model), "MASE =", MASE(model_7.2)$MASE, "\n")
 }
}
## p = 1 q = 1 AIC = -103.4088 BIC = -96.4028 MASE = 0.848564 
## p = 1 q = 2 AIC = -102.1288 BIC = -93.92498 MASE = 0.8996905 
## p = 1 q = 3 AIC = -106.3498 BIC = -97.02434 MASE = 0.8643252 
## p = 1 q = 4 AIC = -101.431 BIC = -91.06435 MASE = 0.8808303 
## p = 1 q = 5 AIC = -95.07433 BIC = -83.75146 MASE = 0.8156142 
## p = 2 q = 1 AIC = -96.65849 BIC = -88.45472 MASE = 0.9530697 
## p = 2 q = 2 AIC = -100.6798 BIC = -91.10872 MASE = 0.8960873 
## p = 2 q = 3 AIC = -107.6486 BIC = -96.99096 MASE = 0.8184751 
## p = 2 q = 4 AIC = -101.3934 BIC = -89.73091 MASE = 0.8358784 
## p = 2 q = 5 AIC = -95.4408 BIC = -82.85984 MASE = 0.7729034 
## p = 3 q = 1 AIC = -100.5465 BIC = -91.22108 MASE = 0.8806533 
## p = 3 q = 2 AIC = -107.1778 BIC = -96.5202 MASE = 0.8458201 
## p = 3 q = 3 AIC = -105.8657 BIC = -93.87582 MASE = 0.8220682 
## p = 3 q = 4 AIC = -100.3794 BIC = -87.42103 MASE = 0.8423436 
## p = 3 q = 5 AIC = -95.36449 BIC = -81.52543 MASE = 0.7767472 
## p = 4 q = 1 AIC = -97.72156 BIC = -87.35487 MASE = 0.8479099 
## p = 4 q = 2 AIC = -101.6889 BIC = -90.0264 MASE = 0.852727 
## p = 4 q = 3 AIC = -99.83121 BIC = -86.87284 MASE = 0.841287 
## p = 4 q = 4 AIC = -98.43992 BIC = -84.18572 MASE = 0.8426095 
## p = 4 q = 5 AIC = -93.41579 BIC = -78.31863 MASE = 0.7728053 
## p = 5 q = 1 AIC = -99.2928 BIC = -87.96993 MASE = 0.742288 
## p = 5 q = 2 AIC = -102.2847 BIC = -89.70372 MASE = 0.715471 
## p = 5 q = 3 AIC = -100.4847 BIC = -86.64562 MASE = 0.7154768 
## p = 5 q = 4 AIC = -98.61453 BIC = -83.51737 MASE = 0.7149312 
## p = 5 q = 5 AIC = -96.6322 BIC = -80.27694 MASE = 0.7128379

According to the output of autoregressive distributed lag model, the lowest MASE, AIC, and BIC values which are MASE = 0.7128379 , AIC: -96.6322 , BIC: -80.27694. As a result, we provide a lag duration of (p=5,q=5).

  • Fitting a autoregressive distributed lag model for Humidity with respect to dependent variable RBO.
ardldlm_t3_55_humidity = ardlDlm(x = as.vector(rbo_humidity.ts), y = as.vector(rbo_RBO.ts),p = 5, q =5)
summary(ardldlm_t3_55_humidity)
## 
## Time series regression with "ts" data:
## Start = 6, End = 31
## 
## Call:
## dynlm(formula = as.formula(model.text), data = data, start = 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.039514 -0.012819 -0.002017  0.014921  0.042583 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -5.189518   3.619113  -1.434   0.1736  
## X.t          0.020748   0.011586   1.791   0.0950 .
## X.1         -0.005077   0.010123  -0.501   0.6238  
## X.2          0.021221   0.011187   1.897   0.0787 .
## X.3         -0.002882   0.011434  -0.252   0.8047  
## X.4          0.001211   0.009630   0.126   0.9017  
## X.5          0.019552   0.011086   1.764   0.0996 .
## Y.1          0.449210   0.238192   1.886   0.0802 .
## Y.2          0.418296   0.283249   1.477   0.1619  
## Y.3          0.059211   0.233143   0.254   0.8032  
## Y.4          0.059871   0.237302   0.252   0.8045  
## Y.5          0.021591   0.221332   0.098   0.9237  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03119 on 14 degrees of freedom
## Multiple R-squared:  0.6849, Adjusted R-squared:  0.4373 
## F-statistic: 2.766 on 11 and 14 DF,  p-value: 0.03819

The above model of the autoregressive distributed lag model has p=5 and q=5, all the attributes has no consequential terms at the 5% level of significance. The adjusted R-squared of the above model is 0.4373, indicating that this only explains 43.73 percent of the variability in the model. The whole model has a p-value of 0.03819, which is less than 0.05, which shows that it is statistically significant.

checkresiduals(ardldlm_t3_55_humidity)
## Time Series:
## Start = 6 
## End = 31 
## Frequency = 1 
##             6             7             8             9            10 
## -0.0003838352  0.0285424921  0.0262204043  0.0425833153 -0.0071460445 
##            11            12            13            14            15 
## -0.0395135878 -0.0097268174 -0.0324312255  0.0075500336  0.0073027412 
##            16            17            18            19            20 
##  0.0105340034  0.0022344615 -0.0062936506 -0.0310337954  0.0145701357 
##            21            22            23            24            25 
## -0.0053089033  0.0150385661 -0.0036507865  0.0177660474  0.0309689598 
##            26            27            28            29            30 
## -0.0346315729 -0.0313195739  0.0413921833 -0.0136687067 -0.0102703033 
##            31 
## -0.0193245409

shapiro.test(residuals(ardldlm_t3_55_humidity$model))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(ardldlm_t3_55_humidity$model)
## W = 0.96831, p-value = 0.5801

The residual graphs for the above model are shown in Figure 22:

  • The time series plot clearly shows a random trend.
  • In accordance with the ACF plot, we may infer that the residuals do not have any significant serial correlations.
  • The histogram of patterned residuals reveals that there is no violation in normality assumptions.
  • Since the p-value is > 0.05 we do not have enough evidence to reject H0. This implies that normality error assumption is not violated.

Now checking the multicollinearity issue

vif_ardldlm_t3_55_humidity=vif(ardldlm_t3_55_humidity$model)
vif_ardldlm_t3_55_humidity
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##  2.419770  1.859241  2.197233  2.315907  1.531656  1.771241  2.746629  3.894421 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##  3.261884  3.356919  2.864389
vif_ardldlm_t3_55_humidity>10
##       X.t L(X.t, 1) L(X.t, 2) L(X.t, 3) L(X.t, 4) L(X.t, 5) L(y.t, 1) L(y.t, 2) 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
## L(y.t, 3) L(y.t, 4) L(y.t, 5) 
##     FALSE     FALSE     FALSE

According to the VIF values, the above model does not have multicollinearity problem.

  • The data frame has been constructed to contain the model accuracy values, such as AIC/BIC and MASE, from the models that have been fitted so far.
model_dlm_t3 <- data.frame(Model=character(),MASE=numeric(),
                           BIC= numeric(),AICC=numeric(),AIC=numeric())

model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Finite DLM_temperature",
                                               AIC = AIC(finite_DLM_11),
                                              BIC = BIC(finite_DLM_11),
                                              MASE= MASE(finite_DLM_11)
))
## [1] -91.46337
## [1] -81.3986
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Polynomial DLM_temperature",
                                               BIC = BIC(poly_DLM_11),
                                               AIC = AIC(poly_DLM_11),
                                              MASE= MASE(poly_DLM_11)
                                              ))
## [1] -81.3986
## [1] -91.46337
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Task3_Koyck Model temperature",
                                               AIC = AIC(Koyck_model_11),
                                      BIC = BIC(Koyck_model_11),
                                              MASE= MASE(Koyck_model_11)
                                              ))
## [1] -64.98345
## [1] -59.37866
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="autoregressive_dlm_t3_54_temperature",
                                               AIC = AIC(ardldlm_t3_54),
                                      BIC = BIC(ardldlm_t3_54),
                                              MASE= MASE(ardldlm_t3_54)
                                              ))
## [1] -95.45616
## [1] -80.359
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Finite DLM_rainfall",
                                               AIC = AIC(finite_DLM_22),
                                              BIC = BIC(finite_DLM_22),
                                              MASE= MASE(finite_DLM_22)
))
## [1] -87.24242
## [1] -77.17765
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Polynomial DLM_rainfall",
                                               BIC = BIC(poly_DLM_22),
                                               AIC = AIC(poly_DLM_22),
                                              MASE= MASE(poly_DLM_22)
                                              ))
## [1] -83.41578
## [1] -90.96436
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Koyck Model rainfall",
                                               AIC = AIC(Koyck_model_22),
                                      BIC = BIC(Koyck_model_22),
                                              MASE= MASE(Koyck_model_22)
                                              ))
## [1] 76.22068
## [1] 81.82547
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="autoregressive_dlm_t3_53_rainfall",
                                               AIC = AIC(ardldlm_t3_53),
                                      BIC = BIC(ardldlm_t3_53),
                                              MASE= MASE(ardldlm_t3_53)
                                              ))
## [1] -93.91526
## [1] -80.0762
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Finite DLM_radiation",
                                               AIC = AIC(finite_DLM_33),
                                              BIC = BIC(finite_DLM_33),
                                              MASE= MASE(finite_DLM_33)
))
## [1] -97.71113
## [1] -92.10634
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Polynomial DLM_radiation",
                                               BIC = BIC(poly_DLM_33),
                                               AIC = AIC(poly_DLM_33),
                                              MASE= MASE(poly_DLM_33)
                                              ))
## [1] -77.72754
## [1] -85.0392
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Koyck Model radiation",
                                               AIC = AIC(Koyck_model_33),
                                      BIC = BIC(Koyck_model_33),
                                              MASE= MASE(Koyck_model_33)
                                              ))
## [1] -93.8669
## [1] -88.26211
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="autoregressive_dlm_t3_35_radiation",
                                               AIC = AIC(ardldlm_t3_35_radiation),
                                      BIC = BIC(ardldlm_t3_35_radiation),
                                              MASE= MASE(ardldlm_t3_35_radiation)
                                              ))
## [1] -97.36077
## [1] -83.5217
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Finite DLM_humidity",
                                               AIC = AIC(finite_DLM_44),
                                              BIC = BIC(finite_DLM_44),
                                              MASE= MASE(finite_DLM_44)
))
## [1] -94.56619
## [1] -88.9614
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Polynomial DLM_humidity",
                                               BIC = BIC(poly_DLM_44),
                                               AIC = AIC(poly_DLM_44),
                                              MASE= MASE(poly_DLM_44)
                                              ))
## [1] -88.9614
## [1] -94.56619
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="Koyck Model humidity",
                                               AIC = AIC(Koyck_model_44),
                                      BIC = BIC(Koyck_model_44),
                                              MASE= MASE(Koyck_model_44)
                                              ))
## [1] -101.2805
## [1] -95.67571
model_dlm_t3 = rbind(model_dlm_t3,cbind(Model="autoregressive_dlm_t3_55_humidity",
                                               AIC = AIC(ardldlm_t3_55_humidity),
                                      BIC = BIC(ardldlm_t3_55_humidity),
                                              MASE= MASE(ardldlm_t3_55_humidity)
                                              ))
## [1] -96.6322
## [1] -80.27694
sortScore(model_dlm_t3,score = "mase")
##                                                        Model        AIC
## poly_DLM_33                         Polynomial DLM_radiation  -85.03920
## ardldlm_t3_35_radiation   autoregressive_dlm_t3_35_radiation  -97.36077
## ardldlm_t3_55_humidity     autoregressive_dlm_t3_55_humidity  -96.63220
## ardldlm_t3_53              autoregressive_dlm_t3_53_rainfall  -93.91526
## ardldlm_t3_54           autoregressive_dlm_t3_54_temperature  -95.45616
## finite_DLM_11                         Finite DLM_temperature  -91.46337
## poly_DLM_11                       Polynomial DLM_temperature  -91.46337
## poly_DLM_22                          Polynomial DLM_rainfall  -90.96436
## finite_DLM_22                            Finite DLM_rainfall  -87.24242
## Koyck_model_44                          Koyck Model humidity -101.28049
## Koyck_model_33                         Koyck Model radiation  -93.86690
## finite_DLM_33                           Finite DLM_radiation  -97.71113
## poly_DLM_44                          Polynomial DLM_humidity  -94.56619
## finite_DLM_44                            Finite DLM_humidity  -94.56619
## Koyck_model_11                 Task3_Koyck Model temperature  -64.98345
## Koyck_model_22                          Koyck Model rainfall   76.22068
##                               BIC       MASE
## poly_DLM_33             -77.72754  0.6549231
## ardldlm_t3_35_radiation -83.52170  0.7009101
## ardldlm_t3_55_humidity  -80.27694  0.7128379
## ardldlm_t3_53           -80.07620  0.7237105
## ardldlm_t3_54           -80.35900  0.7544104
## finite_DLM_11           -81.39860  0.8594175
## poly_DLM_11             -81.39860  0.8594175
## poly_DLM_22             -83.41578  0.9083469
## finite_DLM_22           -77.17765  0.9256770
## Koyck_model_44          -95.67571  0.9559618
## Koyck_model_33          -88.26211  1.0314227
## finite_DLM_33           -92.10634  1.0630293
## poly_DLM_44             -88.96140  1.1010995
## finite_DLM_44           -88.96140  1.1010995
## Koyck_model_11          -59.37866  1.9150155
## Koyck_model_22           81.82547 19.1057647

Forecasting

Covariate_x_values_for_Task_3_ <- read_csv("/Users/zuaibshaikh/Desktop/SEM 4/Forecasting/Final Project/Covariate x-values for Task 3  .csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Year = col_double(),
##   Temperature = col_double(),
##   Rainfall = col_double(),
##   Radiation = col_double(),
##   RelHumidity = col_double()
## )
head(Covariate_x_values_for_Task_3_)
## # A tibble: 6 x 5
##    Year Temperature Rainfall Radiation RelHumidity
##   <dbl>       <dbl>    <dbl>     <dbl>       <dbl>
## 1  2015       10.2      2.27      14.6        94.4
## 2  2016       10.1      2.38      14.6        94.0
## 3  2017        9.53     2.26      14.8        95.0
## 4  2018        9.54     2.27      14.8        95.1
## 5    NA       NA       NA         NA          NA  
## 6    NA       NA       NA         NA          NA
forecasts.dlm = dLagM::forecast(model= poly_DLM_33, x= Covariate_x_values_for_Task_3_$Radiation, h=3)$forecasts

plot(ts(c(as.vector(rbo_RBO.ts),forecasts.dlm),start=1984),  col="blue", type="o", main="Rank-based order (RBO) series with three-year forecasts considering comparative radiation as a predictor", xlab= "Year", ylab="RBO identicals") 
lines(ts(as.vector(rbo_RBO.ts),start=1984), col="black",type="o")

Task 3 Part (b) - The task is to perform the appropriate analysis and obtain the 3 year ahead forecasts

The objective of this task is to predict the best RBO 3-year forecasts for the RBO series using the dynlm package. From 1983 to 2014, this data examines the impact of long-term climatic changes in Victoria on the relative blooming order similarity of 81 plant species. The species were ranked yearly based on the time it took to blossom (FFD), and changes in flowering order were determined by computing the similarity between the annual flowering order and the flowering order from 1983 using the Rank-based Order similarity metric (RBO).

Dynamic Lag Models

Intervention results in an immediate and permanent shift in the mean function

rbo_RBO.ts.tr_1 = log(rbo_RBO.ts)
plot(rbo_RBO.ts.tr_1,ylab='Log of RBO in metric',xlab='Year', main = "Time series plot of the logarithm of RBO metric.")

  • Log transformation makes it more stabile
Y.t = rbo_RBO.ts
T = 13
S.t = 1*(seq (Y.t) >= T)
S.t.1 = lag (S.t,+1)

MModel1 =dynlm(Y.t ~  L(Y.t , k = 1 ) + S.t + trend(Y.t))
summary(MModel1)
## 
## Time series regression with "ts" data:
## Start = 1985, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 1) + S.t + trend(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.049309 -0.017320  0.000662  0.019222  0.057536 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.927274   0.144362   6.423 8.34e-07 ***
## L(Y.t, k = 1) -0.199722   0.186129  -1.073 0.293115    
## S.t           -0.114898   0.025816  -4.451 0.000143 ***
## trend(Y.t)     0.001840   0.001093   1.683 0.104318    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02754 on 26 degrees of freedom
## Multiple R-squared:  0.668,  Adjusted R-squared:  0.6297 
## F-statistic: 17.44 on 3 and 26 DF,  p-value: 2.073e-06
checkresiduals(MModel1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 6.1617, df = 7, p-value = 0.521
MModel2 =dynlm(Y.t ~  L(Y.t , k = 2 ) + S.t + trend(Y.t))
summary(MModel2)
## 
## Time series regression with "ts" data:
## Start = 1986, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 2) + S.t + trend(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.045517 -0.011302  0.001648  0.019562  0.055581 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.908321   0.143497   6.330 1.26e-06 ***
## L(Y.t, k = 2) -0.166685   0.183717  -0.907 0.372915    
## S.t           -0.110586   0.025209  -4.387 0.000183 ***
## trend(Y.t)     0.001444   0.001090   1.325 0.197263    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02725 on 25 degrees of freedom
## Multiple R-squared:  0.6874, Adjusted R-squared:  0.6498 
## F-statistic: 18.32 on 3 and 25 DF,  p-value: 1.688e-06
checkresiduals(MModel2)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 3.1577, df = 7, p-value = 0.87
MModel3 =dynlm(Y.t ~ L(Y.t , k = 3 ) + S.t + trend(Y.t))
summary(MModel3)
## 
## Time series regression with "ts" data:
## Start = 1987, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 3) + S.t + trend(Y.t))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.04466 -0.01591  0.00312  0.01453  0.04124 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.8761866  0.1204342   7.275 1.62e-07 ***
## L(Y.t, k = 3) -0.1372424  0.1522003  -0.902   0.3762    
## S.t           -0.1056195  0.0204379  -5.168 2.72e-05 ***
## trend(Y.t)     0.0017373  0.0009804   1.772   0.0891 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0246 on 24 degrees of freedom
## Multiple R-squared:  0.6968, Adjusted R-squared:  0.659 
## F-statistic: 18.39 on 3 and 24 DF,  p-value: 2.06e-06
checkresiduals(MModel3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 13.187, df = 7, p-value = 0.06769
MModel4 =dynlm(Y.t ~ L(Y.t , k = 4 ) +S.t + trend(Y.t))
summary(MModel4)
## 
## Time series regression with "ts" data:
## Start = 1988, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 4) + S.t + trend(Y.t))
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.048805 -0.011124 -0.001643  0.015559  0.037209 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.9948007  0.1056565   9.415 2.35e-09 ***
## L(Y.t, k = 4) -0.2757240  0.1306248  -2.111   0.0459 *  
## S.t           -0.1062264  0.0164129  -6.472 1.33e-06 ***
## trend(Y.t)     0.0009180  0.0009629   0.953   0.3503    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02274 on 23 degrees of freedom
## Multiple R-squared:  0.7504, Adjusted R-squared:  0.7179 
## F-statistic: 23.05 on 3 and 23 DF,  p-value: 4.05e-07
checkresiduals(MModel4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 7
## 
## data:  Residuals
## LM test = 9.4131, df = 7, p-value = 0.2243
MModel7 =dynlm(Y.t ~ L(Y.t , k = 4 ) + S.t+S.t.1 + trend(Y.t) + rbo_temp.ts)
summary(MModel7)
## 
## Time series regression with "ts" data:
## Start = 1988, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 4) + S.t + S.t.1 + trend(Y.t) + 
##     rbo_temp.ts)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.048916 -0.012225 -0.000212  0.014734  0.035748 
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.8736753  0.1700045   5.139 3.76e-05 ***
## L(Y.t, k = 4) -0.2637547  0.1317632  -2.002   0.0578 .  
## S.t           -0.0972916  0.0191689  -5.076 4.39e-05 ***
## S.t.1                 NA         NA      NA       NA    
## trend(Y.t)     0.0004943  0.0010724   0.461   0.6494    
## rbo_temp.ts    0.0119200  0.0130763   0.912   0.3719    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02283 on 22 degrees of freedom
## Multiple R-squared:  0.7595, Adjusted R-squared:  0.7158 
## F-statistic: 17.37 on 4 and 22 DF,  p-value: 1.455e-06
checkresiduals(MModel7)

## 
##  Breusch-Godfrey test for serial correlation of order up to 9
## 
## data:  Residuals
## LM test = 10.907, df = 9, p-value = 0.2821
MModel5 =dynlm(Y.t  ~ L(Y.t , k = 4 ) + S.t + S.t.1 + trend(Y.t) + rbo_rainfall.ts)
summary(MModel5)
## 
## Time series regression with "ts" data:
## Start = 1988, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 4) + S.t + S.t.1 + trend(Y.t) + 
##     rbo_rainfall.ts)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.042920 -0.012360 -0.003215  0.016749  0.034510 
## 
## Coefficients: (1 not defined because of singularities)
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      0.9752441  0.1070133   9.113 6.36e-09 ***
## L(Y.t, k = 4)   -0.2912771  0.1311341  -2.221   0.0369 *  
## S.t             -0.0994456  0.0175882  -5.654 1.10e-05 ***
## S.t.1                   NA         NA      NA       NA    
## trend(Y.t)       0.0006554  0.0009922   0.660   0.5158    
## rbo_rainfall.ts  0.0131256  0.0124378   1.055   0.3027    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02269 on 22 degrees of freedom
## Multiple R-squared:  0.7625, Adjusted R-squared:  0.7193 
## F-statistic: 17.65 on 4 and 22 DF,  p-value: 1.275e-06
checkresiduals(MModel5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 9
## 
## data:  Residuals
## LM test = 9.1703, df = 9, p-value = 0.4217
MModel6 =dynlm(Y.t ~ L(Y.t , k = 4 ) +S.t +S.t.1 + trend(Y.t) + rbo_radiation.ts)
summary(MModel6)
## 
## Time series regression with "ts" data:
## Start = 1988, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 4) + S.t + S.t.1 + trend(Y.t) + 
##     rbo_radiation.ts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.05060 -0.01225  0.00177  0.01373  0.02990 
## 
## Coefficients: (1 not defined because of singularities)
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1.2083678  0.2282580   5.294 2.59e-05 ***
## L(Y.t, k = 4)    -0.3185748  0.1364915  -2.334   0.0291 *  
## S.t              -0.1018248  0.0168962  -6.026 4.58e-06 ***
## S.t.1                    NA         NA      NA       NA    
## trend(Y.t)        0.0006928  0.0009840   0.704   0.4888    
## rbo_radiation.ts -0.0124131  0.0117679  -1.055   0.3030    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02269 on 22 degrees of freedom
## Multiple R-squared:  0.7625, Adjusted R-squared:  0.7193 
## F-statistic: 17.65 on 4 and 22 DF,  p-value: 1.276e-06
checkresiduals(MModel6)

## 
##  Breusch-Godfrey test for serial correlation of order up to 9
## 
## data:  Residuals
## LM test = 10.487, df = 9, p-value = 0.3125
MModel8 =dynlm(Y.t ~ L(Y.t , k = 4 ) +S.t +S.t.1+ trend(Y.t) + rbo_humidity.ts)
summary(MModel8)
## 
## Time series regression with "ts" data:
## Start = 1988, End = 2014
## 
## Call:
## dynlm(formula = Y.t ~ L(Y.t, k = 4) + S.t + S.t.1 + trend(Y.t) + 
##     rbo_humidity.ts)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.044132 -0.011735 -0.001107  0.018010  0.033447 
## 
## Coefficients: (1 not defined because of singularities)
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      0.6371754  0.5638866   1.130   0.2707    
## L(Y.t, k = 4)   -0.2647730  0.1333934  -1.985   0.0598 .  
## S.t             -0.1079329  0.0168334  -6.412 1.88e-06 ***
## S.t.1                   NA         NA      NA       NA    
## trend(Y.t)       0.0009663  0.0009782   0.988   0.3340    
## rbo_humidity.ts  0.0036995  0.0057272   0.646   0.5250    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02304 on 22 degrees of freedom
## Multiple R-squared:  0.7551, Adjusted R-squared:  0.7106 
## F-statistic: 16.96 on 4 and 22 DF,  p-value: 1.77e-06
checkresiduals(MModel8)

## 
##  Breusch-Godfrey test for serial correlation of order up to 9
## 
## data:  Residuals
## LM test = 10.105, df = 9, p-value = 0.3421

Dynamic Model Comparison

Here we will see the least value of MASE by using function

library(GGally)
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:xts':
## 
##     first, last
## The following objects are masked from 'package:Hmisc':
## 
##     src, summarize
## The following object is masked from 'package:car':
## 
##     recode
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# creating a new model
Model_comparison <- c("MModel1","MModel2","MModel3","MModel4" , "MModel5","MModel6", "MModel7","MModel8")

# checking its accuracy

MASE_t3 <- c(accuracy(MModel1)[6],accuracy(MModel2)[6],accuracy(MModel3)[6],accuracy(MModel4)[6],
             accuracy(MModel5)[6],accuracy(MModel6)[6],accuracy(MModel7)[6],accuracy(MModel8)[6])


model_dylm_task3 <- data.frame(Model_comparison,MASE_t3)
model_dylm_task3
##   Model_comparison   MASE_t3
## 1          MModel1 0.6768533
## 2          MModel2 0.7130389
## 3          MModel3 0.7171283
## 4          MModel4 0.6515020
## 5          MModel5 0.6826307
## 6          MModel6 0.6388298
## 7          MModel7 0.6476601
## 8          MModel8 0.6727085

From above we got to know that model 6 is the best model for forecasting i.e (model6 = 0.6055852)

par(mfrow=c(1,1))
plot(Y.t,ylab='Log of RBO in metric',xlab='Year',main = "Time series plot of the logarithm of RBO.")
lines(MModel6$fitted.values,col="red")

q = 36
n = nrow(MModel6$model)
rbo.frc = array(NA , (n + q))
rbo.frc[1:n] = Y.t[4:length(Y.t)]

trend = array(NA,q)
trend.start = MModel6$model[n,"trend(Y.t)"]
trend = seq(trend.start , trend.start + q/12, 1/12)

Summary

. In task-1 to make statistical conclusions, time series analysis, time series regression methods, and forecasting approaches were employed to a dataset including weekly mortality series of the possible consequences of climate change and pollution on disease-specific mortality from 2010 to 2020. The objective is to analyze, forecast, and choose the best model for a series. This data was originally imported, and then each column was separately converted into a time series using the ts function. Then, visuals for each variable were shown. The data was then scaled so that all of the charts were visible. The stationarity test is then performed on both series, and we find that both series are stationary at the 5% level of significance. Following that, we investigated the impact of time series components using STL decomposition. The time series regression technique was then applied with multivariate formation, and all of the relevant techniques, such as finite DLM, dynamic lag, exponential smoothing, and state-space model, were fitted. Finally, the lowest mase value obtained was . Then, in ascending order, we constructed a table including all of the models that had been fitted so far, with the lowest mase values displayed. Lastly, the four weeks ahead forecasts were displayed.

In task-2 to make statistical conclusions, time series analysis, time series regression methods, and forecasting approaches were employed to a dataset including FFD (first flowering day) series of the possible consequences on climate parameters such as rainfall (rain), temperature (temp), radiation level (rad), and relative humidity (RH). This data was originally imported, and then each column was separately converted into a time series using the ts function. Then, visuals for each variable were shown. The data was then scaled so that all of the charts were visible. The stationarity test is then performed on both series, and we find that both series are non-stationary at the 5% level of significance. So second and third differencing has been applied to make data stationery at a 5% level of significance. Following that, the time series regression technique was then applied with a univariate approach, and all of the relevant techniques, such as finite DLM, dynamic lag, exponential smoothing, and state-space model, were fitted. Finally, the lowest mase value obtained was . Then, in ascending order, we constructed a table including all of the models that had been fitted so far, with the lowest mase values displayed. Lastly, the three years ahead forecasts were displayed.

In task-3 to make statistical conclusions, time series analysis, time series regression methods, and forecasting approaches were employed to a dataset. This data examines the impact of long-term climatic changes in Victoria on the relative blooming order similarity of 81 plant species. This data was originally imported, and then each column was separately converted into a time series using the ts function. Then, visuals for each variable were shown. The data was then scaled so that all of the charts were visible. Following that, the time series regression technique was then applied with a univariate approach, and all of the relevant techniques, such as finite DLM, dynamic lag, were fitted. Finally, the lowest mase value obtained was . Then, in ascending order, we constructed a table including all of the models that had been fitted so far, with the lowest mase values displayed. Lastly, the forecasts for the ahead 3 years were displayed.

Conclusion

In task 1 the primary conclusions from model fitting are the time series regression with multivariate approach, certain exponential smoothing models, and the manually proposed state-space model that failed to capture autocorrelation and seasonality in mortality data. There were also concerns with the residuals from all of the model’s standardized assumptions. Damped additive technique and Multiplicative seasonal damped technique were the most successful models for apprehending seasonality and serial correlation. These models were also observed to be the most effective at reducing MASE with MASE=0.687106984627893 for the Damped additive method and MASE=0.687106984627893 for the Multiplicative seasonal damped method. We picked the Damped additive technique for predicting based on the results of the investigation.

In task 2 the primary conclusions from model fitting are the time series regression with univariate approach, certain exponential smoothing models, and the manually proposed state-space model that failed to capture autocorrelation and seasonality in FFD data. Models were also observed to be the most effective at reducing MASE with MASE=0.513033974107104 for the Polynomial DLM_radiation method. We picked the polynomial technique for predicting based on the results of the investigation.

In task 3 the primary conclusions from model fitting are the time series regression with the univariate approach in RBO data. Moreover, need to analyze the same particular data with the dynamic method and give forecast 3 years ahead with the least mase value developed from the accuracy-test (Model comparison). Models were also observed to be the most effective at reducing MASE with MASE=0.6055852 for the model method. We picked the model 6 technique for predicting based on the results of the investigation.

Reference

The dataset/Codes sourced from MATH1307, CLO’s (week 3-7) assignment 2 canvas by Senior Professor .Irene Hudson, RMIT University.

https://www.frontiersin.org/articles/10.3389/fpls.2020.594538/full

https://ourworldindata.org/air-pollution

https://www.hindawi.com/journals/amete/2017/2954010/