2020-03-17 Research Assignment

Overview

You are tasked to develop an electricity demand forecasting model for ABC. The objective of the forecasting model is to better inform ABC and the industry so as to ensure that sufficient infrastructure is to be put in place to cater for increasing electricity demand.

We used the following abbreviations to denote the Different Sectors and Types of Household in Energy Consumption:

MFG = Manufacturing
CON = Construction
UTI = Utilities
OIR = Other Industrial-related
WRT = Wholesale and Retail Trade
AFS = Accommodation and Food Services
ICM = Information and Communications
FIA = Financial and Insurance Activities
REA = Real Estate Activities
PST = Professional, Scientific & Technical, Administration & Support Activities
OCS = Other Commerce and Services-related
TRP = Transport-related
OTH = Others
X1R2R = 1-Room / 2-Room HDB
X3R = 3-Room HDB
X4R = 4-Room HDB
X5REX = 5-Room and Executive HDB
PAC = Private Apartments and Condominiums
LPP = Landed Properties

Loading Libraries

# Loading Libraries
library(datetime)

## Warning: package 'datetime' was built under R version 3.6.3

library(Hmisc)

## Warning: package 'Hmisc' was built under R version 3.6.3

## Loading required package: lattice

## Loading required package: survival

## Loading required package: Formula

## Loading required package: ggplot2

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:base':
## 
##     format.pval, units

library(corrplot)

## corrplot 0.84 loaded

library(caret)

## 
## Attaching package: 'caret'

## The following object is masked from 'package:survival':
## 
##     cluster

library(vars)

## Warning: package 'vars' was built under R version 3.6.3

## Loading required package: MASS

## Warning: package 'MASS' was built under R version 3.6.3

## Loading required package: strucchange

## Warning: package 'strucchange' was built under R version 3.6.3

## Loading required package: zoo

## Warning: package 'zoo' was built under R version 3.6.3

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: sandwich

## Warning: package 'sandwich' was built under R version 3.6.3

## Loading required package: urca

## Warning: package 'urca' was built under R version 3.6.3

## Loading required package: lmtest

## Warning: package 'lmtest' was built under R version 3.6.3

library(forecast)

## Warning: package 'forecast' was built under R version 3.6.3

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

Loading and Cleaning Data on Energy Consumption

# Loading Data on Energy Consumption
ecData <- read.csv('./EC_05_18.csv', header = TRUE)
ecData$Year <- as.year(ecData$Year)

# Cleaning Data on Energy Consumption
ecData <- ecData[, colSums(is.na(ecData)) == 0]
summary(ecData)

##       Year           MFG             CON             UTI        
##  Min.   :2005   Min.   :13836   Min.   :234.8   Min.   : 411.0  
##  1st Qu.:2008   1st Qu.:14966   1st Qu.:302.7   1st Qu.: 674.2  
##  Median :2012   Median :16932   Median :397.6   Median : 953.7  
##  Mean   :2012   Mean   :16712   Mean   :395.4   Mean   : 958.2  
##  3rd Qu.:2015   3rd Qu.:18205   3rd Qu.:489.0   3rd Qu.:1227.4  
##  Max.   :2018   Max.   :19582   Max.   :551.6   Max.   :1518.7  
##       OIR             WRT            AFS            ICM              FIA      
##  Min.   :25.88   Min.   :1821   Min.   :1031   Min.   : 667.2   Min.   :1292  
##  1st Qu.:29.34   1st Qu.:1913   1st Qu.:1170   1st Qu.: 849.9   1st Qu.:1946  
##  Median :35.58   Median :1972   Median :1263   Median :1036.5   Median :2110  
##  Mean   :36.26   Mean   :1994   Mean   :1265   Mean   :1190.6   Mean   :2114  
##  3rd Qu.:41.66   3rd Qu.:2060   3rd Qu.:1346   3rd Qu.:1393.7   3rd Qu.:2355  
##  Max.   :48.26   Max.   :2293   Max.   :1464   Max.   :2230.5   Max.   :2718  
##       REA            PST             OCS            TRP            OTH       
##  Min.   :3712   Min.   :650.2   Min.   :3613   Min.   :1200   Min.   :237.7  
##  1st Qu.:3961   1st Qu.:706.3   1st Qu.:4244   1st Qu.:1452   1st Qu.:277.1  
##  Median :4455   Median :759.4   Median :4393   Median :2276   Median :325.2  
##  Mean   :4280   Mean   :787.9   Mean   :4384   Mean   :2071   Mean   :369.6  
##  3rd Qu.:4516   3rd Qu.:874.8   3rd Qu.:4697   3rd Qu.:2444   3rd Qu.:446.7  
##  Max.   :4701   Max.   :934.1   Max.   :4860   Max.   :2944   Max.   :612.1  
##      X1R2R             X3R             X4R           X5REX           PAC      
##  Min.   : 70.90   Min.   :690.1   Min.   :1475   Min.   :1486   Min.   :1349  
##  1st Qu.: 77.05   1st Qu.:708.4   1st Qu.:1511   1st Qu.:1517   1st Qu.:1444  
##  Median : 87.75   Median :731.5   Median :1597   Median :1564   Median :1605  
##  Mean   : 96.62   Mean   :733.2   Mean   :1620   Mean   :1568   Mean   :1647  
##  3rd Qu.:117.60   3rd Qu.:747.9   3rd Qu.:1731   3rd Qu.:1592   3rd Qu.:1831  
##  Max.   :135.63   Max.   :801.8   Max.   :1839   Max.   :1708   Max.   :2001  
##       LPP        
##  Min.   : 953.6  
##  1st Qu.: 970.2  
##  Median :1016.9  
##  Mean   :1012.0  
##  3rd Qu.:1037.8  
##  Max.   :1098.0

Plotting a Correlation Matrix for Data on Energy Consumption

# Plotting a Correlation Matrix for Data on Energy Consumption
ecMatrix <- rcorr(as.matrix(ecData))
flattenCorrMatrix <- function(cormat, pmat){
  ut <- upper.tri(cormat)
  data.frame(row = rownames(cormat)[row(cormat)[ut]],
             column = rownames(cormat)[col(cormat)[ut]],
             cor = (cormat)[ut],
             p = pmat[ut])}
flattenecMatrix <- flattenCorrMatrix(ecMatrix$r, ecMatrix$P)
summary(flattenecMatrix)

##       row         column        cor                p            
##  Year   :19   LPP    :19   Min.   :-0.9540   Min.   :0.0000000  
##  MFG    :18   PAC    :18   1st Qu.: 0.5072   1st Qu.:0.0000009  
##  CON    :17   X5REX  :17   Median : 0.8389   Median :0.0000789  
##  UTI    :16   X4R    :16   Mean   : 0.5731   Mean   :0.0631629  
##  OIR    :15   X3R    :15   3rd Qu.: 0.9313   3rd Qu.:0.0016110  
##  WRT    :14   X1R2R  :14   Max.   : 0.9951   Max.   :0.8441051  
##  (Other):91   (Other):91

Plotting a Correlation Plot for Data on Energy Consumption

# Plotting a Correlation Plot for Data on Energy Consumption
corrplot(ecMatrix$r, type = "upper", order = "FPC", method = "color",
         p.mat = ecMatrix$P, sig.level = 0.01, insig = "pch",
         tl.cex = 0.8, tl.col = "black", tl.srt = 45)

In the Correlation Plot shown above, the variables that are highly correlated are highlighted at the dark blue intersections. We used a level of significance of 0.01 to determine correlations that are statistically significant. Correlations with a p-value > 0.01 are considered statistically insignificant and marked with a cross.

Finding Highly Correlated Variables in Data on Energy Consumption

# Finding Highly Correlated Variables in Data on Energy Consumption
highlyCorrelated <- findCorrelation(ecMatrix$r, cutoff = 0.98)
names(ecData)[highlyCorrelated]

## [1] "PAC"  "UTI"  "Year" "X3R"

We used a pair-wise absolute correlation cutoff of 0.98 and found 3 out of 19 variables that are highly correlated with one another in our Data on Energy Consumption. We will use these variables to build our Forecasting Model. Therefore, we will only be using the variables, “UTI”, “X3R”, and “PAC” to build our Forecasting Model.

Building our Forecasting Model

For this Research Assignment, we will use a Vector Autoregression (VAR) Model as our Forecasting Model. It is a causal / Econometric Forecasting Method used to identify underlying factors that might influence the variable that is being forecasted.

Converting Data on Energy Consumption to a Time Series Object

# Define Manufacturing (MFG) as a Time Series Object
MFG <- ts(ecData$MFG, start = c(2005), end = c(2018), frequency = 1)

# Define Construction (CON) as a Time Series Object
CON <- ts(ecData$CON, start = c(2005), end = c(2018), frequency = 1)

# Define Utilities (UTI) as a Time Series Object
UTI <- ts(ecData$UTI, start = c(2005), end = c(2018), frequency = 1)

# Define Other Industrial-related (OIR) as a Time Series Object
OIR <- ts(ecData$OIR, start = c(2005), end = c(2018), frequency = 1)

# Define Wholesale and Retail Trade (WRT) as a Time Series Object
WRT <- ts(ecData$WRT, start = c(2005), end = c(2018), frequency = 1)

# Define Accommodation and Food Services (AFS) as a Time Series Object
AFS <- ts(ecData$AFS, start = c(2005), end = c(2018), frequency = 1)

# Define Information and Communications (ICM) as a Time Series Object
ICM <- ts(ecData$ICM, start = c(2005), end = c(2018), frequency = 1)

# Define Financial and Insurance Activities (FIA) as a Time Series Object
FIA <- ts(ecData$FIA, start = c(2005), end = c(2018), frequency = 1)

# Define Real Estate Activities (REA) as a Time Series Object
REA <- ts(ecData$REA, start = c(2005), end = c(2018), frequency = 1)

# Define Professional, Scientific & Technical, Administration
# & Support Activities (PST) as a Time Series Object
PST <- ts(ecData$PST, start = c(2005), end = c(2018), frequency = 1)

# Define Other Commerce and Services-related (OCS) as a Time Series Object
OCS <- ts(ecData$OCS, start = c(2005), end = c(2018), frequency = 1)

# Define Transport-related (TRP) as a Time Series Object
TRP <- ts(ecData$TRP, start = c(2005), end = c(2018), frequency = 1)

# Define Others (OTH) as a Time Series Object
OTH <- ts(ecData$OTH, start = c(2005), end = c(2018), frequency = 1)

# Define 1-Room / 2-Room HDB (X1R2R) as a Time Series Object
X1R2R <- ts(ecData$X1R2R, start = c(2005), end = c(2018), frequency = 1)

# Define 3-Room HDB (X3R) as a Time Series Object
X3R <- ts(ecData$X3R, start = c(2005), end = c(2018), frequency = 1)

# Define 4-Room HDB (X4R) as a Time Series Object
X4R <- ts(ecData$X4R, start = c(2005), end = c(2018), frequency = 1)

# Define 5-Room and Executive HDB (X5REX) as a Time Series Object
X5REX <- ts(ecData$X5REX, start = c(2005), end = c(2018), frequency = 1)

# Define Private Apartments and Condominiums (PAC) as a Time Series Object
PAC <- ts(ecData$PAC, start = c(2005), end = c(2018), frequency = 1)

# Define Landed Properties (LPP) as a Time Series Object
LPP <- ts(ecData$LPP, start = c(2005), end = c(2018), frequency = 1)

Plotting Data on Energy Consumption by Different Sectors and Types of Household

# Plotting Data on Energy Consumption by Different Sectors and Types of Household
plot(cbind(MFG, CON, UTI, OIR, WRT, AFS, ICM, FIA, REA, PST),
     main = "Energy Consumption by Different Sectors and Households")

plot(cbind(OCS, TRP, OTH, X1R2R, X3R, X4R, X5REX, PAC, LPP),
     main = "Energy Consumption by Different Sectors and Households")

Model Selection and Estimation

# Model Selection and Estimation
VAR_Data <- cbind(UTI, X3R, PAC)
colnames(VAR_Data) <- c("UTI", "X3R", "PAC")
VAR_Data_info <- VARselect(VAR_Data, lag.max = 10, type = "const")
VAR_Data_info

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      1      1      1      2 
## 
## $criteria
##           1    2    3    4    5    6    7    8    9   10
## AIC(n) -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## HQ(n)  -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## SC(n)  -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## FPE(n)  NaN    0    0    0    0    0    0    0    0    0

VAR_Est <- VAR(VAR_Data, p = 2, type = "const", season = NULL, exogen = NULL)
summary(VAR_Est)

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: UTI, X3R, PAC 
## Deterministic variables: const 
## Sample size: 12 
## Log Likelihood: -134.887 
## Roots of the characteristic polynomial:
## 0.9851 0.8169 0.8169 0.7236 0.529 0.529
## Call:
## VAR(y = VAR_Data, p = 2, type = "const", exogen = NULL)
## 
## 
## Estimation results for equation UTI: 
## ==================================== 
## UTI = UTI.l1 + X3R.l1 + PAC.l1 + UTI.l2 + X3R.l2 + PAC.l2 + const 
## 
##          Estimate Std. Error t value Pr(>|t|)  
## UTI.l1  5.893e-03  4.080e-01   0.014   0.9890  
## X3R.l1  1.219e+00  3.122e+00   0.390   0.7124  
## PAC.l1  1.863e-01  8.720e-01   0.214   0.8393  
## UTI.l2  9.199e-01  3.416e-01   2.693   0.0431 *
## X3R.l2 -1.315e+00  2.596e+00  -0.507   0.6339  
## PAC.l2 -1.101e-01  9.752e-01  -0.113   0.9145  
## const   1.705e+02  1.371e+03   0.124   0.9059  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 55.56 on 5 degrees of freedom
## Multiple R-Squared: 0.9863,  Adjusted R-squared: 0.9699 
## F-statistic: 60.14 on 6 and 5 DF,  p-value: 0.0001687 
## 
## 
## Estimation results for equation X3R: 
## ==================================== 
## X3R = UTI.l1 + X3R.l1 + PAC.l1 + UTI.l2 + X3R.l2 + PAC.l2 + const 
## 
##         Estimate Std. Error t value Pr(>|t|)  
## UTI.l1  -0.05192    0.11311  -0.459   0.6655  
## X3R.l1   0.80565    0.86568   0.931   0.3947  
## PAC.l1  -0.09847    0.24177  -0.407   0.7006  
## UTI.l2   0.21943    0.09471   2.317   0.0683 .
## X3R.l2  -0.81485    0.71961  -1.132   0.3088  
## PAC.l2  -0.05202    0.27038  -0.192   0.8550  
## const  844.83181  380.04930   2.223   0.0768 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 15.4 on 5 degrees of freedom
## Multiple R-Squared: 0.8863,  Adjusted R-squared: 0.7498 
## F-statistic: 6.493 on 6 and 5 DF,  p-value: 0.02903 
## 
## 
## Estimation results for equation PAC: 
## ==================================== 
## PAC = UTI.l1 + X3R.l1 + PAC.l1 + UTI.l2 + X3R.l2 + PAC.l2 + const 
## 
##          Estimate Std. Error t value Pr(>|t|)
## UTI.l1   -0.02237    0.37071  -0.060    0.954
## X3R.l1   -0.36804    2.83712  -0.130    0.902
## PAC.l1    0.65771    0.79236   0.830    0.444
## UTI.l2    0.61302    0.31040   1.975    0.105
## X3R.l2   -1.15912    2.35841  -0.491    0.644
## PAC.l2   -0.37658    0.88614  -0.425    0.689
## const  1813.12232 1245.55295   1.456    0.205
## 
## 
## Residual standard error: 50.48 on 5 degrees of freedom
## Multiple R-Squared: 0.9746,  Adjusted R-squared: 0.944 
## F-statistic: 31.91 on 6 and 5 DF,  p-value: 0.0007844 
## 
## 
## 
## Covariance matrix of residuals:
##        UTI   X3R    PAC
## UTI 3086.6 257.6  507.8
## X3R  257.6 237.3  768.1
## PAC  507.8 768.1 2548.4
## 
## Correlation matrix of residuals:
##        UTI    X3R    PAC
## UTI 1.0000 0.3010 0.1811
## X3R 0.3010 1.0000 0.9878
## PAC 0.1811 0.9878 1.0000

While a R-Squared value does not provide a formal hypothesis test for goodness-of-fit, it provides an estimate for the strength of the relationship between our variables. We determined the Simple Linear Regression Models for each of our Highly Correlated Variables, “UTI”, “X3R”, and “PAC”. They gave Adjusted R-Squared values of 0.97, 0.75, and 0.94, respectively. This suggests that these variables are highly correlated with one another. The models also gave p-values of 0.0001687, 0.02903, and 0.0007844, respectively. This suggests that the correlations between our variables are statistically significant. Therefore, we are confident that our Forecasting Model is reliable and robust.

Diagnostic Tests on Residuals of our Model

To consider the Model Fit, we performed some Diagnostic Tests on Residuals of our Model.

Portmanteau Test

To test for Serial Correlation, we performed a Portmanteau Test. Serial Correlation is the relationship between a given variable and a lagged version of itself over various time intervals. A variable that is serially correlated has a pattern and is not random.

# Portmanteau Test
VAR_Est_serial <- serial.test(VAR_Est, lags.pt = 10, type = "PT.asymptotic")
VAR_Est_serial

## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object VAR_Est
## Chi-squared = 58.643, df = 72, p-value = 0.8716

# Plotting Serial Correlations of Highly Correlated Variables
plot(VAR_Est_serial, names = "UTI")

plot(VAR_Est_serial, names = "X3R")

plot(VAR_Est_serial, names = "PAC")

From the Portmanteau Test shown above, we obtained a p-value of 0.8716 which does not reject the null hypothesis at the level of significance of 0.05. This suggests that there is an absence of Serial Correlation for our Highly Correlated variables, “UTI”, “X3R”, and “PAC”. There is no apparent relationship between these variables and the lagged versions of themselves over various time intervals. The variables are not serially correlated and their predictions are random and do not display a pattern.

Engle’s Autoregressive Conditional Heteroscedasticity (ARCH) Lagrange Multiplier (LM) Test

To test for Heteroscedasticity, we performed an Engle’s Autoregressive Conditional Heteroscedasticity (ARCH) Lagrange Multiplier (LM) Test. An uncorrelated time series can still be serially dependent due to a dynamic conditional variance process. A time series exhibiting Conditional Heteroscedasticity is said to have ARCH effects. Engle’s ARCH LM Test assesses the significance of these effects.

# Engle's ARCH LM Test
VAR_Est_ARCH <- arch.test(VAR_Est, lags.multi = 1, multivariate.only = TRUE)
VAR_Est_ARCH

## 
##  ARCH (multivariate)
## 
## data:  Residuals of VAR object VAR_Est
## Chi-squared = 38.872, df = 36, p-value = 0.3416

From the Engle’s ARCH LM Test shown above, we obtained a p-value of 0.3416 which does not reject the null hypothesis at the level of significance of 0.05. This suggests that there is an absence of ARCH effects for our Highly Correlated Variables, “UTI”, “X3R”, and “PAC”. There is no apparent Heteroscedasticity for these variables.

Normality Test

To consider the distribution of residuals in our model, we performed a Normality Test. It is used to determine if a dataset is well modelled by a normal distribution and to compute how likely it is for a random variable underlying the dataset to be normally distributed.

# Normality Test
VAR_Est_norm <- normality.test(VAR_Est, multivariate.only = TRUE)
VAR_Est_norm

## $JB
## 
##  JB-Test (multivariate)
## 
## data:  Residuals of VAR object VAR_Est
## Chi-squared = 2.3127, df = 6, p-value = 0.8888
## 
## 
## $Skewness
## 
##  Skewness only (multivariate)
## 
## data:  Residuals of VAR object VAR_Est
## Chi-squared = 0.38742, df = 3, p-value = 0.9428
## 
## 
## $Kurtosis
## 
##  Kurtosis only (multivariate)
## 
## data:  Residuals of VAR object VAR_Est
## Chi-squared = 1.9252, df = 3, p-value = 0.5881

From the Normality Test shown above, we obtained p-values of 0.8888 for the Jarque-Bera (JB) Test, 0.9428 for Skewness, and 0.5881 for Kurtosis. They do not reject the null hypothesis at the level of significance of 0.05. This suggests that the distribution of residuals in our model is a fairly normal distribution.

Cumulative Sum Control Chart (CUSUM) Test

To test for Structural Breaks in the residuals of our model, we performed a Cumulative Sum Control Chart (CUSUM) Test. It is a sequential analysis technique used to monitor change detection for statistical quality control. In econometrics and statistics, a Structural Break is an unexpected change over time in the parameters of regression models, which can lead to huge forecasting errors and reduced reliability of the model. Structural Stability is the time-invariance of regression coefficients and a central issue in all applications of linear regression models. The lack of Structural Stability of coefficients frequently caused forecast failure and should be tested for routinely.

# Cumulative Sum Control Chart (CUSUM) Test
VAR_Est_CUSUM <- stability(VAR_Est, type = "OLS-CUSUM")
plot(VAR_Est_CUSUM)

From the CUSUM Test shown above, there is no apparent Structural Break in the respective Confidence Intervals of our Highly Correlated Variables, “UTI”, “X3R”, and “PAC”. Therefore, we are confident that our Forecasting Model is reliable and robust.

Granger Causality Test

The Granger Causality Test is a statistical hypothesis test for determining whether one time series is useful in forecasting another. Ordinarily, regressions reflect correlations but Clive Granger argued that causality in economics could be tested by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of “True Causality” is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger Test finds only “Predictive Causality”.

# Testing if Utilities (UTI) has Power in Explaining Other Variables
VAR_Est_UTI <- causality(VAR_Est, cause = "UTI")
VAR_Est_UTI

## $Granger
## 
##  Granger causality H0: UTI do not Granger-cause X3R PAC
## 
## data:  VAR object VAR_Est
## F-Test = 3.6468, df1 = 4, df2 = 15, p-value = 0.02879
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: UTI and X3R PAC
## 
## data:  VAR object VAR_Est
## Chi-squared = 4.7083, df = 2, p-value = 0.09498

From the Granger Causality Test shown above, we obtained a p-value of 0.02879 which rejects the null hypothesis of no Granger causality at the level of significance of 0.05. This suggests that the variable, “UTI” may provide statistically significant information about future values of the variables, “X3R” and “PAC”. The variable, “UTI” has enough power in explaining the variables, “X3R” and “PAC”.

# Testing if 3-Room HDB (X3R) has Power in Explaining Other Variables
VAR_Est_X3R <- causality(VAR_Est, cause = "X3R")
VAR_Est_X3R

## $Granger
## 
##  Granger causality H0: X3R do not Granger-cause UTI PAC
## 
## data:  VAR object VAR_Est
## F-Test = 0.18257, df1 = 4, df2 = 15, p-value = 0.9439
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: X3R and UTI PAC
## 
## data:  VAR object VAR_Est
## Chi-squared = 5.9731, df = 2, p-value = 0.05046

From the Granger Causality Test shown above, we obtained a p-value of 0.9439 which does not reject the null hypothesis of no Granger causality at the level of significance of 0.05. This suggests that the variable, “X3R” may not provide statistically significant information about future values of the variables, “UTI” and “PAC”. The variable, “X3R” does not have enough power in explaining the variables, “UTI” and “PAC”.

# Testing if Private Apartments and Condominiums
# (PAC) has Power in Explaining Other Variables
VAR_Est_PAC <- causality(VAR_Est, cause = "PAC")
VAR_Est_PAC

## $Granger
## 
##  Granger causality H0: PAC do not Granger-cause UTI X3R
## 
## data:  VAR object VAR_Est
## F-Test = 0.2545, df1 = 4, df2 = 15, p-value = 0.9024
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: PAC and UTI X3R
## 
## data:  VAR object VAR_Est
## Chi-squared = 5.9714, df = 2, p-value = 0.0505

From the Granger Causality Test shown above, we obtained a p-value of 0.9024 which does not reject the null hypothesis of no Granger causality at the level of significance of 0.05. This suggests that the variable, “PAC” may not provide statistically significant information about future values of the variables, “UTI” and “X3R”. The variable, “PAC” does not have enough power in explaining the variables, “UTI” and “X3R”.

Plotting Impulse Response Plots for Highly Correlated Variables

Impulse Response of a dynamic system is its output when presented with a brief input signal called an impulse. Generally, an Impulse Response is the reaction of any dynamic system in response to some external change. It is the reaction of the system as a function of time or other independent variables that parameterize the dynamic behaviour of the system.

# Plotting Impulse Response Plots for Utilities (UTI)
IRF_UTI_UTI <- irf(VAR_Est, impulse = "UTI", response = "UTI", n.ahead = 10, boot = TRUE)
plot(IRF_UTI_UTI, ylab = "UTI (Gwh)", main = "Shock from UTI")

IRF_UTI_X3R <- irf(VAR_Est, impulse = "X3R", response = "UTI", n.ahead = 10, boot = TRUE)
plot(IRF_UTI_X3R, ylab = "UTI (Gwh)", main = "Shock from X3R")

IRF_UTI_PAC <- irf(VAR_Est, impulse = "PAC", response = "UTI", n.ahead = 10, boot = TRUE)
plot(IRF_UTI_PAC, ylab = "UTI (Gwh)", main = "Shock from PAC")

From the Impulse Response Plots shown above, a positive shock to Utilities (UTI) has a positive impact on Energy Consumption for Utilities (UTI). This is to be expected. Similarly, a positive shock to 3-Room HDB (X3R) has a positive impact on Energy Consumption for Utilities (UTI). This suggests that an increase in Energy Consumption for 3-Room HDB (X3R) may cause an increase in Energy Consumption for Utilities (UTI). Similarly, a positive shock to Private Apartments and Condominiums (PAC) has a positive impact on Energy Consumption for Utilities (UTI). This suggests that an increase in Energy Consumption for Private Apartments and Condominiums (PAC) may cause an increase in Energy Consumption for Utilities (UTI).

# Plotting Impulse Response Plots for 3-Room HDB (X3R)
IRF_X3R_X3R <- irf(VAR_Est, impulse = "X3R", response = "X3R", n.ahead = 10, boot = TRUE)
plot(IRF_X3R_X3R, ylab = "X3R (Gwh)", main = "Shock from X3R")

IRF_X3R_UTI <- irf(VAR_Est, impulse = "UTI", response = "X3R", n.ahead = 10, boot = TRUE)
plot(IRF_X3R_UTI, ylab = "X3R (Gwh)", main = "Shock from UTI")

IRF_X3R_PAC <- irf(VAR_Est, impulse = "PAC", response = "X3R", n.ahead = 10, boot = TRUE)
plot(IRF_X3R_PAC, ylab = "X3R (Gwh)", main = "Shock from PAC")

From the Impulse Response Plots shown above, a positive shock to 3-Room HDB (X3R) has a positive impact on Energy Consumption for 3-Room HDB (X3R). This is to be expected. Similarly, a positive shock to Utilities (UTI) has a positive impact on Energy Consumption for 3-Room HDB (X3R). This suggests that an increase in Energy Consumption for Utilities (UTI) may cause an increase in Energy Consumption for 3-Room HDB (X3R). Similarly, a positive shock to Private Apartments and Condominiums (PAC) has a positive impact on Energy Consumption for 3-Room HDB (X3R). This suggests that an increase in Energy Consumption for Private Apartments and Condominiums (PAC) may cause an increase in Energy Consumption for 3-Room HDB (X3R).

# Plotting Impulse Response Plots for Private Apartments and Condominiums (PAC)
IRF_PAC_PAC <- irf(VAR_Est, impulse = "PAC", response = "PAC", n.ahead = 10, boot = TRUE)
plot(IRF_PAC_PAC, ylab = "PAC (Gwh)", main = "Shock from PAC")

IRF_PAC_UTI <- irf(VAR_Est, impulse = "UTI", response = "PAC", n.ahead = 10, boot = TRUE)
plot(IRF_PAC_UTI, ylab = "PAC (Gwh)", main = "Shock from UTI")

IRF_PAC_X3R <- irf(VAR_Est, impulse = "X3R", response = "PAC", n.ahead = 10, boot = TRUE)
plot(IRF_PAC_X3R, ylab = "PAC (Gwh)", main = "Shock from X3R")

From the Impulse Response Plots shown above, a positive shock to Private Apartments and Condominiums (PAC) has a positive impact on Energy Consumption for Private Apartments and Condominiums (PAC). This is to be expected. Similarly, a positive shock to Utilities (UTI) has a positive impact on Energy Consumption for Private Apartments and Condominiums (PAC). This suggests that an increase in Energy Consumption for Utilities (UTI) may cause an increase in Energy Consumption for Private Apartments and Condominiums (PAC). Similarly, a positive shock to 3-Room HDB (X3R) has a positive impact on Energy Consumption for Private Apartments and Condominiums (PAC). This suggests that an increase in Energy Consumption for 3-Room HDB (X3R) may cause an increase in Energy Consumption for Private Apartments and Condominiums (PAC).

Plotting Forecast Error Variance Decomposition (FEVD) Plots for Highly Correlated Variables

A Forecast Error Variance Decomposition (FEVD) is used to interpret a Vector Autoregression (VAR) Model once it has been fitted. It indicates the amount of information each variable contributes to the other variables in the autoregression. It determines how much of the Forecast Error Variance for each variable can be explained by exogenous shocks to the other variables.

# Plotting Forecast Error Variance Decomposition
# (FEVD) Plots for Highly Correlated Variables
VAR_Est_FEVD <- fevd(VAR_Est, n.ahead = 10)
plot(VAR_Est_FEVD, col = 1:3)

From the FEVD Plots shown above, our results suggest that Utilities (UTI) is largely determined by exogenous shocks to itself, while 3-Room HDB (X3R) is also largely determined by exogenous shocks to itself. Interestingly, the Forecast Error Variance for Private Apartments and Condominiums (PAC) is largely determined by exogenous shocks to the other two variables, Utilities (UTI) and 3-Room HDB (X3R).

Forecasting Energy Consumption in the Next 10 Years

# Forecasting Energy Consumption in the Next 10 Years
predictions <- predict(VAR_Est, n.ahead = 10, ci = 0.95)
predictions

## $UTI
##           fcst    lower    upper       CI
##  [1,] 1530.177 1421.286 1639.067 108.8906
##  [2,] 1681.132 1559.031 1803.233 122.1013
##  [3,] 1740.143 1588.460 1891.826 151.6828
##  [4,] 1822.183 1664.476 1979.890 157.7071
##  [5,] 1863.727 1691.683 2035.770 172.0432
##  [6,] 1952.785 1774.568 2131.003 178.2174
##  [7,] 2026.837 1839.105 2214.570 187.7323
##  [8,] 2103.592 1909.741 2297.444 193.8515
##  [9,] 2160.355 1958.341 2362.370 202.0144
## [10,] 2228.652 2021.220 2436.084 207.4325
## 
## $X3R
##           fcst    lower    upper       CI
##  [1,] 758.4311 728.2410 788.6211 30.19004
##  [2,] 797.0923 763.9420 830.2426 33.15030
##  [3,] 798.1474 753.9772 842.3176 44.17018
##  [4,] 788.1114 740.3335 835.8892 47.77784
##  [5,] 784.1940 735.8442 832.5439 48.34984
##  [6,] 800.1906 749.3373 851.0440 50.85336
##  [7,] 810.8743 759.8132 861.9353 51.06105
##  [8,] 812.0478 759.9829 864.1126 52.06486
##  [9,] 809.4837 757.1584 861.8090 52.32529
## [10,] 814.8867 762.2048 867.5685 52.68183
## 
## $PAC
##           fcst    lower    upper       CI
##  [1,] 2041.940 1942.997 2140.884  98.9432
##  [2,] 2164.129 2051.503 2276.754 112.6253
##  [3,] 2195.475 2064.331 2326.619 131.1445
##  [4,] 2216.095 2078.856 2353.334 137.2389
##  [5,] 2254.663 2110.187 2399.139 144.4760
##  [6,] 2334.701 2183.594 2485.808 151.1067
##  [7,] 2394.947 2237.849 2552.045 157.0977
##  [8,] 2434.895 2271.745 2598.044 163.1498
##  [9,] 2469.344 2301.847 2636.841 167.4969
## [10,] 2522.323 2351.145 2693.502 171.1788

From the Matrices shown above, there is an upward trend in Energy Consumption for the next 10 years. For the sector of Industrial-Related Utilities (UTI), the 10-Year Forecast for Energy Consumption is an increase from 1,530Gwh in 2019 to 2,229Gwh in 2028. For 3-Room HDB Flats (X3R), the 10-Year Forecast for Energy Consumption is an increase from 758Gwh in 2019 to 815Gwh in 2028. For Private Apartments and Condominiums (PAC), the 10-Year Forecast for Energy Consumption is an increase from 2,042Gwh in 2019 to 2,522Gwh in 2028. The 95% Confidence Intervals for these forecasts are also provided.

Plotting Energy Consumption Forecasts in the Next 10 Years

# Plotting Energy Consumption Forecast for Utilities (UTI) in the Next 10 Years
plot(predictions, names = "UTI",
     main = "Forecast of Time Series for Industrial-Related Utilities",
     ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")

fanchart(predictions, names = "UTI",
         main = "Forecast of Time Series for Industrial-Related Utilities",
         ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")

# Plotting Energy Consumption Forecast for 3-Room HDB Flats (X3R) in the Next 10 Years
plot(predictions, names = "X3R",
     main = "Forecast of Time Series for 3-Room HDB Flats",
     ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")

fanchart(predictions, names = "X3R",
         main = "Forecast of Time Series for 3-Room HDB Flats",
         ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")

# Plotting Energy Consumption Forecast for Private
# Apartments and Condominiums (PAC) in the Next 10 Years
plot(predictions, names = "PAC",
     main = "Forecast of Time Series for Private Apartments and Condos",
     ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")

fanchart(predictions, names = "PAC",
         main = "Forecast of Time Series for Private Apartments and Condos",
         ylab = "Energy Consumption (Gwh)", xlab = "Years Since 2005")