a. Visualization
- Selected stock price: IBDRY_Adj_Close Iberdrola SA is one of the
biggest renewable energy companies in the world, with a Spain origin and
170 years of history. It is the world’s largest wind power producer and
one of the world’s largest electric utilities by market capitalization.
It is foccus in energy transition to to reduce the negative effects of
climate change and the need for a clean, reliable and smart business
model. (Iberdrola SA, 2023)
sum(is.na(VARts))
## [1] 0
- Plot the selected stocks price over the time period
VARts$Date <- as.Date(VARts$Date,"%m/%d/%Y")
IBxts<-xts(VARts$IBDRY_Adj_Close,order.by=VARts$Date)
dygraph(IBxts, main = "Iberdrola SA Stock Price") %>%
dyOptions(colors = RColorBrewer::brewer.pal(4, "Dark2"))
In this graph, we can observe a positive trend over time, with a
notable increase between October 2018 and December 2020. There are also
some remarkable decreases around March 2020, September 2021 and and
September 2022. After the decrease of September 2022 the trend line
continues to show a positive trend
This positive trend could be related to the significant increase in
greenhouse gas emissions and consequent global warming in recent years.
Which have generated a greater demand for the services
offered by Iberdrola.
- Decompose the stock price and describe the trend and seasonal
components of the time series data.
IBDRYts<-ts(VARts$IBDRY_Adj_Close,start=c(2015,1),end=c(2022,12),frequency=12)
IBdec<-decompose(IBDRYts)
plot(IBdec)

In the decomposition of the time series it is possible to get a
better understanding of its behavior. The observed variable part shows a
similar performance to the ones previously described on the time series
plots. The trend part shows a more clear positive trend that started to
increase in 2018; with this, it can be inferred that stock price
generally maintains a favorable positive trend, even tho it has periodic
decreases. The seasonal part shows a pattern of dates where the stock
price is favorable because it stands out more on the positive side or
high points. Also, it is possible to see similar behaviors divided in
different periods of the random part for the variable every two
years.
- Do the selected stock price display non-stationary series or
stationary series?
adf.test(VARts$IBDRY_Adj_Close)
##
## Augmented Dickey-Fuller Test
##
## data: VARts$IBDRY_Adj_Close
## Dickey-Fuller = -1.6998, Lag order = 4, p-value = 0.7005
## alternative hypothesis: stationary
Since the p-value is greater than 0.05, it is determined that the
series is non-stationary.
b. Describing Dynamic Interactions
Selecting explanatory variables:
The selected explanatory variables that will be used to try to
explain Iberdrola’s stock price are the following below:
Non-Store Retailing: it measures the performance of online and
non-traditional retail businesses.
Unemployment: unemployment rate
Consumer Confidence: it measures how optimistic / pessimistic
consumers are regarding the state of the economy.
Hour_Wage Sentiment: evaluates the sentiment related to hourly
wages.
par(mfrow=c(2,3))
plot(VARts$Date,VARts$NonStore_Retailing,type="l",col="blue",lwd=2,xlab="Date",ylab="NonStore_Retailing (USD)",main="NonStore_Retailing")
plot(VARts$Date,VARts$US_Unemployment,type="l",col="blue",lwd=2,xlab="Date",ylab="US_Unemployment",main="US_Unemployment")
plot(VARts$Date,VARts$US_Consumer_Confidence,type="l",col="blue",lwd=2,xlab="Date",ylab="US_Consumer_Confidence",main="US_Consumer_Confidence")
plot(VARts$Date,VARts$US_Min_Hour_Wage,type="l",col="blue",lwd=2,xlab="Date",ylab="US_Min_Hour_Wage Sentiment (USD)",main="US_Min_Hour_Wage")
plot(VARts$Date,VARts$IBDRY_Adj_Close,type="l",col="blue",lwd=2,xlab="Date",ylab="IBDRY_Adj_Close",main="IBDRY's Stock Price (USD")

The graphs show the following:
Non Store Retailing: it has a trend of growth over
the years, which is probably because e-commerce is becoming more popular
for its accessibility or convenience. Also, this trend intensified after
the pandemic, because people found a more practical way to get products
without being in contact with people. In a similar way, the stock price
presents a trend of growing, so it can exist a positive relation between
these two variables.
Unemployment: it started with a downward trend, but
then, in 2020 it grew to the highest point to decrease again in the next
few years. So we can say the variables have a negative relationship,
probably because of the economic growth in the country.
Consumer Confidence: at first it seems to have a
positive trend with a pattern of ups and downs, having values over 85.
But, since the pandemic, in the first months of 2020, it started to
decrease because of the uncertainty. In a similar way and with not much
time difference, the stock price of the company also began to decrease,
so they seem to have a positive relationship.
Min_Hour Wage: has the same value in all the
periods, so it doesn’t have a relationship with the other variables.
IBDRY’s Stock Price: it has in general a growing
trend, but decreased in 2020 because of the pandemic, but it started to
increase again with ups and downs.
c. Estimation & Model Selection
In order to take the best decision to estimate a model, we are going
to test variables with the intention of estimating a model, using
stationary time series data.
#Doing this with ts type of this variables gives the same result.
adf.test(VARts$IBDRY_Adj_Close)
##
## Augmented Dickey-Fuller Test
##
## data: VARts$IBDRY_Adj_Close
## Dickey-Fuller = -1.6998, Lag order = 4, p-value = 0.7005
## alternative hypothesis: stationary
adf.test(VARts$NonStore_Retailing)
##
## Augmented Dickey-Fuller Test
##
## data: VARts$NonStore_Retailing
## Dickey-Fuller = -1.8269, Lag order = 4, p-value = 0.648
## alternative hypothesis: stationary
adf.test(VARts$US_Unemployment)
##
## Augmented Dickey-Fuller Test
##
## data: VARts$US_Unemployment
## Dickey-Fuller = -2.5755, Lag order = 4, p-value = 0.3386
## alternative hypothesis: stationary
adf.test(VARts$US_Consumer_Confidence)
##
## Augmented Dickey-Fuller Test
##
## data: VARts$US_Consumer_Confidence
## Dickey-Fuller = -1.7896, Lag order = 4, p-value = 0.6634
## alternative hypothesis: stationary
tsplot(VARts$IBDRY_Adj_Close)
Once we obtained the results for Augmented Dickey-Fuller Test for the
variables we are pretending to use in our model, it is possible to see
that all of them are non-stationary, having a p-value greater than
0.05.
#Transforming to log in order to get stationary series.
adf.test(log(VARts$IBDRY_Adj_Close))
##
## Augmented Dickey-Fuller Test
##
## data: log(VARts$IBDRY_Adj_Close)
## Dickey-Fuller = -1.6406, Lag order = 4, p-value = 0.7249
## alternative hypothesis: stationary
tsplot(log(VARts$IBDRY_Adj_Close))
Despite the fact of having a different p-value, it is also greater than
0.05, that’s why it is also necessary to transform variables into
differences.
#Doing this with ts type of this variables gives the same result.
adf.test(diff(log(VARts$IBDRY_Adj_Close)))
##
## Augmented Dickey-Fuller Test
##
## data: diff(log(VARts$IBDRY_Adj_Close))
## Dickey-Fuller = -4.0157, Lag order = 4, p-value = 0.01179
## alternative hypothesis: stationary
adf.test(diff(log(VARts$NonStore_Retailing)))
##
## Augmented Dickey-Fuller Test
##
## data: diff(log(VARts$NonStore_Retailing))
## Dickey-Fuller = -3.965, Lag order = 4, p-value = 0.0142
## alternative hypothesis: stationary
adf.test(diff(log(VARts$US_Unemployment)))
## Warning in adf.test(diff(log(VARts$US_Unemployment))): p-value smaller than
## printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: diff(log(VARts$US_Unemployment))
## Dickey-Fuller = -4.6314, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
adf.test(diff(log(VARts$US_Consumer_Confidence)))
## Warning in adf.test(diff(log(VARts$US_Consumer_Confidence))): p-value smaller
## than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: diff(log(VARts$US_Consumer_Confidence))
## Dickey-Fuller = -5.6905, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
tsplot(diff(log(VARts$IBDRY_Adj_Close)))

Using differences and log, it is possible to get stationary time
series data (p-value < 0.05), this option is the one that we are
going to use to estimate a model, in order to have an accurate
prediction.
i. Estimating a VAR Model
#Converting to time series format
nsr<-ts(VARts$NonStore_Retailing,start=c(2015,1),end=c(2022,12),frequency=12)
unemployment<-ts(VARts$US_Unemployment,start=c(2015,1),end=c(2022,12),frequency=12)
cc<-ts(VARts$US_Consumer_Confidence,start=c(2015,1),end=c(2022,12),frequency=12)
stock<-ts(VARts$IBDRY_Adj_Close,start=c(2015,1),end=c(2022,12),frequency=12)
#Transforming into diff and log values.
dstock<-diff(log(stock))
dnsr<-diff(log(nsr))
dunemployment<-diff(log(unemployment))
dcc<-diff(log(cc))
VAR_ts<-cbind(dstock, dnsr, dunemployment,dcc)
colnames(VAR_ts)<-cbind("IBDRY_Stock", "NonStore_Retailing", "US_Unemployment","US_Consumer_Confidence")
lag_selection<-VARselect(VAR_ts,lag.max=5,type="const", season=12)
lag_selection$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 1 1 1 1
lag_selection$criteria
## 1 2 3 4 5
## AIC(n) -2.261967e+01 -2.256884e+01 -2.248295e+01 -2.231861e+01 -2.223682e+01
## HQ(n) -2.190282e+01 -2.167278e+01 -2.140767e+01 -2.106412e+01 -2.080312e+01
## SC(n) -2.084202e+01 -2.034678e+01 -1.981648e+01 -1.920773e+01 -1.868153e+01
## FPE(n) 1.524182e-10 1.627713e-10 1.814383e-10 2.209031e-10 2.506538e-10
In the above results we can observe:
The AIC is lowest for a lag order of 1 (AIC = -22.619),
suggesting that a VAR model with a lag order of 4 might be the best
choice based on AIC.
The HQ criterion is lowest for a lag order of 4 (HQ = -21.902),
aligning with the AIC results.
The SC is lowest for a lag order of 4 (SC = -20.84), again
suggesting that a lag order of 4 might be the optimal choice based on
SC.
The FPE criterion decreases as lag order increases. However, the
difference in FPE values between different lag orders is not as
pronounced as with the other criteria.
In summary, based on the criteria used for evaluation (AIC, HQ, SC,
and FPE), Lag Order 1 appears to be the preferred choice as it
consistently minimizes these criterion values. This suggests that a
model with a lag order of 1 might provide a better fit to the data
compared to other lag orders.
# We estimate the VAR model. The p option refers to the number of lags used.
VAR_model1<-VAR(VAR_ts,p=1,type="const",season=4)
summary(VAR_model1)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: IBDRY_Stock, NonStore_Retailing, US_Unemployment, US_Consumer_Confidence
## Deterministic variables: const
## Sample size: 94
## Log Likelihood: 583.72
## Roots of the characteristic polynomial:
## 0.3648 0.3648 0.04547 0.03011
## Call:
## VAR(y = VAR_ts, p = 1, type = "const", season = 4L)
##
##
## Estimation results for equation IBDRY_Stock:
## ============================================
## IBDRY_Stock = IBDRY_Stock.l1 + NonStore_Retailing.l1 + US_Unemployment.l1 + US_Consumer_Confidence.l1 + const + sd1 + sd2 + sd3
##
## Estimate Std. Error t value Pr(>|t|)
## IBDRY_Stock.l1 -0.264127 0.100834 -2.619 0.0104 *
## NonStore_Retailing.l1 -0.120433 0.274262 -0.439 0.6617
## US_Unemployment.l1 0.100186 0.057924 1.730 0.0873 .
## US_Consumer_Confidence.l1 0.276492 0.132985 2.079 0.0406 *
## const 0.014895 0.007115 2.093 0.0393 *
## sd1 0.011868 0.019467 0.610 0.5437
## sd2 0.014232 0.018498 0.769 0.4438
## sd3 0.045424 0.018707 2.428 0.0173 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.06156 on 86 degrees of freedom
## Multiple R-Squared: 0.182, Adjusted R-squared: 0.1154
## F-statistic: 2.733 on 7 and 86 DF, p-value: 0.01308
##
##
## Estimation results for equation NonStore_Retailing:
## ===================================================
## NonStore_Retailing = IBDRY_Stock.l1 + NonStore_Retailing.l1 + US_Unemployment.l1 + US_Consumer_Confidence.l1 + const + sd1 + sd2 + sd3
##
## Estimate Std. Error t value Pr(>|t|)
## IBDRY_Stock.l1 -0.022922 0.038503 -0.595 0.553184
## NonStore_Retailing.l1 -0.410938 0.104727 -3.924 0.000175 ***
## US_Unemployment.l1 0.080025 0.022118 3.618 0.000500 ***
## US_Consumer_Confidence.l1 -0.046327 0.050780 -0.912 0.364156
## const 0.015285 0.002717 5.626 2.26e-07 ***
## sd1 -0.003438 0.007433 -0.463 0.644870
## sd2 -0.001921 0.007063 -0.272 0.786286
## sd3 -0.002405 0.007143 -0.337 0.737179
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.02351 on 86 degrees of freedom
## Multiple R-Squared: 0.2457, Adjusted R-squared: 0.1843
## F-statistic: 4.002 on 7 and 86 DF, p-value: 0.00078
##
##
## Estimation results for equation US_Unemployment:
## ================================================
## US_Unemployment = IBDRY_Stock.l1 + NonStore_Retailing.l1 + US_Unemployment.l1 + US_Consumer_Confidence.l1 + const + sd1 + sd2 + sd3
##
## Estimate Std. Error t value Pr(>|t|)
## IBDRY_Stock.l1 -0.577862 0.215458 -2.682 0.00877 **
## NonStore_Retailing.l1 0.270087 0.586036 0.461 0.64605
## US_Unemployment.l1 0.059154 0.123770 0.478 0.63391
## US_Consumer_Confidence.l1 -0.230387 0.284159 -0.811 0.41974
## const -0.003666 0.015203 -0.241 0.81004
## sd1 -0.016030 0.041596 -0.385 0.70091
## sd2 0.002128 0.039526 0.054 0.95719
## sd3 0.047920 0.039973 1.199 0.23389
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.1315 on 86 degrees of freedom
## Multiple R-Squared: 0.134, Adjusted R-squared: 0.06354
## F-statistic: 1.901 on 7 and 86 DF, p-value: 0.07903
##
##
## Estimation results for equation US_Consumer_Confidence:
## =======================================================
## US_Consumer_Confidence = IBDRY_Stock.l1 + NonStore_Retailing.l1 + US_Unemployment.l1 + US_Consumer_Confidence.l1 + const + sd1 + sd2 + sd3
##
## Estimate Std. Error t value Pr(>|t|)
## IBDRY_Stock.l1 0.047726 0.090697 0.526 0.600
## NonStore_Retailing.l1 0.233262 0.246690 0.946 0.347
## US_Unemployment.l1 -0.060748 0.052101 -1.166 0.247
## US_Consumer_Confidence.l1 -0.057425 0.119616 -0.480 0.632
## const -0.008496 0.006400 -1.328 0.188
## sd1 0.002086 0.017510 0.119 0.905
## sd2 -0.009595 0.016638 -0.577 0.566
## sd3 0.003451 0.016827 0.205 0.838
##
##
## Residual standard error: 0.05537 on 86 degrees of freedom
## Multiple R-Squared: 0.0353, Adjusted R-squared: -0.04323
## F-statistic: 0.4495 on 7 and 86 DF, p-value: 0.868
##
##
##
## Covariance matrix of residuals:
## IBDRY_Stock NonStore_Retailing US_Unemployment
## IBDRY_Stock 3.790e-03 -5.975e-05 -0.001276
## NonStore_Retailing -5.975e-05 5.526e-04 0.001398
## US_Unemployment -1.276e-03 1.398e-03 0.017305
## US_Consumer_Confidence 2.290e-04 -2.325e-04 -0.003357
## US_Consumer_Confidence
## IBDRY_Stock 0.0002290
## NonStore_Retailing -0.0002325
## US_Unemployment -0.0033574
## US_Consumer_Confidence 0.0030663
##
## Correlation matrix of residuals:
## IBDRY_Stock NonStore_Retailing US_Unemployment
## IBDRY_Stock 1.00000 -0.04128 -0.1575
## NonStore_Retailing -0.04128 1.00000 0.4520
## US_Unemployment -0.15750 0.45196 1.0000
## US_Consumer_Confidence 0.06718 -0.17858 -0.4609
## US_Consumer_Confidence
## IBDRY_Stock 0.06718
## NonStore_Retailing -0.17858
## US_Unemployment -0.46091
## US_Consumer_Confidence 1.00000
VAR_model1_residuals<-data.frame(residuals(VAR_model1))
adf.test(VAR_model1_residuals$IBDRY_Stock)
##
## Augmented Dickey-Fuller Test
##
## data: VAR_model1_residuals$IBDRY_Stock
## Dickey-Fuller = -3.508, Lag order = 4, p-value = 0.04549
## alternative hypothesis: stationary
# The p-value is greater than 0.05, so there is not enough evidence to conclude that there is autocorrelation in the residuals.
Box.test(VAR_model1_residuals$IBDRY_Stock,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: VAR_model1_residuals$IBDRY_Stock
## X-squared = 0.064892, df = 1, p-value = 0.7989
# The p-value is greater than 0.05, so there is not enough evidence to conclude that there is autocorrelation in the residuals.
ii. Model Selection
Briefly interpret the regression results. That is, is there a
statistically significant relationship between the explanatory variables
and the main dependent variable?
The provided VAR estimation results describe the relationships
between several economic variables: “IBDRY_Stock,” “NonStore_Retailing,”
“US_Unemployment,” and “US_Consumer_Confidence.” Let’s interpret these
results and provide some insights:
- Model Summary:
- The model includes four endogenous variables: “IBDRY_Stock,”
“NonStore_Retailing,” “US_Unemployment,” and
“US_Consumer_Confidence.”
- A constant term is included in the model.
- The sample size used for estimation consists of 94
observations.
- The log likelihood of the model is 583.72.
- Characteristic Polynomial Roots:
- The characteristic polynomial roots are given as 0.3648, 0.3648,
0.04547, and 0.03011. These roots are essential for determining the
stability of the VAR model. Having roots with magnitudes less than one
indicates stability.
- Equation-Specific Results:
- The output provides results for each endogenous variable equation in
the VAR model. We are going to interpret the statistically significant
variables for each equation.
IBDRY_Stock Equation: (Main dependent Variable)
- “IBDRY_Stock” has a negative coefficient for its own lagged value
(“IBDRY_Stock.l1”). This suggests that past values of the stock price
negatively affect the current stock price.
- “US_Consumer_Confidence.l1” has a positive coefficient, indicating
that an increase in consumer confidence in the previous period is
associated with a higher stock price in the current period.
NonStore_Retailing Equation:
- “NonStore_Retailing” is negatively influenced by its own lagged
value.
- “US_Unemployment.l1” has a positive effect on “NonStore_Retailing,”
suggesting that an increase in unemployment in the previous period is
associated with higher non-store retailing in the current period.
- The constant term has a significant positive impact on
“NonStore_Retailing.”
US_Unemployment Equation:
- “IBDRY_Stock.l1” has a negative coefficient, indicating that an
increase in the previous period’s IBDRY stock price is associated with a
decrease in unemployment in the current period.
- The other variables, including “NonStore_Retailing.l1” and
“US_Consumer_Confidence.l1,” do not have significant effects on
“US_Unemployment.”
US_Consumer_Confidence Equation:
- “IBDRY_Stock.l1” has a small positive effect on
“US_Consumer_Confidence,” though it is not statistically
significant.
- None of the other variables, including lagged values and the
constant term, have significant effects on
“US_Consumer_Confidence.”
- Residuals and Model Fit:
- Each equation’s residuals have residual standard errors, multiple
R-squared, adjusted R-squared, and F-statistics.
- These statistics provide information about the goodness of fit for
each equation.
- Covariance Matrix and Correlation Matrix of
Residuals:
- These matrices show the relationships between the residuals of
different equations.
- For example, the correlation matrix indicates how correlated the
residuals of the different variables are. Negative values suggest
inverse correlations, while positive values suggest direct
correlations.
Do the selected explanatory variables have an influence on
the stock price? Yes they do. Mostly Unemploymnet and lagged
Consumer Confidence.
Is there an instantaneous causality between the stocks price
and the selected explanatory variables? Estimate a Granger Causality
Test to either reject or fail to reject the hypothesis of instantaneous
causality.
granger_IBDRY<-causality(VAR_model1,cause="IBDRY_Stock")
granger_IBDRY
## $Granger
##
## Granger causality H0: IBDRY_Stock do not Granger-cause
## NonStore_Retailing US_Unemployment US_Consumer_Confidence
##
## data: VAR object VAR_model1
## F-Test = 2.7848, df1 = 3, df2 = 344, p-value = 0.04081
##
##
## $Instant
##
## H0: No instantaneous causality between: IBDRY_Stock and
## NonStore_Retailing US_Unemployment US_Consumer_Confidence
##
## data: VAR object VAR_model1
## Chi-squared = 2.3807, df = 3, p-value = 0.4972
Having a p-value of 0.04, we can say that IBDRY_Stock do cause a
significant effect on one or more of the selected variables. Rejecting
H0.
d. Forecasting
Based on the selected VAR_Model, forecast the stock price for
the next 5 months. Display the forecast in a time series
plot.
forecast <- predict(VAR_model1, n.ahead = 5, ci = 0.95)
# Revertir las transformaciones log y diff
forecast$fcst$IBDRY_Stock <- exp(forecast$fcst$IBDRY_Stock) # Reverting log
#Reverting differences for forecasts
forecast$fcst$IBDRY_Stock[1]<-forecast$fcst$IBDRY_Stock[1] + tail(VARts$IBDRY_Adj_Close, 1)
forecast$fcst$IBDRY_Stock[2]<-forecast$fcst$IBDRY_Stock[2] + forecast$fcst$IBDRY_Stock[1]
forecast$fcst$IBDRY_Stock[3]<-forecast$fcst$IBDRY_Stock[3] + forecast$fcst$IBDRY_Stock[2]
forecast$fcst$IBDRY_Stock[4]<-forecast$fcst$IBDRY_Stock[4] + forecast$fcst$IBDRY_Stock[3]
forecast$fcst$IBDRY_Stock[5]<-forecast$fcst$IBDRY_Stock[5] + forecast$fcst$IBDRY_Stock[4]
fanchart(forecast,names="IBDRY_Stock",main="IBDRY Stock Price",xlab="Time Period",ylab="Stock Price")

tsplot(forecast)


We transform the predicted variable for “IBDRY STOCK” into its
original format value, but this just applied to the direct accurate
forecast. It was not possible to transform its lower and upper
values.
#Forecast
forecast$fcst$IBDRY_Stock
## fcst lower upper CI
## [1,] 45.72577 0.8825874 1.123476 1.128244
## [2,] 46.73091 0.8839539 1.142930 1.137090
## [3,] 47.73630 0.8835755 1.144010 1.137871
## [4,] 48.77215 0.9102341 1.178781 1.137994
## [5,] 49.76233 0.8700857 1.126850 1.138026
When generating a forecast with this model, we can obtain an estimate
of what the stock price for the next 5 periods could be. Taking into
account a 95% confidence level, these values might be close as
follows:
Period 1: Price close to $45.73
Period 2: Price close to $46.73
Period 3: Price close to $47.74
Period 4: Price close to $48.77
Period 5: Price close to $49.76
References:
Iberdrola SA (2023). Iberdrola. https://www.iberdrola.com/conocenos/nuestra-empresa
