CD2010 – Introduction to Econometrics

August – December 2023

Exam Part II

Loading Libraries

# loading required libraries 
library(xts)
## Warning: package 'xts' was built under R version 4.2.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.2.3
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.2.3
## 
## ######################### Warning from 'xts' package ##########################
## #                                                                             #
## # The dplyr lag() function breaks how base R's lag() function is supposed to  #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
## # source() into this session won't work correctly.                            #
## #                                                                             #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
## # dplyr from breaking base R's lag() function.                                #
## #                                                                             #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## ###############################################################################
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:xts':
## 
##     first, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(zoo)
library(tseries)
## Warning: package 'tseries' was built under R version 4.2.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(stats)
library(forecast)
## Warning: package 'forecast' was built under R version 4.2.3
library(astsa)
## 
## Attaching package: 'astsa'
## The following object is masked from 'package:forecast':
## 
##     gas
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.2.3
## corrplot 0.92 loaded
library(AER)
## Warning: package 'AER' was built under R version 4.2.3
## Loading required package: car
## Warning: package 'car' was built under R version 4.2.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.2.3
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## Loading required package: lmtest
## Warning: package 'lmtest' was built under R version 4.2.3
## Loading required package: sandwich
## Warning: package 'sandwich' was built under R version 4.2.3
## Loading required package: survival
library(vars)
## Warning: package 'vars' was built under R version 4.2.3
## Loading required package: MASS
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## Loading required package: strucchange
## Warning: package 'strucchange' was built under R version 4.2.3
## Loading required package: urca
## Warning: package 'urca' was built under R version 4.2.3
library(dynlm)
## Warning: package 'dynlm' was built under R version 4.2.3
library(vars)
library(TSstudio)
## Warning: package 'TSstudio' was built under R version 4.2.3
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2
## ──
## ✔ ggplot2 3.4.3     ✔ purrr   1.0.1
## ✔ tibble  3.2.1     ✔ stringr 1.5.0
## ✔ tidyr   1.3.0     ✔ forcats 1.0.0
## ✔ readr   2.1.4
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'tibble' was built under R version 4.2.3
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ stringr::boundary() masks strucchange::boundary()
## ✖ dplyr::filter()     masks stats::filter()
## ✖ dplyr::first()      masks xts::first()
## ✖ dplyr::lag()        masks stats::lag()
## ✖ dplyr::last()       masks xts::last()
## ✖ car::recode()       masks dplyr::recode()
## ✖ MASS::select()      masks dplyr::select()
## ✖ purrr::some()       masks car::some()
library(sarima)
## Warning: package 'sarima' was built under R version 4.2.3
## Loading required package: stats4
## 
## Attaching package: 'sarima'
## 
## The following object is masked from 'package:astsa':
## 
##     sarima
## 
## The following object is masked from 'package:stats':
## 
##     spectrum
library(dygraphs)
## Warning: package 'dygraphs' was built under R version 4.2.3

A) VISUALIZATION (25 pts)

Importing the Dataset

library(readr)
df <- read_csv("D:/JahirBotello/Documents/Carrera LIT/5to Semestre/1er Periodo/Economics/entretainment_stocks.csv")
## Rows: 192 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Date
## dbl (6): Disney_Adj_Close, Netflix_Adj_Close, Nintendo_Adj_Close, WBD_Adj_Cl...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(df)

Standarize Variable Names

## Convert all column names to lowercase
names(df) <- tolower(names(df))

## Replace spaces with underscores
names(df) <- gsub(" ", "_", names(df))

Defining Time Series Format

df$date<-as.Date(df$date,"%m/%d/%Y") 
summary(df$disney_adj_close)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   14.44   31.55   85.92   78.71  108.26  189.04

Converting to time series format (Period = Weekly)

disney<-ts(df$disney_adj_close,start=c(2007,1),end=c(2022,24),frequency=24)

Plot the stock price / variable using a time series format.

par(mfrow = c(2,3))
plot(df$date, df$disney_adj_close, type = "l", col = "blue", lwd = 2, xlab = "Date", ylab = "Adj Close", main = "Disney Adj Close")

## Decompose the time series data in observed, trend, seasonality, and random.

decomp_disney<-decompose(disney)
plot(decomp_disney)

Briefly comment on the following components:

# i. Do the time series data show a trend?

## The decomposed data of disney adj close time series shows a positive trend up until 2015, where all the values have a massive drop. Right after the drop the trend continued with a positive trend up until recent times.



# ii. Do the time series data show seasonality? How is the change of the seasonal component over time?

## With the ADF test we can not only determine the stationarity but we can also determine that there is indeed seasonality in the time series.

## The seasonality that the time series shows starts with a huge increase in the stock price in the starting months of the year followed by a dramatic decrease in stock value.

adf.test(df$disney_adj_close)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  df$disney_adj_close
## Dickey-Fuller = -2.363, Lag order = 5, p-value = 0.4243
## alternative hypothesis: stationary

B) ESTIMATION (25 pts)

Detect if the time series data is stationary.

# By usind the previous ADF test we did to determine the seasonality of the time series we can see that we do not reject the null hypothesis so it indicates that there is indeed stationarity in the time series.
adf.test(df$disney_adj_close)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  df$disney_adj_close
## Dickey-Fuller = -2.363, Lag order = 5, p-value = 0.4243
## alternative hypothesis: stationary

Detect if the time series data shows serial autocorrelation.

acf(df$disney_adj_close, main="Serial Autocorrelation")

## The graph shows that there is indeed autocorrelation in the time series.

Estimate 2 different time series regression models. You might want to consider ARMA (p,q) and / or ARIMA (p,d,q).

Model 1 ARMA

summary(disney_ARMA<-arma(log(df$disney_adj_close),order=c(1,1)))
## 
## Call:
## arma(x = log(df$disney_adj_close), order = c(1, 1))
## 
## Model:
## ARMA(1,1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.2224570 -0.0384418  0.0002564  0.0444826  0.2104628 
## 
## Coefficient(s):
##            Estimate  Std. Error  t value Pr(>|t|)    
## ar1         0.99033     0.00831  119.176   <2e-16 ***
## ma1         0.05651     0.07998    0.706     0.48    
## intercept   0.04591     0.03502    1.311     0.19    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Fit:
## sigma^2 estimated as 0.005385,  Conditional Sum-of-Squares = 1.02,  AIC = -452.16
plot(disney_ARMA)

disney_estimated_stock_price<-exp(disney_ARMA$fitted.values)
plot(disney_estimated_stock_price)

disney_ARMA_residuals<-disney_ARMA$residuals
Box.test(disney_ARMA_residuals,lag=5,type="Ljung-Box")
## 
##  Box-Ljung test
## 
## data:  disney_ARMA_residuals
## X-squared = 4.0425, df = 5, p-value = 0.5433
## Due to the p-value being > 0.05 we do not reject the Null Hypothesis so the model indicates no residual serial auto correlation.

2nd Model ARIMA

plot(df$date,df$disney_adj_close, type="l",col="blue", lwd=2, xlab ="Date",ylab ="Stock Price", main = "Disney Adj Stock Price")

plot(df$date,log(df$disney_adj_close), type="l",col="blue", lwd=2, xlab ="Date",ylab ="Stock Price", main = "Disney Adj Stock Price")

adf.test(log(df$disney_adj_close))
## 
##  Augmented Dickey-Fuller Test
## 
## data:  log(df$disney_adj_close)
## Dickey-Fuller = -1.48, Lag order = 5, p-value = 0.7937
## alternative hypothesis: stationary
plot(diff(log(df$disney_adj_close)),type="l",ylab="Diff(Log(Disney Adj Close))",main = "Difference in Disney Adj Stock Price")

adf.test(diff(log(df$disney_adj_close)))
## Warning in adf.test(diff(log(df$disney_adj_close))): p-value smaller than
## printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff(log(df$disney_adj_close))
## Dickey-Fuller = -5.6716, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Disney_ARIMA <- Arima(df$disney_adj_close,order=c(1,1,1))
summary(Disney_ARIMA)
## Series: df$disney_adj_close 
## ARIMA(1,1,1) 
## 
## Coefficients:
##           ar1     ma1
##       -0.2016  0.1505
## s.e.   0.6994  0.7028
## 
## sigma^2 = 52.72:  log likelihood = -648.66
## AIC=1303.33   AICc=1303.46   BIC=1313.09
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.3165121 7.203687 4.392935 0.3129258 5.493583 0.9925129
##                       ACF1
## Training set -0.0005660867
print(Disney_ARIMA)
## Series: df$disney_adj_close 
## ARIMA(1,1,1) 
## 
## Coefficients:
##           ar1     ma1
##       -0.2016  0.1505
## s.e.   0.6994  0.7028
## 
## sigma^2 = 52.72:  log likelihood = -648.66
## AIC=1303.33   AICc=1303.46   BIC=1313.09
plot(Disney_ARIMA$residuals,main="ARIMA(1,1,1) - Disney Adj Stock Price")

acf(Disney_ARIMA$residuals,main="ACF - ARIMA (1,1,1)")    

Box.test(Disney_ARIMA$residuals,lag=1,type="Ljung-Box") # No serial autocorrelation due to p-value > 0.05
## 
##  Box-Ljung test
## 
## data:  Disney_ARIMA$residuals
## X-squared = 6.2494e-05, df = 1, p-value = 0.9937
adf.test(Disney_ARIMA$residuals) # Stationary due to p-value < 0.05
## Warning in adf.test(Disney_ARIMA$residuals): p-value smaller than printed
## p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Disney_ARIMA$residuals
## Dickey-Fuller = -4.8989, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

3rd Model ARIMA

Disney_ARIMA2 <- Arima(df$disney_adj_close,order=c(0,1,2))
summary(Disney_ARIMA2)
## Series: df$disney_adj_close 
## ARIMA(0,1,2) 
## 
## Coefficients:
##           ma1     ma2
##       -0.0522  0.0379
## s.e.   0.0731  0.0740
## 
## sigma^2 = 52.66:  log likelihood = -648.57
## AIC=1303.14   AICc=1303.27   BIC=1312.9
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.3047561 7.200114 4.391059 0.3010553 5.497399 0.9920892
##                       ACF1
## Training set -0.0005904886
print(Disney_ARIMA2)
## Series: df$disney_adj_close 
## ARIMA(0,1,2) 
## 
## Coefficients:
##           ma1     ma2
##       -0.0522  0.0379
## s.e.   0.0731  0.0740
## 
## sigma^2 = 52.66:  log likelihood = -648.57
## AIC=1303.14   AICc=1303.27   BIC=1312.9
plot(Disney_ARIMA2$residuals,main="ARIMA(0,1,2) - Disney Adj Stock Price")

acf(Disney_ARIMA2$residuals,main="ACF - ARIMA (0,1,2)")    

Box.test(Disney_ARIMA2$residuals,lag=1,type="Ljung-Box") # No serial autocorrelation due to p-value > 0.05
## 
##  Box-Ljung test
## 
## data:  Disney_ARIMA2$residuals
## X-squared = 6.7997e-05, df = 1, p-value = 0.9934
adf.test(Disney_ARIMA2$residuals) # Stationary due to p-value < 0.05
## Warning in adf.test(Disney_ARIMA2$residuals): p-value smaller than printed
## p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Disney_ARIMA2$residuals
## Dickey-Fuller = -4.9743, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

C) EVALUATION (25 pts)

Based on diagnostic tests, compare the 3 estimated time series regression models, and select the results that you consider might generate the best forecast.

## De los 3 modelos que se evaluaron los 2 mejores tuvieron resultados muy similares siendo el modelo ARIMA (1,1,1) y el modelo ARIMA (0.1.2), sin embargo el mejor modelo se puede decir que es el 2 al tener valores un poco mas bajos que el modelo ARIMA(1,1,1)

## Para determinar cual fue el mejor modelo me base en la siguiente informacion 
## - El AIC statistic y el BIC statistic son ligeramente mejores en el modelo ARIMA (0,1,2)
## - Ademas del los estadisticos anteriores tambien me fije en el RMSE y el ACF1 en los cuales dieron resultados ligeramente mejores en el modelo ARIMA (0,1,2)
## - Por ultimo ambos muestran valores sin correlacion serial y datos residuales estacionarios.

D) FORECAST (25 pts)

By using the selected model, make a forecast for the next 5 periods. In doing so, include a time series plot showing your forecast.

Forecast_Disney <- forecast(Disney_ARIMA2, h = 5) # Forecasting the next 5 periods
plot(Forecast_Disney, main= "Forecast for next 5 periods", xlab= "Time",
ylab="Stock price")