La base de datos para el siguiente análsisi es extraÃdo de Yahoo Finance (a través de la técnica scraping) y trata sobre el precio de las acciones de McDonalds.
#install.packages('quantmod') # librerÃa para hacer scraping
#install.packages('tseries')
#install.packages('timeSeries')
#install.packages('forecast')
#install.packages('xts')
#install.packages('ggplot2')
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(tseries)
library(timeSeries)
## Loading required package: timeDate
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
library(forecast)
library(xts)
library(ggplot2)
MCD <- getSymbols('MCD', src='yahoo', from = as.Date("2020-06-20"),to=as.Date("2021-07-16"), auto.assign = FALSE)
Graficando la serie
chartSeries(MCD, name="MCD", subset="last 6 months", theme=chartTheme("white"))
Datos de yahoo finance:
data_MCD <- data.frame(MCD, tiempo = as.Date(rownames(data.frame(MCD))))
head(data_MCD)
## MCD.Open MCD.High MCD.Low MCD.Close MCD.Volume MCD.Adjusted
## 2020-06-22 186.00 187.77 184.89 187.46 3223200 175.2069
## 2020-06-23 189.55 189.73 186.32 186.62 2951900 174.4218
## 2020-06-24 184.96 185.71 181.34 184.29 4146600 172.2441
## 2020-06-25 183.51 184.08 180.33 182.76 3145200 170.8141
## 2020-06-26 182.37 182.81 178.88 179.74 5107400 167.9915
## 2020-06-29 180.57 182.83 179.17 182.80 2622200 170.8515
## tiempo
## 2020-06-22 2020-06-22
## 2020-06-23 2020-06-23
## 2020-06-24 2020-06-24
## 2020-06-25 2020-06-25
## 2020-06-26 2020-06-26
## 2020-06-29 2020-06-29
attach(data_MCD)
Separando la serie close:
base1 = data.frame(tiempo, MCD.Close)
names (base1) = c("tiempo","MCD")
base1 <- na.omit(base1) #eliminando datos ominitidos "NA"
#base1
head(base1, n = 10)
## tiempo MCD
## 1 2020-06-22 187.46
## 2 2020-06-23 186.62
## 3 2020-06-24 184.29
## 4 2020-06-25 182.76
## 5 2020-06-26 179.74
## 6 2020-06-29 182.80
## 7 2020-06-30 184.47
## 8 2020-07-01 184.66
## 9 2020-07-02 183.52
## 10 2020-07-06 188.50
Graficando la serie
ggplot(base1, aes(x = tiempo, y = MCD)) + geom_line() + geom_smooth(se = FALSE)+ labs(title = "Precio de las acciones de McDonalds", x = "Fecha", y = "Precio / Acción")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
MCD_ma = ts(na.omit(base1$MCD), frequency=30)
decomp = stl(MCD_ma, s.window="periodic")
deseasonal_base1 <- seasadj(decomp)
plot(decomp)
adf.test(MCD_ma, alternative = "stationary")
##
## Augmented Dickey-Fuller Test
##
## data: MCD_ma
## Dickey-Fuller = -2.3564, Lag order = 6, p-value = 0.4259
## alternative hypothesis: stationary
Contraste de hipótesis:
pp.test(MCD_ma, alternative = "stationary")
##
## Phillips-Perron Unit Root Test
##
## data: MCD_ma
## Dickey-Fuller Z(alpha) = -8.802, Truncation lag parameter = 5, p-value
## = 0.6165
## alternative hypothesis: stationary
Contraste de hipótesis:
Las ACF proporcionan información sobre cómo una observación influye en las siguientes.
Acf(MCD_ma, main='')
Pacf(MCD_ma, main='')
Para realizar un modelo ARIMA, la serie temporal debe ser estacionaria. Para conseguir esta estacionariedad, la diferenciaremos.
MCD_d1 = diff(deseasonal_base1, differences = 1)
plot(MCD_d1)
Para comprobar que la serie es, efectivamente, estacionaria, hacemos de nuevo el test aumentado de Dickey-Fuller.
adf.test(MCD_d1, alternative = "stationary")
## Warning in adf.test(MCD_d1, alternative = "stationary"): p-value smaller than
## printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: MCD_d1
## Dickey-Fuller = -6.6258, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
modeloarima<-auto.arima(MCD_ma, seasonal=FALSE)
modeloarima
## Series: MCD_ma
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## -0.1113 0.1851
## s.e. 0.0591 0.1192
##
## sigma^2 = 4.851: log likelihood = -590.88
## AIC=1187.77 AICc=1187.86 BIC=1198.54
tsdisplay(residuals(modeloarima), lag.max=10, main='(0,1,0) Model Residuals')
prediccion <- forecast(modeloarima, h=30)
plot(prediccion)
tail(prediccion$mean,30)
## Time Series:
## Start = c(9, 30)
## End = c(10, 29)
## Frequency = 30
## [1] 237.1034 237.2885 237.4735 237.6586 237.8437 238.0288 238.2139 238.3989
## [9] 238.5840 238.7691 238.9542 239.1393 239.3243 239.5094 239.6945 239.8796
## [17] 240.0647 240.2498 240.4348 240.6199 240.8050 240.9901 241.1752 241.3602
## [25] 241.5453 241.7304 241.9155 242.1006 242.2856 242.4707