Kali ini saya akan melakukan pemodelan pada data time series saham PT Bank Central Asia Tbk (BBCA.JK). adapun data yang akan saya gunakan merupakan data yang saya peroleh dari Website Yahoo Finance. Untuk mengakses data ini, saya akan menggunakan library R yaitu quantmod. Periode data yang akan saya gunakan untuk melakukan pemodelan dimulai dari tanggal 1 Januari 2016 hingga 31 Desember 2020.
library(quantmod)## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(dplyr) # data wrangling##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:xts':
##
## first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate) # date manipulation##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(forecast) # time series library
library(TTR) # for Simple moving average function
library(MLmetrics) # calculate error##
## Attaching package: 'MLmetrics'
## The following object is masked from 'package:base':
##
## Recall
library(tseries) # adf.test
library(fpp) # usconsumtion## Loading required package: fma
## Loading required package: expsmooth
## Loading required package: lmtest
library(TSstudio) # mempercantik visualisasi timeseries
library(ggplot2)
library(tidyr)
library(padr) # complete data frame
library(zoo) # Missing value imputation
library(data.table)##
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
##
## hour, isoweek, mday, minute, month, quarter, second, wday, week,
## yday, year
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following objects are masked from 'package:xts':
##
## first, last
bbca <- getSymbols("BBCA.JK", auto.assign = FALSE, from=as.Date("2016-01-01"), to =as.Date("2020-12-31"))
bbca <- as.data.frame(bbca)
bbca <- setDT(bbca, keep.rownames = TRUE)[]
head(bbca)glimpse(bbca)## Rows: 1,261
## Columns: 7
## $ rn <chr> "2016-01-04", "2016-01-05", "2016-01-06", "2016-01-07…
## $ BBCA.JK.Open <dbl> 2635, 2640, 2665, 2620, 2590, 2590, 2580, 2645, 2600,…
## $ BBCA.JK.High <dbl> 2685, 2710, 2700, 2630, 2620, 2595, 2645, 2660, 2630,…
## $ BBCA.JK.Low <dbl> 2630, 2640, 2640, 2600, 2590, 2550, 2580, 2630, 2595,…
## $ BBCA.JK.Close <dbl> 2645, 2675, 2640, 2600, 2600, 2550, 2620, 2635, 2605,…
## $ BBCA.JK.Volume <dbl> 77479500, 96609000, 105370000, 146746500, 125820000, …
## $ BBCA.JK.Adjusted <dbl> 2393.224, 2420.369, 2388.700, 2352.508, 2352.508, 230…
Pada data diatas dapat dilihat bahwa tipe data tanggal pada kolom rn perlu kita ubah terlebih dahulu menjadi tipe data yang tepat yakni Date
bbca <- bbca %>%
mutate(rn = ymd(rn))Pada kasus kali ini saya akan melakukan forecasting pada pada closing price, hal ini dikarenakan closing price merupakan hal yang paling penting untuk menimbang untuk menanam saham bagi investor. Alasan lain karena harga closing price merupakan acuan harga opening price, pada hari kerja selanjutnya.
close <- bbca %>%
select(c(rn,BBCA.JK.Close))head(close)Untuk kasus time series, disyaratkan tidak ada data yang kosong agar tahapan forecasting dapat dilakukan. Dan data yang digunakan haruslah data yang terurut berdasarkan waktunya.
# padding data
c_pad <- pad(
close,
interval = NULL,
start_val = ymd("2016-01-01"),
end_val = ymd("2020-12-31"),
by = NULL,
group = NULL,
break_above = 1
) ## pad applied on the interval: day
# mengisi data NA
close_pad<- c_pad %>%
mutate(BBCA.JK.Close = na.fill(BBCA.JK.Close, fill = "extend"))head(close_pad)close_ts <- ts(data = close_pad$BBCA.JK.Close,
start = range(close_pad$rn)[[1]],
frequency = 7*4*3) # yearly seasonality
# melihat plot time series
close_ts %>%
decompose()%>%
autoplot()Karena data trend masih berpola yang mengindikasikan masih ada trend yang belum tertangkap, maka kita buat multiseasonal time-series
# monthly and yearly multiseasonal time series
close_msts <- close_pad$BBCA.JK.Close %>%
msts(seasonal.periods = c(7*4,7*4*12))
# plot decompose msts
close_msts %>%
mstl() %>%
autoplot()Dari plot decompose msts, sudah tidak lagi terlihat data pada trend
test <- tail(close_msts, 7*4*6)
train <- head(close_msts, length(close_msts)-length(test))# ets
close_ets <- stlm(train, method = "ets",allow.multiplicative.trend = T)
# SARIMA
close_arima <- stlm(train, method = "arima",allow.multiplicative.trend = T)
# Holt-Winters
model_holtwin <- HoltWinters(x = train, seasonal = "multiplicative")# forecasting model ets
close_ets_f <- forecast(close_ets, h = 7*4*6, allow.multiplicative.trend = T)
# forecasting model arima
close_arima_f <- forecast(close_arima, h = 7*4*6, allow.multiplicative.trend = T)
# forecasting model holtwinters
close_holtwin_f <- forecast(model_holtwin, h = 7*4*6)train %>%
autoplot()+
autolayer(test, series = "test")+
autolayer(close_ets$fitted, series = "fitted_ets")+
autolayer(close_ets_f$mean, series = "forecast_ets")train %>%
autoplot()+
autolayer(test, series = "test")+
autolayer(close_arima$fitted, series = "fitted_arima")+
autolayer(close_arima_f$mean, series = "forecast_arima")train %>%
autoplot()+
autolayer(test, series = "test")+
autolayer(close_ets$fitted, series = "fitted_hw")+
autolayer(close_holtwin_f$mean, series = "forecast_hw")MAE(close_ets_f$mean, test) ## [1] 288.3213
MAE(close_arima_f$mean, test)## [1] 287.9982
MAE(close_holtwin_f$mean, test)## [1] 305.3012
Model terbaik dengan nilai MAE terkecil adalah model ARIMA
Box.test(close_arima_f$residuals, type = "Ljung-Box")##
## Box-Ljung test
##
## data: close_arima_f$residuals
## X-squared = 0.013357, df = 1, p-value = 0.908
P-value > 0.05 hal ini mengindikasikan bahwa residual autokorelasi
hist(close_arima_f$residuals)shapiro.test(close_arima_f$residuals)##
## Shapiro-Wilk normality test
##
## data: close_arima_f$residuals
## W = 0.88312, p-value < 2.2e-16
P-value < 0.05 artinya residual tidak menyebar normal.
Model terbaik dengan MAE terkecil adalah model arima dengan MAE 287.9982, meskipun pada uji asumsi tidak memenuhi, hal mungkin saja karena adanya event yang unpredictable.