Autoregressive integrated moving average

Dự báo luôn là đề tài rất khó trong kinh tế lượng, đặc biệt là dự báo chuỗi thời gian bởi chúng ta phải nói về một kết quả mà chúng ta không biết trước trong khi trong tương lai bối cảnh và các yếu tố tác động là thay đổi. Chủ để của bài viết sau đây sẽ nói về dự báo và phương pháp ARIMA được sử dụng trong dự báo lợi nhuận chứng khoán sử dụng trên phần mềm R.

I. Như thế nào là dự báo time series?

Phương pháp dự báo mà liên quan đến việc sử dụng các giá trị lịch sử của các nhân tố chẳng hạn như giá và sản lượng quá khứ, lạm phát, lợi nhuận quá khứ,… để dự báo giá trị hiện tại hoặc dự báo sự thay đổi của giá trị hiện tại thì được gọi là dự báo time series. Có 2 nhóm dự báo chính là dự báo định tính và dự báo định lượng. Time series thuộc lớp mô hình dự báo định lượng vì kết quả của model dự báo là một giá trị định lượng chứ không phải là phân loại nhóm. Vì có tính định lượng như vậy nên nó được sử dụng khá phổ biến trong nghiên cứu các chuỗi số kinh tế học như GDP, lạm phát, tăng trường hay nghiên cứu giá cả thị trường. Trong tài chính dự báo time series được áp dụng khá phổ biến trong việc ước lượng giá chứng khoán và tìm tác động qua lại giữa giá trị hiện tại và giá trị quá khứ của các mã chứng khoán. Trong đó một vài nguyên tắc dự báo cơ bản thường được áp dụng là AR,MA,SRM,…:

Autoregressive Models (AR): Các giá trị lịch sử của chính biến được dự báo có tương quan lớn đến giá trị tương lai của nó nên AR sẽ sử dụng chính biến đó trong quá khứ làm đầu vào cho model hồi qui biến đó trong tương lai. Hiểu đơn giản thì Autoregressive Models là tự hồi qui.
Moving Average Models (MA): Hay còn gọi là trung bình trượt. Khi dữ liệu trở nên biến động đặc biệt là dữ liệu giá chứng khoán thì trung bình trượt có tác dụng làm cho dữ liệu trở nên smoothing hơn mà vẫn dữ lại được trend của biến. Trung bình trượt được tính bằng cách gán trọng số cho các giá trị lịch sử (weigted moving average) hoặc gán trọng số bằng exponetial moving average.
Seasonal Regression Models: Một chuỗi thời gian được hình thành từ 4 nhân tố: Chu kì, Xu hướng, Mùa vụ và sai số ngẫu nhiên. Seasonal Regression Models sẽ đưa thêm tác động của yếu tố mùa vụ vào model dự báo.
Distributed Lags Models: Nhân tố trễ thường sẽ dừng sau một khoảng thời gian nào đó. Tìm hiểu phân phối của nhân tố trễ để từ đó tìm ra qui luật dừng của nó. Từ đó đưa ra model dự báo với độ trễ tương ứng.

II. Autoregressive Intergrated Moving Average (ARIMA) là gì?

ARIMA là tên gọi viết tắt của phương pháp dự báo Autoregressive Intergrated Moving Average. ARIMA được xây dựng dựa trên cách tiếp cận của Box và Jenkins khi hai nhà khoa học này cho rằng chuỗi không dừng thông thường sẽ chuyển sang chuỗi dừng nếu chúng ta lấy sai phân của chúng. Model tổng quát của ARIMA được viết như sau:

\[ \Delta Y_{t} = \phi_{1} \Delta Y_{t-1}+\phi_{2} \Delta Y_{t-2}+...+\phi_{p}\Delta Y_{t-p}+ \theta_{1}\epsilon_{t-1}+\theta_{2}\epsilon_{t-2}+...+\theta_{q}\epsilon_{t-q} \] Trong đó \(\Delta Y_{t-i}\) là giá trị sai phân và \(\epsilon_{t-i}\) là các nhiễu trắng.

ARIMA về bản chất là sự kết hợp của 2 model:

AutoRegression (AR)- Tự hồi qui một giá trị của timeseries dựa trên giá trị trễ của nó. Độ trễ lớn nhất ở đây (hay còn gọi là bậc tự hồi qui) là p
Integrated (Đồng tích hợp): Là bậc để chuỗi sai phân có tính dừng. Chẳng hạn nếu sai phân bậc 1 là \(\Delta Y_{t}\) là dừng thì chúng ta có giá trị của đồng tích hợp là d = 1.
Moving Average (Trung bình trượt): Bậc của trung bình trượt. Chúng ta kí hiệu là q.

Một quá trị ARIMA(p,d,q) được hiểu là tự hồi qui bậc p, đồng tích hợp bậc d và trung bình trượt bậc q.

III. Các bước để xây dựng một ARIMA model

Bước 1: Kiểm tra tính dừng của Time series.

Time series phải đảm bảo tính dừng. Tính dừng ở đây có nghĩa là nó không có trend tăng dần hoặc giảm dần theo thời gian. Chuỗi được dự báo có giả định là trung bình và phương sai không đổi bởi theo tư tưởng của Box-Jenkins điều này sẽ làm cho chuỗi được dự báo dễ hơn.

Để kiếm tra tính dừng chúng ta sẽ sử dụng chỉ số Augumented Dickey-Fuller unit root test (ADF). Nếu p-value của ADF nhỏ hơn 0.05 thì thỏa mãn tiêu chuẩn dừng. Nếu ADF lớn hơn 0.05 thì chuỗi có nghiệm đơn vị tức là chuỗi là không dừng. ADF về cơ bản là kiểm định hệ số hồi qui của trễ bậc 1 khi hồi một biến theo giá trị trễ bậc 1 của nó. Nếu hệ số này nhỏ hơn 1 thì chuỗi dừng và lớn hơn hoặc bằng 1 thì chuỗi không có tính dừng.

Bước 2: Lấy sai phân.

Mục đích của bước này là chuyển từ một chuỗi không dừng về chuỗi dừng. Chuỗi sai phân của biến được dự báo có thể được kiểm tra tính tự tương quan và tính dừng và tính phân phối chuẩn.

Thông thường chúng ta sẽ lấy sai phân bậc 1 hoặc bậc 2. Sau khi xác định giá trị bậc của sai phân thích hợp chúng ta sẽ chuyển sang bước 3.

Bước 3: Xác định bậc p và q

Trong bước này chúng ta phải xác định bậc phù hợp của 2 phương trình AR và MA. Để tìm được bậc phù hợp chúng ta sẽ cần sử dụng đến đồ thị tự tương quan (ACF) và tự tương quan riêng phần (PACF). Về thế nào là tự tương quan và tự tương quan riêng phần sẽ được trình bày ở bài viết khác.

Xác định bậc của AR model.

Đồ thị PACF sẽ được sử dụng để xác định bậc p của quá trình AR. Dựa vào đỉnh của đồ thị PACF với các mức trễ liên tiếp nhau. Nếu độ trễ lớn nhất có đỉnh nằm ngoài khoảng tin cậy 5% thì giá trị bậc của AR sẽ được quyết định bằng chính độ trễ này.

Xác định bậc của MA model.

Đồ thị ACF sẽ được sử dụng để xác định bậc q của quá trình MA. Cách xác định cũng tương tự như xác định AR là căn cứ vào độ trễ lớn nhất nằm ngoài khoảng tin cậy 5%.

Bước 4: Xây dựng model dự báo.

Từ các bước xác định p,d,q ta sẽ có được một quá trình ARIMA(p,d,q) và thực hiện hồi qui trên tập train dataset. Sau khi thu được model dự báo chúng ta sẽ áp dụng trên tập test dataset và cross check xem giá trị dự báo có phù hợp với giá trị thực tế không.

IV. Xây dựng ARIMA model

Có rất nhiều package tích hợp sẵn để có thể xây dựng được một model time series trên R. Tuy nhiên các package sau là được sử dụng phổ biến và đáng tin cậy hơn cả: quantmod (cho phép load dữ liệu chứng khoán từ yahoo finance, vẽ đồ thị giá và khối lượng giao dịch bằng barchart và biểu đồ nến), tseries(chuyên xây dựng các model dự báo timeseries và chứa các kiểm định tính dừng như ADF),timeSeries (chuyên xây dựng các model dự báo timeseries), forecast (đưa ra kết quả dự báo của các model ARIMA, vẽ đồ thị, tính toán khoảng tin cậy),xts(tạo các object timeseries để lưu trữ các giá trị dự báo). Hi vọng trong tương lai gần sẽ có một package load dữ liệu VnIndex để việc ứng dụng R vào phân tích chứng khoán tại Việt Nam được phổ biến và thuận tiện.

library(quantmod)
#Kéo dữ liệu từ Yahoo finance
#Trước khi kéo dữ liệu về cần kiểm tra kết nối mạng để đảm bảo dữ liệu được kéo về.
getSymbols('TECHM.NS', from = '2012-01-01', to ='2015-01-01')

## [1] "TECHM.NS"

#Lựa chọn mức giá đóng cửa
stock_prices <- TECHM.NS[,4]
tail(stock_prices)

##            TECHM.NS.Close
## 2014-12-25       1258.800
## 2014-12-26       1283.350
## 2014-12-29       1294.876
## 2014-12-30       1305.824
## 2014-12-31       1296.776
## 2015-01-01       1297.276

Thay vì tính ra lợi nhuận của chứng khoán và hồi qui ARIMA trực tiếp trên giá trị này chúng ta sẽ tính sai phân của logarit và sử dụng như đầu vào của model ARIMA. Đây đơn thuần chỉ là một kĩ thuật chuyển hóa dữ liệu nhằm tạo ra một chuỗi số mới có tính dừng.

stock = diff(log(stock_prices), lag = 1)
stock = stock[!is.na(stock)]
plot(stock,main = "log return plot")

Kiểm định ADF kiểm tra tính dừng của chuỗi.

library("tseries")
print(adf.test(stock))

## 
##  Augmented Dickey-Fuller Test
## 
## data:  stock
## Dickey-Fuller = -9.5228, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary

Giá trị p-value < 0.05 cho thấy chuỗi có tính dừng.

Chia tập dataset thành train và test. Trong đó train lấy 29/30 tổng quan sát trong khoảng thời gian đâu và test sẽ lấy 1/30 tổng quan sát gần nhất.

##tạo breakpoint đánh dấu row để phân chia
breakpoint = floor(nrow(stock)*(2.9/3))
##train dataset
train = stock[1:breakpoint,]
test = stock[-row(train),]

Tính các chỉ số ACF và PACF:

par(mfrow = c(1,1))
acf.train = acf(train,main = 'ACF Plot', lag.max = 100)

pacf.train = pacf(train,main = "PACF plot", lag.max = 100)

Chúng ta có thể quan sát thấy với ACF thì biểu đồ có trễ bậc 2 và PACF trễ bậc 2.Kiểm định ADF cho kết quả chuỗi sai phân bậc 1 của logarit stock price là dừng nên chúng ta xác định model ARIMA(p,d,q) thích hợp để dự báo là ARIMA(2,1,2).

Mục tiêu của chúng ta là dự báo toàn bộ chuỗi lợi suất của test dataset nên ta sẽ tạo ra một vòng lặp for trong R để với mỗi một chu kì lặp sẽ tính ra giá trị dự báo trong test dataset.

#Khởi tạo một xts object cho giá trị thực tế của log return
nrow(train)

## [1] 756

a = 756
stock[a,]

##            TECHM.NS.Close
## 2014-11-25    -0.01548775

#Lay gia tri date truoc ngay co stt a
Actual_series = xts(0,as.Date("2014-11-24","%Y-%m-%d"))
Actual_series

##            [,1]
## 2014-11-24    0

#Khởi tạo dataframe của chuỗi forecast
forecasted_series = data.frame(STT = integer(),Forecasted = numeric(),Upper_Forecasted = numeric(),Lower_Forecasted = numeric())
for(b in a:nrow(stock)-1){
  stock_train = stock[1:b,]
  stock_test = stock[-row(stock_train),]
#Summary ARIMA model
fit = arima(stock_train,order = c(2,0,2),include.mean = FALSE)
summary(fit)
#Plotting residual plot
#acf(fit$residuals,main="Residuals plot")
#Forecast log returns
library(forecast)
arima.forecast = forecast.Arima(fit,h=1,level = 95)
summary(arima.forecast)
#Thêm giá trị dự báo vào chuỗi forecasted_series
forecasted_series = rbind(forecasted_series,c(STT = b+1,Forecasted = arima.forecast$mean[1],arima.forecast$upper,arima.forecast$lower))
#Plotting the forecast
#par(mfrow=c(1,1))
#plot(arima.forecast,main = "ARIMA forecast")
#Tạo ra một chuỗi actual return của giai đoạn dự báo
Actual_return = stock[(b+1),]
Actual_series = c(Actual_series,xts(Actual_return))
rm(Actual_return)
}

## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2     ma1     ma2
##       -0.053  -0.2503  0.1608  0.3227
## s.e.     NaN   0.0856     NaN  0.1416
## 
## sigma^2 estimated as 0.0002952:  log likelihood = 1996.94,  aic = -3983.88
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001786199 0.01718176 0.01262324 NaN  Inf 0.7074692
##                    ACF1
## Training set -0.0177668
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 756   -0.002612282 -0.03628791 0.03106335
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5994  -0.8681  -0.5335  0.8863
## s.e.  0.0540   0.0814   0.0556  0.0774
## 
## sigma^2 estimated as 0.0002932:  log likelihood = 2002.08,  aic = -3994.15
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001886453 0.01712328 0.01263224 NaN  Inf 0.7083944
##                    ACF1
## Training set 0.01926763
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5994  -0.8681  -0.5335  0.8863
## s.e.  0.0540   0.0814   0.0556  0.0774
## 
## sigma^2 estimated as 0.0002932:  log likelihood = 2002.08,  aic = -3994.15
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001886453 0.01712328 0.01263224 NaN  Inf 0.7083944
##                    ACF1
## Training set 0.01926763
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 757   -0.004308985 -0.03786999 0.02925202
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2736  -0.2127  0.3788  0.3105
## s.e.   0.4721   0.2313  0.4639  0.2120
## 
## sigma^2 estimated as 0.0002948:  log likelihood = 2002.8,  aic = -3995.59
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001782112 0.01716888 0.01261935 NaN  Inf 0.7072624
##                    ACF1
## Training set -0.0147404
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2736  -0.2127  0.3788  0.3105
## s.e.   0.4721   0.2313  0.4639  0.2120
## 
## sigma^2 estimated as 0.0002948:  log likelihood = 2002.8,  aic = -3995.59
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001782112 0.01716888 0.01261935 NaN  Inf 0.7072624
##                    ACF1
## Training set -0.0147404
## 
## Forecasts:
##     Point Forecast      Lo 95      Hi 95
## 758   0.0007807852 -0.0328696 0.03443117
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3011  -0.1931  0.4063  0.2943
## s.e.   0.4292   0.2510  0.4219  0.2294
## 
## sigma^2 estimated as 0.0002944:  log likelihood = 2005.94,  aic = -4001.88
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001775923 0.01715753 0.01260308 NaN  Inf 0.7067369
##                    ACF1
## Training set -0.0146981
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3011  -0.1931  0.4063  0.2943
## s.e.   0.4292   0.2510  0.4219  0.2294
## 
## sigma^2 estimated as 0.0002944:  log likelihood = 2005.94,  aic = -4001.88
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001775923 0.01715753 0.01260308 NaN  Inf 0.7067369
##                    ACF1
## Training set -0.0146981
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 759    0.001015061 -0.03261309 0.03464321
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2779  -0.2222  0.3828  0.3196
## s.e.   0.4650   0.2281  0.4564  0.2116
## 
## sigma^2 estimated as 0.000294:  log likelihood = 2009.07,  aic = -4008.13
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001774314 0.01714674 0.01259158 NaN  Inf 0.7069191
##                     ACF1
## Training set -0.01446794
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2779  -0.2222  0.3828  0.3196
## s.e.   0.4650   0.2281  0.4564  0.2116
## 
## sigma^2 estimated as 0.000294:  log likelihood = 2009.07,  aic = -4008.13
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001774314 0.01714674 0.01259158 NaN  Inf 0.7069191
##                     ACF1
## Training set -0.01446794
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 760  -0.0009070753 -0.03451407 0.03269992
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3047  -0.1956  0.4093  0.2968
## s.e.   0.4302   0.2533  0.4230  0.2314
## 
## sigma^2 estimated as 0.0002937:  log likelihood = 2012.16,  aic = -4014.32
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001775939 0.01713666 0.01258146 NaN  Inf 0.7069025
##                    ACF1
## Training set -0.0141358
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3047  -0.1956  0.4093  0.2968
## s.e.   0.4302   0.2533  0.4230  0.2314
## 
## sigma^2 estimated as 0.0002937:  log likelihood = 2012.16,  aic = -4014.32
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001775939 0.01713666 0.01258146 NaN  Inf 0.7069025
##                    ACF1
## Training set -0.0141358
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 761   0.0003435073 -0.03324373 0.03393074
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2713  -0.2121  0.3762  0.3094
## s.e.   0.4785   0.2312  0.4704  0.2116
## 
## sigma^2 estimated as 0.0002933:  log likelihood = 2015.3,  aic = -4020.6
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001776555 0.01712554 0.01256791 NaN  Inf 0.7069006
##                     ACF1
## Training set -0.01465206
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2713  -0.2121  0.3762  0.3094
## s.e.   0.4785   0.2312  0.4704  0.2116
## 
## sigma^2 estimated as 0.0002933:  log likelihood = 2015.3,  aic = -4020.6
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001776555 0.01712554 0.01256791 NaN  Inf 0.7069006
##                     ACF1
## Training set -0.01465206
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 762   0.0008073216 -0.03275812 0.03437276
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3010  -0.1962  0.4061  0.2974
## s.e.   0.4207   0.2471  0.4133  0.2270
## 
## sigma^2 estimated as 0.0002931:  log likelihood = 2018.19,  aic = -4026.39
## 
## Training set error measures:
##                     ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.0017882 0.01712008 0.01256704 NaN  Inf 0.7071783
##                     ACF1
## Training set -0.01468472
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3010  -0.1962  0.4061  0.2974
## s.e.   0.4207   0.2471  0.4133  0.2270
## 
## sigma^2 estimated as 0.0002931:  log likelihood = 2018.19,  aic = -4026.39
## 
## Error measures:
##                     ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.0017882 0.01712008 0.01256704 NaN  Inf 0.7071783
##                     ACF1
## Training set -0.01468472
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 763     0.00112281 -0.03243193 0.03467755
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3010  -0.2014  0.4057  0.3023
## s.e.   0.4194   0.2445  0.4119  0.2254
## 
## sigma^2 estimated as 0.0002928:  log likelihood = 2021.28,  aic = -4032.56
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001779428 0.01711026 0.01255833 NaN  Inf 0.7066697
##                     ACF1
## Training set -0.01464812
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.3010  -0.2014  0.4057  0.3023
## s.e.   0.4194   0.2445  0.4119  0.2254
## 
## sigma^2 estimated as 0.0002928:  log likelihood = 2021.28,  aic = -4032.56
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001779428 0.01711026 0.01255833 NaN  Inf 0.7066697
##                     ACF1
## Training set -0.01464812
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 764   0.0001129289 -0.03342257 0.03364842
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2    ma1     ma2
##       -0.2861  -0.2260  0.391  0.3236
## s.e.   0.4405   0.2296  0.432  0.2144
## 
## sigma^2 estimated as 0.0002931:  log likelihood = 2023.51,  aic = -4037.02
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001751357 0.01711963 0.01257273 NaN  Inf 0.7074629
##                     ACF1
## Training set -0.01417224
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2    ma1     ma2
##       -0.2861  -0.2260  0.391  0.3236
## s.e.   0.4405   0.2296  0.432  0.2144
## 
## sigma^2 estimated as 0.0002931:  log likelihood = 2023.51,  aic = -4037.02
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001751357 0.01711963 0.01257273 NaN  Inf 0.7074629
##                     ACF1
## Training set -0.01417224
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 765   -0.003290276 -0.03684413 0.03026358
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2608  -0.2141  0.3662  0.3100
## s.e.   0.4911   0.2294  0.4829  0.2088
## 
## sigma^2 estimated as 0.0002927:  log likelihood = 2026.65,  aic = -4043.31
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001746574 0.01710856 0.01255852 NaN  Inf 0.7064939
##                     ACF1
## Training set -0.01458612
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2608  -0.2141  0.3662  0.3100
## s.e.   0.4911   0.2294  0.4829  0.2088
## 
## sigma^2 estimated as 0.0002927:  log likelihood = 2026.65,  aic = -4043.31
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001746574 0.01710856 0.01255852 NaN  Inf 0.7064939
##                     ACF1
## Training set -0.01458612
## 
## Forecasts:
##     Point Forecast      Lo 95      Hi 95
## 766   -0.001243542 -0.0347757 0.03228861
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2575  -0.2143  0.3629  0.3099
## s.e.   0.5033   0.2298  0.4951  0.2080
## 
## sigma^2 estimated as 0.0002923:  log likelihood = 2029.8,  aic = -4049.6
## 
## Training set error measures:
##                      ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.00174556 0.01709744 0.01254344 NaN  Inf 0.7064761
##                     ACF1
## Training set -0.01453639
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.2575  -0.2143  0.3629  0.3099
## s.e.   0.5033   0.2298  0.4951  0.2080
## 
## sigma^2 estimated as 0.0002923:  log likelihood = 2029.8,  aic = -4049.6
## 
## Error measures:
##                      ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.00174556 0.01709744 0.01254344 NaN  Inf 0.7064761
##                     ACF1
## Training set -0.01453639
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 767    0.001249875 -0.03226049 0.03476024
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.0581  -0.2560  0.1656  0.3279
## s.e.      NaN   0.0922     NaN  0.1455
## 
## sigma^2 estimated as 0.0002924:  log likelihood = 2032.4,  aic = -4054.79
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE     MASE
## Training set 0.001721481 0.01709866 0.01255188 NaN  Inf 0.707072
##                     ACF1
## Training set -0.01723078
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##           ar1      ar2     ma1     ma2
##       -0.0581  -0.2560  0.1656  0.3279
## s.e.      NaN   0.0922     NaN  0.1455
## 
## sigma^2 estimated as 0.0002924:  log likelihood = 2032.4,  aic = -4054.79
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE     MASE
## Training set 0.001721481 0.01709866 0.01255188 NaN  Inf 0.707072
##                     ACF1
## Training set -0.01723078
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 768   -0.001458663 -0.03497142 0.03205409
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6018  -0.8753  -0.5396  0.8915
## s.e.  0.0540   0.0831   0.0574  0.0782
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2036.87,  aic = -4063.75
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001867401 0.01705622 0.01257633 NaN  Inf 0.7076102
##                    ACF1
## Training set 0.02166148
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6018  -0.8753  -0.5396  0.8915
## s.e.  0.0540   0.0831   0.0574  0.0782
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2036.87,  aic = -4063.75
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001867401 0.01705622 0.01257633 NaN  Inf 0.7076102
##                    ACF1
## Training set 0.02166148
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 769    0.002917051 -0.03051253 0.03634664
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6017  -0.8740  -0.5394  0.8896
## s.e.  0.0550   0.0886   0.0597  0.0839
## 
## sigma^2 estimated as 0.0002908:  log likelihood = 2039.66,  aic = -4069.33
## 
## Training set error measures:
##                       ME       RMSE       MAE MPE MAPE     MASE       ACF1
## Training set 0.001846629 0.01705322 0.0125767 NaN  Inf 0.707008 0.02044433
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6017  -0.8740  -0.5394  0.8896
## s.e.  0.0550   0.0886   0.0597  0.0839
## 
## sigma^2 estimated as 0.0002908:  log likelihood = 2039.66,  aic = -4069.33
## 
## Error measures:
##                       ME       RMSE       MAE MPE MAPE     MASE       ACF1
## Training set 0.001846629 0.01705322 0.0125767 NaN  Inf 0.707008 0.02044433
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 770  -0.0008387719 -0.03426246 0.03258492
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6005  -0.8704  -0.5358  0.8849
## s.e.  0.0567   0.0879   0.0609  0.0835
## 
## sigma^2 estimated as 0.0002921:  log likelihood = 2040.64,  aic = -4071.28
## 
## Training set error measures:
##                       ME      RMSE        MAE MPE MAPE      MASE
## Training set 0.001796063 0.0170904 0.01260607 NaN  Inf 0.7082903
##                    ACF1
## Training set 0.02074664
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6005  -0.8704  -0.5358  0.8849
## s.e.  0.0567   0.0879   0.0609  0.0835
## 
## sigma^2 estimated as 0.0002921:  log likelihood = 2040.64,  aic = -4071.28
## 
## Error measures:
##                       ME      RMSE        MAE MPE MAPE      MASE
## Training set 0.001796063 0.0170904 0.01260607 NaN  Inf 0.7082903
##                    ACF1
## Training set 0.02074664
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 771   -0.005550561 -0.03904713 0.02794601
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5991  -0.8689  -0.5356  0.8832
## s.e.  0.0593   0.0987   0.0656  0.0940
## 
## sigma^2 estimated as 0.0002923:  log likelihood = 2043.04,  aic = -4076.09
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001822858 0.01709602 0.01261228 NaN  Inf 0.7068783
##                    ACF1
## Training set 0.01874599
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5991  -0.8689  -0.5356  0.8832
## s.e.  0.0593   0.0987   0.0656  0.0940
## 
## sigma^2 estimated as 0.0002923:  log likelihood = 2043.04,  aic = -4076.09
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001822858 0.01709602 0.01261228 NaN  Inf 0.7068783
##                    ACF1
## Training set 0.01874599
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 772    -0.00159957 -0.03510716 0.03190802
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5997  -0.8732  -0.5378  0.8886
## s.e.  0.0576   0.0904   0.0628  0.0847
## 
## sigma^2 estimated as 0.0002921:  log likelihood = 2045.87,  aic = -4081.74
## 
## Training set error measures:
##                       ME     RMSE        MAE MPE MAPE      MASE       ACF1
## Training set 0.001803524 0.017092 0.01261628 NaN  Inf 0.7064342 0.01891033
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5997  -0.8732  -0.5378  0.8886
## s.e.  0.0576   0.0904   0.0628  0.0847
## 
## sigma^2 estimated as 0.0002921:  log likelihood = 2045.87,  aic = -4081.74
## 
## Error measures:
##                       ME     RMSE        MAE MPE MAPE      MASE       ACF1
## Training set 0.001803524 0.017092 0.01261628 NaN  Inf 0.7064342 0.01891033
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 773    0.003147138 -0.03035256 0.03664684
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5987  -0.8716  -0.5355  0.8872
## s.e.  0.0576   0.0863   0.0615  0.0807
## 
## sigma^2 estimated as 0.0002922:  log likelihood = 2048.46,  aic = -4086.92
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001822574 0.01709329 0.01262531 NaN  Inf 0.7059753
##                    ACF1
## Training set 0.01644947
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5987  -0.8716  -0.5355  0.8872
## s.e.  0.0576   0.0863   0.0615  0.0807
## 
## sigma^2 estimated as 0.0002922:  log likelihood = 2048.46,  aic = -4086.92
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001822574 0.01709329 0.01262531 NaN  Inf 0.7059753
##                    ACF1
## Training set 0.01644947
## 
## Forecasts:
##     Point Forecast       Lo 95     Hi 95
## 774    0.004357063 -0.02914518 0.0378593
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5993  -0.8729  -0.5361  0.8887
## s.e.  0.0567   0.0835   0.0601  0.0778
## 
## sigma^2 estimated as 0.0002919:  log likelihood = 2051.52,  aic = -4093.04
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE     MASE
## Training set 0.001829485 0.01708422 0.01261987 NaN  Inf 0.706096
##                    ACF1
## Training set 0.01690044
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.5993  -0.8729  -0.5361  0.8887
## s.e.  0.0567   0.0835   0.0601  0.0778
## 
## sigma^2 estimated as 0.0002919:  log likelihood = 2051.52,  aic = -4093.04
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE     MASE
## Training set 0.001829485 0.01708422 0.01261987 NaN  Inf 0.706096
##                    ACF1
## Training set 0.01690044
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 775   0.0005522973 -0.03293215 0.03403675
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6001  -0.8742  -0.5374  0.8901
## s.e.  0.0561   0.0828   0.0597  0.0770
## 
## sigma^2 estimated as 0.0002915:  log likelihood = 2054.62,  aic = -4099.25
## 
## Training set error measures:
##                       ME      RMSE        MAE MPE MAPE      MASE
## Training set 0.001834416 0.0170742 0.01261067 NaN  Inf 0.7061934
##                    ACF1
## Training set 0.01739024
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6001  -0.8742  -0.5374  0.8901
## s.e.  0.0561   0.0828   0.0597  0.0770
## 
## sigma^2 estimated as 0.0002915:  log likelihood = 2054.62,  aic = -4099.25
## 
## Error measures:
##                       ME      RMSE        MAE MPE MAPE      MASE
## Training set 0.001834416 0.0170742 0.01261067 NaN  Inf 0.7061934
##                    ACF1
## Training set 0.01739024
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 776    -0.00307604 -0.03654087 0.03038879
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6005  -0.8756  -0.5378  0.8914
## s.e.  0.0555   0.0799   0.0587  0.0740
## 
## sigma^2 estimated as 0.0002912:  log likelihood = 2057.73,  aic = -4105.46
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001825437 0.01706422 0.01260192 NaN  Inf 0.7058894
##                    ACF1
## Training set 0.01734871
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6005  -0.8756  -0.5378  0.8914
## s.e.  0.0555   0.0799   0.0587  0.0740
## 
## sigma^2 estimated as 0.0002912:  log likelihood = 2057.73,  aic = -4105.46
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001825437 0.01706422 0.01260192 NaN  Inf 0.7058894
##                    ACF1
## Training set 0.01734871
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 777   -0.002611608 -0.03605687 0.03083365
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6023  -0.8777  -0.5400  0.8942
## s.e.  0.0534   0.0760   0.0559  0.0702
## 
## sigma^2 estimated as 0.0002913:  log likelihood = 2060.26,  aic = -4110.52
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001798272 0.01706679 0.01261265 NaN  Inf 0.7067238
##                    ACF1
## Training set 0.01829934
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6023  -0.8777  -0.5400  0.8942
## s.e.  0.0534   0.0760   0.0559  0.0702
## 
## sigma^2 estimated as 0.0002913:  log likelihood = 2060.26,  aic = -4110.52
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001798272 0.01706679 0.01261265 NaN  Inf 0.7067238
##                    ACF1
## Training set 0.01829934
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 778  -0.0001232298 -0.03357352 0.03332706
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6028  -0.8774  -0.5403  0.8935
## s.e.  0.0534   0.0779   0.0564  0.0722
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2063.41,  aic = -4116.82
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001796272 0.01705583 0.01259605 NaN  Inf 0.7055991
##                    ACF1
## Training set 0.01826409
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6028  -0.8774  -0.5403  0.8935
## s.e.  0.0534   0.0779   0.0564  0.0722
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2063.41,  aic = -4116.82
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001796272 0.01705583 0.01259605 NaN  Inf 0.7055991
##                    ACF1
## Training set 0.01826409
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 779    0.001985118 -0.03144369 0.03541393
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6054  -0.8823  -0.5435  0.8979
## s.e.  0.0513   0.0719   0.0535  0.0664
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2066.05,  aic = -4122.1
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001817735 0.01705596 0.01260287 NaN  Inf 0.7059066
##                   ACF1
## Training set 0.0187026
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6054  -0.8823  -0.5435  0.8979
## s.e.  0.0513   0.0719   0.0535  0.0664
## 
## sigma^2 estimated as 0.0002909:  log likelihood = 2066.05,  aic = -4122.1
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001817735 0.01705596 0.01260287 NaN  Inf 0.7059066
##                   ACF1
## Training set 0.0187026
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 780    0.002405877 -0.03102319 0.03583494
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6074  -0.8851  -0.5458  0.9003
## s.e.  0.0500   0.0697   0.0524  0.0640
## 
## sigma^2 estimated as 0.0002906:  log likelihood = 2069.13,  aic = -4128.26
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001824805 0.01704655 0.01259535 NaN  Inf 0.7058652
##                    ACF1
## Training set 0.01941105
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6074  -0.8851  -0.5458  0.9003
## s.e.  0.0500   0.0697   0.0524  0.0640
## 
## sigma^2 estimated as 0.0002906:  log likelihood = 2069.13,  aic = -4128.26
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001824805 0.01704655 0.01259535 NaN  Inf 0.7058652
##                    ACF1
## Training set 0.01941105
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 781   0.0002254591 -0.03318516 0.03363608
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6069  -0.8803  -0.5441  0.8957
## s.e.  0.0508   0.0755   0.0537  0.0705
## 
## sigma^2 estimated as 0.0002903:  log likelihood = 2072.17,  aic = -4134.34
## 
## Training set error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001830595 0.01703822 0.01258974 NaN  Inf 0.7064299
##                    ACF1
## Training set 0.01829992
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6069  -0.8803  -0.5441  0.8957
## s.e.  0.0508   0.0755   0.0537  0.0705
## 
## sigma^2 estimated as 0.0002903:  log likelihood = 2072.17,  aic = -4134.34
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set 0.001830595 0.01703822 0.01258974 NaN  Inf 0.7064299
##                    ACF1
## Training set 0.01829992
## 
## Forecasts:
##     Point Forecast      Lo 95      Hi 95
## 782   -0.001356514 -0.0347508 0.03203777
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6072  -0.8818  -0.5447  0.8972
## s.e.  0.0504   0.0738   0.0533  0.0687
## 
## sigma^2 estimated as 0.00029:  log likelihood = 2075.27,  aic = -4140.54
## 
## Training set error measures:
##                       ME       RMSE       MAE MPE MAPE     MASE       ACF1
## Training set 0.001821793 0.01702845 0.0125809 NaN  Inf 0.706058 0.01844063
## 
## Forecast method: ARIMA(2,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(2, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       0.6072  -0.8818  -0.5447  0.8972
## s.e.  0.0504   0.0738   0.0533  0.0687
## 
## sigma^2 estimated as 0.00029:  log likelihood = 2075.27,  aic = -4140.54
## 
## Error measures:
##                       ME       RMSE       MAE MPE MAPE     MASE       ACF1
## Training set 0.001821793 0.01702845 0.0125809 NaN  Inf 0.706058 0.01844063
## 
## Forecasts:
##     Point Forecast       Lo 95      Hi 95
## 783   -0.001326816 -0.03470197 0.03204834

colnames(forecasted_series)=c("STT","Forecasted","Upper_Forecasted","Lower_Forecasted")
tail(forecasted_series)

##    STT    Forecasted Upper_Forecasted Lower_Forecasted
## 23 778 -0.0001232298       0.03332706      -0.03357352
## 24 779  0.0019851178       0.03541393      -0.03144369
## 25 780  0.0024058768       0.03583494      -0.03102319
## 26 781  0.0002254591       0.03363608      -0.03318516
## 27 782 -0.0013565141       0.03203777      -0.03475080
## 28 783 -0.0013268160       0.03204834      -0.03470197

Từ các hệ số thu được ta có phương trình của sai phân logarit lợi suất:

\[ Y_{t} = 0.6072*Y_{t-1}-0.8818*Y_{t-2}-0.5447*\epsilon_{t-1}+0.8972*\epsilon_{t-2} \]

Nên lựa chọn model có phương sai hệ số ước lượng đủ nhỏ để các hệ số ước lượng có ý nghĩa thống kê. Hệ số Akaike information criterion (AIC) thấp cho thấy ARIMA model có độc chính xác cao. Chúng ta cũng có thể nhìn vào đồ thị tự tương quan ACF của phần dư, ARIMA model tốt sẽ có phần dư tự tương quan nếu xét dưới một ngưỡng threshold. Gía trị dự báo trả về là -0.001327 là giá trị dự báo cuối cùng của output.

V. Kiểm tra độ chính xác của ARIMA model

Bằng cách so sánh giá trị dự báo với giá trị thực tế của logarit lợi suất chúng ta sẽ biết được mức độ chính xác dự báo của ARIMA.

#Điều chỉnh độ dài của chuỗi lợi suất
Actual_series = Actual_series[-1]
#Tạo ra object cho chuỗi được dự báo
forecasted_series = xts(forecasted_series,index(Actual_series))

#So sánh giá trị dự báo và giá trị thực tế
plot(Actual_series,ylim=c(-0.2,0.2),main="Actual return vs Forecasted return")
lines(forecasted_series$Forecasted,lwd=1.5,col = "blue")
lines(forecasted_series$Upper_Forecasted,type = "l",pch = 22,lyt =1,lwd=0.5,col = "red")
lines(forecasted_series$Lower_Forecasted,type = "l",pch = 22,lyt =1,lwd=0.5,col = "red")
legend('bottomright',c('Actual','Forecasted',"Upper forecasted","Lower forecasted"),lty = c(1,1), lwd = c(1.5,1.5),col = c('black','blue','red','red'))

#Tạo bảng giá trị actual và forecasted
comparision = merge(Actual_series,forecasted_series$Forecasted)
comparision$Accuracy = sign(comparision$Actual_series) == sign(comparision$Forecasted)
print(comparision)

##            Actual_series    Forecasted Accuracy
## 2014-11-25 -0.0154877473 -0.0026122819        1
## 2014-11-26  0.0101377233 -0.0043089852        0
## 2014-11-27 -0.0003219135  0.0007807852        0
## 2014-11-28 -0.0023441737  0.0010150613        0
## 2014-12-01  0.0049120615 -0.0009070753        0
## 2014-12-02  0.0016375258  0.0003435073        1
## 2014-12-03  0.0131590684  0.0008073216        1
## 2014-12-04 -0.0049711114  0.0011228101        0
## 2014-12-05 -0.0230689605  0.0001129289        0
## 2014-12-08 -0.0020447544 -0.0032902765        1
## 2014-12-09 -0.0002495719 -0.0012435420        1
## 2014-12-10 -0.0156834851  0.0012498747        0
## 2014-12-11  0.0182084822 -0.0014586629        0
## 2014-12-12 -0.0115992224  0.0029170506        0
## 2014-12-15 -0.0365085594 -0.0008387719        1
## 2014-12-16  0.0154268090 -0.0055505608        0
## 2014-12-17 -0.0154268090 -0.0015995700        1
## 2014-12-18  0.0212679508  0.0031471377        1
## 2014-12-19  0.0116987764  0.0043570629        1
## 2014-12-22  0.0058169635  0.0005522973        1
## 2014-12-23 -0.0083976341 -0.0030760402        1
## 2014-12-24 -0.0216700763 -0.0026116082        1
## 2014-12-25  0.0000000000 -0.0001232298        0
## 2014-12-26  0.0193149604  0.0019851178        1
## 2014-12-29  0.0089410911  0.0024058768        1
## 2014-12-30  0.0084193215  0.0002254591        1
## 2014-12-31 -0.0069530750 -0.0013565141        1
## 2015-01-01  0.0003854973 -0.0013268160        0

#tính toán mức độ chính xác
Accuracy_percentage = sum(comparision$Accuracy ==1)*100/length(comparision$Accuracy)
print(Accuracy_percentage)

## [1] 57.14286

Như vậy model ARIMA dự báo được chính xác 57.14% số trường hợp tăng của giá chứng khoán. Đường lợi suất thực tế có xu hướng nằm trong khoảng tin cậy dự báo là 90%.

Việc hiểu được model để từ đó xác định đúng bậc (p,d,q) của model ARIMA là rất quan trọng để tạo ra một model dự báo có độ chính xác cao và kết quả đưa ra là đáng tin cậy. Khi lựa chọn sai model ARIMA(p,d,q) có thể dẫn đến model dự báo sai số lớn, nếu sử dụng kết quả từ những model này có thể dẫn đến quyết định sai lầm trong lựa chọn điểm mua, điểm bán. Chẳng hạn như với cùng mã chứng khoán như trên nếu ta chọn model dự báo là ARIMA(1,0,2) thì kết quả của model chỉ là 39.28% và giá trị này hoàn toàn không có ý nghĩa sử dụng vì xác xuất chính xác còn thấp hơn so với việc lựa chọn random. Sau đây là kết quả của model ARIMA(1,0,2)

#Khởi tạo một xts object cho giá trị thực tế của log return
nrow(train)
a = 756
stock[a,]
#Lay gia tri date truoc ngay co stt a
Actual_series = xts(0,as.Date("2014-11-24","%Y-%m-%d"))
Actual_series
#Khởi tạo dataframe của chuỗi forecast
forecasted_series = data.frame(STT = integer(),Forecasted = numeric(),Upper_Forecasted = numeric(),Lower_Forecasted = numeric())
for(b in a:nrow(stock)-1){
  stock_train = stock[1:b,]
  stock_test = stock[-row(stock_train),]
#Summary ARIMA model
fit = arima(stock_train,order = c(1,0,2),include.mean = FALSE)
summary(fit)
#Plotting residual plot
#acf(fit$residuals,main="Residuals plot")
#Forecast log returns
library(forecast)
arima.forecast = forecast.Arima(fit,h=1,level = 95)
summary(arima.forecast)
#Thêm giá trị dự báo vào chuỗi forecasted_series
forecasted_series = rbind(forecasted_series,c(STT = b+1,Forecasted = arima.forecast$mean[1],arima.forecast$upper,arima.forecast$lower))
#Plotting the forecast
#par(mfrow=c(1,1))
#plot(arima.forecast,main = "ARIMA forecast")
#Tạo ra một chuỗi actual return của giai đoạn dự báo
Actual_return = stock[(b+1),]
Actual_series = c(Actual_series,xts(Actual_return))
rm(Actual_return)
}
colnames(forecasted_series)=c("STT","Forecasted","Upper_Forecasted","Lower_Forecasted")
tail(forecasted_series)

#Điều chỉnh độ dài của chuỗi lợi suất
Actual_series = Actual_series[-1]
#Tạo ra object cho chuỗi được dự báo
forecasted_series = xts(forecasted_series,index(Actual_series))

#So sánh giá trị dự báo và giá trị thực tế
plot(Actual_series,ylim=c(-0.2,0.2),main="Actual return vs Forecasted return")
lines(forecasted_series$Forecasted,lwd=1.5,col = "blue")
lines(forecasted_series$Upper_Forecasted,type = "l",pch = 22,lyt =1,lwd=0.5,col = "red")
lines(forecasted_series$Lower_Forecasted,type = "l",pch = 22,lyt =1,lwd=0.5,col = "red")
legend('bottomright',c('Actual','Forecasted',"Upper forecasted","Lower forecasted"),lty = c(1,1), lwd = c(1.5,1.5),col = c('black','blue','red','red'))

#Tạo bảng giá trị actual và forecasted
comparision = merge(Actual_series,forecasted_series$Forecasted)
comparision$Accuracy = sign(comparision$Actual_series) == sign(comparision$Forecasted)
print(comparision)

##            Actual_series    Forecasted Accuracy
## 2014-11-25 -0.0154877473 -5.427549e-04        1
## 2014-11-26  0.0101377233 -1.440300e-03        0
## 2014-11-27 -0.0003219135  7.433579e-04        0
## 2014-11-28 -0.0023441737  1.109022e-03        0
## 2014-12-01  0.0049120615 -9.602641e-04        0
## 2014-12-02  0.0016375258  1.608764e-03        1
## 2014-12-03  0.0131590684  6.646666e-05        1
## 2014-12-04 -0.0049711114  1.911487e-03        0
## 2014-12-05 -0.0230689605 -6.816333e-04        1
## 2014-12-08 -0.0020447544 -2.374005e-03        1
## 2014-12-09 -0.0002495719 -4.081253e-04        1
## 2014-12-10 -0.0156834851  4.383304e-04        0
## 2014-12-11  0.0182084822 -2.095485e-03        0
## 2014-12-12 -0.0115992224  1.852577e-03        0
## 2014-12-15 -0.0365085594 -9.105770e-04        1
## 2014-12-16  0.0154268090 -4.262190e-03        0
## 2014-12-17 -0.0154268090  5.346352e-04        0
## 2014-12-18  0.0212679508 -6.512345e-04        0
## 2014-12-19  0.0116987764 -1.613320e-04        0
## 2014-12-22  0.0058169635  3.588032e-03        1
## 2014-12-23 -0.0083976341 -2.351856e-04        1
## 2014-12-24 -0.0216700763  4.357746e-04        0
## 2014-12-25  0.0000000000 -2.809399e-03        0
## 2014-12-26  0.0193149604 -8.534465e-04        0
## 2014-12-29  0.0089410911  2.667234e-03        1
## 2014-12-30  0.0084193215  1.530330e-03        1
## 2014-12-31 -0.0069530750  6.058686e-04        0
## 2015-01-01  0.0003854973 -2.680053e-04        0

#tính toán mức độ chính xác
Accuracy_percentage = sum(comparision$Accuracy ==1)*100/length(comparision$Accuracy)
print(Accuracy_percentage)

## [1] 39.28571