I. Dữ liệu

Dữ liệu thu thập là giá đóng cửa có điều chỉnh của mã chứng khoán GAS trong khoảng thời gian từ ngày 16/09/2016 đến 21/04/2017 được thu thập từ trang chủ của Vndirect:

Link GAS price

Chia tập dữ liệu thành 2 phần là train và test theo tỷ lệ 9:1. Trong đó tập train sẽ chiếm 90% số quan sát ngày đầu tiên và test chiếm 10% số quan sát những ngày cuối cùng. Tập train được sử dụng để hồi qui model và tập test được dùng để kiểm định tính chính xác của model.

## 'data.frame':    150 obs. of  7 variables:
##  $ DATE  : Factor w/ 325 levels "01/02/2016","01/03/2016",..: 256 221 209 197 187 177 142 130 118 109 ...
##  $ CLOSE : num  54.1 53.6 54 55 55.1 54 53.9 54.7 55.4 56.8 ...
##  $ TICKER: Factor w/ 1 level "GAS": 1 1 1 1 1 1 1 1 1 1 ...
##  $ OPEN  : num  53.6 53.9 54.4 55.2 54 54.2 54.9 56.4 57 57 ...
##  $ HIGH  : num  54.5 54.4 54.4 55.4 55.2 55.1 55 56.4 57 57.5 ...
##  $ LOW   : num  53.4 53.6 54 54.7 54 53.7 53.5 54.3 55.4 56.3 ...
##  $ VOLUME: num  361260 256750 405730 247430 457560 ...
##            PRICE
## 2016-09-16  63.2
## 2016-09-19  64.0
## 2016-09-20  66.0
## 2016-09-21  67.6
## 2016-09-22  67.8
## 2016-09-23  68.6
## [1] "6 quan sat dau tien test data"
##            PRICE
## 2017-04-17  54.0
## 2017-04-18  55.1
## 2017-04-19  55.0
## 2017-04-20  54.0
## 2017-04-21  53.6
## 2017-04-24  54.1
## [1] "Kich thuoc mau test data"
## [1] 15
## [1] "6 quan sat dau tien cua train data"
##            PRICE
## 2017-03-24  54.5
## 2017-03-27  54.0
## 2017-03-28  53.6
## 2017-03-29  53.8
## 2017-03-30  54.4
## 2017-03-31  55.0
## [1] "Kich thuoc mau train data"
## [1] 135
## [1] "Chuoi loi suat thuc te theo ngay"
##                   PRICE
## 2017-04-17  0.001853569
## 2017-04-18  0.020165670
## 2017-04-19 -0.001816531
## 2017-04-20 -0.018349139
## 2017-04-21 -0.007434978
## 2017-04-24  0.009285118

## 
##  Augmented Dickey-Fuller Test
## 
## data:  data.xts
## Dickey-Fuller = -5.3932, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

## [1] 135
##                 PRICE
## 2017-03-31 0.01096903
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.8481  -0.9424  0.0738
## s.e.     NaN      NaN     NaN
## 
## sigma^2 estimated as 0.0002981:  log likelihood = 353.78,  aic = -699.55
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001212412 0.01726464 0.01229523 NaN  Inf 0.6681695
##                      ACF1
## Training set -0.004937111
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 135   -0.001085834 -0.02094621 0.01877454
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5508  -0.6466  0.0288
## s.e.  0.8995   0.9030  0.1444
## 
## sigma^2 estimated as 0.0002958:  log likelihood = 356.94,  aic = -705.88
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001262739 0.01719789 0.01227886 NaN  Inf 0.6678114
##                      ACF1
## Training set -0.005516154
## 
## Forecasts:
##     Point Forecast       Lo 75    Hi 75
## 136  -3.658368e-05 -0.01982017 0.019747
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5397  -0.6349  0.0247
## s.e.  0.8474   0.8505  0.1390
## 
## sigma^2 estimated as 0.000294:  log likelihood = 359.99,  aic = -711.99
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001313255 0.01714582 0.01224962 NaN  Inf 0.6706839
##                      ACF1
## Training set -0.005878081
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 137   0.0005437096 -0.01917997 0.02026739
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5214  -0.6181  0.0243
## s.e.  0.9140   0.9174  0.1443
## 
## sigma^2 estimated as 0.0002924:  log likelihood = 363.02,  aic = -718.03
## 
## Error measures:
##                       ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.00123776 0.01709888 0.01221873 NaN  Inf 0.6694666
##                      ACF1
## Training set -0.005750467
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 138  -0.0007114348 -0.02038112 0.01895825
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.8501  -0.9379  0.0660
## s.e.     NaN      NaN  0.0733
## 
## sigma^2 estimated as 0.0003043:  log likelihood = 362.91,  aic = -717.82
## 
## Error measures:
##                         ME       RMSE        MAE MPE MAPE     MASE
## Training set -0.0009030651 0.01744384 0.01241228 NaN  Inf 0.675896
##                      ACF1
## Training set -0.002380377
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 139   -0.003844981 -0.02391149 0.01622153
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##        ar1      ma1     ma2
##       0.85  -0.9393  0.0674
## s.e.   NaN      NaN  0.0689
## 
## sigma^2 estimated as 0.0003021:  log likelihood = 366.04,  aic = -724.08
## 
## Error measures:
##                         ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.0009078543 0.01738136 0.01233241 NaN  Inf 0.6637528
##                      ACF1
## Training set -0.001409079
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 140  -0.0002707824 -0.02026542 0.01972385
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1     ma1     ma2
##       0.8496  -0.939  0.0669
## s.e.     NaN     NaN  0.0678
## 
## sigma^2 estimated as 3e-04:  log likelihood = 369.17,  aic = -730.34
## 
## Error measures:
##                         ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.0009162767 0.01731961 0.01225645 NaN  Inf 0.6635388
##                      ACF1
## Training set -0.001384932
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 141  -0.0002065136 -0.02013011 0.01971709
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.4996  -0.5888  0.0041
## s.e.  0.7122   0.7133  0.1226
## 
## sigma^2 estimated as 0.0003017:  log likelihood = 371.41,  aic = -734.82
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001101812 0.01736854 0.01239993 NaN  Inf 0.6700822
##                      ACF1
## Training set -0.003034996
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 142    0.001696384 -0.01828351 0.02167628
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5340  -0.6153  0.0005
## s.e.  0.6508   0.6522  0.1150
## 
## sigma^2 estimated as 0.000301:  log likelihood = 374.2,  aic = -740.41
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001200022 0.01734874 0.01242842 NaN  Inf 0.6732381
##                      ACF1
## Training set -0.003979315
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 143     0.00188112 -0.01807599 0.02183823
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5621  -0.6373  0.0035
## s.e.  0.6721   0.6740  0.1091
## 
## sigma^2 estimated as 0.0003008:  log likelihood = 376.89,  aic = -745.78
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001294965 0.01734312 0.01245064 NaN  Inf 0.6786989
##                      ACF1
## Training set -0.005211916
## 
## Forecasts:
##     Point Forecast       Lo 75     Hi 75
## 144    0.002130153 -0.01782049 0.0220808
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1     ma2
##       0.5692  -0.6443  0.0043
## s.e.  0.6645   0.6668  0.1091
## 
## sigma^2 estimated as 0.0002987:  log likelihood = 380.03,  aic = -752.05
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001288557 0.01728277 0.01236611 NaN  Inf 0.6745429
##                      ACF1
## Training set -0.005613988
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 145    0.001153111 -0.01872811 0.02103433
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1      ma2
##       0.5617  -0.6376  -0.0042
## s.e.  0.5548   0.5567   0.1055
## 
## sigma^2 estimated as 0.0002991:  log likelihood = 382.57,  aic = -757.13
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001175771 0.01729415 0.01243551 NaN  Inf 0.6783339
##                      ACF1
## Training set -0.004775439
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 146  -0.0006327303 -0.02052704 0.01926158
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1      ma2
##       0.5614  -0.6379  -0.0036
## s.e.  0.5532   0.5554   0.1052
## 
## sigma^2 estimated as 0.000297:  log likelihood = 385.7,  aic = -763.41
## 
## Error measures:
##                        ME      RMSE        MAE MPE MAPE      MASE
## Training set -0.001175675 0.0172351 0.01235746 NaN  Inf 0.6731522
##                      ACF1
## Training set -0.004116331
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 147  -0.0003423334 -0.02016871 0.01948405
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1      ma2
##       0.5272  -0.6037  -0.0096
## s.e.  0.5792   0.5811   0.1077
## 
## sigma^2 estimated as 0.0002972:  log likelihood = 388.3,  aic = -768.61
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001289639 0.01724008 0.01239958 NaN  Inf 0.6759069
##                      ACF1
## Training set -0.004333645
## 
## Forecasts:
##     Point Forecast       Lo 75     Hi 75
## 148    0.001155483 -0.01867664 0.0209876
## 
## Forecast method: ARIMA(1,0,2) with zero mean
## 
## Model Information:
## 
## Call:
## arima(x = stock_train, order = c(1, 0, 2), include.mean = FALSE)
## 
## Coefficients:
##          ar1      ma1      ma2
##       0.5243  -0.5974  -0.0123
## s.e.  0.5709   0.5721   0.1052
## 
## sigma^2 estimated as 0.0002957:  log likelihood = 391.32,  aic = -774.65
## 
## Error measures:
##                        ME       RMSE        MAE MPE MAPE      MASE
## Training set -0.001334693 0.01719615 0.01237496 NaN  Inf 0.6764288
##                      ACF1
## Training set -0.005046805
## 
## Forecasts:
##     Point Forecast       Lo 75      Hi 75
## 149     0.00141473 -0.01836685 0.02119631

##            Actual_series      FORECAST Accuracy
## 2017-04-03  -0.005469476 -1.085834e-03        1
## 2017-04-04  -0.007339482 -3.658368e-05        1
## 2017-04-05   0.009165967  5.437096e-04        1
## 2017-04-07   0.042863704 -7.114348e-04        0
## 2017-04-10  -0.005258557 -3.844981e-03        1
## 2017-04-11  -0.001759015 -2.707824e-04        1
## 2017-04-12  -0.024956732 -2.065136e-04        1
## 2017-04-13  -0.012715884  1.696384e-03        0
## 2017-04-14  -0.014733232  1.881120e-03        0
## 2017-04-17   0.001853569  2.130153e-03        1
## 2017-04-18   0.020165670  1.153111e-03        1
## 2017-04-19  -0.001816531 -6.327303e-04        1
## 2017-04-20  -0.018349139 -3.423334e-04        1
## 2017-04-21  -0.007434978  1.155483e-03        0
## 2017-04-24   0.009285118  1.414730e-03        1
## [1] 73.33333

Như vậy model dự báo chính xác được 73.3% số trường hợp tăng, giảm của mã GAS.