Individuals can choose different ways to increase their assets. There are many ways to invest assets, such as stocks, deposits, or bonds. For bond investments, it is not easy for individuals to access to invest due to various information and lack of capital of bond types. In case of deposits, the rate of interest has been decreased due to aggresive positive demand policy and low interest rate of each central bank after the 2008 financial crisis, and the speed of fund accumulation has been much slower than other investment methods
In this situation, in order to increase assets, individuals are forced to pay attention to investing in risky assets, like stocks. There are passive strategies and active strategies in investing in stocks. For example, having a portfolio which is almost the same as the index at the target of market return is a passive strategy and market portfolio and other portfolio are active to achieve high revenue.
An active strategy is how the portfolio is organized to generate profits that exceed market returns. However, due to the lack of information and the calculation of incorrect returns and risks, it is not easy for individuals to increase their profits with active strategy. So, individuals who had previously chosen a passive strategy had to construct passive portfolio. However, with the advent of ETF, individuals can construct active portfolio by using the market index in the stock market.
So, our team decided to make a time-series model that can predict the trend of the KOSPI index using the macro-index easily gettable in the internet. It will provide a criterion for predicting the direction of ETF prices so that individuals can determine more reasonable investment decision, not just uncertain intuition or experience.
There has been a lot of studies to verify correlations between the KOSPI index and other economic variables. ‘An Empirical Analysis on the Relationship among S&P500 Index, KOSPI and ₩/$ Exchange Rate’, Kim(2000) conducted a study that analyzes the inter-relationship among financial market price variables, using W/S exchanges rates, Korea Composite Stock Price index and S&P500 index. The interest for this has been growing in connection with currency crisis which may appear again in the future. He concluded that the inter-relation of financial markets since September, 1997 had increased because of the financial market disturbances. In the causal relations among variables he found there are bilateral causalities between foreign exchange rates and KOSPI, that is KOSPI appeared mainly Granger-causally prior to foreign exchange rates. Also S&P500 index appeared to be the precedent variable that could explain KOSPI since April, 1998.
Also, ‘On the relation between oil price and the domestic stock price’, Jung(2012) inferred that relation between oil price and the domestic stock price is getting positively bigger because of oil demands of emerging countries. Fluctuations of the oil price has had enormously influenced on the world economy. In Korea, especially after early 2000s when the financial regulation has been slackened, the influence of the oil price has been growing sharply. In this paper, Unit root test, Correlation, Error Correction Model and Granger-Causality test were used to show the relation between the price of Western Texas Intermediate(WTI) and the Korea Composite Stock Price Index(KOSPI).
‘Volatility Spillover Effect from the Shanghai Stock Market to the Korean Stock Market, Jung and Ryu(2013) stated that the volatility spillover effect between the Shanghai and Korea stock markets is clearly observed based on the ARMA-GARCH model. The specific data are daily returns of Dow Jones index, Shanghai SE A index and KOSPI index and there existed the volatility spillover effect between the Shanghai and Korea stock markets. In particular, since the 2008 financial crisis, the Korean stock market has had a causal relationship with the Chinese stock market, shock response and explanatory power have risen.
‘Prediction of the industrial stock price index using domestic and foreign economic indicies’, Choi(2012) predicted the rise or the fall in eleven major industrial stock price indices. Each industrial stock prcie is important because it helps understanding the movement of stock market as a whole. And the variables that are used as explanatory could be the factor of KOSPI index. The input variables is not only the domestic economic indices but also foreign economic indices including the U.S.A, Japan, China and Europe that have affected Korean stock market.
Literature study showed that domestic macro indicators and overseas stock market data should be used to explain the KOSPI index. It was also noted that the domestic economy in the United States, which is an important variable for the global economy, should be also used as variables, since the KOSPI index is heavily affected by the export market. By using U.S. domestic macro indices, we have also shown that we could use the supply chain of the KOSPI, which leads to changes in U.S. retail to wholesale change, wholesale change to manufacturing change, as explanatory variables.
We would build a predictive model taking into account the movement of the KOSPI index time-series data for prediction of the KOSPI index. To do this, we would first use the Seasonal Decomposition model and ARIMA model to analyze the KOSPI index and predict the results for the next five months. And external variables determined throughout the preceding study would be examined utilizing the Granger-Causality test to find out whether the KOSPI index and precedence exist.
Preceding variables derived from the Granger-Causality test would be used in external variables in the ARIMAX model to produce more accurate predictive models. Our goal is to create a model with high predictive accuracy, thus we would be able to evaluate model performance comparing the estimated 5 months of data from January to May of 2018 with the least error of months of actual data.
Data collected by the Saint Louis FED, the Bank of Korea, and the National Statistical Office would be used in constructing predictive models. Since the macroeconomics and the financial market are different before and after the 2008 financial crisis, the in sample period of the KOSPI index has been used since 2011. Accordingly, macroeconomic indicators that would be used as preceding variables used data from 2010 to 2017. Also, all data were used in monthly.
KOSPI stands for Korea Composite Stock Price Index. The index shows companies listed on the stock market of the Korea Exchange by comparing the time when the market capitalization, the sum of shares, is used, with the time when. The index is based on the total market capitalization on January 4, 1980, and the market capitalization at the time of calculation as a molecule.
The S&P 500 is an index that includes 500 large companies. Most of the 500 companies are U.S. It was used as an external variable to reflect the characteristics of international financial markets.
The New York Stock Exchange is a stock exchange in New York City in the United States, where stocks of the world’s major companies are listed. The NYSE Stock Index is an index that reflects the value of all stock transactions on the exchange. It was used as an external variable to reflect the characteristics of international financial markets.
West Texas Intermediate is the oil produced in western Texas in the United States. It has the greatest impact on the pricing of international oil prices.
Dubai oil is the oil extracted from Dubai, United Arab Emirates. Dubai crude oil is used as a price benchmark because it can produce and trade simultaneously among oil produced in the Persian Gulf.
An index of how much goods are produced compared to the base year. As Korea exports many medium and heavy goods to the United States, a high industrial production index would have a positive impact on the KOSPI with time lag.
Unemployment is the proportion of the economically active population without jobs, and the economically active population is the sum of the currently employed and those actively employed. High unemployment soon leads to low purchasing power, which in turn leads to low industrial production due to high inventories, which is expected to have a negative impact on the KOSPI through the supply chain.
BSI Index is an index that is indexed by observing changes in the judgment, forecast and plan of entrepreneurs on business trends. It is expected that the KOSPI will be preceded by an expected index of market participants on the KOSPI.
The exchange rate is the exchange rate of different currencies. The won-dollar exchange rate is the value of the won per dollar. Fluctuations in the exchange rate could be used to predict changes in the KOSPI through changes in exports.
The weighted average of all A and B stock prices on the Shanghai Stock Exchange. It was used as an external variable to reflect the characteristics of international financial markets.
The manufacturing industry’s average operation rate is the ratio of production performance to production capacity. It can be used as a material for judging the economics.
An index in which consumers quantify their expectations for the future economy through the survey. It is expected to be a positive shock because stock prices have a mechanism to determine prices based on expectations.
Inventories can be used to predict future economic situations because they can lead to lower future production for companies. This allows the KOSPI to be predicted.
Construction orders refer to the amount of construction contract signed by the contractor. Because construction is a high cost and the future economic situation is an important variable, the fluctuations in orders could be used as a forecast for economic fluctuations.
Personal consumption expenditures (PCE), or the PCE Index, measures price changes in consumer goods and services. Expenditures included in the index are actual U.S. household expenditures. Data that pertains to services, durables and non-durables are measured by the index.
Personal Consumption Expenditures: Durable Goods (PCEDG) measures price changes in durable goods. Unlike PCE, it measures price changes only related to durable goods.
Business inventories is an economic figure that tracks the dollar amount of inventories held by retailers, wholesalers and manufacturers across the nation. Business inventories is the short version term for “Manufacturing and Trade Inventories and Sales,” a monthly report released by the U.S. Department of Commerce.
Amount of unsold inventory still in the possession of the manufacturer. If manufacturer inventories start to become elevated, that typically means that wholesalers are purchasing less because demand for the products has decreased. We can look to manufacturer inventory figures to see how the consumer market is performing.
Amount of unsold inventory still in the possession of the retailers. If retailer inventories start to become elevated, that typically means that consumers are purchasing less because demand for the products has decreased. We can look to retailer inventory figures to see how the consumer market is performing.
Amount of unsold inventory still in the possession of the wholesaler. If wholesale inventories start to become elevated, that typically means that retailers are purchasing less because demand for the products has decreased. We can look to wholesale inventory figures to see how the consumer market is performing.
In order to remove the seasonal effect from a time series, the regression model (additive decomposition model) is conducted. Because the variations around the trend does not vary with the level of the time series, additive decomposition model is appropriate. It provides a cleaner way to understand trends.
anova(additive)
## Analysis of Variance Table
##
## Response: kospi_ts_in[, 21]
## Df Sum Sq Mean Sq F value Pr(>F)
## t 1 551794 551794 30.6733 4.853e-07 ***
## dummy 11 48650 4423 0.2459 0.9929
## Residuals 71 1277248 17989
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(additive)
##
## Call:
## tslm(formula = kospi_ts_in[, 21] ~ t + dummy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -159.73 -86.24 -28.18 48.97 348.63
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1867.97107 58.53678 31.911 < 2e-16 ***
## t 3.36066 0.60976 5.511 5.4e-07 ***
## dummyJan -12.67272 72.00569 -0.176 0.861
## dummyFeb 0.05662 71.95146 0.001 0.999
## dummyMar 45.01310 71.90235 0.626 0.533
## dummyApr 58.58101 71.85838 0.815 0.418
## dummyMay 55.31749 71.81956 0.770 0.444
## dummyJun 27.53397 71.78591 0.384 0.702
## dummyJul 52.43617 71.75741 0.731 0.467
## dummyAug 1.22551 71.73409 0.017 0.986
## dummySep 7.05913 71.71595 0.098 0.922
## dummyOct 31.25275 71.70299 0.436 0.664
## dummyNov 10.91781 71.69521 0.152 0.879
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 134.1 on 71 degrees of freedom
## Multiple R-squared: 0.3198, Adjusted R-squared: 0.2048
## F-statistic: 2.781 on 12 and 71 DF, p-value: 0.003687
Since p-value of F test is less than 0.05. Reject H0. This model is valid and can be used.
To make the data fast and easy to view, the above picture shows STL-Decomposition (The Seasonal-Trend Decomposition Procedure by Loees). It is an iterative, non-parametric smoothing algorithm that yields a simultaneous estimation of trend and seasonal effect.
Before creating the model, it is important to make sure whether the variables are stationary. To check the status of the variables, the Audited Dickey-Fuller Test would be used in.
adf_call
## $`no difference`
## MNFCTRIMSA RETAILMSA WHLSLRMSA BUSINV PCE PCEDG
## 0.7818732 0.9419898 0.6805399 0.6767256 0.7210449 0.2109779
## S&P500 NYSE WTI DUBAI USAIPI USAUR
## 0.6187214 0.6055879 0.5987504 0.7332286 0.6309507 0.9681455
## BSIKOR EXPKOR ER SHAINDEX KORIUI KORCSI
## 0.0100000 0.0100000 0.5815716 0.3127968 0.0100000 0.2913156
## KORICI KORCO KOSPI
## 0.1796491 0.6795038 0.7482306
##
## $`1st difference`
## MNFCTRIMSA RETAILMSA WHLSLRMSA BUSINV PCE PCEDG
## 0.45917274 0.01000000 0.01000000 0.01049194 0.06406458 0.01000000
## S&P500 NYSE WTI DUBAI USAIPI USAUR
## 0.07085825 0.05077259 0.01000000 0.01000000 0.28312456 0.01000000
## ER SHAINDEX KORCSI KORICI KORCO KOSPI
## 0.01000000 0.05689227 0.02129313 0.19564610 0.02481853 0.01000000
##
## $`2nd difference`
## MNFCTRIMSA PCE S&P500 NYSE USAIPI SHAINDEX
## 0.01 0.01 0.01 0.01 0.01 0.01
## KORICI
## 0.01
Also pacf would be used to select the time lag of the dependent variable(KOSPI index).
pacf(kospi_ts_in[, 21], lag.max = 84) # AR lag term: 1
In the above result, the time lag of the KOSPI index can be selected via pacf graph and graph implies the model of the time lag should be applied as 1. (AR=1)
Lastly, we should consider the moving average in residuals, using pacf to residuals
pacf(AR$residuals, lag.max = 84)
In the created ARIMA model, the pacf graph of the residuals shows that the time difference for the residuals should be 3. However, we can’t only include MA lag 3, because of arima function doesn’t support this. So rather than putting MA lag 3, we choose MA lag as 1 (MA=1)
On the result of the above, we forecasted 5 months. The summary of arima model and the result of the graph is as follow:
summary(arma)
##
## Call:
## arima(x = kospi_ts_in[, 21], order = c(1, 1, 1))
##
## Coefficients:
## ar1 ma1
## -0.6537 0.8118
## s.e. 0.1746 0.1242
##
## sigma^2 estimated as 4446: log likelihood = -466.42, aic = 938.85
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 4.404228 66.28118 50.13328 0.1449537 2.488951 0.9619796
## ACF1
## Training set -0.1076692
plot(pred_3)
Granger-Causality test can be used to determine whether a causal relationship exists between the two phenomena. The prerequisite of Granger causality test is the stationary of variables. Since the variables of difference conducted once, the variables are stationary.
# Granger causality result.1 (KOSPI index – variables)
gr_kospi_left
## BSIKOR EXPKOR KORIUI RETAILMSA WHLSLRMSA BUSINV
## lag 1 0.159015189 0.053498656 0.3704726 0.9585171 0.9312462 0.9062418
## lag 2 0.194896837 0.019643539 0.2473957 0.8977835 0.9797929 0.9739991
## lag 3 0.106303117 0.026935854 0.2455580 0.6780033 0.3923153 0.1636165
## lag 4 0.088651951 0.014779661 0.1078193 0.7036034 0.7335225 0.4321975
## lag 5 0.004965171 0.003985682 0.1070820 0.3394824 0.8898129 0.2570179
## PCEDG WTI DUBAI USAUR ER KORCSI
## lag 1 0.3436165 0.5161380 0.3093892 0.8959201 0.6276243 0.2337846
## lag 2 0.2230996 0.6292124 0.6744112 0.6290912 0.8624447 0.8631304
## lag 3 0.1441445 0.8263080 0.7579560 0.5204957 0.7557536 0.5453294
## lag 4 0.3125886 0.8325365 0.4860917 0.6102082 0.7304536 0.6135958
## lag 5 0.3698513 0.8165156 0.8334667 0.7863940 0.9444382 0.6907208
## KORCO MNFCTRIMSA PCE S.P500 NYSE USAIPI
## lag 1 0.130707009 0.90674713 0.3896571 0.63930156 0.7719341 0.1081059
## lag 2 0.144643460 0.55744305 0.5235727 0.82589876 0.8906685 0.1112847
## lag 3 0.002477298 0.07721048 0.7921056 0.87791257 0.9846999 0.1646119
## lag 4 0.010379965 0.31398910 0.8155505 0.22494148 0.5422895 0.6425782
## lag 5 0.006071828 0.45198436 0.8750960 0.09564338 0.1860921 0.7122792
## SHAINDEX KORICI
## lag 1 0.4436125 0.2265445
## lag 2 0.5536846 0.2922458
## lag 3 0.6180987 0.2676203
## lag 4 0.5147226 0.2765705
## lag 5 0.5831960 0.2126882
# Granger causality result.2 (variables - KOSPI index)
gr_kospi_right
## BSIKOR EXPKOR KORIUI RETAILMSA WHLSLRMSA BUSINV
## lag 1 0.9486639 0.6350734 0.4959502 0.8588528 0.1558999 0.2748382
## lag 2 0.4370073 0.7315992 0.6477178 0.8278895 0.1778580 0.1327741
## lag 3 0.7417922 0.8929694 0.2834482 0.7229133 0.3036500 0.1865662
## lag 4 0.5962533 0.9989209 0.1106116 0.7696953 0.5524945 0.5169282
## lag 5 0.9439050 0.9968824 0.2091842 0.5356595 0.7318937 0.5000529
## PCEDG WTI DUBAI USAUR ER KORCSI
## lag 1 0.5605221 0.47924607 0.0122208721 0.5381077 0.001741563 0.001732533
## lag 2 0.6157217 0.01632931 0.0003712187 0.6373134 0.015670965 0.060347966
## lag 3 0.7836913 0.06485276 0.0028370723 0.9728495 0.115356184 0.073452157
## lag 4 0.8991695 0.18998841 0.0123928039 0.8444085 0.212369573 0.324388998
## lag 5 0.9520011 0.43030504 0.0264124165 0.3928655 0.337539230 0.303330424
## KORCO MNFCTRIMSA PCE S&P500 NYSE USAIPI
## lag 1 0.5435515 0.8472353 0.1759410 0.517123992 0.37744246 0.67664882
## lag 2 0.5981868 0.4290516 0.2443390 0.182505533 0.01927136 0.38658458
## lag 3 0.9062149 0.7134574 0.8247302 0.269839505 0.03582117 0.01637553
## lag 4 0.9841358 0.9489551 0.8746126 0.126824101 0.03654447 0.07036547
## lag 5 0.9232008 0.8665618 0.5060785 0.006200091 0.00108437 0.23282111
## SHAINDEX KORICI
## lag 1 0.2658097 0.08952397
## lag 2 0.1295735 0.54714399
## lag 3 0.3236628 0.08547361
## lag 4 0.5406372 0.15244660
## lag 5 0.4300136 0.11751120
vari_select
## [1] "BSIKOR , lag 4" "BSIKOR , lag 5" "EXPKOR , lag 1"
## [4] "EXPKOR , lag 2" "EXPKOR , lag 3" "EXPKOR , lag 4"
## [7] "EXPKOR , lag 5" "KORCO , lag 3" "KORCO , lag 4"
## [10] "KORCO , lag 5" "MNFCTRIMSA , lag 3"
As a result of Granger-Causality test, we can explain the causal relationship, not the correlation, because there is no intersection on the left test and the right test. We don’t use all the lag terms of selected variables for each selected variables, we will use only one lag term which p-value is lower than other lag terms. Also, 2nd difference form of variable can cause data loss so we don’t use 2nd diff form data. (MNFCTRIMSA)
gr_kospi_left <- data.frame(gr_kospi_left)
for(i in c("BSIKOR", "EXPKOR", "KORCO", "PCEDG"))
cat(i, '\n', which(gr_kospi_left[, i] == min(gr_kospi_left[, i])), '\n')
## BSIKOR
## 5
## EXPKOR
## 5
## KORCO
## 3
## PCEDG
## 3
The variable that we should consider is PCEDG. PCEDG variable has somewhat high explanatory power since the p-value is relatively lower than the other variables and the difference can be done only once. Accordingly, the below 4 variables (BSIKOR, EXPKOR, PCEDG, KORCO) are used in ARIMAX model as an exogeneous variable.(the significant level as 0.1)
ARIMAX models were created using the combination of the above exogenous variables. The five month forecasting results and the RMSE values are as follows :
unlist(rmse_out)
## bsikor korco
## 2728.364 3145.227
## pcedg expkor
## 3058.163 2626.654
## bsikor korco bsikor pcedg
## 2986.081 2747.911
## bsikor expkor korco pcedg
## 2774.780 3249.536
## korco expkor pcedg expkor
## 2772.228 2669.840
## bsikor korco pcedg bsikor korco expkor
## 3328.223 3132.538
## bsikor pcedg expkor korco pcedg expkor
## 2788.025 2878.551
## bsikor korco pcedg expkor
## 3211.851
The above result is RMSE values for the ARIMAX models. EXPKOR, the consumer confidence index, shows the lowest RMSE. By adding precursor variables, RMSE is significantly lower than ARIMA. This means that adding an explainable precursor variable to the KOSPI rather than a model that considers only the movement of the KOSPI increases the prediction of the price.
As expected, the ARIMAX model was drawn into the most descriptive model. Models that added the consumer expectation index as an external variable were derived with the lowest RMSE. This is because the KOSPI was characterized as a stock price. Individuals ’ expectations seem to be reflected in the price of the KOSPI. However, due to the fact that there is a three month time difference, the time difference seems to be applied to the realization of expectations.
Unlike the initial goal, the accuracy of the prediction of the model was not improved by the inclusion of many foreign variables. This seems to be due to market efficiencies that have already reflected and adjusted significantly in the price of macro metrics that are public information. In addition, it was difficult for variables related to the supply chain to find meaningful data in the grangers and as expected. Also, the consumer purchasing index used as a proxy variable failed to function as a descriptive variable in the long term.
However, as a result of comparison with data from January to May 2018, it showed a one-point difference in May and a relatively near actual price for the rest of the month. This indicates that using the ARMA model and consumer expectation indices, individuals can make investments with reasonable criteria for investment.
The threshold is that it failed to find and apply the data of the Global financial marker as a leading variable of the KOSPI and the variables of the U.S. supply-chain. This is because indicators such as U.S. consumption and inventory in the supply chain are abnormal time series and data loss in the differencing process was too large.
Therefore, better processing of the data or finding other relevant data could lead to more accurate predictions. Another limitation is that the impact of sudden price changes such as the global financial crisis is unpredictable, as it is based primarily on ARIMA models.
Thus, this model is the most effective model when the economy is at very little system risk in the medium or short term and when the stock price moves at a relatively small amplitude, so be careful when using the model of instability.
Kim Seonghwan, ‘An Empirical Analysis on the Relationship among S&P500 Index, KOSPI and ₩/$ Exchange Rate’ (2000) pp. 1-48
Lim Sunghoon, ‘A study on the international transmission of stock market returns: Case of Korea, China, and the U.S’ (2012) pp. pp. 1-37
Yu Siyong, Kim Donghwi, ‘A study on Dynamic Conditional Correlation between Chinese and International Stock Markets Through DCC-MGARCH Model (2011) pp. 25-48
Cape Quant, Market Through “The Cycle” : Stock Market Seen Through Economic Fluctuations (2017) Jung Jeho, ‘On the relation between oil price and the domestic stock price’ (2012) pp. 1-37
Lee Giyul, Cho Minsu, Pyo Sujin, Jung Muyung, ‘Relationship analysis between oil price and equity returns of domestic industry using the regression model’ (2012)
Jun Jihong, Lee Changmin, Lee Sanglim, ‘Impact of oil price fluctuations on stock price, focusing on industry-specific differences (2016) pp. 5-18
Choi Iksun, Kang Dongsik, Lee Jungho, Kang Minwoo, Song Dayung, Shin Seohee, Son Yongsook, ‘Prediction of the industrial stock price index using domestic and foreign economic indicies’ pp.271-283
Gam Hyunggyu, Shin Yongjae, ‘The Impact of Macroeconomic Variables on Stock Reurns in Korea’ (2017) pp.33-52
Lee Hanjae, ‘The Structual Change between KOSPI Stock Price Index and Won/Dollar Exchange Rates (2012) pp.927-947
Lee Yoonbok, Baek Jaeseung, ‘The Interconnectedness between Foreign Exchange Rate on Stock Price and Macroeconomic Variables in Korea and the U.S. (2016) pp.1459-1480
Jung Daejin, Ryu Doojin, ‘Volatility Spillover Effect from the Shanghai Stock Market to the Korean Stock Market (2013) pp.221-253
Baek Yongju, Kang Sanghoon, ‘Analysis Relationship between KOSPI and KRW Markets By Using
Markov-Switching Vector Auto-Regression Model’ (2016) pp.519-540
Park Sungho, Lee Jihye, Kim Pansu, ‘The relationship between firm characteristic and inventory asset turnover change ratio according to type of business (2016) pp.3645-3652