By:
Wendy Chong Pooi Mun & Varian Soong
ABSTRACT
This study is motivated to create macro-driven machine learning models to forecast the U.S. bond yield curve and examine the performance of these models against the traditional statistical approaches based on Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). Multiple machine learning models are created, namely Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbors (kNN), Artificial Neural Network (ANN) and AdaBoost. These models are compared against the Linear Regression Model (LR), Auto Regressive Integrated Moving Average (ARIMA) and Vector Autoregression (VAR). This study uses the software Python 3.9.6 for modelling and feature selection. The result from this study concluded that the machine learning models have high explaining power compared to the traditional statistical models (high R²). With feature selection with RReliefF algorithm, it further improved the models’ predictive power, except linear regression which shown drop in predictive power. Further, the feature selection has shown consistent top 5 features for all targets which could have some underlying economic theories to bond yield curve. The features are European 10 Year bond yield, Fed Fund Rate, average hourly earnings of production and non-supervisory employees, FED total investment holdings and total commercial and industrial loans. The performance of the models reveals that SVM outperformed the rest for forecasting 20 Year and 30 Year bond yield while ARIMA outperformed the shorter-ends of the yield curve. ARIMA model uses only its own lagged values and lagged forecast errors in view of the spanning hypothesis that bond yield curve incorporates all relevant information, inclusive of macroeconomic variables and Central Bank policy. However, the performance of ARIMA dropped significantly if the number of forecast steps are increased. This notion indicates the machine learning approach is generally preferred due to its higher accuracy for long-term yield forecasting.
1. INTRODUCTION
Bond market is a massive financial market as compared to stock market. Globally, the bond market size stood at USD128.3 trillion as at end-August 2020 (ICMA, 2020) as compared to total stock market capitalisation of USD109.2 trillion in 2020 (Statista, 2021). A healthy bond market reflects a healthy lending and borrowing activities among issuers (borrowers) and investors (lender). Besides, the bond market also serves as an important secondary market for investors to sell and buy already issued bonds to manage their liquidity and investment goals. Two of big investors in the bond market are pension funds and insurance companies as together, they hold 50% of the bond market (Nunes et al., 2018). Therefore, forecasting future bond yield is crucial to investors in their portfolio management process. Further, the relationship between the yield curve and economic activities has long been researched and of interest to investors. The slope of the yield curve indicates the future economic activities of the country with steeper slope indicating higher GDP growth and inflation rate, and vice-versa (Bordo & Haubrich, 2021). From a search in Web of Science and Google Scholar, there are scarce literature on machine learning methods in yield curve prediction while traditional forecasting techniques (i.e. vector regression model, ARIMA and VAR) was aplenty. According to an article by CFA Institute in 2019, only 10% of CFA certified analysts and portfolio managers use machine learning in their investment strategy. However, more analysts and portfolio manager have undertook machine learning training which indicates the industry growing adoption of machine learning from the current nascent stage (CFA, 2019).
Traditional forecasting models are losing predictive power due to structural changes caused by changes in monetary policy (Morell, 2018). To address this issue, a few studies have incorporated machine learning techniques, either as standalone or hybrid. In both situations, machine learning techniques prove promising in improving forecasting technique (Smyl, 2020; Nunes et al., 2018). The studies have shown top relevant features of other type of interest rates (futures, foreign bonds or short-term interest rate) in forecasting bond yield, despite other studies have shown that central bank policy and macroeconomic variables have predictive power on bond returns (Indriawan et al., 2021; Hou et al., 2021). Further, the studies do not comprehensively conclude the predictions accuracy between machine learning models and statistical (traditional) models. While other studies use machine learning as feature selection tool only (Fong & Wu, 2020; Bianchi et al., 2021), fewer studies have been conducted on bond market (Misuk, 2021). Majority of the studies are conducted on stock market and commodities market using artificial neural networks.
2. DATA
2.1 Targets
The focus of this study is the monthly data of U.S. Treasury yield curve, obtained from the official website of the U.S. Department of Treasury and cover the period from January 2017 to April 2021. This period span witnessed several bull and bear markets, for example: the boom and bubble burst of cryptocurrency market in 2017-2018, the start of US-China trade war in 2018, Covid-19 pandemic in 2020 and several phases of monetary easing by the US Central Bank. The Treasury yields used in this study consists of bond securities with 8 maturity terms: 1 Year, 2 Year, 3 Year, 5 Year, 7 Year, 10 Year, 20 Year, and 30 Year. These yields are computed from the composites of indicative bid-side market quotations derived from the Federal Reserve Bank of New York. The rationale of selecting the U.S. Treasury yields as the target data is due to its higher liquidity and larger market capitalisation compared to other types of bond securities. In addition, it is known to be a good proxy indicator of mortgage rates and investors’ sentiments about the global economy. A rising yield indicates higher inflation rate and GDP growth. Improving economic indicators would reduce demands for Treasury bonds as investors’ preference shift towards higher risk investment tools with higher potential return.
2.2 Features
Due to the interconnectedness of numerous macroeconomic elements on the movement of U.S. Treasury yield curve, this research considered a wide range of features from the bond market, relative asset classes, currencies, crude oil price and other leading economic indices. A complete list of 29 features are collected at monthly data frequency, similar to that of the target U.S. bond yield data set.
3. METHODOLOGY
3.1 Train-test split and normalization
The data is divided into two sets using the 80%/20% split for training and testing respectively. Next, the data variables are normalized to a notionally common scale by centering the mean and scaling the standard deviation to 1. The normalization process is necessary to account for the numerous features considered in this research, which have different range of scales.
3.2 Feature selection
The feature selection process is important to identify the relevant features, as well as to remove noisy or redundant data (if any) in the effort to reduce distortion to the pattern recognition process. This study adopts the RReliefF regression method to select the significant features by measuring the relative distance between the predicted value of two instances. The Relief’s estimate W[A] of the feature A is an approximation of the differences in probabilities [Eq. 1] which uses the k-nearest instances of hits and misses. It then assigns ranks to each of the features: the higher the Relief estimate, the higher its ranking is. In this study, the relevant features are defined separately for each target variable.
3.3 Models
This research focuses on time-series modelling to predict the movement in U.S. Treasury yields. There are 8 main types of models adopted in this research, of which three are statistical approaches and the remaining five comprise machine learning techniques. With the exception of ARIMA model, the parameters for the other models are tuned to incorporate two different data sets: the first set with the complete set of macroeconomic features; and the second set with only the top-ranked features identified from the feature selection process. Hence, there are altogether 15 variants of models considered in this research. It is emphasized that the ARIMA model depends only on its past values and lagged forecast errors to predict the future values, hence this model does not incorporate any macroeconomic factor as its parameter.
The five machine learning models are the adaptive boosting (AdaBoost), k-nearest neighbours (kNN), artificial neural network (ANN), random forest (RF) and support vector machine (SVM). The remaining three models adopt the statistical approach, which are the multivariate linear regression, vector regression model (VAR) and auto regressive integrated moving average (ARIMA).
3.4 Model Evaluation Metrics
Root mean square error (RMSE)
The RMSE is used as the main metric to measure the deviations between the actual values and the fitted forecast value for a given time period.
where Y_t is the actual value at given time period t, (Y_t ) ̂ is the forecast value for the similar time period t and n is the number of fitted points.
Coefficient of determination (R²)
The R² measures the proportion of variability in the dependent variable that can be explained by its least-squares regression line on the independent variables.
where R² is the coefficient of determination, RSS is the sum of squares of residuals and TSS is the total sum of squares.
4. RESULTS & DISCUSSION
First, an exploratory data analysis is performed to compare the change in the yield curve during conventional and non-conventional monetary phases. Then, we present the top five ranked features of each target variable from the feature selection process. Lastly, we compare the accuracy performance of all the models that are considered in this research.
4.1 Exploratory Analysis
Figure 1 presents the U.S. Treasury yield curves from April 2017 until April 2021 at different terms to maturity. Generally, the yield curve slopes upwards to the right, which indicates the longer the money lending period by an investor, the higher the yield or return is. It is deduced that the direction of each yield curve at different maturity terms is highly correlated to one another. However, the slope or gradient of each yield curve is different at certain calendar periods, as seen from the width of gap between the chart lines.
The steepening of the yield curve reflects the widening spread between the long- and short-term interest rates. It generally signals the positive sentiment of the macroeconomic conditions and rising inflation rates due to the higher interest rates.
It is observed that the yield curve is inverted with a drastically steepened pattern in April 2020. A yield curve inversion happens when the yields on short term bonds are higher than that of the long term bonds. It signals the lost of investors’ confidence towards the current economic conditions. This observation coincides with the panic in the economic market due to the initial Covid-19 pandemic phase and gradual lockdown announcements in many countries during that time period. However, the steepening of the yield curves also indicates the expectation from investors for fast recoveries in the global economy. This is subsequently observed from the quick flattening of the yield curves.
Figure 1. The yield curve of the U.S. Treasury bond yields from April 2017 until April 2021 for different terms to maturity (i.e. 1Yr, 2Yr, 3Yr, 5Yr, 7Yr, 10Yr, 20Yr, 30Yr)
4.2 Comparison of Models
The RMSE and MAE scores of all models (with complete set of features) for bond yields with different terms to maturity (i.e. 1Yr, 2Yr, 3Yr, 5Yr, 7Yr, 10Yr, 20Yr, 30Yr):
The RMSE and MAE scores of all models (with relevant features, i.e. top 5 ranked features) for bond yields with different terms to maturity (i.e. 1Yr, 2Yr, 3Yr, 5Yr, 7Yr, 10Yr, 20Yr, 30Yr):
For the 5Y Treasury yield curve, the ARIMA model is observed to have the lowest RMSE compared to the other models. However, for the longer term of 20Y and 30Y yield, the RMSE of the ARIMA model is seen to increase. This indicates the ARIMA model is not recommended for long term yield forecasting. The ARIMA algorithm learns only from the historical values of the target variables to forecast the future values, in comparison to other models which incorporate numerous macroeconomic features in their parameters. Based on the RMSE results, the ARIMA model performs well to forecast the one-step forward values of the target variable, however as the forward steps are increased, the RMSE is seen to increase accordingly too.
For a longer forecasting horizon, the models are compared for the 20Y and 30Y yields. Generally, the machine learning models (AdaBoost, kNN, ANN, RF and SVM) performed better than the statistical models. For individual model though, the multivariate linear regression (LR) model with complete set of features has a comparatively lower RMSE. It is worthy to note that the RMSE of the LR model increases rather substantially if only the five top-ranked features are used as its parameters. This means the linear regression model requires a significant tradeoff for data collection effort by incorporating the full set of macroeconomic features to achieve a higher prediction accuracy. For model comparison with parameters comprising only relevant features, the SVM model turns out to record the lowest RMSE. It is also observed that the feature selection process is generally improves the machine learning model performance, compared to the statistical models.
Across the 5Y, 20Y and 30Y yields, it is observed that the machine learning models (AdaBoost, kNN, ANN, RF and SVM) generally have higher R² compared to the statistical models, ARIMA and VAR. For individual assessment, the LR model has only marginally higher R² than the machine learning models. It is noted that the SVM model (with relevant features) displays an impressive R² although only the top five ranked features are selected as the parameters. This indicates the importance of the feature selection process to remove irrelevant features that might distort the model forecasting ability.
5. CONCLUSION
The first perspective proposes to incorporate macroeconomic variables as features in the models to predict the U.S. Treasury yield curve. The results of this study indicate the forecasting performance of the models are significantly improved for long-term forecasting horizon. This sequel opposes the spanning hypothesis that bond yield curve reflects investors’ sentiment on the overall economy, thus no other variable is required to forecast bond yields but the yield curve itself. It is reiterated that although the yield curve alone is sufficient for short-term forecasting, the addition of macroeconomic variables is crucial to improve the accuracy of the yield forecasts. This is further illustrated by the performance of the ARIMA model which gradually deteriorates as the forecast steps are increased. Further, the study incorporates a feature selection process using the RReliefF algorithm. This process is carried out to outline the relevant and top-ranked features to be used as refined parameters in the models. Noisy and redundant data will be removed, in hope to reduce distortion to the pattern recognition process. The results reveal that the feature selection process generally enhanced the performance of the machine learning models, compared to the statistical models. The second perspective is to propose a machine learning approach to forecast the bond yields, compared to statistical models. For long term forecasting, the results support that the machine learning models generally achieved higher accuracy compared to the statistical approach. The ARIMA model, which depends merely on the historical values of the yield curve, is only ideal for short term forecasting.
REFERENCES
Bekaert, G., Engstrom, E., & Ermolov, A. (2021). Macro risks and the term structure of interest rates. Journal of Financial Economics. https://doi.org/10.1016/j.jfineco.2021.03.011
Bianchi, D., Büchner, M., Hoogteijling, T., & Tamoni, A. (2021). Corrigendum: Bond Risk Premiums with Machine Learning. Review of Financial Studies, 34(2), 1090–1103. https://doi.org/10.1093/rfs/hhaa098
Bordo, M. D., & Haubrich, J. G. (2021). Some international evidence on the causal impact of the yield curve. Finance Research Letters, March, 102116. https://doi.org/10.1016/j.frl.2021.102116
Box, G. E. , Jenkins, G. M. , Reinsel, G. C. , & Ljung, G. M. (2015). Time series analysis: Forecasting and control, 5th edition. Wiley Series in Probability and Statistics. Wiley.
Caldeira, J. F., Moura, G. V., & Santos, A. A. P. (2016). Predicting the yield curve using forecast combinations. Computational Statistics and Data Analysis, 100, 79–98. https://doi.org/10.1016/j.csda.2014.05.008