library(tidyverse)
library(lubridate)
library(kableExtra)
library(knitr)
In this section, we evaluate the relationship how our sentiment index compares to a broad US equity index (the Russell 1000 Index). This section will examine the fluctuations of the sentiment compared to the equity market in two ways: through a visual analysis of the normalized levels of both variables and a linear regression of the time series data. To accomplish this, we first merge 3 data sets aligned by the 102 FOMC meeting dates. To calculate normalized versions of the variables, we calculate Z-scores of both variables over the sample period. Lastly, we perform both analyses using the Z-score data.
# First load all 3 files into data frames.
# ------------------------------------------------------
mgData<-readRDS(file = "fomc_merged_data_v2.rds")
sData <- readRDS( file = "../DATA/SentimentDF.rds")
file_fred_ru1000tr = "../DATA/FRED_RU1000TR.csv"
ru1000tr = read_csv(file_fred_ru1000tr,
col_types = cols(DATE=col_character(),
RU1000TR = col_double() ) )
# Generate a lubridate date column to join with the FOMC data.
# -----------------------------------------------------------------
ru1000tr %>% mutate( date_mdy = lubridate::ymd( DATE ) )-> ruData
#z_ru_daily = (RU1000TR - mean(RU1000TR, na.rm=TRUE))/sd(RU1000TR, na.rm = TRUE )
# Second, join the data:
# Since this is a 2-way inner join, we start with the FOMC statement data
# and join it to the sentiment data by date string (yyyymmdd)
# -------------------------------------------------------------------------
mgData %>% inner_join(sData, by = c( "statement.dates" = "FOMC_Date")) -> msData
# Join the sentiment-FOMC data to the Russell 1000 Index data from FRED
# Make sure to add a Z-score for each of the time series: sentiment and Rusell index values
# Save the raw data and normalized data by FOMC data.
# ----------------------------------------------------------------------------------
msEQdata = msData %>% left_join(ruData, by = c("date_mdy" = "date_mdy") ) %>%
select( date_mdy, Sentiment_Score, RU1000TR ) %>%
mutate( z_ru_fomc = (RU1000TR - mean(RU1000TR, na.rm = TRUE) ) / sd( RU1000TR, na.rm=TRUE ) ,
z_sentiment = ( Sentiment_Score - mean( Sentiment_Score, na.rm = TRUE) ) /
sd( Sentiment_Score, na.rm=TRUE) )
Let’s inspect the data for accuracy and scaling issues. Exploratory data analysis shows 3 issues.
Normalization to z-score format is needed to ensure that scale is not a problem. Since the Russell Index level are expressed in the thousands, while the sentiment is on expressed in units of 0.01, scaling is essential along the y-dimension. To solve the scale problem, we convert the entire sample to Z-score equivalent which bring both time series to the same order of magnitude and mean.
There is also a need to normalize in the frequency domain. FOMC meetings occur 8 times per year so their sentiment levels and changes reflect nearly 2 months of news. Russell equity index levels are collected on a daily basis in order to ensure completeness of the data collection. The volatility of lower frequency data is much greater in absolute terms than volatility of higher frequency (daily) data. To address this, we only calculate Z-scores of the Russell equity index levels observed only on the FOMC dates.
Lastly, Russell Index levels increases at a geometric rate (roughly). Thus, values at the start of the sample period are smaller than values at the end of the period. The residuals in a regression of such data show significant increase volatility over the sample period. This is solved by apply a logarithmic transformation to Russell Index levels. This change fixes the non-constant residual volatility and also improves the model fit from 36 to 39 percent adjusted R-squared roughly.
The following code produces the log-transformed z-scores of FOMC periodic equity values.
msEQdata %>% mutate( logEquity = log(RU1000TR) ) %>%
mutate( z_logEquity = ( logEquity - mean(logEquity) )/ sd( logEquity ) ) -> msEQdata
msEQdata %>% kable() %>% scroll_box(width="100%", height="200px")
| date_mdy | Sentiment_Score | RU1000TR | z_ru_fomc | z_sentiment | logEquity | z_logEquity |
|---|---|---|---|---|---|---|
| 2007-01-31 | 0.0263158 | 3497.78 | -0.7129900 | 2.8720683 | 8.159884 | -0.6445636 |
| 2007-03-21 | -0.0285714 | 3505.86 | -0.7089570 | -0.7514790 | 8.162191 | -0.6388734 |
| 2007-05-09 | -0.0280374 | 3698.71 | -0.6127007 | -0.7162224 | 8.215739 | -0.5068180 |
| 2007-06-28 | -0.0090090 | 3682.96 | -0.6205620 | 0.5399938 | 8.211472 | -0.5173417 |
| 2007-08-07 | -0.0307692 | 3610.26 | -0.6568484 | -0.8965736 | 8.191535 | -0.5665083 |
| 2007-08-17 | -0.0126582 | 3531.38 | -0.6962194 | 0.2990795 | 8.169444 | -0.6209871 |
| 2007-09-18 | -0.0250000 | 3726.95 | -0.5986054 | -0.5157003 | 8.223346 | -0.4880606 |
| 2007-10-31 | -0.0370370 | 3816.32 | -0.5539986 | -1.3103620 | 8.247042 | -0.4296229 |
| 2007-12-11 | -0.0169492 | 3650.86 | -0.6365839 | 0.0158010 | 8.202718 | -0.5389300 |
| 2008-01-22 | -0.0522876 | 3234.17 | -0.8445644 | -2.3171733 | 8.081528 | -0.8377977 |
| 2008-01-30 | -0.0259740 | 3353.44 | -0.7850337 | -0.5800036 | 8.117742 | -0.7484895 |
| 2008-03-18 | -0.0451977 | 3299.30 | -0.8120564 | -1.8491157 | 8.101466 | -0.7886286 |
| 2008-04-30 | -0.0310881 | 3452.16 | -0.7357601 | -0.9176236 | 8.146755 | -0.6769394 |
| 2008-06-25 | -0.0379747 | 3328.66 | -0.7974020 | -1.3722636 | 8.110325 | -0.7667802 |
| 2008-08-05 | -0.0454545 | 3219.69 | -0.8517917 | -1.8660695 | 8.077040 | -0.8488637 |
| 2008-09-16 | -0.0647482 | 3047.69 | -0.9376413 | -3.1397991 | 8.022139 | -0.9842554 |
| 2008-10-08 | -0.0490566 | 2454.63 | -1.2336525 | -2.1038704 | 7.805731 | -1.5179389 |
| 2008-10-29 | -0.0478723 | 2309.41 | -1.3061355 | -2.0256876 | 7.744747 | -1.6683314 |
| 2008-12-16 | -0.0466926 | 2277.87 | -1.3218779 | -1.9478039 | 7.730996 | -1.7022435 |
| 2009-01-28 | -0.0136519 | 2201.85 | -1.3598214 | 0.2334807 | 7.697053 | -1.7859500 |
| 2009-03-18 | -0.0270270 | 2016.35 | -1.4524092 | -0.6495206 | 7.609044 | -2.0029889 |
| 2009-04-29 | -0.0447761 | 2234.29 | -1.3436298 | -1.8212812 | 7.711679 | -1.7498818 |
| 2009-06-24 | -0.0430622 | 2311.77 | -1.3049576 | -1.7081316 | 7.745769 | -1.6658126 |
| 2009-08-12 | -0.0270270 | 2597.64 | -1.1622726 | -0.6495206 | 7.862359 | -1.3782903 |
| 2009-09-23 | -0.0377358 | 2751.63 | -1.0854123 | -1.3564962 | 7.919949 | -1.2362673 |
| 2009-11-04 | -0.0307167 | 2711.23 | -1.1055770 | -0.8931072 | 7.905158 | -1.2727435 |
| 2009-12-16 | -0.0224719 | 2889.67 | -1.0165131 | -0.3488007 | 7.968898 | -1.1155546 |
| 2010-01-27 | -0.0146628 | 2868.68 | -1.0269897 | 0.1667444 | 7.961607 | -1.1335332 |
| 2010-03-16 | -0.0071942 | 3049.00 | -0.9369874 | 0.6598011 | 8.022569 | -0.9831956 |
| 2010-04-28 | -0.0154440 | 3143.24 | -0.8899499 | 0.1151672 | 8.053009 | -0.9081264 |
| 2010-06-23 | -0.0084746 | 2886.76 | -1.0179655 | 0.5752760 | 7.967890 | -1.1180393 |
| 2010-08-10 | -0.0273973 | 2963.05 | -0.9798873 | -0.6739627 | 7.993974 | -1.0537126 |
| 2010-09-21 | -0.0185185 | 3026.69 | -0.9481229 | -0.0878055 | 8.015225 | -1.0013068 |
| 2010-11-03 | -0.0193548 | 3195.65 | -0.8637907 | -0.1430177 | 8.069546 | -0.8673460 |
| 2010-12-14 | -0.0263158 | 3329.18 | -0.7971425 | -0.6025661 | 8.110481 | -0.7663950 |
| 2011-01-26 | -0.0144928 | 3485.32 | -0.7192091 | 0.1779677 | 8.156315 | -0.6533642 |
| 2011-03-15 | 0.0034965 | 3460.65 | -0.7315225 | 1.3655834 | 8.149212 | -0.6708819 |
| 2011-04-27 | 0.0034843 | 3674.91 | -0.6245799 | 1.3647791 | 8.209284 | -0.5227379 |
| 2011-06-22 | -0.0350877 | 3499.60 | -0.7120816 | -1.1816718 | 8.160404 | -0.6432807 |
| 2011-08-09 | -0.0280374 | 3173.99 | -0.8746018 | -0.7162224 | 8.062745 | -0.8841181 |
| 2011-09-21 | -0.0298913 | 3167.68 | -0.8777512 | -0.8386146 | 8.060755 | -0.8890257 |
| 2011-11-02 | -0.0132013 | 3366.74 | -0.7783954 | 0.2632256 | 8.121700 | -0.7387281 |
| 2011-12-13 | -0.0110294 | 3337.07 | -0.7932044 | 0.4066108 | 8.112849 | -0.7605573 |
| 2012-01-25 | -0.0114504 | 3630.15 | -0.6469208 | 0.3788192 | 8.197029 | -0.5529591 |
| 2012-03-13 | -0.0185185 | 3838.77 | -0.5427932 | -0.0878055 | 8.252907 | -0.4151583 |
| 2012-04-25 | 0.0000000 | 3827.04 | -0.5486480 | 1.1347511 | 8.249847 | -0.4227054 |
| 2012-06-20 | -0.0198675 | 3731.04 | -0.5965640 | -0.1768659 | 8.224442 | -0.4853558 |
| 2012-08-01 | -0.0140351 | 3781.35 | -0.5714530 | 0.2081819 | 8.237836 | -0.4523247 |
| 2012-09-13 | -0.0028736 | 4040.51 | -0.4420997 | 0.9450440 | 8.304126 | -0.2888473 |
| 2012-10-24 | -0.0086207 | 3908.01 | -0.5082338 | 0.5656299 | 8.270784 | -0.3710736 |
| 2012-12-12 | -0.0160183 | 3988.68 | -0.4679694 | 0.0772537 | 8.291216 | -0.3206861 |
| 2013-01-30 | -0.0218978 | 4215.41 | -0.3548027 | -0.3108997 | 8.346502 | -0.1843440 |
| 2013-03-20 | -0.0169903 | 4397.11 | -0.2641116 | 0.0130851 | 8.388703 | -0.0802730 |
| 2013-05-01 | -0.0141844 | 4464.70 | -0.2303757 | 0.1983248 | 8.403957 | -0.0426539 |
| 2013-06-19 | -0.0160920 | 4604.29 | -0.1607028 | 0.0723916 | 8.434744 | 0.0332686 |
| 2013-07-31 | -0.0227790 | 4789.46 | -0.0682798 | -0.3690770 | 8.474173 | 0.1305049 |
| 2013-09-18 | -0.0181087 | 4931.06 | 0.0023963 | -0.0607469 | 8.503309 | 0.2023580 |
| 2013-10-30 | -0.0204918 | 5048.76 | 0.0611434 | -0.2180779 | 8.526898 | 0.2605301 |
| 2013-12-18 | -0.0255941 | 5198.37 | 0.1358175 | -0.5549249 | 8.556100 | 0.3325462 |
| 2014-01-29 | -0.0227704 | 5114.60 | 0.0940058 | -0.3685063 | 8.539855 | 0.2924821 |
| 2014-03-19 | -0.0232975 | 5393.54 | 0.2332318 | -0.4033039 | 8.592957 | 0.4234388 |
| 2014-04-30 | -0.0211946 | 5446.93 | 0.2598801 | -0.2644755 | 8.602807 | 0.4477304 |
| 2014-06-18 | -0.0116279 | 5680.51 | 0.3764658 | 0.3670993 | 8.644796 | 0.5512792 |
| 2014-07-30 | -0.0092937 | 5719.90 | 0.3961263 | 0.5212004 | 8.651707 | 0.5683207 |
| 2014-09-17 | -0.0069808 | 5831.98 | 0.4520683 | 0.6738921 | 8.671112 | 0.6161759 |
| 2014-10-29 | -0.0043290 | 5772.43 | 0.4223454 | 0.8489586 | 8.660848 | 0.5908653 |
| 2014-12-17 | -0.0064795 | 5873.73 | 0.4729068 | 0.7069883 | 8.678245 | 0.6337674 |
| 2015-01-28 | -0.0053619 | 5872.69 | 0.4723877 | 0.7807669 | 8.678068 | 0.6333307 |
| 2015-03-18 | -0.0106383 | 6193.92 | 0.6327217 | 0.4324314 | 8.731323 | 0.7646637 |
| 2015-04-29 | -0.0214477 | 6219.01 | 0.6452448 | -0.2811857 | 8.735366 | 0.7746331 |
| 2015-06-17 | -0.0056180 | 6225.98 | 0.6487237 | 0.7638631 | 8.736486 | 0.7773955 |
| 2015-07-29 | -0.0084746 | 6244.98 | 0.6582071 | 0.5752760 | 8.739533 | 0.7849099 |
| 2015-09-17 | -0.0105541 | 5927.08 | 0.4995351 | 0.4379906 | 8.687287 | 0.6560654 |
| 2015-10-28 | -0.0134771 | 6193.52 | 0.6325221 | 0.2450199 | 8.731259 | 0.7645045 |
| 2015-12-16 | -0.0107527 | 6147.98 | 0.6097919 | 0.4248795 | 8.723879 | 0.7463046 |
| 2016-01-27 | -0.0226629 | 5577.69 | 0.3251457 | -0.3614088 | 8.626530 | 0.5062326 |
| 2016-03-16 | 0.0000000 | 6035.22 | 0.5535105 | 1.1347511 | 8.705368 | 0.7006540 |
| 2016-04-27 | -0.0027397 | 6265.80 | 0.6685988 | 0.9538797 | 8.742862 | 0.7931179 |
| 2016-06-15 | -0.0205882 | 6215.64 | 0.6435627 | -0.2244441 | 8.734824 | 0.7732964 |
| 2016-07-27 | -0.0084507 | 6512.26 | 0.7916133 | 0.5768520 | 8.781442 | 0.8882607 |
| 2016-09-21 | -0.0053763 | 6531.17 | 0.8010517 | 0.7798153 | 8.784341 | 0.8954113 |
| 2016-11-02 | -0.0054348 | 6335.69 | 0.7034827 | 0.7759573 | 8.753954 | 0.8204730 |
| 2016-12-14 | -0.0117302 | 6840.08 | 0.9552365 | 0.3603458 | 8.830555 | 1.0093779 |
| 2017-02-01 | -0.0123457 | 6941.77 | 1.0059926 | 0.3197134 | 8.845312 | 1.0457711 |
| 2017-03-15 | -0.0117302 | 7274.32 | 1.1719767 | 0.3603458 | 8.892106 | 1.1611686 |
| 2017-05-03 | -0.0235294 | 7298.39 | 1.1839906 | -0.4186149 | 8.895409 | 1.1693152 |
| 2017-06-14 | -0.0085470 | 7472.19 | 1.2707386 | 0.5704942 | 8.918943 | 1.2273533 |
| 2017-07-26 | -0.0092593 | 7607.26 | 1.3381555 | 0.5234728 | 8.936858 | 1.2715333 |
| 2017-09-20 | -0.0260870 | 7727.34 | 1.3980904 | -0.5874590 | 8.952520 | 1.3101565 |
| 2017-11-01 | -0.0212766 | 7954.93 | 1.5116864 | -0.2698884 | 8.981547 | 1.3817404 |
| 2017-12-13 | -0.0129870 | 8231.14 | 1.6495498 | 0.2773738 | 9.015680 | 1.4659149 |
| 2018-01-31 | -0.0037037 | 8732.78 | 1.8999310 | 0.8902398 | 9.074839 | 1.6118075 |
| 2018-03-21 | 0.0069686 | 8440.61 | 1.7541016 | 1.5948072 | 9.040810 | 1.5278882 |
| 2018-05-02 | 0.0000000 | 8213.04 | 1.6405156 | 1.1347511 | 9.013478 | 1.4604861 |
| 2018-06-13 | 0.0000000 | 8687.40 | 1.8772807 | 1.1347511 | 9.069629 | 1.5989590 |
| 2018-08-01 | 0.0147059 | 8800.09 | 1.9335271 | 2.1056048 | 9.082517 | 1.6307427 |
| 2018-09-26 | 0.0104712 | 9118.25 | 2.0923288 | 1.8260396 | 9.118033 | 1.7183286 |
| 2018-11-08 | 0.0050761 | 8798.28 | 1.9326237 | 1.4698681 | 9.082311 | 1.6302354 |
| 2018-12-19 | 0.0000000 | 7877.65 | 1.4731140 | 1.1347511 | 8.971785 | 1.3576657 |
| 2019-01-30 | 0.0046729 | 8468.86 | 1.7682018 | 1.4432466 | 9.044151 | 1.5361282 |
| 2019-03-20 | -0.0177778 | 8948.39 | 2.0075474 | -0.0389032 | 9.099229 | 1.6719554 |
| 2019-05-01 | -0.0144231 | 9277.94 | 2.1720341 | 0.1825676 | 9.135395 | 1.7611441 |
In this section, we will show 3 time series charts illustrating the alternative considerations of regression modeling.
The first chart below shows the raw sentiment compared to raw Russell equity levels. Scale issues are obvious since the sentiment values are compressed to the appearance of a slightly fuzzy flat line. The chart below shows scaling is essential.
ggplot() +
geom_line(data=msEQdata, aes(x=date_mdy, y=Sentiment_Score) , color = "red" ) +
geom_line(data=msEQdata, aes(x=date_mdy, y=RU1000TR), color="green") +
ggtitle("Sentiment vs. Russell 1000 Equity Level", subtitle="Not usable without fixes")
The second chart shows the use of scaled sentiment versus scaled Russell equity levels. Scale issues are remain because the right hand side (the more recent years) shows higher variation than the left hand side (earliest years).
ggplot() +
geom_line(data=msEQdata, aes(x=date_mdy, y=z_sentiment) , color = "red" ) +
geom_line(data=msEQdata, aes(x=date_mdy, y=z_ru_fomc), color="green") +
ggtitle("Scaled Sentiment vs. Scaled Equity Index", subtitle = "Nearly There...")
Finally, the third chart shows the variables we will use in the regression analysis.
ggplot() +
geom_line(data=msEQdata, aes(x=date_mdy, y=z_sentiment) , color = "red" ) +
geom_line(data=msEQdata, aes(x=date_mdy, y=z_logEquity), color="green") +
ggtitle("Scaled-Sentiment vs. Scaled Log Equity Price", subtitle="What we will use")
The final regression model we present uses the scaled, log-transformed data with the removal of an influential outlier (observation 1 of Jan 2007). For a reason yet to be determined, Jan 2007 generates the highest sentiment of the entire observation period. This is arguably wrong as the Sept 2018 period was possibly the most euphoric in recent memory. It is calculated in the code chunk below.
mod1 = lm( z_logEquity ~ z_sentiment, data=msEQdata[2:102,])
summary(mod1)
##
## Call:
## lm(formula = z_logEquity ~ z_sentiment, data = msEQdata[2:102,
## ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.96078 -0.59194 0.09201 0.58226 1.67231
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.02467 0.07894 0.313 0.755
## z_sentiment 0.64312 0.08237 7.807 6.19e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.793 on 99 degrees of freedom
## Multiple R-squared: 0.3811, Adjusted R-squared: 0.3748
## F-statistic: 60.96 on 1 and 99 DF, p-value: 6.192e-12
The mod1 clearly has a statistically significant leading coefficient because the p-value is 6.19e-12. The adjusted-R-squared of 37 percent suggests the model has some explanatory power.
Examining the diagnostic plots below shows: * Q-Q plot and histogram of residuals show reasonable approximation to normality. * residuals have relatively homogenous variance across the range of observations * residuals have little trend in relative to the fitted values * leverage plot has controlled for most influential outlier (observation 1)
par(mfrow=c(3,2))
plot(mod1)
hist(mod1$residuals )
Finally, we present the scatterplot of regressed values overlay with the regression line to study the model fit.
ggplot(data=msEQdata[2:102,], aes(x=z_sentiment, y=z_logEquity) ) +
geom_point() +
geom_smooth(method=lm) +
ggtitle("ScatterPlot of Fitted Regression Model", subtitle="X=Z-Sentiment, Y=Z-LogRussell 1000 (2007-2019)")
There are two comments related to the time series and regression we should make.
First, the time series of sentiment clearly shows a pattern characteristic of other financial variables through the 2007-2019 period. During the Q4 2008, at the depths of the financial crisis, sentiment appears to be at a low. During H2 2009, when the financial markets had miraculously recovered, the sentiment spikes upward. Other signs that sentiment is effective include the 2018 euphoria when equity markets reached daily highs during the summer and fall. Moreover, sentiment in Q4 2018 and Q1 2019 declined in concert with the observed selloff of risk assets in the same period.
However, the sentiment index is imperfect. The 2013 taper tantrum is not reflected correctly from a bond investor point of view. As we recall, on May 22, 2013, bond markets panicked when Bernanke gave a speech to Congress that quantitative easing would likely be terminated at a future date. More investigation is needed to understand the market and FOMC dynamics around that historical episode and we regard this as future work.
Second, the regressions suggests that sentiment is positively associated with equity levels. Positive sentiment is associated with higher Russell Index 1000 levels. We think this makes sense. Whether sentiment causes equity markets to move or vice versa is too complex to answer with the crude econometric analysis we have conducted. However, the trend and regression results suggest that more detailed regression analysis of sentiment difference vs. equity returns (instead of levels) both contemporaneous or lagged would promising some predictive value from sentiment analysis. The project timeline did not allow for this more extensive regression analysis work, but we view it as fertile ground for future research.