Introduction

Gross domestic product (GDP) is the most important economic indicator which shows the general economic situation of a country. Especially, the underlying dynamics of the quarterly GDP have always been the subject of research because it is very important to know its possible future value to assess or steer of the economy in the short-term. In recent years, it has been more desirable to estimate the current value of the quarterly GDP than to forecast future values of the quarterly GDP by using the nowcasting techniques since the quarterly GDP has a relatively delayed publication schedule (\(t+45\ days\) in average) compared to the reference quarter (\(t\)).

There is a vast literature on nowcasting GDP, and the main purpose of the studies is generally to find the best method and variable combination to nowcast the quarterly GDP. The literature shows that the variables to be used in a nowcasting model are mainly chosen from the real, financial or (household/business) survey indicators, i.e. structured data. On the other hand, the use of unstructured data (simply text data) in a nowcasting model is relatively new topic in the literature.

In addition to sentiments or emotions extracted from microblogs and the internet, it may be the case that the speeches of monetary authorities (i.e. central banks) can be effective economic indicators since they may affect the economic agents’ behaviors, such as expectations, saving-consumption decision, investment risk appetite. Although it may be thought to focus on the impact of central banks’ speeches on price stability (inflation) at first glance, the question of how much the effect of the speeches on a real indicator is still important.

This paper investigates the impacts of ECB speeches on nowcasting Euro Area quarterly GDP in this paper. I apply the emotion analysis to ECB’s speeches, and construct several emotion indicators to be contained in a multivariate nowcasting model. Then, I examine the effect of emotion indicators on Euro area quarterly GDP by three dimensions. First, I analyze how much the emotion indicators make the contribution to decrease the estimation error of nowcasting model for quarterly GDP. Second, I try to measure the impact of emotion indicators on the explaining the variance of the quarterly GDP. The third dimension is to measure the impact of emotions indicators on the revisions of the GDP nowcasts.

The rest of this paper is organized as follows. Section @ref(sec:1) presents a summary of related studies in the literature. Section @ref(sec:2) defines the unstructured (ECB speeches) and the structured (economic and survey indicators) data for nowcasting quarterly Euro Area GDP. The method to obtain emotion indicators by analyzing unstructured data, and the nowcasting methodology are explained in Section @ref(sec:3). Besides, several approaches to measure the impact of emotions of speeches on nowcasting quarterly Euro Area GDP are also explained in the Section @ref(sec:3). The complete findings of the paper are presented in the Section @ref(sec:4). Final remarks are provided in the conclusion.

Literature

A summary of related literature is given in Table @ref(tab:tablolit). The studies of Varian and Choi (2009), McLaren and Shanbhogue (2011), Fondeur and Karamé (2013), Bortoli and Combes (2015), Baker et al. (2016), and Francesco and Marcucci (2017) tries to improve the nowcasting/forecasting models by including the unstructured data available on the internet. Specifically, word-search characteristics over searching platforms are often treated as unstructured data. It has been also becoming popular in the literature to analyze emotions and associate them with economic indicators.

However, there are very few studies that try to analyze the nowcasting models by including the sentiments or emotions. Combes et al. (2018) is a visionary paper which differs from the literature in terms of the data source used in the nowcasting model. They extracted several sentiment indicators from the newspapers to use in the nowcasting model for France GDP, and they investigated the effect of sentiment indicators on the nowcasting model.

The paper of Kaminski and Gloor (2014) is another visionary paper in terms of analyzing emotions rather than sentiments. They examined the micro-blog data to extract several emotion indicators, and they analyzed the effect of the indicators on the crypto-currencies. The studies of Bollen et al. (2010), Zhang et al. (2011), and Si et al. (2013) can also be given as the examples of associating economic analysis with emotion indicators obtained from micro-blog platforms.

There are several studies in the literature about examining the relationship between the communication of central banks and economic indicators. The studies of Lucca and Trebbi (2009), Hansen and McMahon (2015) and Eskici and Koçak (2018) are good examples for pointing out the relationship, however it seems that there is a scarcity of literature in which the emotions extracted from the central banks’ speeches and quarterly GDP have been considered together in a nowcasting model.

Data

Unstructured data

ECB (2019) provides speech data with metadata containing the content of all speeches made by ECB to assist researchers in the field of central bank communication. The data is currently updated every two months and presented in comma-separated-value format. All related information can be found in ECB (2019).

The data, which I downloaded as of 7 July 2020, consist of 2383 speeches record. I combine the speeches in the same date, so I obtained a total of 2328 speeches in unique dates. The data contains speeches in several languages, but mostly in English. 92.7% of speeches are in English. There are also speeches in German, Spanish, French, Italian, Catalan and Dutch languages. I only take English speeches in consideration. In detail, I include 5 variables in the data. These are \(date,\ speakers,\ title,\ subtitle,\) and \(contents\). \(date\) variable extends from 1997-02-07 to 2020-01-27 in daily format, and I restrict the time span until the end of 2019 for this paper. \(speakers\) variable includes 23 speakers with their names and surnames. In addition to the ECB presidents’, the presentations and speeches of other officials from the ECB are included in the data. The speeches made by all speakers are taken into consideration due to vice presidents, chief economists and other officials deliver a significant number of speeches. \(subtitle\) variable is excluded because it is unnecessary information for the analysis. The \(contents\) variable provides the textual information of the speeches.

Structured data

I use mixed-type data for nowcasting GDP. A summary information for data is given in Table @ref(tab:ektablo) and the data is accessible at https://tinyurl.com/yyvxo2tp.

The data can be explained by three groups such as real, survey, and emotions. Real indicators group include four variables. The first one is the Euro Area quarterly GDP which is the target variable of the nowcasting model. Flash estimates of quarterly GDP are usually published with a delay of 45 days compared to the reference period. The second variable is the final estimates of quarterly GDP which is generally finalized after two years from the reference period. The third variable is Euro area monthly industrial production index. The last variable in this group is Euro area total turnover index. The latter two variables are published with a minimum delay of 30 days compared to reference periods. All real indicators cover the period between 1995 and 2019, and they are obtained from ECB.

There are two variables in the survey indicators group. The Purchasing Managers IndexTM (PMITM) related to the manufacturing sector at the monthly frequency by IHS-MARKIT. PMITM variable covers the period between February 2008 and the end of 2019. The Eurozone Business and Consumer Survey (BCS) sentiment index, which is published at monthly frequency by European Commission, is released 7 days before the end of the reference period. BCS index covers the period between January 1995 and the end of 2019.

I calculate five emotion indicators (structurized) using the unstructured ECB speeches data described in the part @ref(sec:31) of this section. The emotion indicators are at monthly frequency and cover the period between February 1997 and December 2019.

In summary, the mixed-type data includes 11 variables which are at monthly or quarterly frequencies, and it covers the time period between January 1995 and December 2019 with missing observations. The data entirely consists of the first estimation of the variables. I ignore the subsequent revisions on the indicators.

Method

Text analysis and construction of emotion indicators

I explain how the unstructured ECB speeches data (defined in Section @ref(sec:21)) is analyzed to obtain emotion indicators those represent the emotions of ECB speeches. The approach can be summarized as follows:

  • Pre-processing
  • n-gram analysis (descriptive)
  • Extraction emotions from the data
  • Time aggregation

Pre-processing is a technique which aims to prepare the text data for text analysis. It comprises two stage, i.e. tokenization and cleaning text data. Tokenization is the process of breaking down a text document into those tokens, i.e. words (Welbers et al., 2017). The second stage of pre-processing is the determination of the words (stop-words) to be excluded from the text data. This is a recursive process (Loughran and McDonald, 2016). This paper uses the stop-word lists which are already available in Benoit et al. (2018), Rinker (2018), Rinker (2020), Benoit et al. (2019), Silge and Robinson (2016) and Feinerer et al. (2008) studies in addition to a long user-defined stop-words list.

The n-gram analysis, which it is observed the use of words together and the change of these uses over time, can provide important information about text data. Although it does not contribute to the process of obtaining emotion indicators, it provides important descriptive information for the ECB speech data. It is examined the group of two or three words (bigram and trigram, respectively) as well as a single word (unigram) by n-gram analysis. An \(n-gram\) is a sequence of \(n\) adjacent elements from a string of words (Jurafsky and Martin, 2008). A bigram is an \(n-gram\) for \(n=2\) and a trigram is an \(n-gram\) for \(n=3\). For simplicity, the relationship of Bayesian conditional probability is given in Eq.@ref(eq:bigram) only for a bigram which provides the conditional probability of a word given the preceding word.

\[\begin{equation} P(W_{n}|W_{n-1}) = \frac{P(W_{n-1},W{n})}{P(W_{n-1})} (\#eq:bigram) \end{equation}\]

That is, the probability \(P()\) of a word \(W_{n}\) given the preceding word \(W_{n-1}\) is equal to the probability of their bigram, or the co-occurrence of the two words \(P(W_{n-1},W{n})\) divided by the probability of the preceding word.

I performed the emotion analysis which aims to categorize words in the text data regarding several pre-defined emotions to extract emotions from the ECB speech data. It is a different approach from sentiment analysis which categorize words into symmetric measures such as “positive,” “neutral,” and “negative.” In general, a pre-defined dictionary is used in the emotions analysis however, a new comprehensive dictionary, which is a harmonization of mainly financial-purpose lexicons, are considered in this paper. The lexicons are following :

  • bing dictionary (Hu and Liu, 2004),
  • National Research Council of Canada (NRC) dictionary (Mohammad and Turney, 2013),
  • Lexicoder Sentiment Dictionary (Young and Soroka, 2012),
  • Harvard-IV dictionary as used in the General Inquirer software (Harvard, 2000),
  • Henry’s Financial dictionary (Henry, 2008),
  • Loughran-McDonald Financial dictionary (Loughran and McDonald, 2011),
  • AFINN dictionary (Nielsen, 2011)

All mentioned lexicons are based on unigrams. They contain many English words, and the words are assigned to emotions. The Loughran and McDonald (2011) and Mohammad and Turney (2013) lexicons categorize words into emotions, the remaining lexicons are for sentiments. The AFINN lexicon only assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment. Due to the scale problem, I exclude the AFINN dictionary from the scope of the analysis.

After harmonization of all available lexicons, there are 17695 (15186 of them are unique) words in the new dictionary, and the number of total emotions is 7. These emotions can be listed as positive, negative, uncertainty, anticipation, and constraining. The distribution of words to emotions is given in Table @ref(tab:tablo4). It is seen that the words are mostly flagged as negative-positive separation according to the distribution.

I extract emotions from ECB speech data using the methods explained in the studies of Plutchik (1962), Plutchik (2001) and Rinker (2019) through the dictionary explained above. The emotion analysis is done at the sentence level in the text data. Emotion’s score is calculated at the sentence level. The scores are between 0 (no emotion in the sentence) and 1 (all vocabulary used in the sentence represent emotions). It should be noted that emotion words prefixed with an ‘un-’ are treated as a negation. For example, “unhappy” would be treated as “not happy.”

I obtain the emotion’s score in daily basis since ECB speech data is at daily frequency. The daily scores should be aggregated to the monthly frequency since monthly emotion indicators are used in the nowcasting model defined in Section @ref(sec:32). Various approaches can be used in the aggregation process. For instance, daily speech texts in the relevant month can be evaluated as a single text if a speech is made once a month. However, there is a disadvantage that the score of the speech at the end of any month are only reflected to that month in this approach. For this reason, I assume the emotion’s score to show an exponential decay process running from the date of a speech until the next one. Afterward, I perform the aggregation by taking monthly averages on each emotion. Finally, I rescale the monthly emotion indicators from 0 to 100.

Nowcasting model and estimation

I use the dynamic factor model (DFM) specification to build a nowcasting model (Stock and Watson, 2005). In a DFM, a large amount of time series are represented by a smaller number of factors, both by carrying the time series features and by reducing their dimensions. A DFM can be represented as given in Eq.@ref(eq:main)-@ref(eq:resid):

\[\begin{equation} x_{i,t} = \Lambda_{i}f_{t}+\epsilon_{i,t};\qquad i = 1,\ldots,n, (\#eq:main) \end{equation}\]

\[\begin{equation} f_{t} = \varphi(L)f_{t-1}+\eta_{t};\qquad \eta_{t}\sim i.i.d.\mathcal{N}(0,\sigma^2_{i}) (\#eq:factor) \end{equation}\]

\[\begin{equation} \epsilon_{i,t} = \alpha_{i}\epsilon_{i,t-1}+\upsilon_{i,t};\qquad \upsilon_{i,t}\sim i.i.d.\mathcal{N}(0,Q) (\#eq:resid) \end{equation}\]

Let \(x\) represent the standardized variables in terms of mean and variance, \(\Lambda\) is the \(nxr\) dimension matrix and represents the effects of \(x\)’s on invisible factors (\(f\)), which represents the loadings. There are three factors, namely real variables factor, survey variables and emotion variables. These factors are assumed to be followed by a VAR process in \(p\) lag. \(\epsilon_{t}\) shows idiosyncratic residuals and they are assumed to be AR(1) process.

Mixed-frequency data can be used in DFM by following the method explained in the paper of Camacho and Perez-Quiros (2010). The DFM also allows to use unbalanced data, which contains missing observations, following the method in the paper of Bańbura and Modugno (2014). The authors suggest a solution for the problems of estimating missing observations in the system estimation in the case of mixed frequency data. The transformations of variables in DFM are given in relevant column of Table @ref(tab:ektablo).

This paper uses the expectation-maximization (EM) algorithm as defined by Marta and Michele (2010) is used to estimate DFM. I determine the lag length as three for the factors according to the Akaike information criterion. As proposed by Doz et al. (2012), I apply the restriction of the effects of variables on the factors in a subjective approach in the form of real/survey/emotion separation explained in Section @ref(sec:22).

Impulse-response functions and variance decomposition analysis

I examine impulse-response functions to present the evolution of GDP in reaction to shocks in the three factors estimated in Emotion-Included model. It measures the changes in the future responses of GDP in the Emotion-Included model when the factors are shocked by an impulse in one standard deviation unit. I obtained impulse-response functions by Cholesky factorization of \(Q\), i.e. \(Q = AA'\) which is the innovations of Eq.@ref(eq:main). First, moving average (MA) representation of Eq.@ref(eq:main) are obtained. Then, it is defined a new error vector \(\tilde{\epsilon}_{t}\) as (linear transformation of old error vector \(\epsilon_{t}\)). The coefficient in the MA representation measures the impulse-response which are defined in Eq.@ref(eq:imp1)-@ref(eq:imp3).

\[\begin{equation} x_{i,t} = \epsilon_{i,t} + \phi \epsilon_{i,t-1} + {\phi}^2 \epsilon_{i,t-2} + \dots + {\phi}^j \epsilon_{i,t-j} (\#eq:imp1) \end{equation}\]

\[\begin{equation} x_{i,t} = AA^{-1}\epsilon_{i,t} + \phi AA^{-1} \epsilon_{i,t-1} + {\phi}^2 AA^{-1} \epsilon_{i,t-2} + \dots + {\phi}^j AA^{-1} \epsilon_{i,t-j} (\#eq:imp2) \end{equation}\]

\[\begin{equation} x_{i,t} = A \tilde{\epsilon}_{i,t} + \phi A\tilde{\epsilon}_{i,t-1} + {\phi}^2 A\tilde{\epsilon}_{i,t-2} + \dots + {\phi}^j A\tilde{\epsilon}_{i,t-j} (\#eq:imp3) \end{equation}\]

Eq.@ref(eq:imp1)-@ref(eq:imp3) implies that the impulse-response to the orthogonal error \(\tilde{\epsilon}_{t}\) after \(j\) periods is \(j^{th}\) orthogonal impulse-response which equals \({\phi}^j A\) where A in \(Q = AA'\).

The variance decomposition analysis measures the amount of information each factor contributes to the GDP in the Emotion-Included model. It determines how much the forecast error variance of GDP can be explained by exogenous shocks to the three factors. Variance decomposition of forecast errors are calculated as follows. The amount of forecast error variance of factor \(i\) accounted for by exogenous shocks to \(GDP\) is measured by \(\omega_{i,GDP,h}\) for \(h\)-step as shown in Eq.@ref(eq:vardecomp).

\[\begin{equation} \sum_{j=o}^{h-1}(\phi_{i} A\tilde{\epsilon}_{i}\phi_{GDP})^2/MSE[x_{i,t}(h)] (\#eq:vardecomp) \end{equation}\]

where the mean squared error of the h-step forecast of variable \(i\) is given in Eq.@ref(eq:vardecomp2).

\[\begin{equation} MSE[x_{i,t}(h)] = (\sum_{j=o}^{h-1}\phi_{i}Q\phi_{i}') (\#eq:vardecomp2) \end{equation}\]

Revision analysis

The revision analysis is important to show the changes in the nowcast/forecast values regarding the upcoming information in new data releases. I try to measure the effect of emotion indicators group to the revisions of GDP nowcasts. I follow the approach suggested by Basselier et al. (2017) to extract model-based revisions in the nowcasting framework. In this case, \(y_{t}^{Q}\) is quarterly GDP growth at time \(t\), and \(\Omega_{v}\) is the data set at time \(v\). Hence, the quarterly GDP growth nowcast is the expected value of \(y_{t}^{Q}\) using the available information, \(E[y_{t}^{Q}|\Omega_{v}]\). The new nowcast value can be decomposed by :

\[\begin{equation} \underbrace{E[y_{t}^{Q}|\Omega_{v+1}]}_\text{New nowcast} =\underbrace{E[y_{t}^{Q}|\Omega_{v}]}_\text{Old nowcast} +\underbrace{E[y_{t}^{Q}|I_{v+1}]}_\text{Revision} (\#eq:news1) \end{equation}\]

where \(I_{v+1}\) is the new information and it is orthogonal to \(\Omega_{v}\). Therefore, the revision can be explained as a weighted sum of revisions from the updated variables at time \(v+1\).

\[\begin{equation} E[y_{t}^{Q}|I_{v+1}] = \sum_{j \in J_{v+1}}b_{j,v+1}(x_{i_{j},t_{j}}-E[x_{i_{j},t_{j}}|\Omega_{v}]) (\#eq:news2) \end{equation}\]

where \(b_{j,v+1}\) represents the weights which measure the marginal contribution of every release of indicators in the new value of the nowcast. Using the Emotion-Included model structure, the model is refreshed in terms of loadings by adding new data for each month starting from January 2019 until December 2019.

Findings

Descriptive statistics for ECB’s speech data are presented in Table @ref(tab:tablo1). It is observed that the number of speeches has increased over the years. On the other hand, it has been observed that the annual mean of the number of words used per speech has not changed much over the years.

Table @ref(tab:tablo3) presents the unigrams, bigrams and trigrams together over the years. When n-grams are analyzed by years, it is easy to understand which economic topics are discussed in the relevant year. According to Table @ref(tab:tablo3), the words “stability” and “foreign exchange” were the most frequently used during the 1997 Asian financial crisis. Also, it is underlined the phrases “stability,” “foreign exchange markets,” and “cross border payments” during and after the 1998 Russian financial crisis. It is emphasized “growth,” “securities,” and “structural reforms” between the years 2000 to 2003. During the years from 2003 to 2005, Trichet used the terms “real GDP growth” and “integration,” but an important point that draws attention is the emphasis on “labor productivity growth” and “cross border” in the years before the 2009 global economic crisis (2005-2008).

Then, the global economic crisis and the measures taken are frequently mentioned and it is used “systemic risk,” “macroprudential supervision” and “sovereign debt crisis” during the years between the years 2009 and 2011. it is emphasized “fiscal” issues, the speeches are made on “macro prudential,” “banking union,” “liquidity” in the years between 2012-2016. Most frequently n-grams are “asset purchase programmes,” “capital markets union.”

I estimate two DFMs models to measure the marginal contribution of the emotion indicators group to the nowcasting model for Euro Area GDP. The “Benchmark” model is the base model which the emotion indicators group is not included, that is, only real and survey group of indicators are represented as two factors in the model. The second is the “Emotion-Included” model in which the emotion indicators group is included as a single factor in addition to two factors of Benchmark model. The two models are estimated over full-time span which is from January 1995 to December 2019. Table @ref(tab:tablo6) presents RMSE values for two DFMs and gain of the Emotion-Included model respect to Benchmark model. The RMSE value is 0.0019 in the Benchmark model while the RMSE value is 0.00171 in the Emotion-Included model. Thus, 9.7% gain is achieved by the Emotion-Included model comparing with the Benchmark model. This result indicates that the error of the nowcasting model can be expected to decrease by approximately 10% in case of the emotion indicators obtained from ECB speeches are included in the model compared with non-included one.

Furthermore, I estimate these two DFMs by increasing the number of observations in a recursive basis (one-period-ahead) from January-2019 to December 2019 to directly observe the contribution of the emotion indicators to the nowcast figures. Table @ref(tab:tablo7) presents the quarterly GDP growth rates and nowcasted figures produces by the Benchmark model and the Emotion-Included model for the 2019 by recursive estimation. It is seen that nowcast values of the Emotion-Included model seems to closer to the published quarterly GDP growth rates in comparison with the Benchmark model. I calculate from Table @ref(tab:tablo7) that RMSE of Emotion-Included model (0.00249) is lower than the Benchmark Model’s (0.00269).

As a result of the evaluation of Table @ref(tab:tablo6) and @ref(tab:tablo7) together, it can be claimed that the inclusion of the emotion indicators group as a single factor to the Benchmark model creates a noticeable gain in RMSE and in the nowcasting performance for GDP. From the perspective of policy makers, the use of emotion indicators implied by ECB speeches will make a tangible contribution to in the process of nowcasting Euro Area GDP growth by reducing the RMSE. It can be said that the contribution of the emotion indicators group to the real and survey groups is not very high in terms of RMSE, as expected, but it affects positively in terms of proximity to flash estimates.

I examined the impulse-response functions and the forecast error variance decomposition analysis using the Emotion-Included model estimation results. Figure @ref(fig:impresp) presents the impulse-response analysis results. Each graph in Figure @ref(fig:impresp) provides the response of quarterly GDP growth rate to one standard deviation innovations in relevant factor, i.e. real variables factor, survey variables factor and emotion variables factor. It is understood from two graphs in the upper part of the Figure @ref(fig:impresp) that the responses of the quarterly GDP growth rate to the real and survey variables factors are positively significant in the short term. In contrast, an impulse in the emotion variables factor causes a significant response in the quarterly GDP growth rate in the longer periods. The sign of the response is not one-way due to the factor of emotion variables include opposite emotions such as positive-negative.

Orthogonal Impulse-responses from Factors to GDP

Orthogonal Impulse-responses from Factors to GDP

The forecast error variance decomposition of the quarterly GDP growth rate are presented in Figure @ref(fig:vardecomp) using the estimates of Emotion-Included model. The forecast error variances of quarterly GDP growth rate are mainly explained by itself, real variables and survey variables and emotion variables, respectively. The ratio of explanation of the variance by emotion variables factor is minimal but noticeable and increases in longer forecast horizon.

Orthogonal Forecast Error Variance Decomposition of GDP

Orthogonal Forecast Error Variance Decomposition of GDP

Finally, the revision analysis results are given in Table @ref(tab:tablo8) using by Emotion-Included model estimates. The first column in Table @ref(tab:tablo8) shows the month in which Emotion-Included model run using the available dataset until that month. The second column represents the quarter for which the nowcast of quarterly GDP growth are produced. The third column shows the nowcast value produced in the previous month. The columns between 4 and 7 represents the amount of the total revision in the nowcast value of quarterly GDP growth and the source of the revision by the factors due to the new information available in the variables. The last column is the final value of the nowcast for the relevant quarter in that month.

Table @ref(tab:tablo8) shows that the effects of emotional indicators are negligible in the revision of quarterly GDP growth rate nowcasts. The main source of the revision is the real variables group. Therefore, it can be suggested that adding emotion indicators to the nowcast model does not have an effect to increase the variance of nowcast values.

Conclusion

Recently, it has been discussed that inclusion of information extracted from microblogs and internet platforms into forecast or nowcast models for economic variables in the literature. On the other hand, it seems that the evaluating the impact of speech information on the economic models is a new subject.

In this paper, I investigate the effect of speeches made by ECB officials on nowcasting Euro Area quarterly GDP. As a result of the descriptive analysis of the speech data, it is seen that the number of speeches increased over the years between 1997-2019, and it can be claimed that the words and word groups frequently used in the speeches may reflect the economic conditions of those periods. Then, I calculate five emotion indicators (indices between 0 and 100) by applying the emotion analysis method on the speech data.

I observe that the marginal contribution of emotion indicators is found effective in terms of reduction of RMSE of the nowcasting model established in dynamic factor model representation for Euro Area quarterly GDP. The emotion indicators also provides a noticeable improvement in the nowcasts. Impulse-response functions and forecast error variance decomposition analysis have shown that the impact of emotion indicators on nowcasts is noticeable in the long term, not in the short term. Finally, revision analysis indicates that emotion indicators cause a negligible effect in the revision of the nowcasts.

References

Baker, S.R., Bloom, N., Davis, S.J., 2016. Measuring Economic Policy Uncertainty. The Quarterly Journal of Economics 131, 1593–1636.
Bańbura, M., Modugno, M., 2014. Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. Journal of Applied Econometrics 29, 133–160.
Basselier, R., Antonio Liedo, D. de, Geert Langenus, 2017. Nowcasting real economic activity in the euro area : Assessing the impact of qualitative surveys (Working Paper Research No. 331). National Bank of Belgium.
Benoit, K., Muhr, D., Watanabe, K., 2019. Stopwords: Multilingual stopword lists.
Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., Matsuo, A., 2018. Quanteda: An r package for the quantitative analysis of textual data. Journal of Open Source Software 3, 774.
Bollen, J., Mao, H., Zeng, X.-J., 2010. Twitter mood predicts the stock market. CoRR abs/1010.3003.
Bortoli, C., Combes, S., 2015. Contribution from google trends for forecasting the short-term economic outlook in france: Limited avenues. Institut National de la Statistique et des Éstudes Économiques. Available online: https://www. insee. fr/en/statistiques/1408911.
Camacho, M., Perez-Quiros, G., 2010. Introducing the euro-sting: Short-term indicator of euro area growth. Journal of Applied Econometrics 25, 663–694.
Combes, S., Renault, T., Bortoli, C., 2018. Nowcasting GDP growth by reading newspapers.
Doz, C., Giannone, D., Reichlin, L., 2012. A quasi–maximum likelihood approach for large, approximate dynamic factor models. Review of economics and statistics 94, 1014–1024.
ECB, 2019. Speeches dataset.
Eskici, H.B., Koçak, N.A., 2018. A text mining application on monthly price developments reports. Central Bank Review 18, 51–60.
Feinerer, I., Hornik, K., Meyer, D., 2008. Text mining infrastructure in r. Journal of Statistical Software 25, 1–54.
Fondeur, Y., Karamé, F., 2013. Can google data help predict french youth unemployment? Economic Modelling 30, 117–125.
Francesco, D’Amuri, Marcucci, J., 2017. The predictive power of google searches in forecasting US unemployment. International Journal of Forecasting 33, 801–816.
Hansen, S., McMahon, M., 2015. Shocking Language: Understanding the macroeconomic effects of central bank communication (Discussion Papers No. 1537). Centre for Macroeconomics (CFM).
Harvard, U., 2000. Harvard IV-4 dictionary.
Henry, E., 2008. Are investors influenced by how earnings press releases are written? The Journal of Business Communication (1973) 45, 363–407.
Hu, M., Liu, B., 2004. Mining and summarizing customer reviews, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04. ACM, New York, NY, USA, pp. 168–177.
Jurafsky, D., Martin, J.H., 2008. Speech and language processing: An introduction to speech recognition, computational linguistics and natural language processing. Upper Saddle River, NJ: Prentice Hall.
Kaminski, J., Gloor, P.A., 2014. Nowcasting the bitcoin market with twitter signals. CoRR abs/1406.7577.
Loughran, T., McDonald, B., 2011. When is a liability not a liability? Textual analysis, dictionaries, and 10-ks. The Journal of Finance 66, 35–65.
Loughran, T., McDonald, B., 2016. Textual analysis in accounting and finance: A survey. Journal of Accounting Research 54, 1187–1230.
Lucca, D.O., Trebbi, F., 2009. Measuring Central Bank Communication: An Automated Approach with Application to FOMC Statements (NBER Working Papers No. 15367). National Bureau of Economic Research, Inc.
Marta, B., Michele, M., 2010. Maximum likelihood estimation of factor models on data sets with arbitrary pattern of missing data. European central bank. Working Paper Series.
McLaren, N., Shanbhogue, R., 2011. Using internet search data as economic indicators. Bank of England Quarterly Bulletin 51, 134–140.
Mohammad, S.M., Turney, P.D., 2013. Crowdsourcing a word–emotion association lexicon. Computational Intelligence 29, 436–465.
Nielsen, F.Ä., 2011. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. CoRR abs/1103.2903.
Plutchik, R., 1962. The emotions: Facts, theories and a new model. New york, NY, US. Crown Publishing Group/Random House.
Plutchik, R., 2001. The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. American scientist 89, 344–350.
Rinker, T.W., 2018. lexicon: Lexicon data. Buffalo, New York.
Rinker, T.W., 2019. sentimentr: Calculate text polarity sentiment. Buffalo, New York.
Rinker, T.W., 2020. qdap: Quantitative discourse analysis package. Buffalo, New York.
Si, J., Mukherjee, A., Liu, B., Li, Q., Li, H., Deng, X., 2013. Exploiting topic based twitter sentiment for stock prediction, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Sofia, Bulgaria, pp. 24–29.
Silge, J., Robinson, D., 2016. Tidytext: Text mining and analysis using tidy data principles in r. JOSS 1.
Stock, J.H., Watson, M.W., 2005. Implications of dynamic factor models for VAR analysis. National Bureau of Economic Research.
Varian, H., Choi, H., 2009. Predicting the present with google trends. Economic Record 88.
Welbers, K., Atteveldt, W.V., Benoit, K., 2017. Text analysis in r. Communication Methods and Measures 11, 245–265.
Young, L., Soroka, S., 2012. Affective news: The automated coding of sentiment in political texts. Political Communication 29, 205–231.
Zhang, X., Fuehres, H., Gloor, P.A., 2011. Predicting stock market indicators through twitter “i hope it is not as bad as i fear.” Procedia - Social and Behavioral Sciences 26, 55–62.