Introduction and data import

This R file generates the daily and weekly speech databases.

Speech database

Sentence-level database

Data import

We use the database 2b_cbdc_sentence_preds_with_author_desc.csv downloaded from the Kaggle notebook. We also use the countries2.csv file that merge the authors of speeches with their Central Bank and Countries. For some cases, there are multiple authors for a specific speech. Fortunately, these case are rare and they group authors from the same Central Banks and the same Countries. We manually retrieve their informations and add them directly.

We save the complete list of speeches in a .csv file (completelist.csv). Note that the database has loose keywords conditions, in order to capture all speeches related to CBDC. However, some sentences are linked to titles and/or references, or to crypto-assets exclusively (see for example the first observations above).

write.table(sent_pred2,
            file = "C:/Users/fkraus/Desktop/Articles/effects of CB speeches on stablecoins/2025/completelist.csv",
            sep = ";", dec = ".",
            row.names = FALSE, col.names = TRUE,
            fileEncoding = "UTF-8")

We manually note each title. The dataset list.csv is the same as completelist.csv, with an additional column cbdc_sentence that is equal to FALSE when sentences are related to keywords in title or in references. We then remove these “wrong” sentences, and add the corresponding currency to each Central Bank manually.

This filtering yields a total of 5,397 CBDC-related sentences (against 5,376 sentences in Zafar (2025) that applies the MLM on speeches up to 2024). These sentences appear in 652 speeches authored by 166 central bankers from 59 countries and 49 central banks.

Statistics on sentence-level data

We now have a sentence-level clean database that contains the type_label (retail, wholesale, general/unspecified), the sentiment_label (positive, negative, neutral), stance_label (Pro-CBDC, Anti-CBDC, Wait-and-See) and discourse_label (Risk-Benefit, Feature, Process) associated with each sentence measured by BERT.

Summary statistics of sentence-level database
Number of sentences Share within dimension
Discourse
Process 2,381 44.1%
Feature 1,543 28.6%
Risk-Benefit 1,473 27.3%
Sentiment
neutral 3,029 56.1%
positive 1,937 35.9%
negative 431 8.0%
Stance
Pro-CBDC 2,813 52.1%
Wait-and-See 2,051 38.0%
Anti-CBDC 533 9.9%
Type
Retail CBDC 2,854 52.9%
General/Unspecified 2,114 39.2%
Wholesale CBDC 429 7.9%

In total, there are 5397 uniques sentences, mostly about Retail CBDC (53% of speeches), while unspecified (39%) and wholesale (8%) are relatively marginally mentioned. Overall, the sentiment is essentially neutral yet pro-CBDC. In addition, we can see that there are less speeches about wholesale CBDC, but they tend to be more positive/Pro-CBDC :

In addition, we can observe that some Central Banks - and notably ECB - tend to communicate more than others :

Day/Week-level database

We have sentiment, stance, discourse and type of CBDC discussed at sentence-level. We transform the sentiments (positive, neutral et negative) and stances (Pro-CBDC, Wait-and-See et Anti-CBDC) in numeric values (1, 0, -1) at sentence-level.

As this table shows, the sentence “the bank of japan has no specific plan at present to issue digital currencies as a substitute for banknotes.” (second line) is associated with a negative stance towards CBDC.

There are some discrepancies between sentiment and score, yet they exhibit relatively high correlations at sentence-level:

All speeches

Then, for a given speech, we measure the average sentiment and stance as the average sentiment and stance across all sentences in a given speech.

Our results show a total of 652 speeches, consistent with our previous observations. We then aggregate the sentiment and stance daily-level as their average across all speeches that day. We find a total of 528 days in which there is at least one CBDC-related speech.

Here, we represent the 15-days moving average of stance in red and of sentiment in blue. The two series have similar dynamics overall. We also represent the sentiment from Auer & al. (2023) in orange.

Advantage of ML framework compared to expert scores : more observations, more consistent, granularity (sentence-level compared to speech-level) and easy to increase the number of observations.

We also decompose by type of CBDC discussed within each day :

Here, we represent the 15-days moving average of sentiment (the stance has a similar dynamics) decomposed by type of CBDC discussed. While speeches about wholesale CBDC starts relatively late compared to Unspecified and Retail CBDCs, the sentiment is also higher.

We also decompose by the discourse associated with the speech :

We also combine the type and discourse sentiment and stance

We also decompose between Central Banks. Because of data constraints, we procede to group from several ways. First, we distinguish between Developed Markets Central Banks and Emerging Markets Central Banks

Here, the 15-days moving average of sentiment shows some heterogeneity between developed and emerging markets. However, emerging markets tend to communicate more positively towards CBDC.

We also group three major Central Banks (European Central Bank, Fed and Bank of Japan)

and the ECB against the rest :

In a significant number of speeches, CBDC are only treated as a subject and - even though the sentiment or stance appears positive in a given speech - this may not be perceived by investors as an important signal. We thus also construct a database that only groups speeches that explicitely mention CBDC in their title, showing the relevance of this subject within the speech. This has been done by another paper (find reference).

And we merge everything in a single database :

Here, days without speech (speechday == 0) are treated as NAs. However, when combining speech data with stablecoins supply, we replace NAs by zeros.

Statistics of weekly speeches

Characteristic N = 3,398
Date (min, max) 2016-03-02, 2025-06-20
index_sent
    Mean (SD) 0.35 (0.42)
    Unknown 2,870
index_stance
    Mean (SD) 0.51 (0.43)
    Unknown 2,870
stance_auer
    -1 30 (9.9%)
    -0.5 1 (0.3%)
    0 96 (32%)
    0.333333333333333 1 (0.3%)
    0.5 13 (4.3%)
    0.666666666666667 5 (1.7%)
    1 156 (52%)
    Unknown 3,096
index_sent_type_general_unspecified
    Mean (SD) 0.19 (0.39)
    Unknown 2,870
index_sent_type_retail_cbdc
    Mean (SD) 0.20 (0.39)
    Unknown 2,870
index_sent_type_wholesale_cbdc
    Mean (SD) 0.23 (0.41)
    Unknown 2,870
index_stance_type_general_unspecified
    Mean (SD) 0.36 (0.44)
    Unknown 2,870
index_stance_type_retail_cbdc
    Mean (SD) 0.29 (0.46)
    Unknown 2,870
index_stance_type_wholesale_cbdc
    Mean (SD) 0.27 (0.43)
    Unknown 2,870
index_sent_disc_feature
    Mean (SD) 0.28 (0.39)
    Unknown 2,870
index_sent_disc_process
    Mean (SD) 0.24 (0.36)
    Unknown 2,870
index_sent_disc_risk_benefit
    Mean (SD) 0.16 (0.48)
    Unknown 2,870
index_stance_disc_feature
    Mean (SD) 0.40 (0.44)
    Unknown 2,870
index_stance_disc_process
    Mean (SD) 0.41 (0.41)
    Unknown 2,870
index_stance_disc_risk_benefit
    Mean (SD) 0.20 (0.50)
    Unknown 2,870
index_sent_type_disc_general_unspecified_feature
    Mean (SD) 0.09 (0.27)
    Unknown 2,870
index_sent_type_disc_general_unspecified_process
    Mean (SD) 0.16 (0.33)
    Unknown 2,870
index_sent_type_disc_general_unspecified_risk_benefit
    Mean (SD) 0.08 (0.39)
    Unknown 2,870
index_sent_type_disc_retail_cbdc_feature
    Mean (SD) 0.19 (0.34)
    Unknown 2,870
index_sent_type_disc_retail_cbdc_risk_benefit
    Mean (SD) 0.10 (0.45)
    Unknown 2,870
index_sent_type_disc_wholesale_cbdc_feature
    0 462 (88%)
    0.25 1 (0.2%)
    0.333333333333333 2 (0.4%)
    0.4 2 (0.4%)
    0.5 6 (1.1%)
    0.6 1 (0.2%)
    0.666666666666667 3 (0.6%)
    0.75 1 (0.2%)
    1 50 (9.5%)
    Unknown 2,870
index_sent_type_disc_retail_cbdc_process
    Mean (SD) 0.11 (0.27)
    Unknown 2,870
index_sent_type_disc_wholesale_cbdc_risk_benefit
    -1 2 (0.4%)
    0 487 (92%)
    0.333333333333333 3 (0.6%)
    0.5 1 (0.2%)
    0.666666666666667 2 (0.4%)
    0.75 1 (0.2%)
    1 32 (6.1%)
    Unknown 2,870
index_sent_type_disc_wholesale_cbdc_process
    0 440 (83%)
    0.0909090909090909 1 (0.2%)
    0.333333333333333 1 (0.2%)
    0.4 1 (0.2%)
    0.428571428571429 1 (0.2%)
    0.5 7 (1.3%)
    0.6 1 (0.2%)
    0.666666666666667 2 (0.4%)
    1 74 (14%)
    Unknown 2,870
index_stance_type_disc_general_unspecified_feature
    Mean (SD) 0.15 (0.34)
    Unknown 2,870
index_stance_type_disc_general_unspecified_process
    Mean (SD) 0.31 (0.41)
    Unknown 2,870
index_stance_type_disc_general_unspecified_risk_benefit
    Mean (SD) 0.11 (0.44)
    Unknown 2,870
index_stance_type_disc_retail_cbdc_feature
    Mean (SD) 0.27 (0.42)
    Unknown 2,870
index_stance_type_disc_retail_cbdc_risk_benefit
    Mean (SD) 0.12 (0.47)
    Unknown 2,870
index_stance_type_disc_wholesale_cbdc_feature
    0 444 (84%)
    0.333333333333333 4 (0.8%)
    0.5 2 (0.4%)
    0.666666666666667 4 (0.8%)
    0.7 1 (0.2%)
    0.75 4 (0.8%)
    1 69 (13%)
    Unknown 2,870
index_stance_type_disc_retail_cbdc_process
    Mean (SD) 0.19 (0.37)
    Unknown 2,870
index_stance_type_disc_wholesale_cbdc_risk_benefit
    -1 4 (0.8%)
    0 487 (92%)
    0.333333333333333 3 (0.6%)
    0.75 1 (0.2%)
    1 33 (6.3%)
    Unknown 2,870
index_stance_type_disc_wholesale_cbdc_process
    Mean (SD) 0.17 (0.37)
    Unknown 2,870
index_sent_deveme_developed
    Mean (SD) 0.22 (0.37)
    Unknown 2,870
index_sent_deveme_emerging
    Mean (SD) 0.16 (0.35)
    Unknown 2,870
index_stance_deveme_developed
    Mean (SD) 0.34 (0.41)
    Unknown 2,870
index_stance_deveme_emerging
    Mean (SD) 0.21 (0.39)
    Unknown 2,870
index_sent_cb_others
    Mean (SD) 0.19 (0.37)
    Unknown 2,870
index_sent_cb_ecb_fed_boj
    Mean (SD) 0.19 (0.35)
    Unknown 2,870
index_stance_cb_others
    Mean (SD) 0.27 (0.42)
    Unknown 2,870
index_stance_cb_ecb_fed_boj
    Mean (SD) 0.29 (0.40)
    Unknown 2,870
index_sent_cb_other
    Mean (SD) 0.19 (0.38)
    Unknown 2,870
index_sent_cb_ecb
    Mean (SD) 0.19 (0.35)
    Unknown 2,870
index_stance_cb_other
    Mean (SD) 0.29 (0.42)
    Unknown 2,870
index_stance_cb_ecb
    Mean (SD) 0.27 (0.39)
    Unknown 2,870
index_sent_cbdctitle
    Mean (SD) 0.25 (0.27)
    Unknown 3,312
index_stance_cbdctitle
    Mean (SD) 0.38 (0.31)
    Unknown 3,312
index_sent_type_cbdctitle_General/Unspecified
    Mean (SD) 0.12 (0.27)
    Unknown 3,312
index_sent_type_cbdctitle_Retail CBDC
    Mean (SD) 0.22 (0.34)
    Unknown 3,312
index_sent_type_cbdctitle_Wholesale CBDC
    Mean (SD) 0.25 (0.39)
    Unknown 3,312
index_stance_type_cbdctitle_General/Unspecified
    Mean (SD) 0.26 (0.37)
    Unknown 3,312
index_stance_type_cbdctitle_Retail CBDC
    Mean (SD) 0.37 (0.42)
    Unknown 3,312
index_stance_type_cbdctitle_Wholesale CBDC
    Mean (SD) 0.30 (0.42)
    Unknown 3,312
index_sent_discourse_cbdctitle_Feature
    Mean (SD) 0.39 (0.31)
    Unknown 3,312
index_sent_discourse_cbdctitle_Process
    Mean (SD) 0.15 (0.18)
    Unknown 3,312
index_sent_discourse_cbdctitle_Risk-Benefit
    Mean (SD) 0.21 (0.43)
    Unknown 3,312
index_stance_discourse_cbdctitle_Feature
    Mean (SD) 0.60 (0.32)
    Unknown 3,312
index_stance_discourse_cbdctitle_Process
    Mean (SD) 0.32 (0.27)
    Unknown 3,312
index_stance_discourse_cbdctitle_Risk-Benefit
    Mean (SD) 0.23 (0.48)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Feature_General/Unspecified
    Mean (SD) 0.14 (0.31)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Feature_Retail CBDC
    Mean (SD) 0.36 (0.35)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Feature_Wholesale CBDC
    0 72 (84%)
    0.25 1 (1.2%)
    0.4 2 (2.3%)
    0.5 1 (1.2%)
    0.666666666666667 1 (1.2%)
    0.75 1 (1.2%)
    1 8 (9.3%)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Process_General/Unspecified
    Mean (SD) 0.08 (0.17)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Risk-Benefit_General/Unspecified
    Mean (SD) 0.14 (0.45)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Risk-Benefit_Retail CBDC
    Mean (SD) 0.15 (0.55)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Process_Retail CBDC
    Mean (SD) 0.13 (0.20)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Process_Wholesale CBDC
    0 66 (77%)
    0.0909090909090909 1 (1.2%)
    0.333333333333333 1 (1.2%)
    0.4 1 (1.2%)
    0.428571428571429 1 (1.2%)
    0.5 3 (3.5%)
    0.666666666666667 1 (1.2%)
    1 12 (14%)
    Unknown 3,312
index_sent_disc_type_cbdctitle_Risk-Benefit_Wholesale CBDC
    0 74 (86%)
    0.333333333333333 2 (2.3%)
    0.5 1 (1.2%)
    1 9 (10%)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Feature_General/Unspecified
    Mean (SD) 0.21 (0.38)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Feature_Retail CBDC
    Mean (SD) 0.59 (0.37)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Feature_Wholesale CBDC
    0 67 (78%)
    0.333333333333333 1 (1.2%)
    0.5 1 (1.2%)
    0.666666666666667 2 (2.3%)
    0.75 1 (1.2%)
    1 14 (16%)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Process_General/Unspecified
    Mean (SD) 0.29 (0.36)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Risk-Benefit_General/Unspecified
    Mean (SD) 0.12 (0.51)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Risk-Benefit_Retail CBDC
    Mean (SD) 0.20 (0.59)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Process_Retail CBDC
    Mean (SD) 0.27 (0.29)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Process_Wholesale CBDC
    0 62 (72%)
    0.0909090909090909 1 (1.2%)
    0.428571428571429 1 (1.2%)
    0.5 3 (3.5%)
    0.666666666666667 2 (2.3%)
    1 17 (20%)
    Unknown 3,312
index_stance_disc_type_cbdctitle_Risk-Benefit_Wholesale CBDC
    -1 2 (2.3%)
    0 72 (84%)
    0.333333333333333 2 (2.3%)
    1 10 (12%)
    Unknown 3,312
index_sent_cb_cbdctitle_Other
    Mean (SD) 0.03 (0.14)
    Unknown 3,312
index_sent_cb_cbdctitle_ECB
    Mean (SD) 0.22 (0.25)
    Unknown 3,312
index_stance_cb_cbdctitle_Other
    Mean (SD) 0.08 (0.23)
    Unknown 3,312
index_stance_cb_cbdctitle_ECB
    Mean (SD) 0.30 (0.31)
    Unknown 3,312
