Load required packages

library(rlang)
library(tidyverse)
## -- Attaching packages ----------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.1       v purrr   0.3.2  
## v tibble  2.1.1       v dplyr   0.8.0.1
## v tidyr   0.8.3       v stringr 1.4.0  
## v readr   1.3.1       v forcats 0.4.0
## -- Conflicts -------------------------------------------------------------------------------- tidyverse_conflicts() --
## x purrr::%@%()         masks rlang::%@%()
## x purrr::as_function() masks rlang::as_function()
## x dplyr::filter()      masks stats::filter()
## x purrr::flatten()     masks rlang::flatten()
## x purrr::flatten_chr() masks rlang::flatten_chr()
## x purrr::flatten_dbl() masks rlang::flatten_dbl()
## x purrr::flatten_int() masks rlang::flatten_int()
## x purrr::flatten_lgl() masks rlang::flatten_lgl()
## x purrr::flatten_raw() masks rlang::flatten_raw()
## x purrr::invoke()      masks rlang::invoke()
## x dplyr::lag()         masks stats::lag()
## x purrr::list_along()  masks rlang::list_along()
## x purrr::modify()      masks rlang::modify()
## x purrr::splice()      masks rlang::splice()
library(stringr)
library(tidytext)
library(tidyr)
library(xlsx)
library(RCurl)
## Loading required package: bitops
## 
## Attaching package: 'RCurl'
## The following object is masked from 'package:tidyr':
## 
##     complete
library(XML)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(tm)
## Loading required package: NLP
## 
## Attaching package: 'NLP'
## The following object is masked from 'package:ggplot2':
## 
##     annotate
library(ngram)
library(SentimentAnalysis)
## 
## Attaching package: 'SentimentAnalysis'
## The following object is masked from 'package:base':
## 
##     write
library(tidytext)
library(wordcloud)
## Loading required package: RColorBrewer
library(ggridges)
## 
## Attaching package: 'ggridges'
## The following object is masked from 'package:ggplot2':
## 
##     scale_discrete_manual
library(ggplot2)

Section 2.1

Background

The Federal Reserve System - commonly called “the Fed” - serves as the central bank of the United States. Congress passed the Federal Reserve Act in 1913, which President Woodrow Wilson supported and signed into law on December 23, 1913. Congress structured the Fed as a distinctly American version of a central bank: a “decentralized” central bank, with Reserve Banks and Branches in 12 Districts spread across the country and coordinated by a Board of Governors in Washington, D.C. Congress also gave the Fed System a mixture of public and private characteristics. The 12 Reserve Banks share many features with private-sector corporations, including boards of directors and stockholders (the member banks within their Districts). The Board of Governors, though, is an independent government agency, with oversight responsibilities for the Reserve Banks.

The Fed conducts monetary policy, supervises and regulates banking, serves as lender of last resort, maintains an effective and efficient payments system, and serves as banker for banks and the U.S. government. Conducting the nation’s monetary policy is one of the most important - and often the most visible - functions of the Fed.

Monetary Policy

So, what is monetary policy? Simply put, it refers to the actions taken by the Fed to influence the supply of money and credit in order to foster price stability (i.e. control inflation) and maintain maximum sustainable employment. These two objectives are called the “dual mandate”. This distinguishes the Fed from other central banks which typically have a single mandate to control inflation.

The Fed’s instrument for implementing monetary policy is the FOMC’s target for the federal funds rate - the interest rate at which banks lend to each other overnight. By buying and selling U.S. government securities in the open market, the Fed influences the interest rate that banks charge each other. Movements in this rate and expectations about those changes influence all other interest rates and asset prices in the economy.

The Federal Reserve also issues the nation’s currency (Federal Reserve notes) and manages the amount of funds the banking system holds as reserves. Currency and reserves make up what is called the monetary base. However, because the vast majority of money in the US economy is in intangible form rather than physical notes, monetary policy focuses on interest rates instead of currency supply.

In the early days of the FOMC, controversy swirled around how to structure the vote. Should monetary policy be set by the 12 Reserve Banks or the Board of Governors? Or both? In 1935 Congress decided that the seven Governors would vote along with only five of the 12 presidents. The president of the New York Fed always votes - since the Open Market Trading Desk operates in that District - along with four presidents who rotate from among the groups shown below. In that way, voting members always come from different parts of the country.

Section 2.2

FOMC Introduction

As long as the U.S. economy is growing steadily and inflation is low, few people give much thought to the Federal Open Market Committee (FOMC), the group within the Federal Reserve System charged with setting monetary policy. Yet, when economic volatility makes the evening news, this Committee and its activities become much more prominent. Investors and workers, shoppers and savers all pay more attention to the FOMC’s decisions and the wording of its announcements at the end of each meeting.

Why? Because the decisions made by the FOMC have a ripple effect throughout the economy. The FOMC is a key part of the Federal Reserve System, which serves as the central bank of the United States. Among the Fed’s duties are managing the growth of the money supply, providing liquidity in times of crisis, and ensuring the integrity of the financial system. The FOMC’s decisions to change the growth of the nation’s money supply affect the availability of credit and the level of interest rates that businesses and consumers pay. Those changes in money supply and interest rates, in turn, influence the nation’s economic growth and employment in the short run and the general level of prices in the long run.

FOMC Meetings

The FOMC meets regularly - typically every six to eight weeks - in Washington, D.C., although the Committee can and does meet more often by phone or videoconference if needed. The meetings are generally one-day or two-day events, with the two-day meetings providing more time to discuss a special topic. Around the table in the Federal Reserve Board’s headquarters sit all 19 FOMC participants (seven Governors and 12 Reserve Bank presidents) as well as select staff and economists from the Board and the Reserve Banks. Because of the nature of the discussions, attendance is restricted. A Reserve Bank president, for instance, typically brings along only one staff member, usually his or her director of research.

The objective at each meeting is to set the Committee’s target for the federal funds rate - the interest rate at which banks lend to each other overnight - at a level that will support the two key objectives of U.S. monetary policy: price stability and maximum sustainable economic growth. The meeting’s agenda follows a structured and logical process that results in well-informed and thoroughly deliberated decisions on the future course of monetary policy.

Structure of a Typical Meeting

The meeting begins with a report from the manager of the System Open Market Account (SOMA) at the Federal Reserve Bank of New York, who is responsible for keeping the federal funds rate close to the target level set by the FOMC. The manager explains how well the Open Market Trading Desk has done in hitting the target level since the last FOMC meeting and discusses recent developments in the financial and foreign exchange markets. Up next is the Federal Reserve Board’s director of the Division of Research and Statistics, along with the director of the Division of International Finance. They review the Board staff’s outlook for the U.S. economy and foreign economies. This detailed forecast is circulated the week before the meeting to FOMC members in what is called the “Greenbook” - named for its green cover in the days when it was a printed document.

Then the meeting progresses to the first of two “go-rounds,” which are the core of FOMC meetings. During the first go-round, all of the Fed Governors and Reserve Bank presidents discuss how they see economic and financial conditions. The Reserve Bank presidents speak about conditions in their Districts, as well as offering their views on national economic conditions. The data and information discussed vary by region and therefore spotlight a wide range of industries. For example, one would expect the review of regional conditions in the San Francisco District to lend insight into the tech sector of Silicon Valley.

The policymakers have prepared for this go-round through weeks of information gathering. Before the FOMC meeting, each Reserve Bank prepares a “Summary of Commentary on Current Economic Conditions,” which is published two weeks before each meeting in what most people call the “Beige Book,” for the color of its cover when originally printed. One Federal Reserve Bank, designated on a rotating basis, publishes the overall summary of the 12 District reports. The Reserve Bank presidents have also gathered information by talking with executives in a variety of business sectors and through meetings with the Banks’ boards of directors and advisory councils.

This first go-round covers valuable information about economic activity throughout the country, measured in hard data and recent anecdotal information, as well as the analysis and interpretation conveyed by the policymakers sitting around the table. This is a key way in which each region of the U.S. has input into the making of national monetary policy. This portion of the meeting concludes with the FOMC Chair summarizing the discussion and providing the Chair’s own view of the economy. At this point, the policy discussion begins with the Federal Reserve Board’s director of the Division of Monetary Affairs, who outlines the Committee’s various policy options.

The outlook options could include no change, an increase, or a decrease in the federal funds rate target. Each option is described, along with a clear rationale, the pros and cons, and some alternatives for how the Committee could explain its decision in a public statement to be released that afternoon. Then, there is a second go-round. The Reserve Bank presidents and Governors each make the best case for the policy alternative they prefer, given current economic conditions and their personal outlook for the economy. They also comment on how they think the statement explaining the decision should be worded. One of the most important aspects of an FOMC meeting is that all voices matter. The analysis and viewpoints of each committee participant - whether a voting member or not - play an instrumental role in the FOMC’s policy decisions.

At the end of this policy go-round, the Chair summarizes a proposal for action based on the Committee’s discussion, as well as a proposed statement to explain the policy decision. The Fed Governors and presidents then get a chance to question or comment on the Chair’s proposed approach. Once a motion for a decision is on the table, the Committee tries to come to a consensus through its deliberations. Although the final decision is most often one that all can support, there are times when some differences of opinion may remain, and voting members may dissent. At the end of the policy discussion, all seven of the Fed Governors and the five voting Reserve Bank presidents cast a formal vote on the proposed decision and the wording of the statement.

Announcing the Policy Decision

After the vote has been taken, the FOMC publicly announces its policy decision at 2:15 p.m. The announcement includes the federal funds rate target, the statement explaining its actions, and the vote tally, including the names of the voters and the preferred action of those who dissented.

In addition, the FOMC releases its official minutes three weeks after each meeting. The minutes include a more complete explanation of the views expressed, which allows the public to get a better sense of the range of views within the FOMC and promotes awareness and understanding of how monetary policy is made. In recent years, the FOMC has improved communications with the public. What’s more, the FOMC now releases Committee participants’ projections for the economy and inflation four times a year, which provides added insight into the policymakers’ perspectives.

Implementing Policy

Once the FOMC establishes a target for the federal funds rate, the Open Market Trading Desk at the Federal Reserve Bank of New York conducts daily open market operations - buying or selling U.S. government securities on the open market - as necessary to achieve the federal funds rate target. Open market operations affect the amount of money and credit available in the banking system, thereby affecting interest rates, which in turn affect the spending decisions of households and businesses and ultimately the overall performance of the U.S. economy.

Connecting To Our Project

This detailed description of the FOMC serves two purposes: (a) to describe the monetary policymaking activities of the FOMC (b) to identify the dataset which we will analyze. We are going to focus exclusively on the FOMC policy statements released at 2:15pm ET. Anecdotally, these policy statements have the greatest short term impact on financial markets and potential for surprise. In the next section, we identify past research that examines the FOMC policy statements from a data science perspective.

Section 3.1

Data Staging - prepare metadata for data extraction and create a dataframe

# Extract year of publication from the statement's release date, and create a data frame with date, year and URL. 

statement.dates<-NULL
year<-NULL
for(i in seq(from=1, to=length(links))) {
  statement.dates[i]<-(str_extract(links[i],"[[:digit:]]+"))
  year[i]<-substr(statement.dates[i],1,4)
}

reports<-data.frame(year,statement.dates, links)

# Convert factors to characters

reports %<>% mutate_if(is.factor, as.character)%>% arrange(statement.dates)

Data Extraction via web-scraping

# Loop through the statement links and scrape the content from the Federal Reserve website.
# Discard irrelevant portions of the extracted content i.e. preliminary paragraphs and last paragraph.

statement.content<-NULL
statement.length<-NULL
for(i in seq(from=1, to=length(reports$links))) {
stm.url<-getURL(reports$links[i])
stm.tree<-htmlTreeParse(stm.url,useInternal=TRUE )
stm.tree.parse<-unlist(xpathApply(stm.tree, path="//p", fun=xmlValue))
n<-(which(!is.na(str_locate(stm.tree.parse, "release")))+1)[1]
l<-length(stm.tree.parse)-1

# Condense separate paragraphs into one element per statement date

reports$statement.content[i]<-paste(stm.tree.parse[n:l], collapse = "")

# Remove line breaks

reports$statement.content[i]<-gsub("\r?\n|\r"," ",reports$statement.content[i])
#reports$statement.content[i]<-gsub("\\.+\\;+\\,+","",reports$statement.content[i])

# Count number of characters per statement

reports$statement.length[i]<-nchar(reports$statement.content[i])
#reports$statement.length[i]<-wordcount(reports$statement.content[i], sep = " ", count.function = sum)
}
# Create R data object

saveRDS(reports, file = "fomc_data.rds")

Data cleansing - correct a statement date

# Correct the date for one statement, because the URL is not in sync with the actual date inside the statement content

reports$statement.dates[match(c("20070618"),reports$statement.dates)]<-"20070628"

Section 4.1

Analyse FOMC statement word lengths and word frequency

# Compute total statement length per year by aggregating across individual statements

yearly.length<-reports%>% group_by(year) %>% summarize(words.per.year=sum(statement.length))
yearly.length
## # A tibble: 13 x 2
##    year  words.per.year
##    <chr>          <int>
##  1 2007           12361
##  2 2008           19660
##  3 2009           23410
##  4 2010           24857
##  5 2011           26634
##  6 2012           27816
##  7 2013           40310
##  8 2014           46081
##  9 2015           32005
## 10 2016           30787
## 11 2017           28423
## 12 2018           19457
## 13 2019            6845

As can be seen, the total statement length was the highest for the year 2014. As expected, the count for 2019 is low because the year is still in progress and there have been only 3 meetings so far this year.

# Graph the total statement length per year

ggplot(yearly.length, aes(x=yearly.length$year,y=yearly.length$words.per.year))+geom_bar(stat="identity",fill="darkblue", colour="black") + coord_flip()+xlab("Year")+ylab("Statement Length")

#Verify word count for a sample word in a sample statement

sample<-reports%>%filter(reports$statement.dates=="20140319")
sample[,4]
## [1] "        Information received since the Federal Open Market Committee met in January indicates that growth in economic activity slowed during the winter months, in part reflecting adverse weather conditions. Labor market indicators were mixed but on balance showed further improvement. The unemployment rate, however, remains elevated. Household spending and business fixed investment continued to advance, while the recovery in the housing sector remained slow. Fiscal policy is restraining economic growth, although the extent of restraint is diminishing. Inflation has been running below the Committee's longer-run objective, but longer-term inflation expectations have remained stable.             Consistent with its statutory mandate, the Committee seeks to foster maximum employment and price stability. The Committee expects that, with appropriate policy accommodation, economic activity will expand at a moderate pace and labor market conditions will continue to improve gradually, moving toward those the Committee judges consistent with its dual mandate. The Committee sees the risks to the outlook for the economy and the labor market as nearly balanced. The Committee recognizes that inflation persistently below its 2 percent objective could pose risks to economic performance, and it is monitoring inflation developments carefully for evidence that inflation will move back toward its objective over the medium term.             The Committee currently judges that there is sufficient underlying strength in the broader economy to support ongoing improvement in labor market conditions. In light of the cumulative progress toward maximum employment and the improvement in the outlook for labor market conditions since the inception of the current asset purchase program, the Committee decided to make a further measured reduction in the pace of its asset purchases. Beginning in April, the Committee will add to its holdings of agency mortgage-backed securities at a pace of $25 billion per month rather than $30 billion per month, and will add to its holdings of longer-term Treasury securities at a pace of $30 billion per month rather than $35 billion per month. The Committee is maintaining its existing policy of reinvesting principal payments from its holdings of agency debt and agency mortgage-backed securities in agency mortgage-backed securities and of rolling over maturing Treasury securities at auction. The Committee's sizable and still-increasing holdings of longer-term securities should maintain downward pressure on longer-term interest rates, support mortgage markets, and help to make broader financial conditions more accommodative, which in turn should promote a stronger economic recovery and help to ensure that inflation, over time, is at the rate most consistent with the Committee's dual mandate.             The Committee will closely monitor incoming information on economic and financial developments in coming months and will continue its purchases of Treasury and agency mortgage-backed securities, and employ its other policy tools as appropriate, until the outlook for the labor market has improved substantially in a context of price stability. If incoming information broadly supports the Committee's expectation of ongoing improvement in labor market conditions and inflation moving back toward its longer-run objective, the Committee will likely reduce the pace of asset purchases in further measured steps at future meetings. However, asset purchases are not on a preset course, and the Committee's decisions about their pace will remain contingent on the Committee's outlook for the labor market and inflation as well as its assessment of the likely efficacy and costs of such purchases.             To support continued progress toward maximum employment and price stability, the Committee today reaffirmed its view that a highly accommodative stance of monetary policy remains appropriate. In determining how long to maintain the current 0 to 1/4 percent target range for the federal funds rate, the Committee will assess progress--both realized and expected--toward its objectives of maximum employment and 2 percent inflation. This assessment will take into account a wide range of information, including measures of labor market conditions, indicators of inflation pressures and inflation expectations, and readings on financial developments. The Committee continues to anticipate, based on its assessment of these factors, that it likely will be appropriate to maintain the current target range for the federal funds rate for a considerable time after the asset purchase program ends, especially if projected inflation continues to run below the Committee's 2 percent longer-run goal, and provided that longer-term inflation expectations remain well anchored.             When the Committee decides to begin to remove policy accommodation, it will take a balanced approach consistent with its longer-run goals of maximum employment and inflation of 2 percent. The Committee currently anticipates that, even after employment and inflation are near mandate-consistent levels, economic conditions may, for some time, warrant keeping the target federal funds rate below levels the Committee views as normal in the longer run.             With the unemployment rate nearing 6-1/2 percent, the Committee has updated its forward guidance. The change in the Committee's guidance does not indicate any change in the Committee's policy intentions as set forth in its recent statements.             Voting for the FOMC monetary policy action were: Janet L. Yellen, Chair; William C. Dudley, Vice Chairman; Richard W. Fisher; Sandra Pianalto; Charles I. Plosser; Jerome H. Powell; Jeremy C. Stein; and Daniel K. Tarullo.             Voting against the action was Narayana Kocherlakota, who supported the sixth paragraph, but believed the fifth paragraph weakens the credibility of the Committee's commitment to return inflation to the 2 percent target from below and fosters policy uncertainty that hinders economic activity.             Statement Regarding Purchases of Treasury Securities and Agency Mortgage-Backed Securities      Board of Governors of the Federal Reserve System"
str_count(sample, pattern="inflation")
## [1]  0  0  0 15  0

Trend in Statement Length by year and Fed Chair

It seems that the FOMC statements became progressively verbose under Chairman Bernanke until they reached a peak in 2014 when Janet Yellen took over as the Fed Chair. This can be attributed to the fact that during 2014, there was a lot of discussion around when the Fed would end the quantitative easing measures that it had put in place to combat the recession that ensued from the financial crisis. There were 2 schools of thought - one which felt that the time was right for the Fed to start trimming its large balance sheet and the other that wanted to wait a bit longer to see more definite signs of growth before starting to reverse the quantitative easing measures. So the Fed tried to provide more transparency into their thinking which resulted in longer FOMC statments.

Since 2014, the statements have gotten shorter. The current chairman Jerome Powell took over in February 2018.

# Graph the annual trend in statement length, annotated by Fed Chair

p<-ggplot(reports, aes(x=year,y=statement.length))+geom_point(stat="identity",color=statement.dates)+scale_fill_brewer(palette="Pastel1")+theme(legend.position="right")+xlab("Year") + ylab("Length of Statement")

p + ggplot2::annotate("text", x = 4,y = 5000, label = "Bernanke", family="serif", fontface="bold", colour="blue", size=4)+ggplot2::annotate("text", x=10, y=5500, label="Yellen", family="serif", fontface="bold", colour="darkred",size=4)+ggplot2::annotate("text", x=13, y=3600, label="Powell", family="serif", fontface="bold", colour="black",size=4)+ggplot2::annotate("segment", x = 0, xend = 8.1, y = 2700, yend = 6500, colour = "blue", size=1, arrow=arrow(ends="both"))+ggplot2::annotate("segment", x = 8.1, xend = 12.1, y = 6500, yend = 3200, colour = "darkred", size=1, arrow=arrow(ends="both"))+ggplot2::annotate("segment", x = 12.1, xend = 14, y = 3200, yend = 3200, colour = "black", size=1, arrow=arrow(ends="both"))

Adding custom words and names to the list of stop words

Remove proper nouns and irrelevant words from further analysis by adding them as custom words to the stop words lexicon

# Add custom words to the stop words list to exclude proper nouns/names and words such as "committee" which would provide no meangingful insight into the statement's sentiment analysis

#print(stop_words)
words<-c("committee", "ben", "geithner", "bernanke", "timothy", "hoenig", "thomas", "donald", "kevin", "mishkin", "kroszner", "kohn", "charles", "frederic")
lexicon<-c("Custom")
my.stop_words<-data.frame(words, lexicon)
colnames(my.stop_words)<-c("word","lexicon")
new.stop_words <- rbind(my.stop_words, stop_words)
new.stop_words$word<-as.character(new.stop_words$word)
new.stop_words$lexicon<-as.character(new.stop_words$lexicon)
head(new.stop_words)
##        word lexicon
## 1 committee  Custom
## 2       ben  Custom
## 3  geithner  Custom
## 4  bernanke  Custom
## 5   timothy  Custom
## 6    hoenig  Custom

Cleanse data - remove irrelevant characters and calculate the frequency of the main words per statement date

# Strip out punctuations, white space and custom stop words, and calculate the word frequency by statement date

report.words<-reports %>%mutate(date = statement.dates, year = year, text= statement.content) %>% unnest(text) %>% unnest_tokens(word, text) %>%mutate(word = stripWhitespace(gsub("[^A-Za-z ]"," ",word))) %>% filter(word != "") %>% filter(word != " ") %>%anti_join(new.stop_words)%>% count(date, year, word, sort = TRUE)%>% mutate(frequency = n) %>% select(date, year, word, frequency)
## Joining, by = "word"

Verify if the count is correct for a given combination of sample word and statement

# Verify the count for the word "inflation" during the statements published in 2007 

report.words%>%filter(year=='2007', word=='inflation')
## # A tibble: 8 x 4
##   date     year  word      frequency
##   <chr>    <chr> <chr>         <int>
## 1 20070131 2007  inflation         5
## 2 20071031 2007  inflation         5
## 3 20071211 2007  inflation         5
## 4 20070321 2007  inflation         4
## 5 20070509 2007  inflation         4
## 6 20070628 2007  inflation         4
## 7 20070807 2007  inflation         4
## 8 20070918 2007  inflation         3
# Rank most frequent words by year

f_text<-report.words%>% group_by(year,word) %>% summarize(total=sum(frequency))%>%arrange(year,desc(total),word)%>% mutate(rank=row_number())%>%ungroup() %>% arrange(rank,year)

# Select the top 10 ranked words per year

topWords <- f_text %>% filter(rank<11)%>%arrange(year,rank)
print(topWords)
## # A tibble: 130 x 4
##    year  word      total  rank
##    <chr> <chr>     <int> <int>
##  1 2007  inflation    34     1
##  2 2007  federal      31     2
##  3 2007  growth       23     3
##  4 2007  economic     20     4
##  5 2007  action       19     5
##  6 2007  moderate     19     6
##  7 2007  policy       19     7
##  8 2007  chairman     18     8
##  9 2007  rate         13     9
## 10 2007  governors    12    10
## # ... with 120 more rows

Graph the most frequent words per year

# Graph top 10 most frequent words by year

gg <- ggplot(head(topWords, 130), aes(y=total,x=reorder(word,rank))) + geom_col(fill="#27408b") +
  facet_wrap(~year,scales="free", ncol=3)+ coord_flip()+theme_ridges(font_size=11) + 
  labs(x="",y="",title="Most Frequent Words in FOMC Statements grouped by years (2007 - 2019)")

gg

Conclusion

As can be seen from the above analysis, the type of words that show up in the top 10 list are largely the same. This is because in almost all cases, the FOMC statements start by making a reference to the previous statement and refer to the common economic parameters that the committee tracks. So there is large amount of consistency in how the statements are worded and the type of terms they employ. There is no surprise in the most frequently used words in these statements. In fact, one could argue that it is the differential i.e. the new words which are likely to be the least frequently words in the statements that provide the real information needed for sentiment analysis.

On account of this, we do not pursue this path further, and change track to other approaches to do our analysis.