Sentiment Profiling of CT Mirror Using the Loughran-McDonald Financial Dictionary
Author
Saurabh C Srivastava
Published
May 1, 2025
Objective of the Analysis
The objective of this analysis is to scrape the latest content from the CT Mirror news website and analyze the emotional or sentiment orientation of the content using the Loughran-McDonald Dictionary, which is widely used in financial and risk communication contexts.
Unlike general sentiment dictionaries, this one classifies words into categories like Uncertainty, Litigious, Negative, Positive, and more — offering insights into how media language may reflect risk, doubt, or optimism in ongoing news cycles.
This script is dynamic — it can be re-run on different days to reflect the changing nature of media coverage.
Practical Implementation
This method can be used in areas such as:
Financial journalism analysis: To assess how often uncertain or risk-related language is used in economic reporting.
Policy communication audits: Evaluating sentiment in government or institutional statements.
Investor behavior modeling: Linking tone in financial news to stock market sentiment.
Media bias or agenda studies: Observing dominant tones across news outlets over time.
Crisis communication: Rapidly assess emotional signals in real-time during emergencies.
Brief Overview of Code
1. Libraries loaded:
For web scraping, sentiment analysis, visualization, and text processing.
library(lingmatch) # For downloading and using financial sentiment dictionary
Loading required package: Matrix
library(dplyr) # Data manipulation
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(tidytext) # Text tokenizationlibrary(tibble) # Working with tibbleslibrary(stringr) # String operationslibrary(rvest) # Web scrapinglibrary(ggplot2) # Data visualizationlibrary(SnowballC) # Word stemming
2. Dictionary download:
The Loughran-McDonald dictionary is loaded using lingmatch.
lingmatch::download.dict("loughranmcdonald", dir =tempdir())
lm_dict <-read.dic(file.path(tempdir(), "loughranmcdonald.dic"))# Convert dictionary to tidy format with stemmingdict_df <- tibble::enframe(lm_dict, name ="sentiment", value ="word") %>% tidyr::unnest(word) %>%mutate(word =wordStem(word))
3. Data collection:
All <p> elements from https://ctmirror.org/ are scraped as text.
The processed words are matched to sentiment categories using the Loughran-McDonald dictionary. A horizontal bar chart then visualizes the frequency of terms associated with each sentiment, offering a clear snapshot of the emotional tone present in the article.
ct_df %>%inner_join(dict_df, by ="word") %>% dplyr::count(sentiment, sort =TRUE) %>%ggplot(aes(x =reorder(sentiment, n), y = n, fill = sentiment)) +geom_col() +coord_flip() +labs(title ="CT Mirror Sentiment - Using Loughran-McDonald Dictionary",x ="Sentiment Category", y ="Frequency",subtitle =paste("CT Mirror Analysis |", format(Sys.Date(), "%B %d, %Y")),caption ="Prepared by Saurabh Srivastava" ) +theme(legend.position ="none",plot.title =element_text(size =14, face ="bold", hjust =0.5) )
Warning in inner_join(., dict_df, by = "word"): Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 18 of `x` matches multiple rows in `y`.
ℹ Row 3575 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
"many-to-many"` to silence this warning.
Conclusion
By applying the Loughran-McDonald dictionary to CT Mirror’s live news content, we gain an immediate and structured understanding of how current media narratives frame uncertainty, risk, or positivity. This approach is scalable and adaptable to other sources, enabling robust monitoring of tone across sectors like finance, governance, or social policy.
It also demonstrates how scraping + sentiment analysis can create powerful, real-time insights — no manual tagging required.