News Bias Analysis

Background : Can bias in news reporting be measured with basic text analytics?

Goal : We are going to explore daily news feeds and see if we can detect variances in how various news sources report or don't report stories.

Can text analytics make sense of this?

Note : This article, like most of mine on R-Pubs, is both instructional and investigative. You can re-create any of these charts and tables from the embedded R code. Click the "Code" button to view.

QuietLoad <- function(library) {
  suppressWarnings(suppressPackageStartupMessages(
    library(library, character.only=TRUE)))
}

# Load libraries
QuietLoad('tidyverse')
QuietLoad('odbc')
QuietLoad('DBI')
QuietLoad('clipr')
QuietLoad('tidyRSS')
QuietLoad('rvest')
QuietLoad('tidytext')
QuietLoad('topicmodels')
QuietLoad('kableExtra')
QuietLoad('httr')
QuietLoad('xml2')
QuietLoad('XML') 
QuietLoad('jsonlite')
QuietLoad('knitr')
QuietLoad('DBI')
QuietLoad('bigrquery')

First we will gather news articles from six sources. Two that are thought to have a left leaning bias, two from the center and two that are thought to have a right leaning bias.

Left : HUF (Huffington Post) and CBS (CBS News)

Center : BBC (BBC America) and USA (USA Today)

Right : FOX (Fox News) and WTM (Washington Times)

To keep the topics focused, the US headline news feeds were used.

The bias assumption was drawn from my own experience plus the categorization provided by AllSides.

There will be three elements to our bias analysis.

Exclusionary Bias : Topics and the news change daily but which topics are reported more, less or not at all by each side?
Sentiment Bias : Is each side reporting certain topics in a more positive or more negative tone than the other side?
Keyword Bias : Are articles with that report on certain keywords reported more, less or not at all by each side and what is the sentiment of articles with those keywords?

# Define our news source RSS links along with meta data
# about how we will filter and retrieve the data
FeedURLDF <- data.frame(
  FeedURL = c("http://feeds.bbci.co.uk/news/world/us_and_canada/rss.xml",
              "https://www.cbsnews.com/latest/rss/us",
              "http://feeds.foxnews.com/foxnews/national",
              "https://www.huffpost.com/section/front-page/feed",
              "http://rssfeeds.usatoday.com/UsatodaycomNation-TopStories",
              "https://www.washingtontimes.com/rss/headlines/news/national/"),
  FeedName = c("BBC",
               "CBS",
               "FOX",
               "HUF",
               "USA",
               "WTM"),
  FeedType = factor(c("Center",
               "Left",
               "Right",
               "Left",
               "Center",
               "Right"), levels = c("Left", "Center", "Right")),
  FeedFilter = c("av|podcast|sport",
                 "video|podcast",
                 "podcast",
                 "podcast",
                 "videos|picture-gallery",
                 NA),
  NodeType = c("xpath",
               "css",
               "css",
               "css",
               "css",
               "css"),
  FeedNode = c('//*[@id="main-content"]',
               ".content__body",
               "div.article-body",
               NA, #".entry__text",
               NA, #"div.gnt_ar_b",
               "div.storyareawrapper"),
  XCludeNode = c('//*[@data-component="topStories"]|//*[@data-component="features"]|
                 //*[@data-component="mostRead"]|//*[@data-component="see-alsos"]',
               NA,
               "div.info",
               NA,
               NA,
               "div.article-toplinks"),
  ReplaceExp = c("\\.css.*?;\\}|\\@media.*?\\}",
                "BulaBulla",
                "BulaBulla",
                "BulaBulla",
                "BulaBulla",
                "BulaBulla"),
  stringsAsFactors = FALSE)  
  
# Build Article Link Data Frame 
#     - Titles and reference links without the full article 
for (i in 1:nrow(FeedURLDF)) {
  TempDF <- tidyfeed(FeedURLDF$FeedURL[i]) %>%
    dplyr::select(feed_title, item_title, item_link, item_description, 
                  item_pub_date, item_guid) %>%
    mutate(item_articletext = NA,
           item_NodeType = FeedURLDF$NodeType[i],
           item_FeedNode = FeedURLDF$FeedNode[i],
           item_XCludeNode = FeedURLDF$XCludeNode[i],
           item_ReplaceExp = FeedURLDF$ReplaceExp[i],
           feed_type = FeedURLDF$FeedType[i],
           feed_title = FeedURLDF$FeedName[i]) %>%
    # General filter for blank article descriptions (CNN local news links)
    filter(!is.na(item_description) & str_length(item_description) > 1) %>%
    # Filter for Recent articles
    filter(as.Date(item_pub_date) >= Sys.Date() - 5) %>%
    # Specific filters for link URL's (Excludes video for CBS for example)
    filter(is.na(FeedURLDF$FeedFilter[i]) | 
             !str_detect(item_link, FeedURLDF$FeedFilter[i])) %>%
    filter(is.na(FeedURLDF$FeedFilter[i]) | 
             !str_detect(item_guid, FeedURLDF$FeedFilter[i]))
  if (i == 1) {
    ArticleDF <- TempDF
  } else {
    ArticleDF <- bind_rows(ArticleDF, TempDF)
  }
}

# News sources have different article counts.  These next steps will 
# normalize the article count per source in 4 steps

# Step 1 : Number and Index by Feed Source
ArticleDF <- ArticleDF %>%
  group_by(feed_title) %>%
  mutate(item_index = row_number(),
         feed_count = n()) %>%
  ungroup()

# Step 2 : Summarize the Article DF with counts of articles by source
SelectorDF <- ArticleDF %>%
  group_by(feed_title) %>%
  summarise(feed_count = min(feed_count), .groups = "drop")

MinCount <- min(SelectorDF$feed_count)

# Step 3 : Using the Selector DF, create a random Selector Index vector 
for (i in 1:nrow(SelectorDF)) {
  if (i==1) {
    FinalSelectorDF <- tibble(feed_title = rep(SelectorDF$feed_title[i], SelectorDF$feed_count[i]),
                SelectorIndex = sample(1:SelectorDF$feed_count[i], 
                            size = SelectorDF$feed_count[i]))
  } else {
    FinalSelectorDF <- bind_rows(FinalSelectorDF, 
                   tibble(feed_title = rep(SelectorDF$feed_title[i], SelectorDF$feed_count[i]),
                SelectorIndex = sample(1:SelectorDF$feed_count[i], 
                            size = SelectorDF$feed_count[i])))
  }
}

FinalSelectorDF <- FinalSelectorDF %>%
  filter(SelectorIndex <= MinCount)  

# Step 4 - : Filter each source down to the minimum source count using the
# selector index
ArticleDF <- FinalSelectorDF %>%
  inner_join(ArticleDF, by = c("feed_title" = "feed_title", "SelectorIndex" = "item_index")) 


# Retrieve the full article detail from each link 
#     - Uses the meta data to determine how to retrieve and filter 
#i <- 37
for (i in 1:nrow(ArticleDF)) {
#for (i in 52:68) {
  tryCatch(
  if (ArticleDF$item_NodeType[i]=="css") {
    TempText <- ArticleDF$item_link[i] %>%
      read_html()
    if (is.na(ArticleDF$item_FeedNode[i])) {
      # if item_FeedNode is NA then no need to filter nodes; we'll just take
      # all <p> later
    } else {
      TempText <- TempText %>%
        html_nodes(css = ArticleDF$item_FeedNode[i])
    }
    if (!is.na(ArticleDF$item_XCludeNode[i])) {
      TempXcludeText <- TempText %>%
        html_nodes(css = ArticleDF$item_XCludeNode[i])
      xml_remove(TempXcludeText)
    }
    TempText <- TempText %>%
    html_nodes("p") %>%
    html_text() %>%
    toString() %>%
    str_replace_all(ArticleDF$item_ReplaceExp[i], "") 
  } else {
    TempText <- ArticleDF$item_link[i] %>%
      read_html() %>%
      html_nodes(xpath = ArticleDF$item_FeedNode[i])
    if (!is.na(ArticleDF$item_XCludeNode[i])) {
      TempXcludeText <- TempText %>%
        html_nodes(xpath = ArticleDF$item_XCludeNode[i])
      xml_remove(TempXcludeText)
    }
    TempText <- TempText %>%
    html_nodes("p") %>%
    html_text() %>%
    toString() %>%
    str_replace_all(ArticleDF$item_ReplaceExp[i], "") 
  }
  , error = function(cond) {
      print(cond)
      TempText <- ""
  }
  )
  ArticleDF$item_articletext[i] <- TempText
#  print(paste0("Processing : ", i, " of ", nrow(ArticleDF), " ",
#               ArticleDF$feed_title[i], " length : ", str_length(TempText)))
}

#
custom_stop_words = bind_rows(
  tibble(
  word = c("ap", "news", "week", "fox", "cnn", "cbs", 
           "bbc", "huffington", "huff", "post", "washington",
           "colstrip")),
  stop_words %>%
    filter(lexicon != "NO SUCH LEXICON") %>%
    dplyr::select(word)) %>%
  mutate(word = SnowballC::wordStem(word))

ArticleTokenDF <- ArticleDF %>%
  dplyr::select(feed_title, feed_type, item_link, item_title, 
                item_description, item_articletext) %>%
  unnest_tokens(word, item_articletext) %>%
  mutate(word = SnowballC::wordStem(word)) %>%
  filter(!str_detect(word, "\\d|\\,")) %>%
  anti_join(custom_stop_words, by = "word") %>%
  group_by(item_link, word) %>%
  summarise(count = n(), .groups = "drop")

ArticleDTM <- ArticleTokenDF %>%
  cast_dtm(item_link, word, count)

TargetK <- min(round(MinCount * .8, 0), 10)

#ArticleLDA <- LDA(ArticleDTM, k = TargetK, control = list(seed = 1234))
ArticleLDA <- LDA(ArticleDTM, k = TargetK)


ArticleTopics_Word <- tidy(ArticleLDA, matrix = "beta")

ArticleTopTerms <- ArticleTopics_Word %>%
  group_by(topic) %>%
  top_n(5, beta) %>%
  ungroup() %>%
  arrange(topic, -beta)

ArticleTopTerms_Name <- ArticleTopTerms %>%
  arrange(topic, desc(beta)) %>%
  group_by(topic) %>%
  mutate(RowID = row_number(),
         topic_beta = sum(beta)) %>%
  ungroup() %>%
  pivot_wider(id_cols = c(topic, topic_beta), 
              names_from = RowID, values_from = term) %>%
  mutate(topic_name = paste(`1`, `2`, `3`, sep = " ")) %>%
  dplyr::select(topic, topic_beta, topic_name) 

# Re-categorize a topic as miscellaneous
#ArticleTopTerms_Name <- ArticleTopTerms_Name%>%
#  mutate(topic_name = ifelse(topic_beta == min(topic_beta),
#                             "MISC", topic_name))

ArticleTopics_Doc <- tidy(ArticleLDA, matrix = "gamma") %>%
  group_by(document) %>%
  mutate(Maxgamma = max(gamma)) %>%
  ungroup() %>%
  filter(gamma == Maxgamma) %>%
  dplyr::select(document, topic) %>%
  left_join(ArticleTopTerms_Name, by = c("topic"))

We start by pulling the RSS feeds for the six news sources. Then we follow the article links and clean-up the article text to prepare it for further analysis.

One challenge is that not all six sources report an equal number of articles. Therefore we determine the news source with the least articles and only utilize that same number of articles from the other sources using random selection. Some articles may be empty of significant text (only contain video or external links) so those empty articles are ignored for subsequent analysis.

# Article Summary

TimeStamp = Sys.time() 
attr(TimeStamp, "tzone") <- "US/Eastern"

tibble(
  Description = c("Total articles",
                  paste0("Source with least articles : ",
                         filter(SelectorDF, feed_count == min(feed_count))$feed_title)[1],
                  "Revised total articles after random selection",
                  "Empty Articles",
                  "Total word count (net of stop words)",
                  "Unique word count"),
  Value = c(sum(SelectorDF$feed_count),
            min(SelectorDF$feed_count),
            nrow(ArticleDF),
            nrow(ArticleDF %>%
                   filter(str_length(item_articletext) <= 10)),
            nrow(ArticleTokenDF),
            length(unique(ArticleTokenDF$word)))
       ) %>%
kable(caption = paste0('<p style="color:black; font-size:18px"> Article Summary - ',
                        format(TimeStamp, "%m/%d/%Y %H:%M"),
                       '</p>')) %>%
  kable_styling("striped") %>%
  row_spec(0, color = "white", background = "black")

Article Summary - 07/22/2022 06:20
Description	Value
Total articles	129
Source with least articles : USA	9
Revised total articles after random selection	54
Empty Articles	0
Total word count (net of stop words)	9178
Unique word count	3724

1 - Exclusionary Bias

Topics and the news change daily but which topics are reported more, less or not at all by each side?

Trying to determine topics using static methods is difficult. Here are a few methods and their challenges.

Keyword Topics : The news changes daily and a static list of keywords will never keep up with what might be hot news. Imagine that there is a hamster worshiping cult in Laramie Wyoming that suddenly makes the news for something tragic or wonderful. Chances are that neither "Laramie" or "hamster" would be in your list of hot keywords.

Source Labeled Topics : First, the sources may label the same topic in slightly different ways making it difficult to tie topics across sources. Second, we want to remove any meta-bias in how a source might label their own stories. From our previous example, perhaps one source's "cult" is another source's "charity".

Therefore we will be using a machine learning algorithm known as Latent Dirichlet Allocation or LDA to determine the topics.

LDA is a popular method for sorting documents into topics. It treats each document as a mixture of topics, and each topic as a mixture of words. This allows documents to "overlap" each other in terms of content, rather than being separated into discrete groups, in a way that mirrors typical use of natural language. For our purposes we will only be classifying each article to its' most probable topic.

If you want to understand LDA in more depth there is Wikipedia and a easier to read medium article with pictures.

ArticleTopTerms %>%
  mutate(term = reorder_within(term, beta, topic)) %>%
  ggplot(aes(beta, term, fill = factor(topic))) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~ topic, scales = "free") +
  scale_y_reordered() +
  theme_classic() +
  labs(title = "Article Topic Model Groups",
       subtitle = "By Most Probable Terms within Topic",
       x = "Probability", y = "Term")

Because the topics are mathematically generated they have no real name. The number of topics is a parameter we can define for LDA and I currently have it set to 75% of the net article count by source. If there were 6 sources of 10 articles each, LDA will split that into 8 topics for us.

For labeling purposes we will be identifying each topic by it's three most probable keywords. In late 2020, you might ended up with something like "trump election biden" as a topic. That doesn't mean that every article in that group contains even one of those three words or even sounds like it could be related to them. Typically the total probabilities of the top 5 words in a given topic add up to less than 0.10 meaning that 90% of the article does not contain those words.

Also note that all words have been "stemmed" to de-duplicate words down to their root meaning. Words like "placing", "placed" and "places" will be stemmed to "place". This is great for text analytics but sometimes creates odd results like "suprem" instead of "supreme".

# Add the article topics to the main DF
ArticleDF <- ArticleDF %>%
  left_join(ArticleTopics_Doc, by = c("item_link" = "document")) %>%
  mutate(item_length = str_length(item_articletext),
         item_empty = ifelse(item_length <= 10, "Y", "N"))

Now that the topics have been labeled, it's time to see which source types are reporting or not reporting each topic.

ArticleDF %>%
  filter(str_length(item_articletext) > 10) %>%
  group_by(feed_type, topic, topic_name) %>%
  summarise(topic_count = n(), .groups = "drop") %>%
  ggplot(aes(feed_type, topic_name, fill = topic_count)) +
  geom_tile() +
#  geom_label(aes(feed_type, topic_name, label = topic_count), fill = "white") +
  scale_fill_gradient(low = "light grey", high = "dark red") +
  theme_classic() +
  labs(title = "Article Count by Topic and Source Type",
       x = "Type", y = "Topic", fill = "Topic Count")

Let's break that down further by looking at the topics reported by source.

ArticleDF %>%
  filter(str_length(item_articletext) > 10) %>%
  mutate(feed_title = factor(feed_title, levels = c("HUF", "CBS", 
                                                    "BBC", "USA", 
                                                    "FOX", "WTM"))) %>%
  group_by(feed_title, topic, topic_name) %>%
  summarise(topic_count = n(), .groups = "drop") %>%
  ggplot(aes(feed_title, topic_name, fill = topic_count)) +
  geom_tile() +
  scale_fill_gradient(low = "light grey", high = "dark red") +
  theme_classic() +
  labs(title = "Article Count by Topic and Source Name",
       x = "Type", y = "Topic", fill = "Topic Count")

To examine this in more detail, here are lists including article titles and links of anything either side excluded.

ArticleTopic_Counts <- ArticleDF %>%
  group_by(feed_type, topic_name) %>%
  summarise(topic_count = n(), .groups = "drop") %>%
  filter(!is.na(topic_name) & topic_name != "NA") %>%
  pivot_wider(names_from = feed_type, values_from = topic_count) %>%
  rename(TopicRptLeft = Left, TopicRptCenter = Center, TopicRptRight = Right)

ArticleDF <- ArticleDF %>%
  mutate(item_display_link = cell_spec(item_title, "html", 
         link = item_link)) %>%
  left_join(ArticleTopic_Counts, by = c("topic_name"))

# Filter to articles not reported by right
if (nrow(ArticleDF %>% filter(is.na(TopicRptRight))) < 1) {
  cat("There are no articles with topics not reported by the right")
  } else {
  ArticleDF %>%
    filter(is.na(TopicRptRight)) %>%
    dplyr::select(feed_title, feed_type, item_display_link, topic_name) %>%
    rename(Feed = feed_title,
           Bias = feed_type,
           Link = item_display_link,
           Topic = topic_name) %>%
    kable("html", escape = FALSE,
          caption = '<p style="color:blue; font-size:18px">Topics Not Reported by Right</p>') %>%
      kable_styling("striped",
                    bootstrap_options = c("hover", "condensed")) %>%
      row_spec(0, color = "white", background = "black")   %>%
  scroll_box(height = "500px")
}

Topics Not Reported by Right
Feed	Bias	Link	Topic
BBC	Center	Dave Chappelle show cancelled over transgender jokes controversy	hous covid biden
BBC	Center	Elon Musk's Tesla sells most of its Bitcoin holdings	hous covid biden
BBC	Center	Comic-Con returns in-person to San Diego after pandemic	hous covid biden
BBC	Center	US President Joe Biden 'doing great' after testing positive for Covid	hous covid biden
CBS	Left	Pence announces Biden's victory after Congress completes electoral count	senat presid capitol
CBS	Left	Dire U.S. housing shortage isn't only a coastal issue anymore	hous covid biden
CBS	Left	British spy chief thinks Russians will soon "run out of steam" in Ukraine	hous covid biden
HUF	Left	Retired U.S. Military Leaders Speak Out Against Trump’s ‘Dereliction Of Duty’	vote republican protect
HUF	Left	Biden Tests Positive For COVID	hous covid biden
HUF	Left	Clarence Thomas Isn’t The Only One Ready To Reduce Women To Baby Incubators	vote republican protect
USA	Center	Armed with vaccines, boosters and antiviral, Biden should fare well with COVID, doctors say	hous covid biden
USA	Center	Man indicted in rape of 10-year-old Ohio girl who traveled to Indiana for abortion	hous covid biden

# Filter to articles not reported by left
if (nrow(ArticleDF %>% filter(is.na(TopicRptLeft))) < 1) {
  cat("There are no articles with topics not reported by the left")
  } else {
  ArticleDF %>%
    filter(is.na(TopicRptLeft)) %>%
    dplyr::select(feed_title, feed_type, item_display_link, topic_name) %>%
    rename(Feed = feed_title,
           Bias = feed_type,
           Link = item_display_link,
           Topic = topic_name) %>%
    kable("html", escape = FALSE,
          caption = '<p style="color:red; font-size:18px">Topics Not Reported by Left</p>') %>%
      kable_styling("striped",
                    bootstrap_options = c("hover", "condensed")) %>%
      row_spec(0, color = "white", background = "black")   %>%
  scroll_box(height = "500px")
}

Topics Not Reported by Left
Feed	Bias	Link	Topic
BBC	Center	'I built a career from true crime and make-up'	polic video offic
BBC	Center	Thomas Lane: Ex-policeman jailed for role in George Floyd killing	polic video offic
FOX	Right	DC hotel shooting leaves one person injured after 'barricade' situation: Police	polic video offic
FOX	Right	North Carolina police officers shoot suspect while responding to kidnapping call, caught on doorbell video	polic video offic
FOX	Right	California mom on bus platform assaulted by man trying to take her toddler	polic video offic
USA	Center	'One date ... and now I'm being sued': Woman sues man for $10K for standing her up on a date	polic video offic
USA	Center	Great white shark washes up on Long Island beach in New York, police say	polic video offic

# Before we spend $ and time getting sentiments, we will see if these have been previously reported.  If so we will flag them so that they can be filtered from the API calls

OldArticleDF <- readRDS(file = "NewsMachine_ArticleDF.RDS") %>%
  dplyr::select(item_link, sentiment, positive, neutral, negative) %>%
  mutate(reported = "Y")

ArticleDF <- ArticleDF %>%
  left_join(OldArticleDF %>%
              dplyr::select(item_link, reported), by = "item_link")

## GET ARTICLE SENTIMENTS

# We will use the Microsoft cognitive services API
tak <- rawToChar(readRDS(paste0("/home/woolyadmin/R/tak.rds"))) # My Microsoft API key
#MSURL31 <- "https://eastus.api.cognitive.microsoft.com/text/analytics/v3.1-preview.3/sentiment?showStats=TRUE"  

MSURL31 <- "https://eastus.api.cognitive.microsoft.com/text/analytics/v3.1/sentiment?showStats=TRUE"

WipDF <- ArticleDF %>%
  filter(str_length(item_articletext) > 10 &
         is.na(reported))

if (nrow(WipDF) > 0) {

for (i in 1:nrow(WipDF)) {
#  print(paste0(i, " ", WipDF$item_title[i]))
  DocDF <- tibble(
    language = "en",
    id = WipDF$item_link[i],
    text = WipDF$item_articletext[i]) %>%
    mutate(text = str_sub(text, 1, 5001))
  DocJSON <- DocDF
  DocJSON <- list(documents = DocJSON) %>% toJSON()
  SentimentResult <- POST(
    url = MSURL31,
    body = DocJSON,
    config=add_headers(
      c('Content-Type'='application/json',
        'Ocp-Apim-Subscription-Key' = tak))) 
  TempArticleDF <-  fromJSON(
    content(SentimentResult, as="text"))$documents %>%
    as_tibble()
  TempSentenceDF <- 
    fromJSON(
          content(SentimentResult, as="text"))$documents$sentences[[1]] %>%
    as_tibble() %>%
    mutate(id = DocDF$id[1])
  if (i == 1) {
    WipArticleSentimentDF <- TempArticleDF
    WipSentenceSentimentDF <- TempSentenceDF
  } else {
    WipArticleSentimentDF <- bind_rows(WipArticleSentimentDF,
                                       TempArticleDF)
    WipSentenceSentimentDF <- bind_rows(WipSentenceSentimentDF,
                                        TempSentenceDF)
  }
}

# Resolve the embedded data frames in the Article Sentiment DF
WipArticleSentimentDF <- bind_cols(WipArticleSentimentDF %>% 
                                     dplyr::select(id, sentiment),
                                   WipArticleSentimentDF$confidenceScores)

WipSentenceSentimentDF <- bind_cols(WipSentenceSentimentDF %>%
                                           dplyr::select(id, sentiment, offset, 
                                                         length, text),
                                         WipSentenceSentimentDF$confidenceScores)

# Add the article sentiments to the main article data frame
WipDF <- WipDF %>%
               filter(is.na(reported)) %>%
               left_join(WipArticleSentimentDF,
                by = c("item_link" = "id"))

}


ArticleDF <- ArticleDF %>%
 left_join(OldArticleDF %>%
             dplyr::select(-reported),
  by = c("item_link"))

# Add New Articles if present
if (nrow(WipDF) > 0) {

ArticleDF <- bind_rows(ArticleDF,
               WipDF)

}

ArticleDF <- ArticleDF %>%
  mutate(sentiment = factor(sentiment, 
                            levels = c("positive", "mixed", 
                                       "neutral", "negative", "")))

2 - Sentiment Bias

Is each side reporting certain topics in a more positive or more negative tone than the other side?

There are many packages and methods for getting word sentiment. However just getting isolated sentiment of words is too simplistic and can lead to the wrong conclusion.

The best sentiment engines use sequence and context to generate more reliable results. Since we do not have the time or will to develop that independently we are fortunate that there are options out there at Amazon, Google, IBM, Microsoft and others. For our purposes, we are going to use Microsoft's Cognitive Services API to get sentiment. There is a cost for heavy usage but for this project I think we can stay within the 5K per month free tier.

The API delivers results at the sentence level and then aggregates that to the document (article) level. Here is a sampling at the document level.

ArticleDF[1:3,] %>%
  dplyr::select(item_title, sentiment, positive, neutral, negative) %>%
  rename(Title = item_title, Sentiment = sentiment, Positive = positive,
         Neutral = neutral, Negative = negative) %>%
    kable("html", escape = FALSE,
          caption = '<p style="color:black; font-size:18px">Sentiment Results Example</p>') %>%
      kable_styling("striped",
                    bootstrap_options = c("hover", "condensed")) %>%
      row_spec(0, color = "white", background = "black")

Sentiment Results Example
Title	Sentiment	Positive	Neutral	Negative
Dave Chappelle show cancelled over transgender jokes controversy	NA	NA	NA	NA
Dave Chappelle show cancelled over transgender jokes controversy	negative	0.01	0.01	0.98
Dave Chappelle show cancelled over transgender jokes controversy	NA	NA	NA	NA

Now let's explore that sentiment by topic and source type. Generally it's interesting but not surprising how little of the sentiment of all news articles is positive. It reinforces the axiom that "Bad News Sells".

ArticleDF %>%
  filter(!is.na(sentiment)) %>%
  group_by(feed_type, topic_name, sentiment) %>%
  summarise(sentiment_count = n(), .groups = "drop") %>%
  ggplot(aes(sentiment_count, topic_name, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ feed_type, ncol = 3) +
  scale_fill_manual(breaks = c("positive", "neutral", "mixed", "negative"),
                    values = c("light green", "light blue", "light grey", "black")) +
  theme_classic() +
  labs(title = "Article Sentiment Count by Topic and Source Type",
       x = "Topic Count", y = "Topic", fill = "Sentiment")

A breakdown by source reveals who is the most negative overall.

ArticleDF %>%
  mutate(feed_title = factor(feed_title, levels = c("HUF", "CBS", 
                                                    "BBC", "USA", 
                                                    "FOX", "WTM"))) %>%
  group_by(feed_title, topic_name, sentiment) %>%
  summarise(sentiment_count = n(), .groups = "drop") %>%
  ggplot(aes(sentiment_count, topic_name, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ feed_title, ncol = 6) +
  scale_fill_manual(breaks = c("positive", "neutral", "mixed", "negative"),
                    values = c("light green", "light blue", "light grey", "black")) +
  theme_classic() +
  labs(title = "Article Sentiment Count by Topic and Source Name",
       x = "Topic Count", y = "Topic", fill = "Sentiment")

3 - Keyword Bias

Are articles with that report on certain keywords reported more, less or not at all by each side and what is the sentiment of articles with those keywords?

We are starting this list small and will eventually grow it based on the the results from the LDA analytics. For now the keywords are broken down as follows:

Left Word Group : Last names of key political figures and terms associated with the right. Right Word Group : Last names of key political figures and terms associated with the left. Topic Group : Independent topics that could align with either group.

The word groups will be evaluated as a whole while the topics will be evaluated individually.

Keywords <- tibble(
    word = c("trump", "pence", "mcconnell", "mccarthy", "republican", 
             "conservative"),
    type = rep("Right Word Group", 6)) %>%
  bind_rows(tibble(
    word = c("biden", "harris", "pelosi", "schumer", "democrat", 
             "liberal", "progressive"),
    type = rep("Left Word Group", 7))) %>%
  bind_rows(tibble(
    word = c("christian", "muslim", "immigration", "riot", "defund", 
             "protest", "lgbtq", "vaccine", "covid"),
    type = rep("Topic Group", 9)))


Keywords %>%
  rename(Word = word, Type = type) %>%
  kable("html", escape = FALSE,
        caption = '<p style="color:black; font-size:18px">Keyword Topics and Groupings</p>') %>%
    kable_styling("striped",
                  bootstrap_options = c("hover", "condensed")) %>%
    row_spec(0, color = "white", background = "black")  %>%
  scroll_box(height = "500px")

Keyword Topics and Groupings
Word	Type
trump	Right Word Group
pence	Right Word Group
mcconnell	Right Word Group
mccarthy	Right Word Group
republican	Right Word Group
conservative	Right Word Group
biden	Left Word Group
harris	Left Word Group
pelosi	Left Word Group
schumer	Left Word Group
democrat	Left Word Group
liberal	Left Word Group
progressive	Left Word Group
christian	Topic Group
muslim	Topic Group
immigration	Topic Group
riot	Topic Group
defund	Topic Group
protest	Topic Group
lgbtq	Topic Group
vaccine	Topic Group
covid	Topic Group

If an article has a keyword in it 3 or more times then it is included in the keyword analysis. First, a view by source type.

KeywordSentimentDF <- ArticleTokenDF %>%
  left_join(Keywords, by = c("word")) %>%
  filter(count >= 3 & !is.na(type)) %>%
  mutate(KeyWord = case_when(
    type != "Topic Group" ~ type,
    TRUE ~ word)) %>%
  group_by(item_link, KeyWord) %>%
  summarise(n = n(), .groups = c("drop")) %>%
  left_join(ArticleDF %>%
              dplyr::select(item_link, feed_type, feed_title, sentiment),
            by = c("item_link"))

KeywordSentimentDF %>%
  ggplot(aes(n, KeyWord, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ feed_type, ncol = 3) +
  scale_fill_manual(breaks = c("positive", "neutral", "mixed", "negative"),
                    values = c("light green", "light blue", "light grey", "black")) +
  theme_classic() +
  labs(title = "Article Sentiment Count by Select Keyword Groups and Source Type",
       x = "Topic Count", y = "Topic", fill = "Sentiment")

Now a deeper look by source name.

XMax <- KeywordSentimentDF %>%
  group_by(feed_title) %>%
  summarise(n = n(), .groups = "drop")
XMax <- max(XMax$n)
XScale <- round(XMax / 2, 0)

KeywordSentimentDF %>%
  mutate(feed_title = factor(feed_title, levels = c("HUF", "CBS", 
                                                    "BBC", "USA", 
                                                    "FOX", "WTM"))) %>%
  ggplot(aes(n, KeyWord, fill = sentiment)) +
  geom_col() +
  facet_wrap(~ feed_title, ncol = 6) +
  scale_x_continuous(breaks = seq(0, XMax, by = XScale)) +
  scale_fill_manual(breaks = c("positive", "neutral", "mixed", "negative"),
                    values = c("light green", "light blue", "light grey", "black")) +
  theme_classic() +
  labs(title = "Article Sentiment Count by Select Keyword Groups and Source Name",
       x = "Topic Count", y = "Topic", fill = "Sentiment")

Now that you have seen some objective measures of potential bias you can consume the news by...

Being savvy

ArticleDF %>%
  mutate(UnreportedBy = case_when(
    is.na(TopicRptRight) & !is.na(TopicRptLeft) ~ "Right",
    is.na(TopicRptLeft) & !is.na(TopicRptRight) ~ "Left",
    is.na(TopicRptCenter) & (!is.na(TopicRptRight) | !is.na(TopicRptLeft)) ~ "Center",
    TRUE ~ ""
  )) %>%
  rename(Feed = feed_title,
         Bias = feed_type,
         Link = item_display_link,
         Topic = topic_name,
         Sentiment =sentiment) %>%
  dplyr::select(Topic, Bias, Feed, Link, Sentiment, UnreportedBy) %>%
  arrange(Topic, Bias, Feed, Link, Sentiment) %>%
  kable("html", escape = FALSE,
        caption = paste0('<p style="color:black; font-size:18px">',
        'Appendix - All Articles',
        '</p>')) %>%
    kable_styling("striped",
                  bootstrap_options = c("hover", "condensed")) %>%
    row_spec(0, color = "white", background = "black")   %>%
  scroll_box(height = "500px")

Appendix - All Articles
Topic	Bias	Feed	Link	Sentiment	UnreportedBy
attack gun allig	Left	CBS	2 officers shot in Rochester, New York	negative
attack gun allig	Left	CBS	2 officers shot in Rochester, New York	negative
attack gun allig	Left	CBS	2 officers shot in Rochester, New York	NA
attack gun allig	Left	CBS	2 officers shot in Rochester, New York	NA
attack gun allig	Left	CBS	Puerto Rican court closes case against Ricky Martin	mixed
attack gun allig	Left	CBS	Puerto Rican court closes case against Ricky Martin	mixed
attack gun allig	Left	CBS	Puerto Rican court closes case against Ricky Martin	NA
attack gun allig	Left	CBS	Puerto Rican court closes case against Ricky Martin	NA
attack gun allig	Left	HUF	GOP Rep. Lee Zeldin Attacked During NY Gubernatorial Campaign Event	negative
attack gun allig	Left	HUF	GOP Rep. Lee Zeldin Attacked During NY Gubernatorial Campaign Event	negative
attack gun allig	Left	HUF	GOP Rep. Lee Zeldin Attacked During NY Gubernatorial Campaign Event	NA
attack gun allig	Left	HUF	GOP Rep. Lee Zeldin Attacked During NY Gubernatorial Campaign Event	NA
attack gun allig	Left	HUF	The Gun Lobby Has ‘Captured’ The ATF, Report Argues	negative
attack gun allig	Left	HUF	The Gun Lobby Has ‘Captured’ The ATF, Report Argues	NA
attack gun allig	Center	BBC	Comic-Con: Thousands head to San Diego for huge Film and TV event	negative
attack gun allig	Center	BBC	Comic-Con: Thousands head to San Diego for huge Film and TV event	negative
attack gun allig	Center	BBC	Comic-Con: Thousands head to San Diego for huge Film and TV event	NA
attack gun allig	Center	BBC	Comic-Con: Thousands head to San Diego for huge Film and TV event	NA
attack gun allig	Center	USA	Lightning strike kills 1 soldier, injures 9 others at Fort Gordon Army base in Georgia	negative
attack gun allig	Center	USA	Lightning strike kills 1 soldier, injures 9 others at Fort Gordon Army base in Georgia	negative
attack gun allig	Center	USA	Lightning strike kills 1 soldier, injures 9 others at Fort Gordon Army base in Georgia	NA
attack gun allig	Center	USA	Lightning strike kills 1 soldier, injures 9 others at Fort Gordon Army base in Georgia	NA
attack gun allig	Center	USA	Pennsylvania police shoot and kill 15-foot-long pet snake that was strangling man's neck	negative
attack gun allig	Center	USA	Pennsylvania police shoot and kill 15-foot-long pet snake that was strangling man's neck	negative
attack gun allig	Center	USA	Pennsylvania police shoot and kill 15-foot-long pet snake that was strangling man's neck	NA
attack gun allig	Center	USA	Pennsylvania police shoot and kill 15-foot-long pet snake that was strangling man's neck	NA
attack gun allig	Center	USA	A Florida woman died in an alligator attack. How rare is that? What to do if you see one.	mixed
attack gun allig	Center	USA	A Florida woman died in an alligator attack. How rare is that? What to do if you see one.	mixed
attack gun allig	Center	USA	A Florida woman died in an alligator attack. How rare is that? What to do if you see one.	NA
attack gun allig	Center	USA	A Florida woman died in an alligator attack. How rare is that? What to do if you see one.	NA
attack gun allig	Right	FOX	New York crime crisis: Rochester police officer killed in shooting, another injured	mixed
attack gun allig	Right	FOX	New York crime crisis: Rochester police officer killed in shooting, another injured	mixed
attack gun allig	Right	FOX	New York crime crisis: Rochester police officer killed in shooting, another injured	NA
attack gun allig	Right	FOX	New York crime crisis: Rochester police officer killed in shooting, another injured	NA
attack gun allig	Right	FOX	Texas border stash house bust: 48 illegal migrants arrested	negative
attack gun allig	Right	FOX	Texas border stash house bust: 48 illegal migrants arrested	negative
attack gun allig	Right	FOX	Texas border stash house bust: 48 illegal migrants arrested	NA
attack gun allig	Right	FOX	Texas border stash house bust: 48 illegal migrants arrested	NA
attack gun allig	Right	WTM	Report: Suspect identified in attack on N.Y. gubernatorial candidate Lee Zeldin	negative
attack gun allig	Right	WTM	Report: Suspect identified in attack on N.Y. gubernatorial candidate Lee Zeldin	negative
attack gun allig	Right	WTM	Report: Suspect identified in attack on N.Y. gubernatorial candidate Lee Zeldin	NA
attack gun allig	Right	WTM	Report: Suspect identified in attack on N.Y. gubernatorial candidate Lee Zeldin	NA
attack gun allig	Right	WTM	'Elvis' actress found dead in Nashville home	negative
attack gun allig	Right	WTM	'Elvis' actress found dead in Nashville home	negative
attack gun allig	Right	WTM	'Elvis' actress found dead in Nashville home	NA
attack gun allig	Right	WTM	'Elvis' actress found dead in Nashville home	NA
hous covid biden	Left	CBS	British spy chief thinks Russians will soon "run out of steam" in Ukraine	negative	Right
hous covid biden	Left	CBS	British spy chief thinks Russians will soon "run out of steam" in Ukraine	negative	Right
hous covid biden	Left	CBS	British spy chief thinks Russians will soon "run out of steam" in Ukraine	NA	Right
hous covid biden	Left	CBS	British spy chief thinks Russians will soon "run out of steam" in Ukraine	NA	Right
hous covid biden	Left	CBS	Dire U.S. housing shortage isn't only a coastal issue anymore	negative	Right
hous covid biden	Left	CBS	Dire U.S. housing shortage isn't only a coastal issue anymore	negative	Right
hous covid biden	Left	CBS	Dire U.S. housing shortage isn't only a coastal issue anymore	NA	Right
hous covid biden	Left	CBS	Dire U.S. housing shortage isn't only a coastal issue anymore	NA	Right
hous covid biden	Left	HUF	Biden Tests Positive For COVID	mixed	Right
hous covid biden	Left	HUF	Biden Tests Positive For COVID	mixed	Right
hous covid biden	Left	HUF	Biden Tests Positive For COVID	NA	Right
hous covid biden	Left	HUF	Biden Tests Positive For COVID	NA	Right
hous covid biden	Center	BBC	Elon Musk's Tesla sells most of its Bitcoin holdings	negative	Right
hous covid biden	Center	BBC	Elon Musk's Tesla sells most of its Bitcoin holdings	negative	Right
hous covid biden	Center	BBC	Elon Musk's Tesla sells most of its Bitcoin holdings	NA	Right
hous covid biden	Center	BBC	Elon Musk's Tesla sells most of its Bitcoin holdings	NA	Right
hous covid biden	Center	BBC	Comic-Con returns in-person to San Diego after pandemic	negative	Right
hous covid biden	Center	BBC	Comic-Con returns in-person to San Diego after pandemic	negative	Right
hous covid biden	Center	BBC	Comic-Con returns in-person to San Diego after pandemic	NA	Right
hous covid biden	Center	BBC	Comic-Con returns in-person to San Diego after pandemic	NA	Right
hous covid biden	Center	BBC	Dave Chappelle show cancelled over transgender jokes controversy	negative	Right
hous covid biden	Center	BBC	Dave Chappelle show cancelled over transgender jokes controversy	negative	Right
hous covid biden	Center	BBC	Dave Chappelle show cancelled over transgender jokes controversy	NA	Right
hous covid biden	Center	BBC	Dave Chappelle show cancelled over transgender jokes controversy	NA	Right
hous covid biden	Center	BBC	US President Joe Biden 'doing great' after testing positive for Covid	mixed	Right
hous covid biden	Center	BBC	US President Joe Biden 'doing great' after testing positive for Covid	mixed	Right
hous covid biden	Center	BBC	US President Joe Biden 'doing great' after testing positive for Covid	NA	Right
hous covid biden	Center	BBC	US President Joe Biden 'doing great' after testing positive for Covid	NA	Right
hous covid biden	Center	USA	Armed with vaccines, boosters and antiviral, Biden should fare well with COVID, doctors say	mixed	Right
hous covid biden	Center	USA	Armed with vaccines, boosters and antiviral, Biden should fare well with COVID, doctors say	mixed	Right
hous covid biden	Center	USA	Armed with vaccines, boosters and antiviral, Biden should fare well with COVID, doctors say	NA	Right
hous covid biden	Center	USA	Armed with vaccines, boosters and antiviral, Biden should fare well with COVID, doctors say	NA	Right
hous covid biden	Center	USA	Man indicted in rape of 10-year-old Ohio girl who traveled to Indiana for abortion	negative	Right
hous covid biden	Center	USA	Man indicted in rape of 10-year-old Ohio girl who traveled to Indiana for abortion	negative	Right
hous covid biden	Center	USA	Man indicted in rape of 10-year-old Ohio girl who traveled to Indiana for abortion	NA	Right
hous covid biden	Center	USA	Man indicted in rape of 10-year-old Ohio girl who traveled to Indiana for abortion	NA	Right
immigr polici vaccin	Left	CBS	GOP states' lawsuits derail Biden's immigration agenda	negative
immigr polici vaccin	Left	CBS	GOP states' lawsuits derail Biden's immigration agenda	negative
immigr polici vaccin	Left	CBS	GOP states' lawsuits derail Biden's immigration agenda	NA
immigr polici vaccin	Left	CBS	GOP states' lawsuits derail Biden's immigration agenda	NA
immigr polici vaccin	Center	BBC	Polio: New York reports first US case in nearly a decade	negative
immigr polici vaccin	Center	BBC	Polio: New York reports first US case in nearly a decade	negative
immigr polici vaccin	Center	BBC	Polio: New York reports first US case in nearly a decade	NA
immigr polici vaccin	Center	BBC	Polio: New York reports first US case in nearly a decade	NA
immigr polici vaccin	Center	USA	Beach taken from Black couple given back to family 100 years later: 'We are returning stolen land'	mixed
immigr polici vaccin	Center	USA	Beach taken from Black couple given back to family 100 years later: 'We are returning stolen land'	mixed
immigr polici vaccin	Center	USA	Beach taken from Black couple given back to family 100 years later: 'We are returning stolen land'	NA
immigr polici vaccin	Center	USA	Beach taken from Black couple given back to family 100 years later: 'We are returning stolen land'	NA
immigr polici vaccin	Center	USA	Lawsuit: Dallas Taco Bell manager poured scalding water on customers over incorrect order	negative
immigr polici vaccin	Center	USA	Lawsuit: Dallas Taco Bell manager poured scalding water on customers over incorrect order	negative
immigr polici vaccin	Center	USA	Lawsuit: Dallas Taco Bell manager poured scalding water on customers over incorrect order	NA
immigr polici vaccin	Center	USA	Lawsuit: Dallas Taco Bell manager poured scalding water on customers over incorrect order	NA
immigr polici vaccin	Right	FOX	Los Angeles man gets haircut in middle of Sixth Street bridge	negative
immigr polici vaccin	Right	FOX	Los Angeles man gets haircut in middle of Sixth Street bridge	negative
immigr polici vaccin	Right	FOX	Los Angeles man gets haircut in middle of Sixth Street bridge	NA
immigr polici vaccin	Right	FOX	Los Angeles man gets haircut in middle of Sixth Street bridge	NA
immigr polici vaccin	Right	FOX	Police officers grappling with vaccine mandates weigh in on President Biden's COVID-19 diagnosis	mixed
immigr polici vaccin	Right	FOX	Police officers grappling with vaccine mandates weigh in on President Biden's COVID-19 diagnosis	mixed
immigr polici vaccin	Right	FOX	Police officers grappling with vaccine mandates weigh in on President Biden's COVID-19 diagnosis	NA
immigr polici vaccin	Right	FOX	Police officers grappling with vaccine mandates weigh in on President Biden's COVID-19 diagnosis	NA
immigr polici vaccin	Right	WTM	Court rejects bid to disqualify candidate over Jan. 6 attendance	negative
immigr polici vaccin	Right	WTM	Court rejects bid to disqualify candidate over Jan. 6 attendance	negative
immigr polici vaccin	Right	WTM	Court rejects bid to disqualify candidate over Jan. 6 attendance	NA
immigr polici vaccin	Right	WTM	Court rejects bid to disqualify candidate over Jan. 6 attendance	NA
immigr polici vaccin	Right	WTM	New arrest warrant issued for election-denying Colorado clerk	negative
immigr polici vaccin	Right	WTM	New arrest warrant issued for election-denying Colorado clerk	negative
immigr polici vaccin	Right	WTM	New arrest warrant issued for election-denying Colorado clerk	NA
immigr polici vaccin	Right	WTM	New arrest warrant issued for election-denying Colorado clerk	NA
polic video offic	Center	BBC	'I built a career from true crime and make-up'	negative	Left
polic video offic	Center	BBC	'I built a career from true crime and make-up'	negative	Left
polic video offic	Center	BBC	'I built a career from true crime and make-up'	NA	Left
polic video offic	Center	BBC	'I built a career from true crime and make-up'	NA	Left
polic video offic	Center	BBC	Thomas Lane: Ex-policeman jailed for role in George Floyd killing	negative	Left
polic video offic	Center	BBC	Thomas Lane: Ex-policeman jailed for role in George Floyd killing	negative	Left
polic video offic	Center	BBC	Thomas Lane: Ex-policeman jailed for role in George Floyd killing	NA	Left
polic video offic	Center	BBC	Thomas Lane: Ex-policeman jailed for role in George Floyd killing	NA	Left
polic video offic	Center	USA	'One date ... and now I'm being sued': Woman sues man for $10K for standing her up on a date	negative	Left
polic video offic	Center	USA	'One date ... and now I'm being sued': Woman sues man for $10K for standing her up on a date	negative	Left
polic video offic	Center	USA	'One date ... and now I'm being sued': Woman sues man for $10K for standing her up on a date	NA	Left
polic video offic	Center	USA	'One date ... and now I'm being sued': Woman sues man for $10K for standing her up on a date	NA	Left
polic video offic	Center	USA	Great white shark washes up on Long Island beach in New York, police say	mixed	Left
polic video offic	Center	USA	Great white shark washes up on Long Island beach in New York, police say	mixed	Left
polic video offic	Center	USA	Great white shark washes up on Long Island beach in New York, police say	NA	Left
polic video offic	Center	USA	Great white shark washes up on Long Island beach in New York, police say	NA	Left
polic video offic	Right	FOX	California mom on bus platform assaulted by man trying to take her toddler	negative	Left
polic video offic	Right	FOX	California mom on bus platform assaulted by man trying to take her toddler	negative	Left
polic video offic	Right	FOX	California mom on bus platform assaulted by man trying to take her toddler	NA	Left
polic video offic	Right	FOX	California mom on bus platform assaulted by man trying to take her toddler	NA	Left
polic video offic	Right	FOX	DC hotel shooting leaves one person injured after 'barricade' situation: Police	negative	Left
polic video offic	Right	FOX	DC hotel shooting leaves one person injured after 'barricade' situation: Police	negative	Left
polic video offic	Right	FOX	DC hotel shooting leaves one person injured after 'barricade' situation: Police	NA	Left
polic video offic	Right	FOX	DC hotel shooting leaves one person injured after 'barricade' situation: Police	NA	Left
polic video offic	Right	FOX	North Carolina police officers shoot suspect while responding to kidnapping call, caught on doorbell video	negative	Left
polic video offic	Right	FOX	North Carolina police officers shoot suspect while responding to kidnapping call, caught on doorbell video	negative	Left
polic video offic	Right	FOX	North Carolina police officers shoot suspect while responding to kidnapping call, caught on doorbell video	NA	Left
polic video offic	Right	FOX	North Carolina police officers shoot suspect while responding to kidnapping call, caught on doorbell video	NA	Left
senat presid capitol	Left	CBS	Pence announces Biden's victory after Congress completes electoral count	negative	Right
senat presid capitol	Left	CBS	Pence announces Biden's victory after Congress completes electoral count	negative	Right
senat presid capitol	Left	CBS	Pence announces Biden's victory after Congress completes electoral count	NA	Right
senat presid capitol	Left	CBS	Pence announces Biden's victory after Congress completes electoral count	NA	Right
trump capitol committe	Left	CBS	Mulvaney believes Hutchinson, other officials testifying about Trump	mixed
trump capitol committe	Left	CBS	Mulvaney believes Hutchinson, other officials testifying about Trump	mixed
trump capitol committe	Left	CBS	Mulvaney believes Hutchinson, other officials testifying about Trump	NA
trump capitol committe	Left	CBS	Mulvaney believes Hutchinson, other officials testifying about Trump	NA
trump capitol committe	Left	CBS	Watchdog directs Secret Service to stop internal investigation into deleted texts	negative
trump capitol committe	Left	CBS	Watchdog directs Secret Service to stop internal investigation into deleted texts	negative
trump capitol committe	Left	CBS	Watchdog directs Secret Service to stop internal investigation into deleted texts	NA
trump capitol committe	Left	CBS	Watchdog directs Secret Service to stop internal investigation into deleted texts	NA
trump capitol committe	Left	CBS	Pence Jan. 6 Secret Service detail started to "fear for their own lives"	negative
trump capitol committe	Left	CBS	Pence Jan. 6 Secret Service detail started to "fear for their own lives"	negative
trump capitol committe	Left	CBS	Pence Jan. 6 Secret Service detail started to "fear for their own lives"	NA
trump capitol committe	Left	CBS	Pence Jan. 6 Secret Service detail started to "fear for their own lives"	NA
trump capitol committe	Left	HUF	Jan. 6 Hearing Showcases Trump’s 187 Minutes Of Cheering On Mob During His Coup Attempt	negative
trump capitol committe	Left	HUF	Jan. 6 Hearing Showcases Trump’s 187 Minutes Of Cheering On Mob During His Coup Attempt	negative
trump capitol committe	Left	HUF	Jan. 6 Hearing Showcases Trump’s 187 Minutes Of Cheering On Mob During His Coup Attempt	NA
trump capitol committe	Left	HUF	Jan. 6 Hearing Showcases Trump’s 187 Minutes Of Cheering On Mob During His Coup Attempt	NA
trump capitol committe	Left	HUF	Pence’s Security Detail Feared For Their Lives During January 6 Capitol Riot	negative
trump capitol committe	Left	HUF	Pence’s Security Detail Feared For Their Lives During January 6 Capitol Riot	negative
trump capitol committe	Left	HUF	Pence’s Security Detail Feared For Their Lives During January 6 Capitol Riot	NA
trump capitol committe	Left	HUF	Pence’s Security Detail Feared For Their Lives During January 6 Capitol Riot	NA
trump capitol committe	Left	HUF	Video Shows Sen. Josh Hawley Fleeing The Jan. 6 Rioters He Had Just Saluted	negative
trump capitol committe	Left	HUF	Video Shows Sen. Josh Hawley Fleeing The Jan. 6 Rioters He Had Just Saluted	negative
trump capitol committe	Left	HUF	Video Shows Sen. Josh Hawley Fleeing The Jan. 6 Rioters He Had Just Saluted	NA
trump capitol committe	Left	HUF	Video Shows Sen. Josh Hawley Fleeing The Jan. 6 Rioters He Had Just Saluted	NA
trump capitol committe	Left	HUF	More Jan. 6 Witnesses Back Up Account Of Trump’s SUV Meltdown	negative
trump capitol committe	Left	HUF	More Jan. 6 Witnesses Back Up Account Of Trump’s SUV Meltdown	negative
trump capitol committe	Left	HUF	More Jan. 6 Witnesses Back Up Account Of Trump’s SUV Meltdown	NA
trump capitol committe	Left	HUF	More Jan. 6 Witnesses Back Up Account Of Trump’s SUV Meltdown	NA
trump capitol committe	Center	BBC	Capitol riot: Trump ignored pleas to condemn attack, hearing told	negative
trump capitol committe	Center	BBC	Capitol riot: Trump ignored pleas to condemn attack, hearing told	negative
trump capitol committe	Center	BBC	Capitol riot: Trump ignored pleas to condemn attack, hearing told	NA
trump capitol committe	Center	BBC	Capitol riot: Trump ignored pleas to condemn attack, hearing told	NA
trump capitol committe	Right	FOX	Alleged Lee Zeldin attempted attacker charged with felony, immediately released just as congressman predicted	mixed
trump capitol committe	Right	FOX	Alleged Lee Zeldin attempted attacker charged with felony, immediately released just as congressman predicted	mixed
trump capitol committe	Right	FOX	Alleged Lee Zeldin attempted attacker charged with felony, immediately released just as congressman predicted	NA
trump capitol committe	Right	FOX	Alleged Lee Zeldin attempted attacker charged with felony, immediately released just as congressman predicted	NA
trump capitol committe	Right	FOX	Witness recounts attack on Lee Zeldin at campaign event in New York: 'Kinda crazy'	mixed
trump capitol committe	Right	FOX	Witness recounts attack on Lee Zeldin at campaign event in New York: 'Kinda crazy'	mixed
trump capitol committe	Right	FOX	Witness recounts attack on Lee Zeldin at campaign event in New York: 'Kinda crazy'	NA
trump capitol committe	Right	FOX	Witness recounts attack on Lee Zeldin at campaign event in New York: 'Kinda crazy'	NA
trump capitol committe	Right	WTM	Chatter between Oath Keepers on Jan. 6 reveals militant group took Trump's tweets at face value	negative
trump capitol committe	Right	WTM	Chatter between Oath Keepers on Jan. 6 reveals militant group took Trump's tweets at face value	negative
trump capitol committe	Right	WTM	Chatter between Oath Keepers on Jan. 6 reveals militant group took Trump's tweets at face value	NA
trump capitol committe	Right	WTM	Chatter between Oath Keepers on Jan. 6 reveals militant group took Trump's tweets at face value	NA
trump capitol committe	Right	WTM	Trump ignored aides' pleas to call off Jan. 6 rioters for hours, witnesses testify	negative
trump capitol committe	Right	WTM	Trump ignored aides' pleas to call off Jan. 6 rioters for hours, witnesses testify	negative
trump capitol committe	Right	WTM	Trump ignored aides' pleas to call off Jan. 6 rioters for hours, witnesses testify	NA
trump capitol committe	Right	WTM	Trump ignored aides' pleas to call off Jan. 6 rioters for hours, witnesses testify	NA
trump capitol committe	Right	WTM	Trump refused to say election was over in national address condemning Jan. 6 riot	negative
trump capitol committe	Right	WTM	Trump refused to say election was over in national address condemning Jan. 6 riot	negative
trump capitol committe	Right	WTM	Trump refused to say election was over in national address condemning Jan. 6 riot	NA
trump capitol committe	Right	WTM	Trump refused to say election was over in national address condemning Jan. 6 riot	NA
trump capitol committe	Right	WTM	Jan. 6 committee announces more hearings to come in September	negative
trump capitol committe	Right	WTM	Jan. 6 committee announces more hearings to come in September	negative
trump capitol committe	Right	WTM	Jan. 6 committee announces more hearings to come in September	NA
trump capitol committe	Right	WTM	Jan. 6 committee announces more hearings to come in September	NA
trump capitol committe	Right	WTM	GOP N.Y. gubernatorial candidate Lee Zeldin attacked at campaign event	negative
trump capitol committe	Right	WTM	GOP N.Y. gubernatorial candidate Lee Zeldin attacked at campaign event	negative
trump capitol committe	Right	WTM	GOP N.Y. gubernatorial candidate Lee Zeldin attacked at campaign event	NA
trump capitol committe	Right	WTM	GOP N.Y. gubernatorial candidate Lee Zeldin attacked at campaign event	NA
vote republican protect	Left	HUF	Clarence Thomas Isn’t The Only One Ready To Reduce Women To Baby Incubators	negative	Right
vote republican protect	Left	HUF	Clarence Thomas Isn’t The Only One Ready To Reduce Women To Baby Incubators	NA	Right
vote republican protect	Left	HUF	Retired U.S. Military Leaders Speak Out Against Trump’s ‘Dereliction Of Duty’	negative	Right
vote republican protect	Left	HUF	Retired U.S. Military Leaders Speak Out Against Trump’s ‘Dereliction Of Duty’	negative	Right
vote republican protect	Left	HUF	Retired U.S. Military Leaders Speak Out Against Trump’s ‘Dereliction Of Duty’	NA	Right
vote republican protect	Left	HUF	Retired U.S. Military Leaders Speak Out Against Trump’s ‘Dereliction Of Duty’	NA	Right

Note : I have plans to evaluate the news over time now that I have this process running and collecting data nearly every day. If you have some ideas for exploring it further, please leave a comment.

TestDataset01 <- dbConnect(  
                        bigrquery::bigquery(),
                        project = "tokyo-portal-227113",
                        dataset = "TestDataset01",
                        billing = "tokyo-portal-227113")

# savvy-analytics-42.demonstration
Demonstration <- dbConnect(  
                        bigrquery::bigquery(),
                        project = "savvy-analytics-42",
                        dataset = "demonstration",
                        billing = "savvy-analytics-42")


# If there are no new articles then no need to output
if (nrow(WipDF) > 0) {

TodaysArticleDF = ArticleDF %>%
                filter(is.na(reported)) %>%
                        dplyr::select(-reported)

PriorArticleDF = readRDS("NewsMachine_ArticleDF.RDS")

# Append the current article DF (exclude previously reported) 
# to the historical article DF
ALLArticleDF <- bind_rows(TodaysArticleDF, PriorArticleDF)
# Save the historical articles
saveRDS(ALLArticleDF, file = "NewsMachine_ArticleDF.RDS")

# Save th historical articles to BQ
bq_auth(path = "BQAPIKey.json")
dbWriteTable(TestDataset01, "NewsMachine_Article", TodaysArticleDF, append = TRUE)
bq_auth(path = "BQSavvyDemonstration.json")
dbWriteTable(Demonstration, "NewsMachine_Article", TodaysArticleDF, append = TRUE)


# Append the current sentence sentiment to the RDS file
ALLSentenceSentimentDF <- bind_rows(WipSentenceSentimentDF,
                                    readRDS("NewsMachine_SentenceSentimentDF.RDS"))
# Save the historical sentence sentiments
saveRDS(ALLSentenceSentimentDF, file = "NewsMachine_SentenceSentimentDF.RDS")

}

News Bias Analysis

by : Chris at Savvy Analytics

1 - Exclusionary Bias

2 - Sentiment Bias

3 - Keyword Bias