Rationale

First-level agenda setting theory suggests that the amount of coverage a topic receives can influence the audience’s view on its importance. In other words, if a topic gets more coverage, it is deemed more important.

Based on this theory, the purpose of this analysis is to examine how the media allocates attention to competing topics, like the Boston Red Sox and the New York Yankees.

By counting the APNews articles about each team, we can see which team gets more attention and is perceived as more important.

Hypothesis

Weekly APNews coverage volume of the Boston Red Sox and the New York Yankees differed during the first nine months of 2025.

Variables & Method

The dependent variable in the analysis was the weekly APNews.com coverage volume of the two baseball teams. The independent variable was the team, either “Red Sox” or “Yankees.” Key words and phrases used to identify stories about the Red Sox were: “Red Sox”, “Boston Red Sox”, and “BoSox.” Key words and phrases used to identify stories about the Yankees were: “Yankees”, “New York Yankees”, and “Bronx Bombers.”

Eventually, a paired-samples t-test will be used to assess the statistical significance of coverage volume between the two teams.

Results & Discussion

The figure below summarizes each team’s weekly coverage across the first 40 weeks of the year. It appears that the coverage of the New York Yankees exceeded the coverage of the Boston Red Sox.

The most notable difference in APNews.com’s coverage appeared to be in week 26, which was the week from June 23 to June 29. The Yankees had a story count of 45, compared to only 18 stories about the Red Sox. I did some quick searching and Aaron Judge surpassed 30 home runs on June 29th, which made him the first player in MLB history to reach 30 home runs and 110 hits before the month of July. During that same time, the Red Sox dropped five of six games to the LA Dodgers and the Toronto Blue Jays, both teams that eventually advanced to the World Series.

The peak for Boston Red Sox stories came in week 25, with 25 published stories mentioning the team. Upon a deeper review of their schedule, the Red Sox traveled to San Francisco to face the Giants just one week after trading them their best hitter, Rafael Devers. The first game against his former team likely contributed to the spike in media attention.

Overall, the results suggest differing coverage of the two teams and possible nonrandom patterns in media attention, where spikes for one team may correspond to dips in coverage of the other.

Code:

Here is the code that produced this figure.

# ============================================
# APNews text analysis (First-level agenda-setting theory version)
# ============================================

# ============================================
# --- Load required libraries ---
# ============================================

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

library(tidyverse)
library(tidytext)

# ============================================
# --- Load the APNews data ---
# ============================================

# Read the data from the web
FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
# Save the data on your computer
saveRDS(FetchedData, file = "APNews.rds")
# remove the downloaded data from the environment
rm (FetchedData)

APNews <- readRDS("APNews.rds")

# ============================================
# --- Flag Topic1-related stories ---
# ============================================

# --- Define Topic1 phrases ---
phrases <- c(
  "Red Sox",
  "Boston Red Sox",
  "BoSox"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic1 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),  # normalize whitespace
    Topic1 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Flag Topic2-related stories ---
# ============================================

# --- Define Topic2 phrases ---
phrases <- c(
  "Yankees",
  "New York Yankees",
  "Bronx Bombers"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic2 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),
    Topic2 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Visualize weekly counts of Topic1- and Topic2-related stories ---
# ============================================

# --- Load plotly if needed ---
if (!require("plotly")) install.packages("plotly")
library(plotly)

# --- Summarize weekly counts for Topic1 = "Yes" ---
Topic1_weekly <- APNews %>%
  filter(Topic1 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Red Sox") # Note custom Topic1 label

# --- Summarize weekly counts for Topic2 = "Yes" ---
Topic2_weekly <- APNews %>%
  filter(Topic2 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Yankees") # Note custom Topic2 label

# --- Combine both summaries into one data frame ---
Weekly_counts <- bind_rows(Topic2_weekly, Topic1_weekly)

# --- Fill in missing combinations with zero counts ---
Weekly_counts <- Weekly_counts %>%
  tidyr::complete(
    Topic,
    Week = full_seq(range(Week), 1),  # generate all week numbers
    fill = list(Count = 0)
  ) %>%
  arrange(Topic, Week)

# --- Create interactive plotly line chart ---
AS1 <- plot_ly(
  data = Weekly_counts,
  x = ~Week,
  y = ~Count,
  color = ~Topic,
  colors = c("firebrick", "steelblue"),
  type = "scatter",
  mode = "lines+markers",
  line = list(width = 2),
  marker = list(size = 6)
) %>%
  layout(
    title = "Weekly Coverage: Red Sox vs. Yankees",
    xaxis = list(
      title = "Week Number (starting with Week 1 of 2025)",
      dtick = 1
    ),
    yaxis = list(title = "Number of Articles"),
    legend = list(title = list(text = "Topic")),
    hovermode = "x unified"
  )

# ============================================
# --- Show the chart ---
# ============================================

AS1