First-level agenda-setting theory explains how the media influence which issues or topics the public perceives as most important. Rather than determining how people think about an issue, this level of agenda setting concerns what people think about by highlighting certain subjects more frequently or prominently. Increased media attention on specific topics can elevate their perceived importance among audiences.
In this context, the analysis will focus on two Major League Baseball (MLB) teams—the Los Angeles Dodgers and the Toronto Blue Jays. By quantifying and visualizing the frequency of media coverage related to both teams over time, the study seeks to identify patterns that indicate which team receives greater media attention and, consequently, higher public salience.
This emphasis on the visibility and prominence of coverage exemplifies first-level agenda-setting theory. It demonstrates how media outlets influence the public’s awareness and prioritization of particular sports teams, ultimately shaping which organizations occupy the center of public discussion and interest.
Volume of stories about the two sports teams will differ during the nine months of 2025.
The weekly coverage of APNews.com regarding the two sports teams served as the dependent variable for the analysis, measured continuously. The independent variable was the “story topic,” specifically the Los Angeles Dodgers and the Toronto Blue Jays, and it was measured categorically. To differentiate volume, keywords were used for both of the topics. The keywords for the Los Angles Dodgers were “Dodgers” and “Shohei Ohtani.” The keywords for the Toronto Blue Jays was “Blue Jays.”
A paired samples t-test was used to assess the statistical significance of coverage volume variation between the two sport teams.
The figure below illustrates the weekly coverage of each story topic throughout the analyzed period. The Blue Jays received steady coverage, whereas the Los Angeles Dodgers displayed more pronounced spikes in response to significant events.
The most notable difference in volume was Week 25, the week between June 16 and June 22. The weekly-Dodgers related story count peaked at a period high of 56, and the Blue Jays only peaked at 16. In Week 25, most of the weekly-Dodgers related stories was centered around Shohei Ohtani.
The only peak that the Blue Jays received was Week 39, the week between September 23 to September 28. The weekly-Blue Jays related story count peaked at a period high of 22. However, the Dodgers still beat them in stories with a period high of 27.
Overall, the figure reinforces the hypothesis that the volume of stories surrounding the two sports teams will differ over the nine months of 2025.
| Descriptive Statistics: Pair Differences | ||||
| count | mean | sd | min | max |
|---|---|---|---|---|
| 40.000 | 11.850 | 9.209 | 1.000 | 40.000 |
| Normality Test (Shapiro-Wilk) | ||
| statistic | p.value | method |
|---|---|---|
| 0.8994 | 0.0018 | Shapiro-Wilk normality test |
| If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results. | ||
| Paired-Samples t-Test | |||||
| statistic | parameter | p.value | conf.low | conf.high | method |
|---|---|---|---|---|---|
| 8.1387 | 39 | 0.0000 | 8.9050 | 14.7950 | Paired t-test |
| Group Means and SDs (t-Test) | |||
| V1_Mean | V2_Mean | V1_SD | V2_SD |
|---|---|---|---|
| 9.300 | 21.150 | 6.410 | 13.092 |
# ============================================
# APNews text analysis (First-level agenda-setting theory version)
# ============================================
# ============================================
# --- Load required libraries ---
# ============================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")
library(tidyverse)
library(tidytext)
# ============================================
# --- Load the APNews data ---
# ============================================
# Read the data from the web
FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
# Save the data on your computer
saveRDS(FetchedData, file = "APNews.rds")
# remove the downloaded data from the environment
rm (FetchedData)
APNews <- readRDS("APNews.rds")
# ============================================
# --- Flag Topic1-related stories ---
# ============================================
# --- Define Topic1 phrases ---
phrases <- c(
"Dodgers", "Ohtani"
)
# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
phrases,
"([\\^$.|?*+()\\[\\]{}\\\\])",
"\\\\\\1"
)
# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")
# --- Apply matching to flag Topic1 stories ---
APNews <- APNews %>%
mutate(
Full.Text.clean = str_squish(Full.Text), # normalize whitespace
Topic1 = if_else(
str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
"Yes",
"No"
)
)
# ============================================
# --- Flag Topic2-related stories ---
# ============================================
# --- Define Topic2 phrases ---
phrases <- c(
"Blue Jays"
)
# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
phrases,
"([\\^$.|?*+()\\[\\]{}\\\\])",
"\\\\\\1"
)
# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")
# --- Apply matching to flag Topic2 stories ---
APNews <- APNews %>%
mutate(
Full.Text.clean = str_squish(Full.Text),
Topic2 = if_else(
str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
"Yes",
"No"
)
)
# ============================================
# --- Visualize weekly counts of Topic1- and Topic2-related stories ---
# ============================================
# --- Load plotly if needed ---
if (!require("plotly")) install.packages("plotly")
library(plotly)
# --- Summarize weekly counts for Topic1 = "Yes" ---
Topic1_weekly <- APNews %>%
filter(Topic1 == "Yes") %>%
group_by(Week) %>%
summarize(Count = n(), .groups = "drop") %>%
mutate(Topic = "Dodgers") # Note custom Topic1 label
# --- Summarize weekly counts for Topic2 = "Yes" ---
Topic2_weekly <- APNews %>%
filter(Topic2 == "Yes") %>%
group_by(Week) %>%
summarize(Count = n(), .groups = "drop") %>%
mutate(Topic = "Blue Jays") # Note custom Topic2 label
# --- Combine both summaries into one data frame ---
Weekly_counts <- bind_rows(Topic2_weekly, Topic1_weekly)
# --- Fill in missing combinations with zero counts ---
Weekly_counts <- Weekly_counts %>%
tidyr::complete(
Topic,
Week = full_seq(range(Week), 1), # generate all week numbers
fill = list(Count = 0)
) %>%
arrange(Topic, Week)
# --- Create interactive plotly line chart ---
AS1 <- plot_ly(
data = Weekly_counts,
x = ~Week,
y = ~Count,
color = ~Topic,
colors = c("steelblue", "firebrick"),
type = "scatter",
mode = "lines+markers",
line = list(width = 2),
marker = list(size = 6)
) %>%
layout(
title = "Weekly Counts of Topic1- and Topic2-Related AP News Articles",
xaxis = list(
title = "Week Number (starting with Week 1 of 2025)",
dtick = 1
),
yaxis = list(title = "Number of Articles"),
legend = list(title = list(text = "Topic")),
hovermode = "x unified"
)
# ============================================
# --- Show the chart ---
# ============================================
AS1
# ============================================================
# Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")
library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)
options(scipen = 999)
# ============================================================
# Data Import
# ============================================================
# Reshape to wide form
mydata <- Weekly_counts %>%
pivot_wider(names_from = Topic, values_from = Count)
names(mydata) <- make.names(names(mydata))
# Specify the two variables involved
mydata$V1 <- mydata$Blue.Jays # <== Customize this
mydata$V2 <- mydata$Dodgers # <== Customize this
# ============================================================
# Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1
# ============================================================
# Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
data = mydata,
x = ~PairDifferences,
type = "histogram",
marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
layout(
title = "Distribution of Pair Differences",
xaxis = list(title = "Pair Differences"),
yaxis = list(title = "Count"),
shapes = list(
list(
type = "line",
x0 = mean(mydata$PairDifferences, na.rm = TRUE),
x1 = mean(mydata$PairDifferences, na.rm = TRUE),
y0 = 0,
y1 = max(table(mydata$PairDifferences)),
line = list(color = "red", dash = "dash")
)
)
)
# ============================================================
# Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
summarise(
count = n(),
mean = mean(PairDifferences, na.rm = TRUE),
sd = sd(PairDifferences, na.rm = TRUE),
min = min(PairDifferences, na.rm = TRUE),
max = max(PairDifferences, na.rm = TRUE)
)
desc_table <- desc_stats %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Descriptive Statistics: Pair Differences") %>%
fmt_number(columns = where(is.numeric), decimals = 3)
# ============================================================
# Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
select(statistic, p.value, method) %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
tab_source_note(
source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
)
# ============================================================
# Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
pivot_longer(cols = c(V1, V2),
names_to = "Measure",
values_to = "Value")
# ============================================================
# Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
group_by(Measure) %>%
summarise(mean_value = mean(Value), .groups = "drop")
boxplot_measures <- plot_ly() %>%
add_trace(
data = df_long,
x = ~Measure, y = ~Value,
type = "box",
boxpoints = "outliers",
marker = list(color = "red", size = 4),
line = list(color = "black"),
fillcolor = "royalblue",
name = ""
) %>%
add_trace(
data = group_means,
x = ~Measure, y = ~mean_value,
type = "scatter", mode = "markers",
marker = list(
symbol = "diamond", size = 9,
color = "black", line = list(color = "white", width = 1)
),
text = ~paste0("Mean = ", round(mean_value, 2)),
hoverinfo = "text",
name = "Group Mean"
) %>%
layout(
title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
xaxis = list(title = "Measure"),
yaxis = list(title = "Value"),
showlegend = FALSE
)
# ============================================================
# Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Paired-Samples t-Test") %>%
fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)
t_summary <- mydata %>%
select(V1, V2) %>%
summarise_all(list(Mean = mean, SD = sd)) %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Group Means and SDs (t-Test)") %>%
fmt_number(columns = everything(), decimals = 3)
# ============================================================
# Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
select(statistic, p.value, method) %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Wilcoxon Signed Rank Test") %>%
fmt_number(columns = c(statistic, p.value), decimals = 4)
wilcox_summary <- mydata %>%
select(V1, V2) %>%
summarise_all(list(Mean = mean, SD = sd)) %>%
gt() %>%
gt_theme_538() %>%
tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
fmt_number(columns = everything(), decimals = 3)
# ============================================================
# Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary