Rationale

First-level agenda-setting theory examines how specific topics compete for prominence on the media agenda. If the media believes one issue is more critical than the other, the other topic may not be discussed as much. Depending on the seriousness of the topics and the subjects, both issues may be highly prominent at the same time.

In the current political climate, the United States is still adjusting to a new presidential administration and set of policies; my project will compare the media agenda prominence of two topics: tariffs and immigration. These are two topics that President Donald Trump discussed a lot while running for office, and they’ve dominated news headlines in 2025. They’re topics that a lot of Americans care about.

The analysis will compare the number of stories per week that APNews.com published about each issue between Jan. 1 and Sept. 30, 2025. The results will enhance theoretical understanding of how similar-but-distinct issues that affect the American people and how America is perceived compete for attention on the media agenda.

Hypothesis

Weekly APNews.com coverage volume of topics related to Tariffs and Immigration varied during the first nine months of 2025.

Variables & method

Weekly APNews.com coverage volume of the two topics served as the analysis’s dependent variable. It was measured continuously as the number of stories published per week. The independent variable was “story topic,” measured categorically as either “Tariffs” or “Immigration” and operationalized as containing key words or phrases likely to be unique to stories about one topic or the other. Key words and phrases used to identify stories about Tariffs were: “Tariffs”, “Tariff”, “Trade war”, “Import tax”, “Trade deal”, and “Sanctions”. Key words and phrases used to identify stories about Immigration were: “Immigration”, “Border”, “Asylum”, “Deportation”, and “ICE”.

Eventually, I will use a paired-samples t-test to assess the statistical significance of the difference in coverage volume between the two story topics.

Results & discussion

The figure below summarizes each story topic’s weekly coverage volume across the period analyzed. It appears that APNews.com coverage for Immigration and Tariffs was strong throughout the year, and both topics had their moments.

The first notable difference in volume appeared during Week 4. The weekly Immigration-related story count peaked at a period high of 83 compared to only 21 for Tariffs. A quick dive into articles that week reveals it was Mr. Trump’s first week in office. Immigrants were concerned about what the future looked like; the administration was reporting the border was getting more secure, and his administration was fulfilling his pledge to crack down on immigration.

The peak for Tariff-related stories appeared during Week 10, Week 14, and Week 15. This is because there was a lot of coverage about Tariffs being imposed on countries like Mexico, Canada, and China. Also, the reciprocal tariffs.

In Week 24, Immigration-related stories peak higher than Tariffs again because immigration enforcement raids and protests pick up around the country.

Overall, the results suggest slight differences in APNews.com coverage of the two topics, as well as possible nonrandom links between the levels of coverage of one and the other.

Tariffs coverage volume averaged about 44 articles per week across the nine months. Immigration coverage volume wasn’t much more at an average of 46 articles per week; with a mean difference of about 2 articles. The pair differences failed a Shapiro-Wilk Normality Test (p < 0.05), but the dataset’s case count, 40, made a paired-sample t-test nonetheless suitable for assessing the statistical significance of the average paired differences.

Below is a box plot of weekly article volumes for Tariffs (V1) and Immigration (V2), followed by the results of the paired-samples t-test.

statistic	parameter	p.value	conf.low	conf.high	method
Paired-Samples t-Test
0.2789	39	0.7818	−12.0352	15.8852	Paired t-test

V1_Mean	V2_Mean	V1_SD	V2_SD
Group Means and SDs (t-Test)
44.075	46.000	36.137	17.967

The t-test result did not support the hypothesis that the weekly APNews.com coverage volume of topics related to tariffs and immigration differed during the first nine months of 2025. The results were not significant (p = .78).

Code:

Here is the code that produced the figure.

# ============================================
# APNews text analysis (First-level agenda-setting theory version)
# ============================================

# ============================================
# --- Load required libraries ---
# ============================================

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

library(tidyverse)
library(tidytext)

# ============================================
# --- Load the APNews data ---
# ============================================

# Read the data from the web
FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
# Save the data on your computer
saveRDS(FetchedData, file = "APNews.rds")
# remove the downloaded data from the environment
rm (FetchedData)

APNews <- readRDS("APNews.rds")

# ============================================
# --- Flag Topic1-related stories ---
# ============================================

# --- Define Topic1 phrases ---
phrases <- c(
  "Tariff",
  "Trade war",
  "Import tax",
  "Trade deal",
  "Sanctions",
  "Tariffs"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic1 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),  # normalize whitespace
    Topic1 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Flag Topic2-related stories ---
# ============================================

# --- Define Topic2 phrases ---
phrases <- c(
  "Immigration",
  "Border",
  "Asylum",
  "Deportation",
  "ICE"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic2 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),
    Topic2 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Visualize weekly counts of Topic1- and Topic2-related stories ---
# ============================================

# --- Load plotly if needed ---
if (!require("plotly")) install.packages("plotly")
library(plotly)

# --- Summarize weekly counts for Topic1 = "Yes" ---
Topic1_weekly <- APNews %>%
  filter(Topic1 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Tariffs") # Note custom Topic1 label

# --- Summarize weekly counts for Topic2 = "Yes" ---
Topic2_weekly <- APNews %>%
  filter(Topic2 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Immigration") # Note custom Topic2 label

# --- Combine both summaries into one data frame ---
Weekly_counts <- bind_rows(Topic2_weekly, Topic1_weekly)

# --- Fill in missing combinations with zero counts ---
Weekly_counts <- Weekly_counts %>%
  tidyr::complete(
    Topic,
    Week = full_seq(range(Week), 1),  # generate all week numbers
    fill = list(Count = 0)
  ) %>%
  arrange(Topic, Week)

# --- Create interactive plotly line chart ---
AS1 <- plot_ly(
  data = Weekly_counts,
  x = ~Week,
  y = ~Count,
  color = ~Topic,
  colors = c("steelblue", "firebrick"),
  type = "scatter",
  mode = "lines+markers",
  line = list(width = 2),
  marker = list(size = 6)
) %>%
  layout(
    title = "Weekly Counts of Topic1- and Topic2-Related AP News Articles",
    xaxis = list(
      title = "Week Number (starting with Week 1 of 2025)",
      dtick = 1
    ),
    yaxis = list(title = "Number of Articles"),
    legend = list(title = list(text = "Topic")),
    hovermode = "x unified"
  )

# ============================================
# --- Show the chart ---
# ============================================

AS1

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================
# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)

# Specify the two variables involved
mydata$V1 <- mydata$Tariffs # <== Customize this
mydata$V2 <- mydata$Immigration # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary

Complete APNews Analysis

Aaron Cantrell

Rationale

Hypothesis

Variables & method

Results & discussion