Paired-Samples T-test

Rationale

Framing theory tells us that how a media message’s consumer views and responds to the media, whether through their action or judgement, is effected by cues built into the messages themselves. These cues push the consumers to respond in a certain way by viewing the messages in particular framing, which leads to the responses given. Media typically will be framed toward their audience’s own views or agendas, and it was typically aim at reinforcing the regular consumer’s beliefs.

Drawing on this theory, my project will compare the media agenda on how illegal immigration is being framed: whether as an invasion through the southern border or as a peaceful necessity of safe harboring those entering the US. Specifically, my analysis contrasts the number of stories per week that APNews.com published about immigration and how it was framed between Jan. 1 and Sept. 30 2025.

Hypothesis

Weekly APNews.com coverage volume of immigration of the southern border was framed in a specific way.

Variables and method

Weekly APNews.com coverage volume served as the analysis’s dependent variable. It was measured continuously as the number of stories published weekly. The independent variable was how the story was framed either as an “invasion” or “peaceful” and operationalized using key words or phrases likely to be unique to stories framed either way. Key words used that could suggest an “invasion” viewpoint were “illegal”, “deportation”, “police”, “ICE”, and “alien.” Key words used that could suggest immigration was “peaceful” were “safe harbor”, “protests”, “both sides”, “assimilation”, “immigrants”, and “fighting.”

A paired-samples t-test assesses the significance of coverage volume variation between the two agenda frames.

Results and discussion

The figure below summarizes the story topic’s weekly coverage volume across the period analyzed. It appears that, consistently, APNews.com pushed the viewpoint of an “invasion” over the viewpoint that the border crossings were “peaceful.”

The biggest jump for the “peaceful” viewpoint occurred during the week 24, the week of Jun. 9-15. The weekly “peaceful” framing hit a high of 97 stories. Of note, President Trump’s presidential proclamation significantly expanding travel restrictions and affecting nationals from 19 different countries went into effect.

Notably, coverage framed as an “invasion” is covered weekly more than being framed as “peaceful.” The highest volume of “invasion” framing occurred during the week 33, the week of Aug. 11-17. Coverage hit a high of 88 stories published during that week framing immigration as an “invasion.” Of note, the Trump administration began a sharp escalation in immigration enforcement and deportation that included the enlistment and recruitment effort of ICE agents.

Overall, the results suggest an inclination of the framing of immigration on APNews.com as “invasion” rather than “peaceful” was much more apparent.

Stories were framed as an “invasion” an average of 53 articles per week across the analysis period. Stories framed as “peaceful” were published an average of 25 per week, a mean difference of about 28 articles per week. The pair differences failed a Shapiro-Wilk Normality Test (p < 0.05), but the dataset’s case count, 40, made a paired-samples t-test suitable for assessing the statistical significance of the average paired differences.

Below is a box plot of weekly article volumes for Peaceful Immigration (V1) and Invasion (V2), followed by the results of the paired-samples t-test.

statistic	parameter	p.value	conf.low	conf.high	method
Paired-Samples t-Test
10.4602	39	0.0000	22.4848	33.2652	Paired t-test

V1_Mean	V2_Mean	V1_SD	V2_SD
Group Means and SDs (t-Test)
25.200	53.075	16.215	12.206

The significant (p < 0.001) t-test result supported the hypothesis that weekly APNews.com coverage of immigration differed during the first nine months of 2025.

Code:

Here is the code that produced the figures and paired-samples t-test results.

# ============================================
# APNews text analysis (Framing Theory version)
# ============================================

# ============================================
# --- Load required libraries ---
# ============================================

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

library(tidyverse)
library(tidytext)

# ============================================
# --- Load the APNews data ---
# ============================================

# Read the data from the web
FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
# Save the data on your computer
saveRDS(FetchedData, file = "APNews.rds")
# remove the downloaded data from the environment
rm (FetchedData)

APNews <- readRDS("APNews.rds")

# ============================================
# --- Define and apply FilterTopic ---
# ============================================

# --- Define FilterTopic phrases ---
FilterTopic_phrases <- c(
  "ICE",
  "Raids",
  "Border",
  "Immigration",
  "Police",
  "Mexico"
)

# --- Escape regex special characters ---
escaped_FilterTopic <- str_replace_all(
  FilterTopic_phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
FilterTopic_pattern <- paste0("\\b", escaped_FilterTopic, "\\b", collapse = "|")

# --- Flag stories matching the FilterTopic ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),
    FilterTopic = if_else(
      str_detect(Full.Text.clean, regex(FilterTopic_pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# --- Create a TopicNews data frame consisting only of FilterTopic stories ---
TopicNews <- APNews %>%
  filter(FilterTopic == "Yes")

# ============================================
# --- Flag Topic1-related stories (within TopicNews) ---
# ============================================

# --- Define Topic1 phrases ---
phrases <- c(
  "Illegal",
  "Invasion",
  "police",
  "Deportation",
  "Alien",
  "ICE"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build pattern and apply matching ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

TopicNews <- TopicNews %>%
  mutate(
    Topic1 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Flag Topic2-related stories (within TopicNews) ---
# ============================================

# --- Define Topic2 phrases ---
phrases <- c(
  "Safe Harbor",
  "protests",
  "fighting",
  "both sides",
  "assimilation",
  "immigrants",
  "peaceful",
  "immigration"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build pattern and apply matching ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

TopicNews <- TopicNews %>%
  mutate(
    Topic2 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Visualize weekly counts of Topic1- and Topic2-related stories ---
# ============================================

if (!require("plotly")) install.packages("plotly")
library(plotly)

# --- Summarize weekly counts for Topic1 = "Yes" ---
Topic1_weekly <- TopicNews %>%
  filter(Topic1 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Invasion") # Note custom Topic1 label

# --- Summarize weekly counts for Topic2 = "Yes" ---
Topic2_weekly <- TopicNews %>%
  filter(Topic2 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "Peaceful Immigration") # Note custom Topic2 label

# --- Combine both summaries into one data frame ---
Weekly_counts <- bind_rows(Topic2_weekly, Topic1_weekly)

# --- Fill in missing combinations with zero counts ---
Weekly_counts <- Weekly_counts %>%
  tidyr::complete(
    Topic,
    Week = full_seq(range(Week), 1),  # generate all week numbers
    fill = list(Count = 0)
  ) %>%
  arrange(Topic, Week)

# --- Create interactive plotly line chart ---
FR1 <- plot_ly(
  data = Weekly_counts,
  x = ~Week,
  y = ~Count,
  color = ~Topic,
  colors = c("steelblue", "firebrick"),
  type = "scatter",
  mode = "lines+markers",
  line = list(width = 2),
  marker = list(size = 6)
) %>%
  layout(
    title = "Weekly Counts of Invasion- and Peaceful-Related Stories within the Immigration Dataset",
    xaxis = list(
      title = "Week Number (starting with Week 1 of 2025)",
      dtick = 1
    ),
    yaxis = list(title = "Number of Articles"),
    legend = list(title = list(text = "Topic")),
    hovermode = "x unified"
  )

# ============================================
# --- Show the chart ---
# ============================================

FR1

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================
# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)
names(mydata) <- make.names(names(mydata))

# Specify the two variables involved
mydata$V1 <- mydata$Peaceful.Immigration # <== Customize this
mydata$V2 <- mydata$Invasion # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary