Rationale

While first-level agenda setting predicts what issues society considers important — based on the quantity of media coverage of a given topic — second-level agenda setting impacts what attributes or subcategories we think are important within those issues.

Public education is a popular topic of media coverage because it affects nearly everyone at one point during their lives, either directly or indirectly. Within the overarching topic of education, there are multiple subcategories that receive attention from the media. For my analysis, I explore the effects of second-level agenda setting within two of those subcategories — funding and school violence — to determine how coverage volume of those attributes may affect public discussion.

The results will enhance theoretical understanding about how two different aspects of education correlate with media coverage, and thereby influence the topics within education that are prominent in the public’s eye.

Hypothesis

Weekly APNews.com coverage volume of two education subtopics — funding and school violence — affects the prominence of those topics within society during the first nine months of 2025.

Variables & method

Weekly APNews.com education subtopics of funding and school violence are categorical and served as the analysis’ independent variables. The dependent variable is continuous because it uses the volume of stories published each week over the nine-month period between January and September 2025.

For this analysis, I used paired-samples t-test to measure whether there is statistical significance between the volume of education stories focused on violence versus the number of education stories focused on school funding. If there is a statistical significance between media coverage of the two education subcategories, it will suggest the public’s perception of importance of one of those issues would be more prominent than the other.

Results & discussion

The data is not distributed normally, according to the histogram and the Shapiro-Wilk Normality Test, which shows a p-value of less than 0.05, but that doesn’t matter because there was a story count of 40 or more during the first nine months of 2025. 

The box plots show the averages of each story attribute. The average number of education stories focused on school violence each week was 4.1, while the average number of education stories focused on school funding each week was 2.8. 

The Paired-Samples t-Test table shows a p-value is 0.0811, which is much greater than 0.05. So the p-value shows the difference between coverage volume of the two attributes is not significantly different. 

Like the box plots, the Group Means and SDs table shows the averages of stories focusing on each attribute. On average, there are 4.1 education stories each week which focus on school violence and there are 2.8 stories each week focused on school funding. 

While not needed because we had a story count of 40 or more and so we were able to use the paired-samples t-Test, the Wilcoxon Signed Rank Test shows the same conclusion. The p-value on this test is 0.1323 which is far greater than 0.05. This value means the story coverage volume between these two topic attributes is not significantly different.

So the results of the analysis show the hypothesis is false — the number of stories focused on the two attributes each week is not significantly different, and so the effects on public perception about one topic being more important than the other would not be significantly affected.

Descriptive Statistics: Pair Differences
count mean sd min max
40.000 1.300 4.592 −4.000 21.000
Normality Test (Shapiro-Wilk)
statistic p.value method
0.7214 0.0000 Shapiro-Wilk normality test
If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results.
Paired-Samples t-Test
statistic parameter p.value conf.low conf.high method
1.7905 39 0.0811 −0.1686 2.7686 Paired t-test
Group Means and SDs (t-Test)
V1_Mean V2_Mean V1_SD V2_SD
2.800 4.100 1.924 4.174
Wilcoxon Signed Rank Test
statistic p.value method
210.0000 0.1323 Wilcoxon signed rank test with continuity correction
Group Means and SDs (Wilcoxon)
V1_Mean V2_Mean V1_SD V2_SD
2.800 4.100 1.924 4.174

Code:

# ============================================
# APNews text analysis (Second-level agenda-setting theory version)
# ============================================

# ============================================
# --- Load required libraries ---
# ============================================

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")
if (!require("plotly")) install.packages("plotly")

library(tidyverse)
library(tidytext)
library(plotly)

# ============================================
# --- Define Custom Topic Labels ---
# ============================================

# You can change these labels to anything you want.
# They will appear in the chart legend and title.
topic_labels <- list(
  Topic1 = "All education",
  Topic2 = "Violence",
  Topic3 = "Money"
)

# ============================================
# --- Load the APNews data ---
# ============================================

FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
saveRDS(FetchedData, file = "APNews.rds")
rm(FetchedData)

APNews <- readRDS("APNews.rds")

# ============================================
# --- Define and apply FilterTopic ---
# ============================================

FilterTopic_phrases <- c(
  "education", "educational", "educator", "educators",
  "school", "schools", "classroom", "classrooms",
  "teacher", "teachers"
)

escaped_FilterTopic <- str_replace_all(
  FilterTopic_phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])", "\\\\\\1"
)

FilterTopic_pattern <- paste0("\\b", escaped_FilterTopic, "\\b", collapse = "|")

APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),
    FilterTopic = if_else(
      str_detect(Full.Text.clean, regex(FilterTopic_pattern, ignore_case = TRUE)),
      "Yes", "No"
    )
  )

TopicNews <- APNews %>% filter(FilterTopic == "Yes")

# ============================================
# --- Define Topic1 (Education) ---
# ============================================

Topic1_phrases <- c(
  "education", "educational", "educator", "educators",
  "school", "schools", "classroom", "classrooms",
  "teacher", "teachers"
)

Topic1_pattern <- paste0("\\b", str_replace_all(Topic1_phrases,
                                                "([\\^$.|?*+()\\[\\]{}\\\\])", "\\\\\\1"
), "\\b", collapse = "|")

TopicNews <- TopicNews %>%
  mutate(
    Topic1 = if_else(
      str_detect(Full.Text.clean, regex(Topic1_pattern, ignore_case = TRUE)),
      "Yes", "No"
    )
  )

# ============================================
# --- Define Topic2 (Shootings) ---
# ============================================

Topic2_phrases <- c(
  "shoot", "shooting", "shooter", "shot",
  "gun", "rifle", "weapon", "kill", "killed"
)

Topic2_pattern <- paste0("\\b", str_replace_all(Topic2_phrases,
                                                "([\\^$.|?*+()\\[\\]{}\\\\])", "\\\\\\1"
), "\\b", collapse = "|")

TopicNews <- TopicNews %>%
  mutate(
    Topic2 = if_else(
      str_detect(Full.Text.clean, regex(Topic2_pattern, ignore_case = TRUE)),
      "Yes", "No"
    )
  )

# ============================================
# --- Define Topic3 (Economy) ---
# ============================================

Topic3_phrases <- c(
  "fund", "funds", "funding",
  "grant", "grants", "pay",
  "salary", "salaries", "service",
  "services", "lunch", "lunches",
  "special education"
)

Topic3_pattern <- paste0("\\b", str_replace_all(Topic3_phrases,
                                                "([\\^$.|?*+()\\[\\]{}\\\\])", "\\\\\\1"
), "\\b", collapse = "|")

TopicNews <- TopicNews %>%
  mutate(
    Topic3 = if_else(
      str_detect(Full.Text.clean, regex(Topic3_pattern, ignore_case = TRUE)),
      "Yes", "No"
    )
  )

# ============================================
# --- Summarize weekly counts for all topics ---
# ============================================

Topic1_weekly <- TopicNews %>%
  filter(Topic1 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = topic_labels$Topic1)

Topic2_weekly <- TopicNews %>%
  filter(Topic2 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = topic_labels$Topic2)

Topic3_weekly <- TopicNews %>%
  filter(Topic3 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = topic_labels$Topic3)

Weekly_counts <- bind_rows(Topic1_weekly, Topic2_weekly, Topic3_weekly) %>%
  tidyr::complete(
    Topic,
    Week = full_seq(range(Week), 1),
    fill = list(Count = 0)
  ) %>%
  arrange(Topic, Week)

# ============================================
# --- Visualize the results ---
# ============================================

AS2 <- plot_ly(
  data = Weekly_counts,
  x = ~Week,
  y = ~Count,
  color = ~Topic,
  colors = c("steelblue", "seagreen", "firebrick"),
  type = "scatter",
  mode = "lines+markers",
  line = list(width = 2),
  marker = list(size = 6)
) %>%
  layout(
    title = paste(
      "Weekly Counts of",
      paste(unlist(topic_labels), collapse = ", "),
      "Stories (Filtered Dataset)"
    ),
    xaxis = list(title = "Week Number (starting with Week 1 of 2025)", dtick = 1),
    yaxis = list(title = "Number of Articles"),
    legend = list(title = list(text = "Topic")),
    hovermode = "x unified"
  )

# ============================================
# --- Display the chart ---
# ============================================

AS2

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================
# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)
names(mydata) <- make.names(names(mydata))

# Specify the two variables involved
mydata$V1 <- mydata$Money # <== Customize this
mydata$V2 <- mydata$Violence # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary