Rationale

Framing theory suggests that depending on how information is presented, it can significantly influence how it is understood by a person or audience. By emphasizing certain aspects of an issue or downplaying others, communicators, like the media or even political figures, can help shape public perception of a topic and impact how people respond, their attitudes and the choices they make.

In our post-pandemic world, news organizations like AP News continue to report on COVID-19, COVID-19 boosters and vaccines, new health guidelines, misinformation and political controversy about vaccinations.

In this assignment, I analyzed what the difference was between the frequency of hesitancy-related and health-related APNews.com coverage from January 2025 through October 2025. The results show how framing theory involving science-based (health) information and hesitancy (non-scienced based information like misinformation and political controversies) can influence viewer’s attitudes and thoughts toward the latest COVID-19 boosters.

Hypothesis

The volume of weekly APNews.com coverage focused on COVID-19 booster health guidelines will differ significantly from the volume of coverage focused on vaccine hesitancy, misinformation and political controversies between January and October 2025.

Variables & method

Variable 1 is hesitancy (non-science based.) and Variable 2 is health (science-based). This assignment compares the number of stories published about the two topics from APNews.com.

Because the number of paired observations (n = 38) was low small and the normality assumption was not met, a Wilcoxon Signed Rank Test was used since It can accommodate a smaller sample size.

Descriptive statistics, a histogram and boxplot were also used in this analysis.

Results & discussion

As noted above, since I did not have 40 or more observations (a total of 38), the parameters needed for a paired-samples t-test were not met and a Wilcoxon Signed Rank test was used instead.

The Wilcoxon Signed Rank test indicated a significant difference between health and hesitancy between January 2025 to October 2025 (p = 0.0001). On average during that time period, APNews.com posted 1.26 hesitancy-related stories (V1) compared to 2.79 health-related stories (V2). Those results indicate APNews.com focused their efforts on more science-based informational stories than giving more attention to vaccine misinformation or political controversies involving the vaccines.

It is important to note, this information could look very different if additional sources were used in addition to APNews.com since there was a limited number of stories available.

Descriptive Statistics: Pair Differences
count mean sd min max
38.000 1.526 2.215 −2.000 10.000
Normality Test (Shapiro-Wilk)
statistic p.value method
0.8404 0.0001 Shapiro-Wilk normality test
If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results.
Wilcoxon Signed Rank Test
statistic p.value method
35.0000 0.0001 Wilcoxon signed rank test with continuity correction
Group Means and SDs (Wilcoxon)
V1_Mean V2_Mean V1_SD V2_SD
1.263 2.789 1.309 2.559

Code

Here is the code that produced the figures and paired-samples t-test results.

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================

Weekly_counts <- read.csv("Weekly_counts.csv")

# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)
names(mydata) <- make.names(names(mydata))

write.csv(mydata, "mydata.csv", row.names = FALSE)

# Specify the two variables involved
mydata$V1 <- mydata$Hesitancy # <== Customize this
mydata$V2 <- mydata$Health # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary