Rationale

Framing theory suggests that depending on how information is presented, it can significantly influence how it is understood by a person or audience. By emphasizing certain aspects of an issue or downplaying others, communicators, like the media or even political figures, can help shape public perception of a topic and impact how people respond, their attitudes and the choices they make.

In our post-pandemic world, news organizations like AP News continue to report on COVID-19, COVID-19 boosters and vaccines, new health guidelines, misinformation and political controversy about vaccinations.

In this assignment, I analyzed what the difference was between the frequency of hesitancy-related and health-related APNews.com coverage from January 2025 through October 2025. The results show how framing theory involving science-based (health) information and hesitancy (non-scienced based information like misinformation and political controversies) can influence viewer’s attitudes and thoughts toward the latest COVID-19 boosters.

Hypothesis

The volume of weekly APNews.com coverage focused on COVID-19 booster health guidelines will differ significantly from the volume of coverage focused on vaccine hesitancy, misinformation and political controversies between January and October 2025.

Variables & method

Variable 1 is hesitancy (non-science based.) and Variable 2 is health (science-based). This assignment compares the number of stories published about the two topics from APNews.com.

Because the number of paired observations (n = 38) was low small and the normality assumption was not met, a Wilcoxon Signed Rank Test was used since It can accommodate a smaller sample size.

Descriptive statistics, a histogram and boxplot were also used in this analysis.

Results & discussion

As noted above, since I did not have 40 or more observations (a total of 38), the parameters needed for a paired-samples t-test were not met and a Wilcoxon Signed Rank test was used instead.

The Wilcoxon Signed Rank test indicated a significant difference between health and hesitancy between January 2025 to October 2025 (p = 0.0001). On average during that time period, APNews.com posted 1.26 hesitancy-related stories (V1) compared to 2.79 health-related stories (V2). Those results indicate APNews.com focused their efforts on more science-based informational stories than giving more attention to vaccine misinformation or political controversies involving the vaccines.

It is important to note, this information could look very different if additional sources were used in addition to APNews.com since there was a limited number of stories available.

count	mean	sd	min	max
Descriptive Statistics: Pair Differences
38.000	1.526	2.215	−2.000	10.000

statistic	p.value	method
Normality Test (Shapiro-Wilk)
0.8404	0.0001	Shapiro-Wilk normality test
If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results.

statistic	p.value	method
Wilcoxon Signed Rank Test
35.0000	0.0001	Wilcoxon signed rank test with continuity correction

V1_Mean	V2_Mean	V1_SD	V2_SD
Group Means and SDs (Wilcoxon)
1.263	2.789	1.309	2.559

Code

Here is the code that produced the figures and paired-samples t-test results.

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================

Weekly_counts <- read.csv("Weekly_counts.csv")

# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)
names(mydata) <- make.names(names(mydata))

write.csv(mydata, "mydata.csv", row.names = FALSE)

# Specify the two variables involved
mydata$V1 <- mydata$Hesitancy # <== Customize this
mydata$V2 <- mydata$Health # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary

APNews data paired-samples t-test analysis - Week 10

DeAnn Hays

2025-11-05

Rationale

Hypothesis

Variables & method

Results & discussion

Code