Rationale

Agenda-setting theory suggests that the media can shape the public’s perceptions of which issues are considered most important. First-level agenda-setting theory predicts that issues or topics featured prominently in media content tend to become prominent in the minds of media audiences. Because media organizations have limited time, resources, and space, they must decide which topics to prioritize for greater coverage and which to allocate less attention to. As a result, issues — or in this case, sports leagues — compete with one another for prominence within the media agenda.

Sports coverage, like other types of news, reflects editorial judgments about what is considered most “newsworthy.” The amount of coverage devoted to men’s and women’s sports may shape public perceptions of their relative importance, popularity, or legitimacy. For example, if one league consistently receives more coverage, audiences may come to view it as more significant or important than the other.

Drawing on first-level agenda-setting theory, my project will compare the media prominence of two professional basketball leagues: the National Basketball Association (NBA) and the Women’s National Basketball Association (WNBA). Specifically, the analysis will examine the number of weekly stories published by APNews.com about each league between January 1st and September 30th, 2025.

The results will help understand how the coverage of men’s and women’s sports compete for attention within the sports media agenda. Although both leagues represent the highest level of professional basketball in the world, they often differ in media visibility and perceived newsworthiness. This comparison will shed light on how gender dynamics influence which sports receive greater attention and how they affect public awareness and perception of athletic achievement.

Hypothesis

During the NBA’s off-season, weekly AP News coverage of the NBA will be substantially greater than coverage of the WNBA during its regular season.

Variables & method

The weekly APNews.com coverage volume of the two basketball leagues (NBA and WNBA) served as the dependent variable for the analysis. It was measured continuously as the number of stories published per week that referenced either league.

The independent variable was “story topic,” measured categorically as either “NBA” or “WNBA.” The story topic was identified using keywords and phrases uniquely associated with each league. Keywords used to define NBA-related stories included: “NBA,” “NBA’s,” “Adam Silver,” “LeBron,” “Luka,” “men’s basketball,” and “NBA Finals.” Keywords used to define WNBA-related stories included: “WNBA,” “WNBA’s,” “Cathy Engelbert,” “Caitlin Clark,” and “A’ja Wilson.”

Weekly counts of articles containing these keywords were compiled for each league between January 1st and September 30th, 2025, to evaluate how frequently each league appeared in AP News articles.

Eventually, I will use a paired-samples t-test to assess the statistical significance of coverage volume variation between the NBA and WNBA.

Results & Discussion

The figure below summarizes the weekly coverage volume for each league from January 1st to September 30th, 2025. For the majority of weeks, APNews.com coverage of the NBA (blue line) exceeded coverage of the WNBA (red line).

The most notable difference in volume appeared during Week 17 (April 20–26), when NBA-related stories reached a period high of 51 articles, compared to only 6 for the WNBA. A reviw of NBA stories published during that week revealed extensive coverage of playoff match ups and game results. The prominence of postseason narratives and high-profile player performances likely contributed to the surge in coverage. Notably, WNBA coverage remained low despite key preseason activities such as the draft, training camp, media campaigns leading up to the season, and the announcement of a new team.

The peak for WNBA-related stories occurred during Week 29 (July 14–20), with 41 articles published, compared to 11 for the NBA. This coincided with WNBA All-Star Weekend, collective bargaining agreement (CBA) discussions, and media coverage of new player signature shoe releases. However, NBA coverage during the same week remained elevated despite the league being early into its off-season. This suggests that even when inactive, the NBA continues to command substantial media attention.

Overall, the results indicate not only differing levels of APNews.com coverage between the two leagues but also a possible skewed pattern of media visibility. The NBA receives sustained coverage across both active and inactive periods, while WNBA coverage is more episodic and event-driven. These patterns reflect a first-level agenda-setting effect, where the NBA is positioned as a year-round media priority, and the WNBA is viewed as a seasonal or secondary topic.

NBA coverage volume averaged around 25 articles per week, while WNBA articles averaged nearly 12 per week from January 1 to September 30, 2025, the mean difference between the two was around 13 articles. The pair differences passed the Shapiro–Wilk normality test (p = 0.4809), suggesting that the data met the assumption of normality. While the graph above indicates that the NBA receives considerably more media coverage than the WNBA, the paired samples t-test confirmed that this difference was statistically significant, p = 0.0001. These results support the hypothesis that the NBA receives substantially greater weekly media coverage than the WNBA.

Below is a box plot of weekly article volumes for the WNBA (V1) and the NBA (V2), followed by the results of the paired-samples t-test.

statistic	parameter	p.value	conf.low	conf.high	method
Paired-Samples t-Test
4.5560	39	0.0001	7.3815	19.1685	Paired t-test

V1_Mean	V2_Mean	V1_SD	V2_SD
Group Means and SDs (t-Test)
11.950	25.225	8.140	14.545

The p value results (p=0.0001) from the t-test supports the hypothesis that weekly AP New coverage volume of the NBA and WNBA differed during the first nine months of 2025.

Code:

Here is the code that produced the figures and paired-samples t-test results.

# ============================================
# APNews text analysis (First-level agenda-setting theory version)
# ============================================

# ============================================
# --- Load required libraries ---
# ============================================

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidytext")) install.packages("tidytext")

library(tidyverse)
library(tidytext)

# ============================================
# --- Load the APNews data ---
# ============================================

# Read the data from the web
FetchedData <- readRDS(url("https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds"))
# Save the data on your computer
saveRDS(FetchedData, file = "APNews.rds")
# remove the downloaded data from the environment
rm (FetchedData)

APNews <- readRDS("APNews.rds")

# ============================================
# --- Flag Topic1-related stories ---
# ============================================

# --- Define Topic1 phrases ---
phrases <- c(
  "NBA",
  "nba’s",
  "nba's",
  "Adam Silver",
  "Lebron",
  "Luka",
  "men’s basketball",
  "NBA Finals"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic1 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),  # normalize whitespace
    Topic1 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Flag Topic2-related stories ---
# ============================================

# --- Define Topic2 phrases ---
phrases <- c(
  "WNBA",
  "wnba’s",
  "wnba's",
  "Cathy Engelbert",
  "Caitlin clark",
  "a'ja wilson"
)

# --- Escape regex special characters ---
escaped_phrases <- str_replace_all(
  phrases,
  "([\\^$.|?*+()\\[\\]{}\\\\])",
  "\\\\\\1"
)

# --- Build whole-word/phrase regex pattern ---
pattern <- paste0("\\b", escaped_phrases, "\\b", collapse = "|")

# --- Apply matching to flag Topic2 stories ---
APNews <- APNews %>%
  mutate(
    Full.Text.clean = str_squish(Full.Text),
    Topic2 = if_else(
      str_detect(Full.Text.clean, regex(pattern, ignore_case = TRUE)),
      "Yes",
      "No"
    )
  )

# ============================================
# --- Visualize weekly counts of Topic1- and Topic2-related stories ---
# ============================================

# --- Load plotly if needed ---
if (!require("plotly")) install.packages("plotly")
library(plotly)

# --- Summarize weekly counts for Topic1 = "Yes" ---
Topic1_weekly <- APNews %>%
  filter(Topic1 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "NBA") # Note custom Topic1 label

# --- Summarize weekly counts for Topic2 = "Yes" ---
Topic2_weekly <- APNews %>%
  filter(Topic2 == "Yes") %>%
  group_by(Week) %>%
  summarize(Count = n(), .groups = "drop") %>%
  mutate(Topic = "WNBA") # Note custom Topic2 label

# --- Combine both summaries into one data frame ---
Weekly_counts <- bind_rows(Topic2_weekly, Topic1_weekly)

# --- Fill in missing combinations with zero counts ---
Weekly_counts <- Weekly_counts %>%
  tidyr::complete(
    Topic,
    Week = full_seq(range(Week), 1),  # generate all week numbers
    fill = list(Count = 0)
  ) %>%
  arrange(Topic, Week)

# --- Create interactive plotly line chart ---
AS1 <- plot_ly(
  data = Weekly_counts,
  x = ~Week,
  y = ~Count,
  color = ~Topic,
  colors = c("steelblue", "firebrick"),
  type = "scatter",
  mode = "lines+markers",
  line = list(width = 2),
  marker = list(size = 6)
) %>%
  layout(
    title = "Weekly Counts of Topic1- and Topic2-Related AP News Articles",
    xaxis = list(
      title = "Week Number (starting with Week 1 of 2025)",
      dtick = 1
    ),
    yaxis = list(title = "Number of Articles"),
    legend = list(title = list(text = "Topic")),
    hovermode = "x unified"
  )

# ============================================
# --- Show the chart ---
# ============================================

AS1

# ============================================================
#  Setup: Install and Load Required Packages
# ============================================================
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("plotly")) install.packages("plotly")
if (!require("gt")) install.packages("gt")
if (!require("gtExtras")) install.packages("gtExtras")
if (!require("broom")) install.packages("broom")

library(tidyverse)
library(plotly)
library(gt)
library(gtExtras)
library(broom)

options(scipen = 999)

# ============================================================
#  Data Import
# ============================================================
# Reshape to wide form

mydata <- Weekly_counts %>%
  pivot_wider(names_from = Topic, values_from = Count)

# Specify the two variables involved
mydata$V1 <- mydata$WNBA # <== Customize this
mydata$V2 <- mydata$NBA # <== Customize this

# ============================================================
#  Compute Pair Differences
# ============================================================
mydata$PairDifferences <- mydata$V2 - mydata$V1

# ============================================================
#  Interactive Histogram of Pair Differences
# ============================================================
hist_plot <- plot_ly(
  data = mydata,
  x = ~PairDifferences,
  type = "histogram",
  marker = list(color = "#1f78b4", line = list(color = "black", width = 1))
) %>%
  layout(
    title = "Distribution of Pair Differences",
    xaxis = list(title = "Pair Differences"),
    yaxis = list(title = "Count"),
    shapes = list(
      list(
        type = "line",
        x0 = mean(mydata$PairDifferences, na.rm = TRUE),
        x1 = mean(mydata$PairDifferences, na.rm = TRUE),
        y0 = 0,
        y1 = max(table(mydata$PairDifferences)),
        line = list(color = "red", dash = "dash")
      )
    )
  )

# ============================================================
#  Descriptive Statistics
# ============================================================
desc_stats <- mydata %>%
  summarise(
    count = n(),
    mean = mean(PairDifferences, na.rm = TRUE),
    sd = sd(PairDifferences, na.rm = TRUE),
    min = min(PairDifferences, na.rm = TRUE),
    max = max(PairDifferences, na.rm = TRUE)
  )

desc_table <- desc_stats %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Descriptive Statistics: Pair Differences") %>%
  fmt_number(columns = where(is.numeric), decimals = 3)

# ============================================================
#  Normality Test (Shapiro-Wilk)
# ============================================================
shapiro_res <- shapiro.test(mydata$PairDifferences)
shapiro_table <- tidy(shapiro_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Normality Test (Shapiro-Wilk)") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4) %>%
  tab_source_note(
    source_note = "If the P.VALUE is 0.05 or less, the number of pairs is fewer than 40, and the distribution of pair differences shows obvious non-normality or outliers, consider using the Wilcoxon Signed Rank Test results instead of the Paired-Samples t-Test results."
  )

# ============================================================
#  Reshape Data for Repeated-Measures Plot
# ============================================================
df_long <- mydata %>%
  pivot_longer(cols = c(V1, V2),
               names_to = "Measure",
               values_to = "Value")

# ============================================================
#  Repeated-Measures Boxplot (Interactive, with Means)
# ============================================================
group_means <- df_long %>%
  group_by(Measure) %>%
  summarise(mean_value = mean(Value), .groups = "drop")

boxplot_measures <- plot_ly() %>%
  add_trace(
    data = df_long,
    x = ~Measure, y = ~Value,
    type = "box",
    boxpoints = "outliers",   
    marker = list(color = "red", size = 4),
    line = list(color = "black"),
    fillcolor = "royalblue",
    name = ""
  ) %>%
  add_trace(
    data = group_means,
    x = ~Measure, y = ~mean_value,
    type = "scatter", mode = "markers",
    marker = list(
      symbol = "diamond", size = 9,
      color = "black", line = list(color = "white", width = 1)
    ),
    text = ~paste0("Mean = ", round(mean_value, 2)),
    hoverinfo = "text",
    name = "Group Mean"
  ) %>%
  layout(
    title = "Boxplot of Repeated Measures (V1 vs V2) with Means",
    xaxis = list(title = "Measure"),
    yaxis = list(title = "Value"),
    showlegend = FALSE
  )

# ============================================================
#  Parametric Test (Paired-Samples t-Test)
# ============================================================
t_res <- t.test(mydata$V2, mydata$V1, paired = TRUE)
t_table <- tidy(t_res) %>%
  select(statistic, parameter, p.value, conf.low, conf.high, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Paired-Samples t-Test") %>%
  fmt_number(columns = c(statistic, p.value, conf.low, conf.high), decimals = 4)

t_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (t-Test)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Nonparametric Test (Wilcoxon Signed Rank)
# ============================================================
wilcox_res <- wilcox.test(mydata$V1, mydata$V2, paired = TRUE,
                          exact = FALSE)
wilcox_table <- tidy(wilcox_res) %>%
  select(statistic, p.value, method) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Wilcoxon Signed Rank Test") %>%
  fmt_number(columns = c(statistic, p.value), decimals = 4)

wilcox_summary <- mydata %>%
  select(V1, V2) %>%
  summarise_all(list(Mean = mean, SD = sd)) %>%
  gt() %>%
  gt_theme_538() %>%
  tab_header(title = "Group Means and SDs (Wilcoxon)") %>%
  fmt_number(columns = everything(), decimals = 3)

# ============================================================
#  Results Summary (in specified order)
# ============================================================
hist_plot
desc_table
shapiro_table
boxplot_measures
t_table
t_summary
wilcox_table
wilcox_summary

AP News | NBA V. WNBA Paired Samples T-Test

Deyonna Lansden

2025-11-05

Rationale

Hypothesis

Variables & method

Results & Discussion

Code: