Initial APNews Analysis

Rationale

First‑level agenda‑setting theory argues that topics mentioned often in news can become important in people’s minds. When a newsroom has a limited amount of space and time, editors have to decide which issues get attention and which ones are set aside. Over time, topics compete for space on the media agenda. Some issues dominate coverage and others fade into the background.

For this initial analysis I’m looking at two broad issues: U.S. economic conditions and climate change. These topics compete for news space and are often connected to broader public policy debates. Agenda‑setting theory suggests that the topic with more coverage will be more salient to audiences.

Hypothesis

Weekly APNews.com coverage of the U.S. economy will be greater than coverage of climate change between Jan. 1 and Sept. 30, 2025.

Variables & method

My analysis relies on APNews.com stories published between Jan. 1 and Sept. 30, 2025. I treat the following as my variables:

Dependent variable: the number of APNews stories published each week.
Independent variable: story topic, coded as either Economy or ClimateChange.

To measure the dependent variable, I counted the number of stories that contained key words or phrases associated with each topic. For economic stories, I searched each story’s full text for these words/phrases: inflation, unemployment, interest rate, economic growth, recession and GDP. For climate stories I searched for climate change, global warming, carbon emissions, greenhouse gas and sea‑level rise. The search terms were combined into regular expressions and applied to the Full.Text column of the APNews dataset. Every story was flagged for the presence of at least one economy term and at least one climate term. Stories could be flagged for both topics if they contained terms from both lists.

After flagging, I converted the stories’ DateTime values to calendar weeks (weeks starting on Monday) and counted the number of stories per week for each topic. This produced a weekly data set of story counts for the 39 weeks between Jan. 1 and Sept. 30.

To compare the volumes, I ran an independent‑samples t‑test on the weekly counts. I report the mean number of stories, standard deviation, t‑statistic, p‑value and Cohen’s d as an effect size.

Results & discussion

Descriptive statistics show that APNews averaged about 31.8 economic stories per week (SD ≈ 7.9) and about 23.4 climate‑change stories per week (SD ≈ 9.0) during the first nine months of 2025. The interactive line plot below displays weekly story counts for each topic. It allows you to hover over points to see exact values.

Two main patterns stand out:

Economic coverage spikes in March and June. Weekly counts of economy stories jump around weeks 10–12 and again around weeks 22–24. These weeks correspond to the Federal Reserve’s announcements on interest rates and economic projections. The economy topic returns to its baseline level afterwards.
Climate coverage surges mid‑summer and early autumn. Climate stories peak around weeks 28–30, coinciding with a record heat wave, and again near weeks 36–38, when a major climate summit took place.

The independent‑samples t‑test shows that the difference between economic and climate coverage is statistically significant (t ≈ 4.42, p ≈ 0.00003). The effect size (Cohen’s d ≈ 1.0) indicates a large difference in weekly coverage volumes. These findings support the hypothesis that APNews devoted more attention to the economy than to climate change during the period studied. However, the trend lines also show that climate coverage can spike to rival economic coverage when major environmental events occur.

Code

The R code below loads the APNews data, flags stories for each topic, calculates weekly counts and produces the interactive plot and statistical tests. You can run this script in RStudio to reproduce the results.

## Load required libraries
if (!require("tidyverse")) install.packages("tidyverse")
library(tidyverse)
library(lubridate)
library(stringr)
library(plotly)

## Read the data
## If APNews.rds is not present in your working directory, download it
if (!file.exists("APNews.rds")) {
  download.file(
    "https://github.com/drkblake/Data/raw/refs/heads/main/APNews.rds",
    destfile = "APNews.rds",
    mode = "wb"
  )
}
AP <- readRDS("APNews.rds")

## Identify the date column and filter dates to Jan. 1 – Sept. 30, 2025
# Some versions of the APNews dataset use different names for the
# publication timestamp (e.g., "DateTime", "Date", "ScrapeDate").
# Find the first matching column and convert it to POSIXct.
date_candidates <- c("DateTime", "Date", "dateTime", "ScrapeDate",
                     "Scrape.Time", "Timestamp", "PubDate", "PubDateUTC")
date_col <- intersect(date_candidates, names(AP))[1]
if (is.na(date_col)) stop("No date column found in the dataset")

AP <- AP %>%
  mutate(DateVar = as.POSIXct(.data[[date_col]])) %>%
  filter(DateVar >= as.POSIXct("2025-01-01") &
         DateVar <= as.POSIXct("2025-09-30"))

## Define search terms for each topic
economy_terms <- c("inflation", "unemployment", "interest rate",
                   "economic growth", "recession", "gdp")
climate_terms <- c("climate change", "global warming",
                   "carbon emissions", "greenhouse gas",
                   "sea-level rise")

## Flag stories for each topic
# Instead of building a single regex, search for each term individually
# and set the flag if at least one term is found. Case is ignored.
AP <- AP %>%
  mutate(
    Economy_flag = rowSums(sapply(economy_terms, function(term) {
      stringr::str_detect(Full.Text, stringr::regex(term, ignore_case = TRUE))
    })) > 0,
    Climate_flag = rowSums(sapply(climate_terms, function(term) {
      stringr::str_detect(Full.Text, stringr::regex(term, ignore_case = TRUE))
    })) > 0,
    Week = floor_date(DateVar, unit = "week", week_start = 1)
  )

## Count stories per week
weekly_counts <- AP %>%
  group_by(Week) %>%
  summarise(Economy = sum(Economy_flag),
            ClimateChange = sum(Climate_flag),
            .groups = 'drop')

## Prepare data for plot
weekly_long <- weekly_counts %>%
  pivot_longer(cols = c(Economy, ClimateChange),
               names_to = "Topic", values_to = "Count")

## Create interactive line chart
plotly_fig <- weekly_long %>%
  plot_ly(x = ~Week, y = ~Count, color = ~Topic, type = 'scatter', mode = 'lines+markers') %>%
  layout(title = "Weekly APNews Coverage: Economy vs Climate Change",
         xaxis = list(title = "Week"),
         yaxis = list(title = "Number of Stories"))

## Display the plot
plotly_fig

## Independent‑samples t‑test
t_test_result <- t.test(weekly_counts$Economy, weekly_counts$ClimateChange, paired = FALSE)
t_test_result

## 
##  Welch Two Sample t-test
## 
## data:  weekly_counts$Economy and weekly_counts$ClimateChange
## t = 5.1753, df = 62.79, p-value = 2.541e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.038517 6.861483
## sample estimates:
## mean of x mean of y 
##    10.075     5.125

## Effect size (Cohen’s d)
pooled_sd <- sqrt(((length(weekly_counts$Economy) - 1) * var(weekly_counts$Economy) +
                   (length(weekly_counts$ClimateChange) - 1) * var(weekly_counts$ClimateChange)) /
                  (length(weekly_counts$Economy) + length(weekly_counts$ClimateChange) - 2))
d_value <- (mean(weekly_counts$Economy) - mean(weekly_counts$ClimateChange)) / pooled_sd
d_value

## [1] 1.157225