library(tidyverse)
library(httr2)NY Times API
Approach
I will use the New York Times Article Search API to analyze how frequently different newspaper sections publish articles on a specific topic. The ultimate goal is to answer the question: which New York Times sections published the most articles about artificial intelligence in 2025, and how this distribution compares to 10 years earlier.
To answer this question, I will first obtain access to the API by creating a New York Times developer account and securely storing my API key as an environment variable using Sys.getenv(“NYT_API_KEY”). I will then construct requests to the Article Search API endpoint using a keyword query (“AI” and “artificial intelligence) and two defined date ranges: January 1 to December 31, 2025, and January 1 to December 31, 2015. I will include the section_name facet parameter to summarize article counts by section for each time period.
Once the requests are made, I will parse the returned JSON responses using R, specifically with the jsonlite package, and inspect their structure to identify the relevant fields. Because the API response is nested, I will extract the section-level counts from the facets portion of each response and transform them into tidy data frames. Each dataset will contain section names and their corresponding article counts.
The datasets for both years will then be combined into a single structure that includes a year variable, allowing for direct comparison. Finally, I will analyze the data by comparing article counts across sections between 2015 and 2025 to identify shifts in coverage over time. The final Quarto document will include a description of the selected API and endpoint, the parameters used in the requests, the R code for authentication and data retrieval, the resulting tidy data frames, and a brief explanation of any data-cleaning decisions made during the process.
Code Base
library(jsonlite)API Key Authentication
The API key is stored securely as an environment variable and accessed with Sys.getenv().
To set it once on your computer, you can run something like this in your .Renviron file:
NYT_API_KEY=your_actual_key_here
Then restart R.
readRenviron("~/.Renviron")
nyt_key <- Sys.getenv("NYT_API_KEY")
if (nyt_key == "") {
stop("NYT_API_KEY not found. Please store your API key in an environment variable.")
}req <- request("https://api.nytimes.com/svc/search/v2/articlesearch.json") |>
req_url_query(
q = "artificial intelligence OR AI OR machine learning OR language model",
begin_date = "20250101",
end_date = "20251231",
page = 0,
`api-key` = nyt_key
)
resp <- req_perform(req)
resp<httr2_response>
GET https://api.nytimes.com/svc/search/v2/articlesearch.json?q=artificial%20intelligence%20OR%20AI%20OR%20machine%20learning%20OR%20language%20model&begin_date=20250101&end_date=20251231&page=0&api-key=t6XWprsbiE6XQ764HzP2PqPA3cA1FGytVTPeOOr4uc4aed9s
Status: 200 OK
Content-Type: application/json
Body: In memory (20441 bytes)
# convert JSON response into R list
nyt_data <- resp_body_json(resp, simplifyVector = TRUE)
# inspect structure
names(nyt_data)[1] "status" "copyright" "response"
names(nyt_data$response)[1] "docs" "metadata"
# create a tidy dataframe from the JSON response
articles_df <- as_tibble(nyt_data$response$docs) |>
transmute(
headline = headline$main, # extract main headline text
section_name = if_else(is.na(section_name) | section_name == "", "Unknown", section_name), # fill missing sections
pub_date = as.Date(substr(pub_date, 1, 10)), # convert publication date
web_url, # article link
news_desk, # newsroom category
document_type # type of article
)
# view the cleaned dataframe
articles_df# A tibble: 10 × 6
headline section_name pub_date web_url news_desk document_type
<chr> <chr> <date> <chr> <chr> <chr>
1 Who Pays When A.I. I… Business 2025-11-12 https:… Business article
2 Recruiters Use A.I. … Business 2025-10-07 https:… Business article
3 Yann LeCun, a Pionee… Technology 2025-11-19 https:… Business article
4 A.I. Bots or Us: Who… Books 2025-08-27 https:… BookRevi… article
5 Israel’s A.I. Experi… Technology 2025-04-25 https:… Business article
6 Jeff Bezos Creates A… Technology 2025-11-17 https:… Business article
7 How Chile Embodies A… Technology 2025-10-20 https:… Business article
8 Why You’re Better Th… Gameplay 2025-12-09 https:… Games article
9 The Fever Dream of I… Opinion 2025-09-03 https:… OpEd article
10 The Professors Are U… Technology 2025-05-14 https:… Business article
# count articles by section
section_counts <- articles_df |>
count(section_name, sort = TRUE)
section_counts# A tibble: 5 × 2
section_name n
<chr> <int>
1 Technology 5
2 Business 2
3 Books 1
4 Gameplay 1
5 Opinion 1
# plot top sections
section_counts |>
slice_max(order_by = n, n = 10) |>
ggplot(aes(x = reorder(section_name, n), y = n)) +
geom_col() +
coord_flip() +
labs(
title = "Top NYT Sections for AI Articles in 2025",
x = "Section",
y = "Article Count"
)# request 2015 data (1 page to avoid rate limits)
req_2015 <- request("https://api.nytimes.com/svc/search/v2/articlesearch.json") |>
req_url_query(
q = "artificial intelligence OR AI OR machine learning",
begin_date = "20150101",
end_date = "20151231",
page = 0,
`api-key` = nyt_key
)
resp_2015 <- req_perform(req_2015)
# parse JSON
nyt_data_2015 <- resp_body_json(resp_2015, simplifyVector = TRUE)
# create dataframe for 2015
articles_2015 <- as_tibble(nyt_data_2015$response$docs) |>
transmute(
headline = headline$main,
section_name = if_else(is.na(section_name) | section_name == "", "Unknown", section_name),
pub_date = as.Date(substr(pub_date, 1, 10)),
web_url,
news_desk,
document_type
)
articles_2015# A tibble: 10 × 6
headline section_name pub_date web_url news_desk document_type
<chr> <chr> <date> <chr> <chr> <chr>
1 Don’t Fear the Robots Sunday Revi… 2015-10-24 https:… OpEd article
2 Artificial-Intellige… Science 2015-12-11 https:… Business article
3 ‘Machines of Loving … Books 2015-08-21 https:… BookRevi… article
4 Software Is Smart En… Technology 2015-09-20 https:… Business article
5 The Real Threat Pose… Technology 2015-07-11 https:… Business article
6 The End of Work? Opinion 2015-12-10 https:… OpEd article
7 Toyota to Finance $5… Science 2015-09-04 https:… Business article
8 Daily Report: Machin… Technology 2015-12-11 https:… Business article
9 Firms Pit Artificial… Technology 2015-10-14 https:… Business article
10 Outing A.I.: Beyond … Opinion 2015-02-23 https:… OpEd article
# add year labels
nyt_2025 <- articles_df |> mutate(year = 2025)
nyt_2015 <- articles_2015 |> mutate(year = 2015)
# combine both datasets
nyt_all <- bind_rows(nyt_2015, nyt_2025)
nyt_all# A tibble: 20 × 7
headline section_name pub_date web_url news_desk document_type year
<chr> <chr> <date> <chr> <chr> <chr> <dbl>
1 Don’t Fear the… Sunday Revi… 2015-10-24 https:… OpEd article 2015
2 Artificial-Int… Science 2015-12-11 https:… Business article 2015
3 ‘Machines of L… Books 2015-08-21 https:… BookRevi… article 2015
4 Software Is Sm… Technology 2015-09-20 https:… Business article 2015
5 The Real Threa… Technology 2015-07-11 https:… Business article 2015
6 The End of Wor… Opinion 2015-12-10 https:… OpEd article 2015
7 Toyota to Fina… Science 2015-09-04 https:… Business article 2015
8 Daily Report: … Technology 2015-12-11 https:… Business article 2015
9 Firms Pit Arti… Technology 2015-10-14 https:… Business article 2015
10 Outing A.I.: B… Opinion 2015-02-23 https:… OpEd article 2015
11 Who Pays When … Business 2025-11-12 https:… Business article 2025
12 Recruiters Use… Business 2025-10-07 https:… Business article 2025
13 Yann LeCun, a … Technology 2025-11-19 https:… Business article 2025
14 A.I. Bots or U… Books 2025-08-27 https:… BookRevi… article 2025
15 Israel’s A.I. … Technology 2025-04-25 https:… Business article 2025
16 Jeff Bezos Cre… Technology 2025-11-17 https:… Business article 2025
17 How Chile Embo… Technology 2025-10-20 https:… Business article 2025
18 Why You’re Bet… Gameplay 2025-12-09 https:… Games article 2025
19 The Fever Drea… Opinion 2025-09-03 https:… OpEd article 2025
20 The Professors… Technology 2025-05-14 https:… Business article 2025
# count by year and section
section_compare <- nyt_all |>
count(year, section_name, sort = TRUE)
section_compare# A tibble: 10 × 3
year section_name n
<dbl> <chr> <int>
1 2025 Technology 5
2 2015 Technology 4
3 2015 Opinion 2
4 2015 Science 2
5 2025 Business 2
6 2015 Books 1
7 2015 Sunday Review 1
8 2025 Books 1
9 2025 Gameplay 1
10 2025 Opinion 1
ggplot(section_compare, aes(x = n, y = reorder(section_name, n))) +
geom_col() +
facet_wrap(~ year, scales = "free_y") +
labs(
title = "NYT Sections for AI Articles: 2015 vs 2025",
x = "Article Count",
y = "Section"
)section_totals <- section_compare |>
group_by(year) |>
summarize(total = sum(n))
section_totals# A tibble: 2 × 2
year total
<dbl> <int>
1 2015 10
2 2025 10
section_pct <- section_compare |>
group_by(year) |>
mutate(
total = sum(n),
pct = (n / total) * 100
) |>
ungroup()
section_pct# A tibble: 10 × 5
year section_name n total pct
<dbl> <chr> <int> <int> <dbl>
1 2025 Technology 5 10 50
2 2015 Technology 4 10 40
3 2015 Opinion 2 10 20
4 2015 Science 2 10 20
5 2025 Business 2 10 20
6 2015 Books 1 10 10
7 2015 Sunday Review 1 10 10
8 2025 Books 1 10 10
9 2025 Gameplay 1 10 10
10 2025 Opinion 1 10 10
Conclusion
The comparison between 2015 and 2025 shows a clear shift in how artificial intelligence is covered across New York Times sections. In 2015, coverage was concentrated primarily in Technology and Science, suggesting a more technical and research-oriented focus.
By 2025, AI-related articles appear across a broader range of sections, including Technology, Business, Opinion, and others, indicating that AI has expanded beyond a specialized topic into wider industry and public discourse.
A percentage-based comparison highlights this shift more clearly. The Business section shows the largest increase, growing from 0% in 2015 to 20% in 2025, reflecting the growing economic relevance of AI. In contrast, Science declines from 20% to 0%, suggesting reduced emphasis on AI as a purely research-driven subject. Technology remains the dominant section in both years, increasing from 40% to 50%, while the appearance of new sections in 2025 reflects the broader integration of AI across different domains.
Because this analysis is based on the first page of API results for each year, the findings should be interpreted as a sample rather than a complete count of all published articles.