NY Times API

Author

Madina KUdanova

Approach

I will use the New York Times Article Search API to analyze how frequently different newspaper sections publish articles on a specific topic. The ultimate goal is to answer the question: which New York Times sections published the most articles about artificial intelligence in 2025, and how this distribution compares to 10 years earlier.

To answer this question, I will first obtain access to the API by creating a New York Times developer account and securely storing my API key as an environment variable using Sys.getenv(“NYT_API_KEY”). I will then construct requests to the Article Search API endpoint using a keyword query (“AI” and “artificial intelligence) and two defined date ranges: January 1 to December 31, 2025, and January 1 to December 31, 2015. I will include the section_name facet parameter to summarize article counts by section for each time period.

Once the requests are made, I will parse the returned JSON responses using R, specifically with the jsonlite package, and inspect their structure to identify the relevant fields. Because the API response is nested, I will extract the section-level counts from the facets portion of each response and transform them into tidy data frames. Each dataset will contain section names and their corresponding article counts.

The datasets for both years will then be combined into a single structure that includes a year variable, allowing for direct comparison. Finally, I will analyze the data by comparing article counts across sections between 2015 and 2025 to identify shifts in coverage over time. The final Quarto document will include a description of the selected API and endpoint, the parameters used in the requests, the R code for authentication and data retrieval, the resulting tidy data frames, and a brief explanation of any data-cleaning decisions made during the process.

Code Base

library(tidyverse)
library(httr2)

library(jsonlite)

API Key Authentication

The API key is stored securely as an environment variable and accessed with Sys.getenv().

To set it once on your computer, you can run something like this in your .Renviron file:

NYT_API_KEY=your_actual_key_here

Then restart R.

readRenviron("~/.Renviron")
nyt_key <- Sys.getenv("NYT_API_KEY")

if (nyt_key == "") {
  stop("NYT_API_KEY not found. Please store your API key in an environment variable.")
}

req <- request("https://api.nytimes.com/svc/search/v2/articlesearch.json") |>
  req_url_query(
    q = "artificial intelligence OR AI OR machine learning OR language model",
    begin_date = "20250101",
    end_date = "20251231",
    page = 0,
    `api-key` = nyt_key
  )

resp <- req_perform(req)
resp

<httr2_response>
GET https://api.nytimes.com/svc/search/v2/articlesearch.json?q=artificial%20intelligence%20OR%20AI%20OR%20machine%20learning%20OR%20language%20model&begin_date=20250101&end_date=20251231&page=0&api-key=t6XWprsbiE6XQ764HzP2PqPA3cA1FGytVTPeOOr4uc4aed9s
Status: 200 OK
Content-Type: application/json
Body: In memory (20441 bytes)

# convert JSON response into R list
nyt_data <- resp_body_json(resp, simplifyVector = TRUE)

# inspect structure
names(nyt_data)

[1] "status"    "copyright" "response"

names(nyt_data$response)

[1] "docs"     "metadata"

# create a tidy dataframe from the JSON response
articles_df <- as_tibble(nyt_data$response$docs) |>
  transmute(
    headline = headline$main,  # extract main headline text
    section_name = if_else(is.na(section_name) | section_name == "", "Unknown", section_name),  # fill missing sections
    pub_date = as.Date(substr(pub_date, 1, 10)),  # convert publication date
    web_url,          # article link
    news_desk,        # newsroom category
    document_type     # type of article
  )

# view the cleaned dataframe
articles_df

# A tibble: 10 × 6
   headline              section_name pub_date   web_url news_desk document_type
   <chr>                 <chr>        <date>     <chr>   <chr>     <chr>        
 1 Who Pays When A.I. I… Business     2025-11-12 https:… Business  article      
 2 Recruiters Use A.I. … Business     2025-10-07 https:… Business  article      
 3 Yann LeCun, a Pionee… Technology   2025-11-19 https:… Business  article      
 4 A.I. Bots or Us: Who… Books        2025-08-27 https:… BookRevi… article      
 5 Israel’s A.I. Experi… Technology   2025-04-25 https:… Business  article      
 6 Jeff Bezos Creates A… Technology   2025-11-17 https:… Business  article      
 7 How Chile Embodies A… Technology   2025-10-20 https:… Business  article      
 8 Why You’re Better Th… Gameplay     2025-12-09 https:… Games     article      
 9 The Fever Dream of I… Opinion      2025-09-03 https:… OpEd      article      
10 The Professors Are U… Technology   2025-05-14 https:… Business  article

# count articles by section
section_counts <- articles_df |>
  count(section_name, sort = TRUE)

section_counts

# A tibble: 5 × 2
  section_name     n
  <chr>        <int>
1 Technology       5
2 Business         2
3 Books            1
4 Gameplay         1
5 Opinion          1

# plot top sections
section_counts |>
  slice_max(order_by = n, n = 10) |>
  ggplot(aes(x = reorder(section_name, n), y = n)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top NYT Sections for AI Articles in 2025",
    x = "Section",
    y = "Article Count"
  )

# request 2015 data (1 page to avoid rate limits)
req_2015 <- request("https://api.nytimes.com/svc/search/v2/articlesearch.json") |>
  req_url_query(
    q = "artificial intelligence OR AI OR machine learning",
    begin_date = "20150101",
    end_date = "20151231",
    page = 0,
    `api-key` = nyt_key
  )

resp_2015 <- req_perform(req_2015)

# parse JSON
nyt_data_2015 <- resp_body_json(resp_2015, simplifyVector = TRUE)

# create dataframe for 2015
articles_2015 <- as_tibble(nyt_data_2015$response$docs) |>
  transmute(
    headline = headline$main,
    section_name = if_else(is.na(section_name) | section_name == "", "Unknown", section_name),
    pub_date = as.Date(substr(pub_date, 1, 10)),
    web_url,
    news_desk,
    document_type
  )

articles_2015

# A tibble: 10 × 6
   headline              section_name pub_date   web_url news_desk document_type
   <chr>                 <chr>        <date>     <chr>   <chr>     <chr>        
 1 Don’t Fear the Robots Sunday Revi… 2015-10-24 https:… OpEd      article      
 2 Artificial-Intellige… Science      2015-12-11 https:… Business  article      
 3 ‘Machines of Loving … Books        2015-08-21 https:… BookRevi… article      
 4 Software Is Smart En… Technology   2015-09-20 https:… Business  article      
 5 The Real Threat Pose… Technology   2015-07-11 https:… Business  article      
 6 The End of Work?      Opinion      2015-12-10 https:… OpEd      article      
 7 Toyota to Finance $5… Science      2015-09-04 https:… Business  article      
 8 Daily Report: Machin… Technology   2015-12-11 https:… Business  article      
 9 Firms Pit Artificial… Technology   2015-10-14 https:… Business  article      
10 Outing A.I.: Beyond … Opinion      2015-02-23 https:… OpEd      article

# add year labels
nyt_2025 <- articles_df |> mutate(year = 2025)
nyt_2015 <- articles_2015 |> mutate(year = 2015)

# combine both datasets
nyt_all <- bind_rows(nyt_2015, nyt_2025)

nyt_all

# A tibble: 20 × 7
   headline        section_name pub_date   web_url news_desk document_type  year
   <chr>           <chr>        <date>     <chr>   <chr>     <chr>         <dbl>
 1 Don’t Fear the… Sunday Revi… 2015-10-24 https:… OpEd      article        2015
 2 Artificial-Int… Science      2015-12-11 https:… Business  article        2015
 3 ‘Machines of L… Books        2015-08-21 https:… BookRevi… article        2015
 4 Software Is Sm… Technology   2015-09-20 https:… Business  article        2015
 5 The Real Threa… Technology   2015-07-11 https:… Business  article        2015
 6 The End of Wor… Opinion      2015-12-10 https:… OpEd      article        2015
 7 Toyota to Fina… Science      2015-09-04 https:… Business  article        2015
 8 Daily Report: … Technology   2015-12-11 https:… Business  article        2015
 9 Firms Pit Arti… Technology   2015-10-14 https:… Business  article        2015
10 Outing A.I.: B… Opinion      2015-02-23 https:… OpEd      article        2015
11 Who Pays When … Business     2025-11-12 https:… Business  article        2025
12 Recruiters Use… Business     2025-10-07 https:… Business  article        2025
13 Yann LeCun, a … Technology   2025-11-19 https:… Business  article        2025
14 A.I. Bots or U… Books        2025-08-27 https:… BookRevi… article        2025
15 Israel’s A.I. … Technology   2025-04-25 https:… Business  article        2025
16 Jeff Bezos Cre… Technology   2025-11-17 https:… Business  article        2025
17 How Chile Embo… Technology   2025-10-20 https:… Business  article        2025
18 Why You’re Bet… Gameplay     2025-12-09 https:… Games     article        2025
19 The Fever Drea… Opinion      2025-09-03 https:… OpEd      article        2025
20 The Professors… Technology   2025-05-14 https:… Business  article        2025

# count by year and section
section_compare <- nyt_all |>
  count(year, section_name, sort = TRUE)

section_compare

# A tibble: 10 × 3
    year section_name      n
   <dbl> <chr>         <int>
 1  2025 Technology        5
 2  2015 Technology        4
 3  2015 Opinion           2
 4  2015 Science           2
 5  2025 Business          2
 6  2015 Books             1
 7  2015 Sunday Review     1
 8  2025 Books             1
 9  2025 Gameplay          1
10  2025 Opinion           1

ggplot(section_compare, aes(x = n, y = reorder(section_name, n))) +
  geom_col() +
  facet_wrap(~ year, scales = "free_y") +
  labs(
    title = "NYT Sections for AI Articles: 2015 vs 2025",
    x = "Article Count",
    y = "Section"
  )

section_totals <- section_compare |>
  group_by(year) |>
  summarize(total = sum(n))

section_totals

# A tibble: 2 × 2
   year total
  <dbl> <int>
1  2015    10
2  2025    10

section_pct <- section_compare |>
  group_by(year) |>
  mutate(
    total = sum(n),
    pct = (n / total) * 100
  ) |>
  ungroup()

section_pct

# A tibble: 10 × 5
    year section_name      n total   pct
   <dbl> <chr>         <int> <int> <dbl>
 1  2025 Technology        5    10    50
 2  2015 Technology        4    10    40
 3  2015 Opinion           2    10    20
 4  2015 Science           2    10    20
 5  2025 Business          2    10    20
 6  2015 Books             1    10    10
 7  2015 Sunday Review     1    10    10
 8  2025 Books             1    10    10
 9  2025 Gameplay          1    10    10
10  2025 Opinion           1    10    10

Conclusion

The comparison between 2015 and 2025 shows a clear shift in how artificial intelligence is covered across New York Times sections. In 2015, coverage was concentrated primarily in Technology and Science, suggesting a more technical and research-oriented focus.

By 2025, AI-related articles appear across a broader range of sections, including Technology, Business, Opinion, and others, indicating that AI has expanded beyond a specialized topic into wider industry and public discourse.

A percentage-based comparison highlights this shift more clearly. The Business section shows the largest increase, growing from 0% in 2015 to 20% in 2025, reflecting the growing economic relevance of AI. In contrast, Science declines from 20% to 0%, suggesting reduced emphasis on AI as a purely research-driven subject. Technology remains the dominant section in both years, increasing from 40% to 50%, while the appearance of new sections in 2025 reflects the broader integration of AI across different domains.

Because this analysis is based on the first page of API results for each year, the findings should be interpreted as a sample rather than a complete count of all published articles.