Querying the New York Times API

Author

By Tony Fraser

Published

November 5, 2023

Assignment

Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.

About this code

I chose to integrate with the article search api. If you send a query with a search term, it sends you a json object with a bunch of metadata about each matching article. Dat includes author, headline, date, description, etc..

This code either does or does not do the following:

  • It does not do any analytics on return results. The assignment does not ask for that.
  • It does properly handle errors for when/if the rest api returns a server 500.
  • It does load the api key from a text file that is not checked into Github.
  • It does create format and create a link of where the article is, but reading requires an NYT subscription.
  • If does not paginate. There may be be 1,000 results but this code shows the first 10 returned.
  • It can persist raw results as a json file, which is very useful for further integration.

Query the API

###############################
# Set variables
search_term <- "swing dancing"
headline_length <- 70
###############################

packages <- c("httr", "jsonlite", "dplyr", "gt", "lubridate", "stringr")
lapply(packages, library, character.only = TRUE)

get_api_key <- function(api_key_path) {
  if (!file.exists(api_key_path)) {
    stop("API key file does not exist:", api_key_path)
  }
  return(readLines(api_key_path, warn = FALSE, n = 1))
}

search_nyt <- function(query, api_key, persist_results = FALSE) {
  base_url <- "https://api.nytimes.com/svc/search/v2/articlesearch.json"
  params <- list('q' = query, 'api-key' = api_key)
  response <- GET(url = base_url, query = params)
  if (status_code(response) == 200) {
    content <- content(response, type = "text", encoding = "UTF-8")
    json_file_path <- paste0("nogit_", query, ".json")
    if(persist_results) writeLines(content, json_file_path)
    print(paste("JSON content saved to", json_file_path))
    return(fromJSON(content))
  } else { 
    stop(
      paste("Request failed with HTTP StatusCode:", status_code(response)))
  }
}
api_key = get_api_key("nogit_nytimes_api_key.txt") 

results_df <- search_nyt(query = search_term, 
                         api_key = api_key, 
                         persist_results = FALSE) %>%
  { flatten(. $response$docs) } %>%
  as.data.frame() %>%
  mutate(pub_date = ymd_hms(pub_date),
         pub_date_str = format(pub_date, "%Y-%m-%d")) %>%
  arrange(desc(pub_date)) %>%
  rowwise() %>%
  mutate(link = sprintf('<a href=%s target=_blank>view</a>', web_url)) %>%
  mutate(`headline.main` = 
      if_else(str_length(`headline.main`) > headline_length ,
              str_sub(`headline.main`, 1,  headline_length - 3) %>% str_c("..."),
               `headline.main`)) %>%
  select(c(`headline.main`, `byline.original`, pub_date_str, link)) 

Display the results

results_df %>% 
  gt() %>%
  fmt_markdown(
    columns = c(link)
   ) %>%
  cols_width(
    headline.main ~ px(400), 
    byline.original ~ px(250)
  ) %>%
    cols_label(
    headline.main = "Headline",
    byline.original = "Byline",
    pub_date_str = "Date",
    link = "Link"
  )
Headline Byline Date Link
Chefs Join the Swing Dancers at Damrosch Park By Florence Fabricant 2017-06-26
Too Much in Love to Say Good Night By John Leland 2015-09-03
Swing Dancing in a Wheelchair By Alan Robbins 2014-08-01
Swing Dancing in Queens, Free Dancing at Ailey By Rachel Lee Harris 2010-11-25
If It's Sunday, It's the Cat Club: Swing Dancing Takes Over Again NA 1990-12-30
The Pageant in Full Swing; DANCE AND SKYLARK. By John Moore. 215 pp... ISABELLE MALLET 1952-08-03
Ice Carnival Thrills Spectators in Inaugural at Garden; SKATING STA... By Lincoln A. Werden 1938-11-30
Denounce Swing Dances NA 1938-11-24
REICH ADDS TO LOAN TO MEET DEMAND; 1,000,000,000-Mark Issue Is Rais... By Otto D. Tolischuswireless To the New York Times 1938-05-06
NEW SWING DANCE THRILLS TEACHERS; 'The Fleet,' in Latest Rhythm, Ex... NA 1936-02-24