library(tidyverse)
library(knitr)
library(httr)
library(jsonlite)

Introduction / description

For this assignment our task was to choose one of the New York Times APIs, construct an interface in R to read the JSON data, and transform it into an R DataFrame.

Overview of approach

I chose the “Most Popular” API which makes available information on the most popular NY Times articles. I retrieved info for the last day, though it is also possible to increase the number by modifying the API call. This required first setting up a NY Times developer account, and from there selecting the API I want to use for which an API key was assigned.

Link to more information: https://developer.nytimes.com/docs/most-popular-product/1/overview

Retrieve the data

The data is retrieved using the GET function from the httr package, which retrieves JSON data. It is then converted via the fromJSON function.

# Send GET request
res = GET("https://api.nytimes.com/svc/mostpopular/v2/viewed/1.json?api-key=9OzGBXzGGQI4Ql8IEMXVM2CiMTPp63T2")

# Confirm it was successful - code 200 is desired
cat("The status returned is:", res$status_code)
## The status returned is: 200
# Convert JSON data
data <- fromJSON(rawToChar(res$content))

#Check to see what came back
names(data)
## [1] "status"      "copyright"   "num_results" "results"

Structure the data

The key data can be extracted and placed in a data frame by selecting it as follows.

# Create the data frame
df <- data.frame(data$results, stringsAsFactors = FALSE)

View the data

# View summary of top articles
df %>% 
  select(title, section, published_date) %>% 
  rename("Title" = title, "Section" = section, "Published Date" = published_date) %>% 
  kable
Title Section Published Date
Trump Records Shed New Light on Chinese Business Pursuits U.S. 2020-10-20
New Yorker Suspends Jeffrey Toobin After Zoom Incident Business 2020-10-19
Trump Is Giving Up Opinion 2020-10-20
Trump cuts a ‘60 Minutes’ interview short, and then taunts Lesley Stahl on Twitter. U.S. 2020-10-20
Trump Taunts Lesley Stahl of ‘60 Minutes’ After Cutting Off Interview U.S. 2020-10-20
U.S. Accuses Google of Illegally Protecting Monopoly Technology 2020-10-20
U.S. Diplomats and Spies Battle Trump Administration Over Suspected Attacks U.S. 2020-10-19
2,000-Year-Old Cat Etching Found at Nazca Lines Site in Peru World 2020-10-19
As McConnell advises White House against pre-election stimulus deal, Pelosi and Mnuchin make headway in talks. Business 2020-10-20
Justice Dept. Says Trump’s Denial of Rape Accusation Was an Official Act New York 2020-10-19
Voters Prefer Biden Over Trump on Almost All Major Issues, Poll Shows U.S. 2020-10-20
Trump Calls Fauci ‘a Disaster’ and Shrugs Off Virus as Infections Soar U.S. 2020-10-19
5 Ways Families Can Prepare as Coronavirus Cases Surge Parenting 2020-10-19
Michigan Woman Found Alive at Funeral Home Dies 8 Weeks Later U.S. 2020-10-19
Democrats Gain in Georgia Senate Races as Presidential Race Remains Tied The Upshot 2020-10-20
Doctors May Have Found Secretive New Organs in the Center of Your Head Health 2020-10-19
A Gated Community in N.Y.C. Where Trump Flags Fly New York 2020-10-20
After Teacher’s Decapitation, France Unleashes a Broad Crackdown on ‘the Enemy Within’ World 2020-10-19
The Real Divide in America Is Between Political Junkies and Everyone Else Opinion 2020-10-20
Eminem Earns His 10th Straight No. 1 Album Arts 2020-01-27
# Confirm that the data is in a data frame (objective of the assignment)
truefalse <- is.data.frame(df)
cat("Is this a dataframe?", truefalse)
## Is this a dataframe? TRUE

Conclusion

Using a combination of the httr and jsonlite packages we can retrieve, format and use data from APIs - in this case the NY Times.