library(tidyverse)
library(knitr)
library(httr)
library(jsonlite)
For this assignment our task was to choose one of the New York Times APIs, construct an interface in R to read the JSON data, and transform it into an R DataFrame.
I chose the “Most Popular” API which makes available information on the most popular NY Times articles. I retrieved info for the last day, though it is also possible to increase the number by modifying the API call. This required first setting up a NY Times developer account, and from there selecting the API I want to use for which an API key was assigned.
Link to more information: https://developer.nytimes.com/docs/most-popular-product/1/overview
The data is retrieved using the GET function from the httr package, which retrieves JSON data. It is then converted via the fromJSON function.
# Send GET request
res = GET("https://api.nytimes.com/svc/mostpopular/v2/viewed/1.json?api-key=9OzGBXzGGQI4Ql8IEMXVM2CiMTPp63T2")
# Confirm it was successful - code 200 is desired
cat("The status returned is:", res$status_code)
## The status returned is: 200
# Convert JSON data
data <- fromJSON(rawToChar(res$content))
#Check to see what came back
names(data)
## [1] "status" "copyright" "num_results" "results"
The key data can be extracted and placed in a data frame by selecting it as follows.
# Create the data frame
df <- data.frame(data$results, stringsAsFactors = FALSE)
# View summary of top articles
df %>%
select(title, section, published_date) %>%
rename("Title" = title, "Section" = section, "Published Date" = published_date) %>%
kable
| Title | Section | Published Date |
|---|---|---|
| Trump Records Shed New Light on Chinese Business Pursuits | U.S. | 2020-10-20 |
| New Yorker Suspends Jeffrey Toobin After Zoom Incident | Business | 2020-10-19 |
| Trump Is Giving Up | Opinion | 2020-10-20 |
| Trump cuts a ‘60 Minutes’ interview short, and then taunts Lesley Stahl on Twitter. | U.S. | 2020-10-20 |
| Trump Taunts Lesley Stahl of ‘60 Minutes’ After Cutting Off Interview | U.S. | 2020-10-20 |
| U.S. Accuses Google of Illegally Protecting Monopoly | Technology | 2020-10-20 |
| U.S. Diplomats and Spies Battle Trump Administration Over Suspected Attacks | U.S. | 2020-10-19 |
| 2,000-Year-Old Cat Etching Found at Nazca Lines Site in Peru | World | 2020-10-19 |
| As McConnell advises White House against pre-election stimulus deal, Pelosi and Mnuchin make headway in talks. | Business | 2020-10-20 |
| Justice Dept. Says Trump’s Denial of Rape Accusation Was an Official Act | New York | 2020-10-19 |
| Voters Prefer Biden Over Trump on Almost All Major Issues, Poll Shows | U.S. | 2020-10-20 |
| Trump Calls Fauci ‘a Disaster’ and Shrugs Off Virus as Infections Soar | U.S. | 2020-10-19 |
| 5 Ways Families Can Prepare as Coronavirus Cases Surge | Parenting | 2020-10-19 |
| Michigan Woman Found Alive at Funeral Home Dies 8 Weeks Later | U.S. | 2020-10-19 |
| Democrats Gain in Georgia Senate Races as Presidential Race Remains Tied | The Upshot | 2020-10-20 |
| Doctors May Have Found Secretive New Organs in the Center of Your Head | Health | 2020-10-19 |
| A Gated Community in N.Y.C. Where Trump Flags Fly | New York | 2020-10-20 |
| After Teacher’s Decapitation, France Unleashes a Broad Crackdown on ‘the Enemy Within’ | World | 2020-10-19 |
| The Real Divide in America Is Between Political Junkies and Everyone Else | Opinion | 2020-10-20 |
| Eminem Earns His 10th Straight No. 1 Album | Arts | 2020-01-27 |
# Confirm that the data is in a data frame (objective of the assignment)
truefalse <- is.data.frame(df)
cat("Is this a dataframe?", truefalse)
## Is this a dataframe? TRUE
Using a combination of the httr and jsonlite packages we can retrieve, format and use data from APIs - in this case the NY Times.