New York Times Web APIs

In this exercise, JSON data from New York Times APIs are read and transformed into an R dataframe.

Books API

The New York Times provides an API of its best sellings books, which is the first dataset we use.

Request

First, we request data on current NYT Best Sellers. We use the httr library to retrieve the response specified by the API url.

url <- "https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D"
response <- GET(url) # Retrieve whatever is specified by the URL 
response # Server response returned by function

## Response [https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D]
##   Date: 2019-03-30 23:34
##   Status: 200
##   Content-Type: application/json; charset=UTF-8
##   Size: 19.6 kB

Content

We then use the jsonlite package to transform the books json file into a dataframe. There is a nested json column, “isbns”, that contains various types of isbns. We expand that column and then select relevent columns to preview the data.

json <- content(response, as="text")
res <- jsonlite::fromJSON(json)
books <- res[["results"]][["books"]]
colnames(books)

##  [1] "rank"                 "rank_last_week"       "weeks_on_list"       
##  [4] "asterisk"             "dagger"               "primary_isbn10"      
##  [7] "primary_isbn13"       "publisher"            "description"         
## [10] "price"                "title"                "author"              
## [13] "contributor"          "contributor_note"     "book_image"          
## [16] "book_image_width"     "book_image_height"    "amazon_product_url"  
## [19] "age_group"            "book_review_link"     "first_chapter_link"  
## [22] "sunday_review_link"   "article_chapter_link" "isbns"               
## [25] "buy_links"

books$isbn10 <- books %>% select("isbns") %>% apply(1, FUN=function(v) v[["isbns"]][["isbn10"]])
books$isbn13 <- books %>% select("isbns") %>% apply(1, FUN=function(v) v[["isbns"]][["isbn13"]])

books %<>%
unnest(isbn10, isbn13) %>% select(c("rank", "weeks_on_list", "author", "title", "amazon_product_url", "isbn10", "isbn13"))
datatable(books)

Popular Articles API

There is another API that contains data on most popular New York Times articles in terms of amount of times emailed, shared on Facebook, and viewed.

Most Emailed

emailed <- content(GET("https://api.nytimes.com/svc/mostpopular/v2/emailed/7.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D"), as="text")
emailed <- jsonlite::fromJSON(emailed)
datatable(as.data.frame(emailed))

Most Shared

shared <- content(GET("https://api.nytimes.com/svc/mostpopular/v2/shared/1/facebook.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D"), as="text")
shared <- jsonlite::fromJSON(shared)
datatable(as.data.frame(shared))

Most Viewed

viewed <- content(GET("https://api.nytimes.com/svc/mostpopular/v2/viewed/1.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D"), as="text")
viewed <- jsonlite::fromJSON(viewed)
datatable(as.data.frame(viewed))