In this exercise, JSON data from New York Times APIs are read and transformed into an R dataframe.

Books API

The New York Times provides an API of its best sellings books, which is the first dataset we use.

Request

First, we request data on current NYT Best Sellers. We use the httr library to retrieve the response specified by the API url.

url <- "https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D"
response <- GET(url) # Retrieve whatever is specified by the URL 
response # Server response returned by function
## Response [https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=f5RHWlYOgGenjKxrLrkIkWLfVGjbZM9D]
##   Date: 2019-03-30 23:34
##   Status: 200
##   Content-Type: application/json; charset=UTF-8
##   Size: 19.6 kB

Content

We then use the jsonlite package to transform the books json file into a dataframe. There is a nested json column, “isbns”, that contains various types of isbns. We expand that column and then select relevent columns to preview the data.

json <- content(response, as="text")
res <- jsonlite::fromJSON(json)
books <- res[["results"]][["books"]]
colnames(books)
##  [1] "rank"                 "rank_last_week"       "weeks_on_list"       
##  [4] "asterisk"             "dagger"               "primary_isbn10"      
##  [7] "primary_isbn13"       "publisher"            "description"         
## [10] "price"                "title"                "author"              
## [13] "contributor"          "contributor_note"     "book_image"          
## [16] "book_image_width"     "book_image_height"    "amazon_product_url"  
## [19] "age_group"            "book_review_link"     "first_chapter_link"  
## [22] "sunday_review_link"   "article_chapter_link" "isbns"               
## [25] "buy_links"
books$isbn10 <- books %>% select("isbns") %>% apply(1, FUN=function(v) v[["isbns"]][["isbn10"]])
books$isbn13 <- books %>% select("isbns") %>% apply(1, FUN=function(v) v[["isbns"]][["isbn13"]])

books %<>%
unnest(isbn10, isbn13) %>% select(c("rank", "weeks_on_list", "author", "title", "amazon_product_url", "isbn10", "isbn13"))
datatable(books)