Assignment06

Introduction

The goal of this RMD is to take one of the New York Times APIs, construct an interface in R to read in the JSON data and transform it into an R DataFrame. The NYT API that was chosen is the Books API and the data to be taken out of that is the current best sellers list for graphic books and manga.

Import Libraries

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(jsonlite)

## 
## Attaching package: 'jsonlite'
## 
## The following object is masked from 'package:purrr':
## 
##     flatten

library(httr)

Import Web API

# this API key was assigned to my account on Times Developers
api_key <- "q3Ei9rbguhYDwWjqyiAC1sw19y5wSgx9"

# url from API site
books_url <- paste0("https://api.nytimes.com/svc/books/v3/lists/current/graphic-books-and-manga.json?api-key=", api_key)

# response with httr's GET
response <- GET(books_url)

# translate json encoding
json_data <- content(response, "text", encoding = "UTF-8")
parsed_data <- fromJSON(json_data)

#take a look at the data to see what transformation needs to be done
glimpse(parsed_data)

## List of 5
##  $ status       : chr "OK"
##  $ copyright    : chr "Copyright (c) 2024 The New York Times Company.  All Rights Reserved."
##  $ num_results  : int 15
##  $ last_modified: chr "2024-03-07T00:00:10-05:00"
##  $ results      :List of 12
##   ..$ list_name                 : chr "Graphic Books and Manga"
##   ..$ list_name_encoded         : chr "graphic-books-and-manga"
##   ..$ bestsellers_date          : chr "2024-03-02"
##   ..$ published_date            : chr "2024-03-17"
##   ..$ published_date_description: chr "latest"
##   ..$ next_published_date       : chr ""
##   ..$ previous_published_date   : chr "2024-02-01"
##   ..$ display_name              : chr "Graphic Books and Manga"
##   ..$ normal_list_ends_at       : int 15
##   ..$ updated                   : chr "MONTHLY"
##   ..$ books                     :'data.frame':   15 obs. of  26 variables:
##   .. ..$ rank                : int [1:15] 1 2 3 4 5 6 7 8 9 10 ...
##   .. ..$ rank_last_week      : int [1:15] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ weeks_on_list       : int [1:15] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ asterisk            : int [1:15] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ dagger              : int [1:15] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ primary_isbn10      : chr [1:15] "0545828651" "1338730924" "1974743586" "" ...
##   .. ..$ primary_isbn13      : chr [1:15] "9780545828659" "9781338730920" "9781974743582" "9781339050058" ...
##   .. ..$ publisher           : chr [1:15] "Scholastic" "Scholastic" "VIZ Media" "Scholastic" ...
##   .. ..$ description         : chr [1:15] "The ninth book in the Amulet series. A threat of darkness tests new friends and old foes." "The seventh book in the Wings of Fire graphic novel series. Winter must face his family in the Ice Kingdom." "Asa and Denji experience extreme social awkwardness when they go on a date." "The fifth book in the Cat Kid Comic Club series. As publication seems possible, the baby frogs feel doubt." ...
##   .. ..$ price               : chr [1:15] "0.00" "0.00" "0.00" "0.00" ...
##   .. ..$ title               : chr [1:15] "WAVERIDER" "WINTER TURNING" "CHAINSAW MAN, VOL. 14" "INFLUENCERS" ...
##   .. ..$ author              : chr [1:15] "Kazu Kibuishi" "Tui T. Sutherland." "Tatsuki Fujimoto" "Dav Pilkey" ...
##   .. ..$ contributor         : chr [1:15] "by Kazu Kibuishi" "by Tui T. Sutherland. Illustrated by Mike Holmes" "by Tatsuki Fujimoto" "by Dav Pilkey" ...
##   .. ..$ contributor_note    : chr [1:15] "" "Illustrated by Mike Holmes" "" "" ...
##   .. ..$ book_image          : chr [1:15] "https://storage.googleapis.com/du-prd/books/images/9780545828659.jpg" "https://storage.googleapis.com/du-prd/books/images/9781338730920.jpg" "https://storage.googleapis.com/du-prd/books/images/9781974743582.jpg" "https://storage.googleapis.com/du-prd/books/images/9781338896398.jpg" ...
##   .. ..$ book_image_width    : int [1:15] 333 333 333 363 344 331 337 333 344 333 ...
##   .. ..$ book_image_height   : int [1:15] 500 500 500 500 500 500 500 500 500 500 ...
##   .. ..$ amazon_product_url  : chr [1:15] "https://www.amazon.com/dp/0545828651?tag=NYTBSREV-20" "https://www.amazon.com/dp/1338730924?tag=NYTBSREV-20" "https://www.amazon.com/dp/1974743586?tag=NYTBSREV-20" "https://www.amazon.com/dp/1338896393?tag=NYTBSREV-20" ...
##   .. ..$ age_group           : chr [1:15] "" "" "" "" ...
##   .. ..$ book_review_link    : chr [1:15] "" "" "" "" ...
##   .. ..$ first_chapter_link  : chr [1:15] "" "" "" "" ...
##   .. ..$ sunday_review_link  : chr [1:15] "" "" "" "" ...
##   .. ..$ article_chapter_link: chr [1:15] "" "" "" "" ...
##   .. ..$ isbns               :List of 15
##   .. ..$ buy_links           :List of 15
##   .. ..$ book_uri            : chr [1:15] "nyt://book/6ec9c94e-40ff-59dd-9394-1283c81458cc" "nyt://book/26602605-957a-504d-91a9-4f83aff9e9b6" "nyt://book/dfbe5bf8-230b-567e-8ba0-25a09d5a3dde" "nyt://book/d62db906-6011-5ed8-a667-3e2f334b995f" ...
##   ..$ corrections               : list()

Transform JSON Format to R Data Frame

From the above glimpse(), it is clear that there are 2 lists that need to be unnested:

.. ..$ isbns :List of 15 .. ..$ buy_links :List of 15

The Results list contains another list called Books. The data below comes directly from the Books list so the Results list does not need to be unnested.

# take the subgroup books from results as that has the data we are interested in
books_df <- parsed_data$results$books

# flatten the nested structures within the 'isbns' column
books_df <- unnest(books_df, isbns)

# flatten the nested structures within the 'buy_links' column
books_df <- unnest(books_df, buy_links)

#check the data frame
glimpse(books_df)

## Rows: 144
## Columns: 28
## $ rank                 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2…
## $ rank_last_week       <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ weeks_on_list        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ asterisk             <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ dagger               <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ primary_isbn10       <chr> "0545828651", "0545828651", "0545828651", "054582…
## $ primary_isbn13       <chr> "9780545828659", "9780545828659", "9780545828659"…
## $ publisher            <chr> "Scholastic", "Scholastic", "Scholastic", "Schola…
## $ description          <chr> "The ninth book in the Amulet series. A threat of…
## $ price                <chr> "0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "…
## $ title                <chr> "WAVERIDER", "WAVERIDER", "WAVERIDER", "WAVERIDER…
## $ author               <chr> "Kazu Kibuishi", "Kazu Kibuishi", "Kazu Kibuishi"…
## $ contributor          <chr> "by Kazu Kibuishi", "by Kazu Kibuishi", "by Kazu …
## $ contributor_note     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ book_image           <chr> "https://storage.googleapis.com/du-prd/books/imag…
## $ book_image_width     <int> 333, 333, 333, 333, 333, 333, 333, 333, 333, 333,…
## $ book_image_height    <int> 500, 500, 500, 500, 500, 500, 500, 500, 500, 500,…
## $ amazon_product_url   <chr> "https://www.amazon.com/dp/0545828651?tag=NYTBSRE…
## $ age_group            <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ book_review_link     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ first_chapter_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ sunday_review_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ article_chapter_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ isbn10               <chr> "0545828651", "0545828651", "0545828651", "054582…
## $ isbn13               <chr> "9780545828659", "9780545828659", "9780545828659"…
## $ name                 <chr> "Amazon", "Apple Books", "Barnes and Noble", "Boo…
## $ url                  <chr> "https://www.amazon.com/dp/0545828651?tag=NYTBSRE…
## $ book_uri             <chr> "nyt://book/6ec9c94e-40ff-59dd-9394-1283c81458cc"…

Conclusion

The data frame is translated correctly in R, containing 144 rows and 28 columns. More cleaning and transformation would need to be done for a simple analysis; however, that is out of the scope of this markdown.