Web APIs

Title: CUNY SPS MDS DATA607_WK9Assignmt"

Author: Charles Ugiagbe

Date: “10/24/2021”

Assignment Question

The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis(https://developer.nytimes.com/apis)

You’ll need to start by signing up for an API key.

Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.

Approach to Solution

The New York Times API that interest me the most was “Books API”. After requesting the API key, I was able to access it in R. I extracted the data of “hardcover fiction books” from the books API, and this was in JSON format. I then used “fromjson” to convert the JSON data to R objects.

Load the Required Packages

library(httr)
library(jsonlite)
library(tidyverse)
library(kableExtra)

Read the API

bookAPI <- "https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=yrHCGK1wjdTA8abjGAyni4Q1A4RiWACs"

Transform Json Data into R Dataframe and take a glimpse view of the data structure

fiction_books <- fromJSON(bookAPI)[[5]][[11]]
glimpse(fiction_books)

## Rows: 15
## Columns: 26
## $ rank                 <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
## $ rank_last_week       <int> 0, 2, 1, 3, 5, 0, 8, 0, 6, 7, 9, 4, 0, 13, 10
## $ weeks_on_list        <int> 1, 3, 2, 3, 5, 1, 24, 1, 5, 2, 11, 2, 45, 6, 4
## $ asterisk             <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ dagger               <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ primary_isbn10       <chr> "198217367X", "1538728621", "0735222355", "198216~
## $ primary_isbn13       <chr> "9781982173678", "9781538728628", "9780735222359"~
## $ publisher            <chr> "Simon & Schuster, St. Martin's", "Grand Central"~
## $ description          <chr> "In the wake of the previous administration’s mis~
## $ price                <chr> "0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "~
## $ title                <chr> "STATE OF TERROR", "THE WISH", "THE LINCOLN HIGHW~
## $ author               <chr> "Hillary Rodham Clinton and Louise Penny", "Nicho~
## $ contributor          <chr> "by Hillary Rodham Clinton and Louise Penny", "by~
## $ contributor_note     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ book_image           <chr> "https://storage.googleapis.com/du-prd/books/imag~
## $ book_image_width     <int> 331, 330, 331, 331, 329, 331, 331, 336, 329, 328,~
## $ book_image_height    <int> 500, 500, 500, 500, 500, 500, 500, 500, 500, 500,~
## $ amazon_product_url   <chr> "https://www.amazon.com/dp/198217367X?tag=NYTBSRE~
## $ age_group            <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ book_review_link     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ first_chapter_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ sunday_review_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ article_chapter_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "~
## $ isbns                <list> [<data.frame[2 x 2]>], [<data.frame[1 x 2]>], [<d~
## $ buy_links            <list> [<data.frame[6 x 2]>], [<data.frame[6 x 2]>], [<d~
## $ book_uri             <chr> "nyt://book/ee38f9b9-6787-5e00-b068-7d2f712aa3fd"~

Create a new Dataframe from the Original one for Analysis.

fiction_books2 <- fiction_books[c("rank", "publisher", "title", "author", "contributor", "primary_isbn13")]

fiction_books2 %>%
  kbl(caption = "Hardcover fiction books") %>%
  kable_material(c("striped", "hover")) %>%
  row_spec(0, color = "Red")

Hardcover fiction books
rank	publisher	title	author	contributor	primary_isbn13
1	Simon & Schuster, St. Martin’s	STATE OF TERROR	Hillary Rodham Clinton and Louise Penny	by Hillary Rodham Clinton and Louise Penny	9781982173678
2	Grand Central	THE WISH	Nicholas Sparks	by Nicholas Sparks	9781538728628
3	Viking	THE LINCOLN HIGHWAY	Amor Towles	by Amor Towles	9780735222359
4	Scribner	CLOUD CUCKOO LAND	Anthony Doerr	by Anthony Doerr	9781982168438
5	Holt	APPLES NEVER FALL	Liane Moriarty	by Liane Moriarty	9781250220257
6	Viking	SILVERVIEW	John Le Carré	by John Le Carré	9780593490594
7	Simon & Schuster	THE LAST THING HE TOLD ME	Laura Dave	by Laura Dave	9781501171345
8	Simon & Schuster	THE BOOK OF MAGIC	Alice Hoffman	by Alice Hoffman	9781982151485
9	Doubleday	HARLEM SHUFFLE	Colson Whitehead	by Colson Whitehead	9780385545136
10	Delacorte	THE BUTLER	Danielle Steel	by Danielle Steel	9781984821522
11	Scribner	BILLY SUMMERS	Stephen King	by Stephen King	9781982173616
12	Farrar, Straus & Giroux	CROSSROADS	Jonathan Franzen	by Jonathan Franzen	9780374181178
13	Viking	THE MIDNIGHT LIBRARY	Matt Haig	by Matt Haig	9780525559474
14	Farrar, Straus & Giroux	BEAUTIFUL WORLD, WHERE ARE YOU	Sally Rooney	by Sally Rooney	9780374602604
15	Little, Brown	THE JAILHOUSE LAWYER	James Patterson and Nancy Allen	by James Patterson and Nancy Allen	9780316276627

Extra Analysis

# Publisher df. Order books by publisher

df_pub <- fiction_books2 %>%
  group_by(publisher) %>%
  summarise(books_published = n())

Plot the ranked books by publisher and visualize which publicher has more books in the ranking.

df_pub <- df_pub[order(-df_pub$books_published), ]

df_pub %>%
  
  ggplot(aes(reorder(publisher, books_published), books_published)) +
  
  geom_col(aes(fill = books_published)) + 
  
   scale_fill_gradient2(low = "Green",
                       high = "Blue",
                       ) +
  
   labs(title="Ranked hardcover fiction books by publisher") + ylab("books_published") + theme(legend.position = "none", axis.title.x = element_blank(), axis.text.x=element_text(angle=45)) + theme(plot.title = element_text(hjust=0.5)) + theme(axis.text.x = element_text(margin = margin(t = 25, r = 20, b = 0, l = 0)))