DATA 607 - Week 9th Assignment

Introduction

The goal of this week’s assignment is to work with APIs.

We will work with the New York Times web site rich set of APIs, as described here: New York Times APIs. I first needt oestablish a secure way of working by signing up for an API key. My next task is as follow

to choose one of the New York Times APIs,
construct an interface in R to read in the JSON data,
and transform it into an R DataFrame

Code Initiation

Here I load the required libraries and ensure all the required packages are installed before running the following blocks of codes.

## [1] "All required packages are installed"

Working with httr2

In this project, I will be working with httr2 to load data from the link. First things first, I have to create a token and sign up to be able to load data from the New York Times API.

Many examples can be found on the NYTimes website under specific APIs, such as: Books_API.

I will be using httr2 and jsonlite to retrieve and parse the data.

Method	Endpoint	Description
GET	/lists/full-overview.json	Get all books for all the Best Sellers lists for specified date.
GET	/lists/overview.json	Get top 5 books for all the Best Sellers lists for specified date.

url <- "https://api.nytimes.com/svc/books/v3/"

 
Example_call_1 <- "https://api.nytimes.com/svc/books/v3/lists/current/hardcover-fiction.json?api-key=yourkey"

Example_call_2 <- "https://api.nytimes.com/svc/books/v3/reviews.json?author=Stephen+King&api-key=yourkey"



# Define your API key

# Construct the API call URL
nyt_inquiry <- "lists/current/hardcover-fiction.json"

#set and define the url_key with App_key to get the information using GET httr2 function
url_key <- paste0(url,nyt_inquiry,"?api-key=", App_Key)

# Make the API request
req <- httr2::request(url_key) #stablish conenction 
nyt_response <- httr2::req_perform(req) #use req_perfrom to get the data

# test some additional information 
nyt_response %>% resp_content_type()

## [1] "application/json"

nyt_response %>% resp_status_desc()

## [1] "OK"

#nyt_response %>% resp_body_html()
#nyt_response %>%resp_body_json()



# Check if the request was successful if not pritn error 
if (resp_content_type(nyt_response) == "application/json") {
  # Parse the JSON response
  data <- nyt_response %>% 
    resp_body_json(check_type = TRUE, simplifyVector = FALSE)
  
    # Extract book titles 
  books <- data$results$books
  
  # Check if books exist before accessing titles if not print no books
  #define the DF
  book_list <- data.frame(
    Title = character(0),
    Rank = integer(0),
    Authors = character(0),
    Publisher = character(0),
    Book_image = character(0),
    Book_file  = character(0)
  )
  
  if (!is.null(books)) {
    book_list <- data.frame(
      Title = sapply(books, function(x) x$title),
      Rank = sapply(books, function(x) x$rank),
      Authors = sapply(books, function(x) x$author),
      Publisher = sapply(books, function(x) x$publisher),
      Book_image = sapply(books, function(x) x$book_image),
      Book_file = character(length(books))
      )

    
    book_list$Book_file <- sapply(books, function(x) {
      image_url <- x$book_image
      #print(image_url)
      if (!is.null(image_url)) {
        # Download image using read_html and content
        tryCatch({
          #define file_name
          file_name <-  paste0("book_", x$title, "_", x$rank, ".jpg")
          #check if the file has already been downlaoded
          if (file.exists(file_name)) {
            message("Image for ",  x$title ," has already been downloaded!")
            } else {
              download.file(image_url, file_name,
                            mode = "wb")
              message("Image for ", x$title, " is downloaded.")
              }
          #str_trim(x$title)
          #img_src
          return(file_name)
          }, error = function(e) {
            message("Error downloading image for: ", x$title)
            return(NA)
            })
        } else {
          NA
          }
      })

    #print(books_title)
    } else {
      print("No books found in the response.")
      }  
  } else {
    print("Error: Unable to retrieve data from the API.")
  }

## Image for THE WOMEN has already been downloaded!

## Image for FOURTH WING has already been downloaded!

## Image for IRON FLAME has already been downloaded!

## Image for NEVER TOO LATE has already been downloaded!

## Image for THE HUNTER has already been downloaded!

## Image for A FATE INKED IN BLOOD has already been downloaded!

## Image for THREE-INCH TEETH has already been downloaded!

## Image for THE HEAVEN & EARTH GROCERY STORE has already been downloaded!

## Image for HOUSE OF FLAME AND SHADOW has already been downloaded!

## Image for FIRST LIE WINS has already been downloaded!

## Image for REMARKABLY BRIGHT CREATURES has already been downloaded!

## Image for THE SUNLIT MAN has already been downloaded!

## Image for LISTEN FOR THE LIE has already been downloaded!

## Image for LESSONS IN CHEMISTRY has already been downloaded!

## Image for WANDERING STARS has already been downloaded!

image_example:

Dispaly the result with image

In this section, I struggled quite a bit to display the images. Although I managed to download them, it took me a considerable amount of time to figure out how to show them. This solution may not seem ideal, but it’s better than nothing.

message("This is the list of the top books in New York Times dated: ", data$last_modified)

## This is the list of the top books in New York Times dated: 2024-03-13T22:24:46-04:00

# Display the table of books using Kable
kable(book_list[1:5], format = "html") %>%
  kable_styling()

Title	Rank	Authors	Publisher	Book_image
THE WOMEN	1	Kristin Hannah	St. Martin’s	https://storage.googleapis.com/du-prd/books/images/9781250178633.jpg
FOURTH WING	2	Rebecca Yarros	Red Tower	https://storage.googleapis.com/du-prd/books/images/9781649374042.jpg
IRON FLAME	3	Rebecca Yarros	Red Tower	https://storage.googleapis.com/du-prd/books/images/9781649374172.jpg
NEVER TOO LATE	4	Danielle Steel	Delacorte	https://storage.googleapis.com/du-prd/books/images/9780593498408.jpg
THE HUNTER	5	Tana French	Viking	https://storage.googleapis.com/du-prd/books/images/9780593493434.jpg
A FATE INKED IN BLOOD	6	Danielle L. Jensen	Del Rey	https://storage.googleapis.com/du-prd/books/images/9780593599839.jpg
THREE-INCH TEETH	7	C.J. Box	Putnam	https://storage.googleapis.com/du-prd/books/images/9780593331347.jpg
THE HEAVEN & EARTH GROCERY STORE	8	James McBride	Riverhead	https://storage.googleapis.com/du-prd/books/images/9780593422946.jpg
HOUSE OF FLAME AND SHADOW	9	Sarah J. Maas	Bloomsbury	https://storage.googleapis.com/du-prd/books/images/9781635574104.jpg
FIRST LIE WINS	10	Ashley Elston	Pamela Dorman	https://storage.googleapis.com/du-prd/books/images/9780593492918.jpg
REMARKABLY BRIGHT CREATURES	11	Shel Van Pelt	Ecco	https://storage.googleapis.com/du-prd/books/images/9780063204157.jpg
THE SUNLIT MAN	12	Brandon Sanderson	Tor	https://storage.googleapis.com/du-prd/books/images/9781250899712.jpg
LISTEN FOR THE LIE	13	Amy Tintera	Celadon	https://storage.googleapis.com/du-prd/books/images/9781250880314.jpg
LESSONS IN CHEMISTRY	14	Bonnie Garmus	Doubleday	https://storage.googleapis.com/du-prd/books/images/9780385547345.jpg
WANDERING STARS	15	Tommy Orange	Knopf	https://storage.googleapis.com/du-prd/books/images/9780593318256.jpg

WD_path <- getwd()

knitr::include_graphics(book_list$Book_file)

Conclusion

This week, we learned how to work with the New York Times APIs to download JSON files and extract data from them. I also used the image links in the JSON to download them into a folder and later display them in an HTML file.

In general, the process was tricky and required following the exact way the APIs are set to work. It took me a couple of iterations to figure out how to get the information and extract data from the JSON. All in all, JSON is a friendly structure that allows us to elicit information, but it requires digging into the structure of the file to extract information meaningfully.