R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(httr)
library(jsonlite)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()  masks stats::filter()
## ✖ purrr::flatten() masks jsonlite::flatten()
## ✖ dplyr::lag()     masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

In an attempt to protect the API key I have set my system environment with the key under NYT_API, then I will use getenv to pull the key

# Getting the API key and assigning it to a variable
key <- Sys.getenv("NYT_API")

This is my api call to the NYT best sellers list hitory

api_results <- GET(
  url = "https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json",
  query = list(
    author = "John Green",
    `api-key` = key
  )
)

api_results
## Response [https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json?author=John%20Green&api-key=QtWotDdGD9lgdbI34q2NFbfNkLrMWPiX]
##   Date: 2025-03-30 22:53
##   Status: 200
##   Content-Type: application/json; charset=UTF-8
##   Size: 11.1 kB

I am parsing the json data that is pulled from my api call

json_results <- content(api_results, as = "parsed", simplifyVector = FALSE)

Then assigning those results below which will be used shortly by running it through a function which maps certain desired fields to their respective columns in my R dataframe

john_green_res <- json_results$results

I am defining a function below that will pull in the different books John Green may have had on the list.

I am creating a total weeks on the list to see if the books ranked for multiple weeks or not. The highest rank pulls the min rank from the rank history since NYT ranks 1 being the best, or “highest” rank. If the book did not rank then there is an NA

extract_books <- function(extract_book) {
  tibble(
    title = extract_book$title,
    author = extract_book$author,
    contributor = extract_book$contributor,
    publisher = extract_book$publisher,
    description = extract_book$description,
    total_weeks_on_list = if (length(extract_book$ranks_history) > 0)
      max(map_int(extract_book$ranks_history, "weeks_on_list"))
    else NA,
    highest_rank = if (length(extract_book$ranks_history) > 0)
      min(map_int(extract_book$ranks_history, "rank"))
    else NA,
    lowest_rank = if (length(extract_book$ranks_history) > 0)
      max(map_int(extract_book$ranks_history, "rank"))
    else NA
  )
}

Here we see the mapped dataframe that results from running the api results through the function

john_green_df <- map_df(john_green_res, extract_books)
john_green_df
## # A tibble: 9 × 8
##   title             author contributor publisher description total_weeks_on_list
##   <chr>             <chr>  <chr>       <chr>     <chr>                     <int>
## 1 AN ABUNDANCE OF … John … by John Gr… Speak     Colin Sing…                  NA
## 2 EVERYTHING IS TU… John … by John Gr… Crash Co… The author…                   1
## 3 LET IT SNOW       John … by John Gr… Speak     Three holi…                  NA
## 4 LOOKING FOR ALAS… John … by John Gr… Speak     A boy find…                  NA
## 5 PAPER TOWNS       John … by John Gr… Speak     After a ni…                  NA
## 6 THE ANTHROPOCENE… John … by John Gr… Dutton    A collecti…                   9
## 7 THE FAULT IN OUR… John … by John Gr… Penguin   A girl fac…                  NA
## 8 TURTLES ALL THE … John … by John Gr… Penguin   Aza and Da…                  NA
## 9 WILL GRAYSON, WI… John … by John Gr… Penguin … Two boys w…                  NA
## # ℹ 2 more variables: highest_rank <int>, lowest_rank <int>

Here we can see that “The Anthropocene Reviewed” was John Greens book with the most weeks on the NYT bestseller list.

longest_rank <- john_green_df |>
  filter(!is.na(total_weeks_on_list) & total_weeks_on_list > 0) |>
  select(title, 
         author, 
         contributor, 
         publisher, 
         total_weeks_on_list,
         highest_rank,
         lowest_rank) |>
  arrange(desc(total_weeks_on_list))
longest_rank
## # A tibble: 2 × 7
##   title            author contributor publisher total_weeks_on_list highest_rank
##   <chr>            <chr>  <chr>       <chr>                   <int>        <int>
## 1 THE ANTHROPOCEN… John … by John Gr… Dutton                      9            5
## 2 EVERYTHING IS T… John … by John Gr… Crash Co…                   1            1
## # ℹ 1 more variable: lowest_rank <int>

The book that placed at the highest rank however was “Everything is Tuberculosis”

highest_rank <- john_green_df |>
  filter(!is.na(highest_rank)) |>
  select(title, 
         author, 
         contributor, 
         publisher, 
         total_weeks_on_list,
         highest_rank,
         lowest_rank) |>
  arrange((highest_rank))
highest_rank
## # A tibble: 2 × 7
##   title            author contributor publisher total_weeks_on_list highest_rank
##   <chr>            <chr>  <chr>       <chr>                   <int>        <int>
## 1 EVERYTHING IS T… John … by John Gr… Crash Co…                   1            1
## 2 THE ANTHROPOCEN… John … by John Gr… Dutton                      9            5
## # ℹ 1 more variable: lowest_rank <int>