NYT API

In this document, we’ll be using the New York Time’s Book Review API. Using their documentation, we’re going to call to the API to grab the current hardcover nonfiction list.

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.4     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## 
## Attaching package: 'jsonlite'
## The following object is masked from 'package:purrr':
## 
##     flatten
url <- 'https://api.nytimes.com/svc/books/v3/lists/current/hardcover-nonfiction.json?api-key=3ImZ9GGsnMCDHHePNZ6IXzTpHT5qh98O'

raw_results <- fromJSON(txt = url)
books_df <- raw_results$results$books

Analysis

Then, we can do some lightweight analysis to see how long some of these books have been on the list, and their makeup.

We find that there is only one brand new entrant to the list (“REMEMBER”), with a couple outliers with more than 20 weeks on the list (“UNTAMED” and “HOW TO BE AN ANTIRACIST”).

summary(books_df$weeks_on_list)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    5.00    9.00   16.33   21.00   55.00
boxplot(books_df$weeks_on_list)

ggplot(books_df, aes(books_df$weeks_on_list)) + geom_histogram(binwidth = 1)
## Warning: Use of `books_df$weeks_on_list` is discouraged. Use `weeks_on_list`
## instead.