Data 607 Assignment 9

NYT API

In this document, we’ll be using the New York Time’s Book Review API. Using their documentation, we’re going to call to the API to grab the current hardcover nonfiction list.

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.4     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

## 
## Attaching package: 'jsonlite'

## The following object is masked from 'package:purrr':
## 
##     flatten

url <- 'https://api.nytimes.com/svc/books/v3/lists/current/hardcover-nonfiction.json?api-key=3ImZ9GGsnMCDHHePNZ6IXzTpHT5qh98O'

raw_results <- fromJSON(txt = url)
books_df <- raw_results$results$books

Analysis

Then, we can do some lightweight analysis to see how long some of these books have been on the list, and their makeup.

We find that there is only one brand new entrant to the list (“REMEMBER”), with a couple outliers with more than 20 weeks on the list (“UNTAMED” and “HOW TO BE AN ANTIRACIST”).

summary(books_df$weeks_on_list)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    5.00    9.00   16.33   21.00   55.00

boxplot(books_df$weeks_on_list)

ggplot(books_df, aes(books_df$weeks_on_list)) + geom_histogram(binwidth = 1)

## Warning: Use of `books_df$weeks_on_list` is discouraged. Use `weeks_on_list`
## instead.

Data 607 Assignment 9

Claire Meyer

4/7/2021

NYT API

Analysis