Fetching Data

Let’s fetch bestselling authors monthly for a year

startDate<-lubridate::make_date(2018,1,1)
dates<-startDate + months(0:11)

We can get the hardcover fiction list, and the ebook fiction list with our api key, but must add a wait period.

hardCoverLists<-purrr::map(dates, function(x) {
Sys.sleep(30) 
jsonlite::fromJSON(paste( "https://api.nytimes.com/svc/books/v3/lists/",x,"/hardcover-fiction.json?api-key=",nytApiKey,sep=""))
}
)
youngAdult<-purrr::map(dates, function(x) {
Sys.sleep(30) 
jsonlite::fromJSON(paste( "https://api.nytimes.com/svc/books/v3/lists/",x,"/young-adult-hardcover.json?api-key=",nytApiKey,sep=""))
}
)

Analysis

With this we can see that YA has more weeks on list, and both lists have a lot of new enterants (the mean number of weeks on list goes way down) in the last quarter.

weeksOnList <- data.frame(Month=month(dates,label = TRUE),ya=unlist(purrr::map(youngAdult, ~ mean(.x$results$books$weeks_on_list))),adult=unlist(purrr::map(hardCoverLists, ~ mean(.x$results$books$weeks_on_list))))
weeksOnList %>% gather("BookType","weeks", -Month )
##    Month BookType     weeks
## 1    Jan       ya 14.700000
## 2    Feb       ya 10.500000
## 3    Mar       ya 20.500000
## 4    Apr       ya 15.800000
## 5    May       ya 16.400000
## 6    Jun       ya 14.900000
## 7    Jul       ya 21.000000
## 8    Aug       ya 26.500000
## 9    Sep       ya 29.800000
## 10   Oct       ya 31.700000
## 11   Nov       ya 22.400000
## 12   Dec       ya 21.100000
## 13   Jan    adult  8.266667
## 14   Feb    adult  7.533333
## 15   Mar    adult  7.866667
## 16   Apr    adult 12.066667
## 17   May    adult  7.333333
## 18   Jun    adult  8.866667
## 19   Jul    adult  5.333333
## 20   Aug    adult 11.933333
## 21   Sep    adult 12.733333
## 22   Oct    adult  7.733333
## 23   Nov    adult  2.333333
## 24   Dec    adult  3.733333
weeksOnList %>% gather("BookType","weeks", -Month ) %>% ggplot(aes(Month,weeks,group=BookType,color=BookType))+ geom_line()

We can also see how many have official NYTimes reviews. It looks like for our time period this doesn’t include YA.

bookReviewRatio <- data.frame(Month=month(dates,label = TRUE),ya=unlist(purrr::map(youngAdult, ~ mean(1-stri_isempty( .x$results$books$book_review_link)))),adult=unlist(purrr::map(hardCoverLists, ~ mean(1-stri_isempty( .x$results$books$book_review_link)))))

bookReviewRatio %>% gather("BookType","HaveReview", -Month )
##    Month BookType HaveReview
## 1    Jan       ya  0.0000000
## 2    Feb       ya  0.0000000
## 3    Mar       ya  0.0000000
## 4    Apr       ya  0.0000000
## 5    May       ya  0.0000000
## 6    Jun       ya  0.0000000
## 7    Jul       ya  0.0000000
## 8    Aug       ya  0.0000000
## 9    Sep       ya  0.0000000
## 10   Oct       ya  0.0000000
## 11   Nov       ya  0.0000000
## 12   Dec       ya  0.0000000
## 13   Jan    adult  0.2000000
## 14   Feb    adult  0.2666667
## 15   Mar    adult  0.1333333
## 16   Apr    adult  0.2000000
## 17   May    adult  0.2000000
## 18   Jun    adult  0.2000000
## 19   Jul    adult  0.2666667
## 20   Aug    adult  0.3333333
## 21   Sep    adult  0.2000000
## 22   Oct    adult  0.1333333
## 23   Nov    adult  0.2000000
## 24   Dec    adult  0.2000000
bookReviewRatio %>% gather("BookType","HaveReview", -Month ) %>% ggplot(aes(Month,HaveReview,group=BookType,color=BookType))+ geom_line()