For this assignment, I will be using the NYTimes Books API to return a list of all the NYT Best Sellers, and then I will convert the resultant JSON file into an R dataframe.
##Running the API
##This code will generte a JSON format list containing historical data pertaining to some of the NYT best sellers from the pat decade (CONFIRM)
nyt <- fromJSON("https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json?api-key=PCBxRC5IDGa5YL921cADZrhRF0J3B7Bm", flatten = T)
##And now I will convert the JSON list into a dataframe
nyt <- data.frame(nyt)
kable(nyt)
| status | copyright | num_results | results.title | results.description | results.contributor | results.author | results.contributor_note | results.price | results.age_group | results.publisher | results.isbns | results.ranks_history | results.reviews |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | “I GIVE YOU MY BODY …” | The author of the Outlander novels gives tips on writing sex scenes, drawing on examples from the books. | by Diana Gabaldon | Diana Gabaldon | 0.00 | Dell | 0399178570 , 9780399178573 | 0399178570 , 9780399178573 , 8 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2016-09-04 , 2016-08-20 , 1 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | “MOST BLESSED OF THE PATRIARCHS” | A character study that attempts to make sense of Jefferson’s contradictions. | by Annette Gordon-Reed and Peter S. Onuf | Annette Gordon-Reed and Peter S Onuf | 0.00 | Liveright | 0871404427 , 9780871404428 | 0871404427 , 9780871404428 , 16 , Hardcover Nonfiction, Hardcover Nonfiction, 2016-05-01 , 2016-04-16 , 1 , 0 , 1 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | “YOU JUST NEED TO LOSE WEIGHT” | The co-host of the podcast “Maintenance Phase” examines myths about gaining and losing weight to dismantle anti-fat bias. | by Aubrey Gordon | Aubrey Gordon | 0.00 | Beacon | 0807006475 , 0807006483 , 9780807006474, 9780807006481 | 0807006475 , 0807006475 , 9780807006474 , 9780807006474 , 2 , 6 , Paperback Nonfiction , Combined Print and E-Book Nonfiction, Paperback Nonfiction , Combined Print & E-Book Nonfiction , 2023-01-29 , 2023-01-29 , 2023-01-14 , 2023-01-14 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | #ASKGARYVEE | The entrepreneur expands on subjects addressed on his Internet show, like marketing, management and social media. | by Gary Vaynerchuk | Gary Vaynerchuk | 0.00 | HarperCollins | 0062273124 , 0062273132 , 9780062273123, 9780062273130 | 0062273124 , 0062273124 , 9780062273123 , 9780062273123 , 5 , 6 , Business Books , Advice How-To and Miscellaneous, Business , Advice, How-To & Miscellaneous , 2016-04-10 , 2016-03-27 , 2016-03-26 , 2016-03-12 , 0 , 1 , 0 , 0 , 0 , 0 , 1 , 1 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | #GIRLBOSS | An online fashion retailer traces her path to success. | by Sophia Amoruso | Sophia Amoruso | 0.00 | Portfolio/Penguin/Putnam | 039916927X , 1591847931 , 9780399169274, 9781591847939 | 1591847931 , 1591847931 , 1591847931 , 1591847931 , 039916927X , 9781591847939 , 9781591847939 , 9781591847939 , 9781591847939 , 9780399169274 , 8 , 9 , 9 , 8 , 10 , Business Books, Business Books, Business Books, Business Books, Business Books, Business , Business , Business , Business , Business , 2016-03-13 , 2016-01-17 , 2015-12-13 , 2015-11-15 , 2014-11-09 , 2016-02-27 , 2016-01-02 , 2015-11-28 , 2015-10-31 , 2014-10-25 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | #IMOMSOHARD | by Kristin Hensley and Jen Smedley | Kristin Hensley and Jen Smedley | 0.00 | HarperOne | 006285769X , 9780062857699 | 006285769X , 9780062857699 , 10 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2019-04-21 , 2019-04-06 , 1 , 0 , 0 , 1 | , , , | |||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | #NEVERAGAIN | Students from Marjory Stoneman Douglas High School describe the Valentine’s Day mass shooting and outline ways to prevent similar incidents. | by David Hogg and Lauren Hogg | David Hogg and Lauren Hogg | 0.00 | Random House | 198480183X , 9781984801838 | 198480183X , 9781984801838 , 9 , Paperback Nonfiction, Paperback Nonfiction, 2018-07-08 , 2018-06-23 , 1 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | $100 STARTUP | How to build a profitable start up for $100 or less and be your own boss. | by Chris Guillebeau | Chris Guillebeau | 23.00 | Crown Business | 0307951529 , 9780307951526 | NULL | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | $20 PER GALLON | by Christopher Steiner | Christopher Steiner | 0.00 | Grand Central | NULL | NULL | , , , | |||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ’57, Chicago | NA | NA | Steve Monroe | NA | 0.00 | NA | NA | 0786867302 , 9780786867301 | NULL | , NA , https://www.nytimes.com/2001/07/29/books/books-in-brief-fiction-poetry-319660.html, NA |
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ‘ROCK OF AGES:’‘ROLLING STONE’’ HISTORY OF ROCK AND ROLL’ | NA | NA | GEOFFREY STOKES, KEN TUCKER’ ’ED WARD | NA | 0.00 | NA | NA | 0671630687 , 9780671630683 | NULL | , NA , https://www.nytimes.com/1986/12/28/books/three-chord-music-in-a-three-piece-suit.html, NA |
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ‘THE HIGH ROAD TO CHINA: GEORGE BOGLE, THE PANCHEN LAMA AND THE FIRST BRITISH EXPEDITION TO TIBET’ | NA | NA | KATE TELTSCHER | NA | 0.00 | NA | NA | 0374217009 , 9780374217006 | NULL | , NA , https://www.nytimes.com/2007/04/22/books/review/Stuart.t.html, NA |
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ’TIL DEATH | by Sharon Sala | Sharon Sala | 0.00 | Harlequin Mira | 0778314278 , 9780778314271 | NULL | , , , | |||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ’TIL DEATH DO US PART | A matchmaker in Victorian England turns to a crime novelist for help when she starts receiving a series of disturbingly personalized trinkets. | by Amanda Quick | Amanda Quick | 0.00 | Berkley | 069819361X , 039917446X , 9780698193611, 9780399174469 | 069819361X , 069819361X , 9780698193611 , 9780698193611 , 15 , 9 , Combined Print and E-Book Fiction, E-Book Fiction , Combined Print & E-Book Fiction , E-Book Fiction , 2016-05-08 , 2016-05-08 , 2016-04-23 , 2016-04-23 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ’Til Faith Do Us Part: How Interfaith Marriage is Transforming America | NA | NA | Naomi Schaefer Riley | NA | 0.00 | NA | NA | 0199873747 , 9780199873746 | NULL | , NA , https://www.nytimes.com/2013/05/26/books/review/til-faith-do-us-part-by-naomi-schaefer-riley.html, NA |
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ’TIS THE SEASON | Two classic holiday stories — “Under the Christmas Tree” (2009) and “Midnight Confessions” (2010) — plus the novella “Backward Glance” (1991). | by Robyn Carr | Ron Carr | 0.00 | Harlequin Mira | 0778316645 , 9780778316640 | 0778316645 , 0778316645 , 9780778316640 , 9780778316640 , 18 , 6 , Mass Market Paperback , Mass Market Paperback , Paperback Mass-Market Fiction, Paperback Mass-Market Fiction, 2014-11-30 , 2014-11-16 , 2014-11-15 , 2014-11-01 , 0 , 1 , 0 , 0 , 0 , 1 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | (RE)BORN IN THE USA | The soccer commentator describes how he embraced American popular culture while growing up in Liverpool. | by Roger Bennett | Roger Bennett | 0.00 | Dey Street | 0062958690 , 0062958720 , 9780062958693, 9780062958723 | 0062958690 , 0062958690 , 9780062958693 , 9780062958693 , 3 , 1 , Combined Print and E-Book Nonfiction, Hardcover Nonfiction , Combined Print & E-Book Nonfiction , Hardcover Nonfiction , 2021-07-18 , 2021-07-18 , 2021-07-03 , 2021-07-03 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 | , , , | ||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | ——, THAT’S DELICIOUS | by Action Bronson with Rachel Wharton | Action Bronson with Rachel Wharton | 0.00 | Abrams | 1419726552 , 9781419726552 | 1419726552 , 9781419726552 , 9 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2017-10-01 , 2017-09-16 , 1 , 0 , 0 , 1 | , , , | |||
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | …and the Horse He Rode In On: The People V. Kenneth Starr | NA | NA | James Carville | NA | 0.00 | NA | NA | 0684857340 , 9780684857343 | NULL | , NA , https://www.nytimes.com/1998/10/18/books/preaching-to-the-converted.html, NA |
| OK | Copyright (c) 2023 The New York Times Company. All Rights Reserved. | 35593 | .HACK G.U. , VOL. 5 | This series, set in the future, is about an online, multiplayer game run amok. This volume concludes the tale. | by Hamazaki Tatsuya | Hamazaki Tatsuya | 10.99 | TOKYOPOP | NULL | NULL | , , , |
As we can see, this generates a rather unorganized (and incomplete) dataframe consisting of empty and nested columns (and even empty, nested columns) Of course, it is necessary to clean and tidy all of these datapoints
##First, I want to remove the "results." prefix
colnames(nyt) <- sub("^results.", "", colnames(nyt))
nyt_unnest <- nyt %>% unnest_wider(c("ranks_history", "reviews", "isbns"))
#dropping the primary isbn variables, since they contain the same information contained within the nested isbn variable (and the nested variable seems to contain more complete observations)
nyt_unnest <- subset(nyt_unnest, select = -c(primary_isbn10, primary_isbn13))
nyt_unnest <- nyt_unnest %>% unnest_longer(isbn10)
nyt_unnest <- nyt_unnest %>% unnest_longer(isbn13)
nyt_unnest <- nyt_unnest %>% unnest_longer(rank)
nyt_unnest <- nyt_unnest %>% unnest_longer(list_name)
nyt_unnest <- nyt_unnest %>% unnest_longer(display_name)
nyt_unnest <- nyt_unnest %>% unnest_longer(published_date)
nyt_unnest <- nyt_unnest %>% unnest_longer(bestsellers_date)
nyt_unnest <- nyt_unnest %>% unnest_longer(weeks_on_list)
nyt_unnest <- nyt_unnest %>% unnest_longer(rank_last_week)
nyt_unnest <- nyt_unnest %>% unnest_longer(asterisk)
nyt_unnest <- nyt_unnest %>% unnest_longer(dagger)
glimpse(nyt_unnest)
## Rows: 7,821,209
## Columns: 26
## $ status <chr> "OK", "OK", "OK", "OK", "OK", "OK", "OK", "OK", "…
## $ copyright <chr> "Copyright (c) 2023 The New York Times Company. …
## $ num_results <int> 35593, 35593, 35593, 35593, 35593, 35593, 35593, …
## $ title <chr> "\"I GIVE YOU MY BODY ...\"", "\"MOST BLESSED OF …
## $ description <chr> "The author of the Outlander novels gives tips on…
## $ contributor <chr> "by Diana Gabaldon", "by Annette Gordon-Reed and …
## $ author <chr> "Diana Gabaldon", "Annette Gordon-Reed and Peter …
## $ contributor_note <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ price <chr> "0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "…
## $ age_group <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ publisher <chr> "Dell", "Liveright", "Beacon", "Beacon", "Beacon"…
## $ isbn10 <chr> "0399178570", "0871404427", "0807006475", "080700…
## $ isbn13 <chr> "9780399178573", "9780871404428", "9780807006474"…
## $ rank <int> 8, 16, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
## $ list_name <chr> "Advice How-To and Miscellaneous", "Hardcover Non…
## $ display_name <chr> "Advice, How-To & Miscellaneous", "Hardcover Nonf…
## $ published_date <chr> "2016-09-04", "2016-05-01", "2023-01-29", "2023-0…
## $ bestsellers_date <chr> "2016-08-20", "2016-04-16", "2023-01-14", "2023-0…
## $ weeks_on_list <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ rank_last_week <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ asterisk <int> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ dagger <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ book_review_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ first_chapter_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ sunday_review_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ article_chapter_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
nyt_unnest[nyt_unnest == ""] <- NA
And with that, we are left with a very, very large (albeit analyzable) unnested dataset