Article Search API

For this assignment, I will be using the NYTimes Books API to return a list of all the NYT Best Sellers, and then I will convert the resultant JSON file into an R dataframe.

##Running the API

##This code will generte a JSON format list containing historical data pertaining to some of the NYT best sellers from the pat decade (CONFIRM)

nyt <- fromJSON("https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json?api-key=PCBxRC5IDGa5YL921cADZrhRF0J3B7Bm", flatten = T)

##And now I will convert the JSON list into a dataframe
nyt <- data.frame(nyt)

kable(nyt)
status copyright num_results results.title results.description results.contributor results.author results.contributor_note results.price results.age_group results.publisher results.isbns results.ranks_history results.reviews
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 “I GIVE YOU MY BODY …” The author of the Outlander novels gives tips on writing sex scenes, drawing on examples from the books. by Diana Gabaldon Diana Gabaldon 0.00 Dell 0399178570 , 9780399178573 0399178570 , 9780399178573 , 8 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2016-09-04 , 2016-08-20 , 1 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 “MOST BLESSED OF THE PATRIARCHS” A character study that attempts to make sense of Jefferson’s contradictions. by Annette Gordon-Reed and Peter S. Onuf Annette Gordon-Reed and Peter S Onuf 0.00 Liveright 0871404427 , 9780871404428 0871404427 , 9780871404428 , 16 , Hardcover Nonfiction, Hardcover Nonfiction, 2016-05-01 , 2016-04-16 , 1 , 0 , 1 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 “YOU JUST NEED TO LOSE WEIGHT” The co-host of the podcast “Maintenance Phase” examines myths about gaining and losing weight to dismantle anti-fat bias. by Aubrey Gordon Aubrey Gordon 0.00 Beacon 0807006475 , 0807006483 , 9780807006474, 9780807006481 0807006475 , 0807006475 , 9780807006474 , 9780807006474 , 2 , 6 , Paperback Nonfiction , Combined Print and E-Book Nonfiction, Paperback Nonfiction , Combined Print & E-Book Nonfiction , 2023-01-29 , 2023-01-29 , 2023-01-14 , 2023-01-14 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 #ASKGARYVEE The entrepreneur expands on subjects addressed on his Internet show, like marketing, management and social media. by Gary Vaynerchuk Gary Vaynerchuk 0.00 HarperCollins 0062273124 , 0062273132 , 9780062273123, 9780062273130 0062273124 , 0062273124 , 9780062273123 , 9780062273123 , 5 , 6 , Business Books , Advice How-To and Miscellaneous, Business , Advice, How-To & Miscellaneous , 2016-04-10 , 2016-03-27 , 2016-03-26 , 2016-03-12 , 0 , 1 , 0 , 0 , 0 , 0 , 1 , 1 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 #GIRLBOSS An online fashion retailer traces her path to success. by Sophia Amoruso Sophia Amoruso 0.00 Portfolio/Penguin/Putnam 039916927X , 1591847931 , 9780399169274, 9781591847939 1591847931 , 1591847931 , 1591847931 , 1591847931 , 039916927X , 9781591847939 , 9781591847939 , 9781591847939 , 9781591847939 , 9780399169274 , 8 , 9 , 9 , 8 , 10 , Business Books, Business Books, Business Books, Business Books, Business Books, Business , Business , Business , Business , Business , 2016-03-13 , 2016-01-17 , 2015-12-13 , 2015-11-15 , 2014-11-09 , 2016-02-27 , 2016-01-02 , 2015-11-28 , 2015-10-31 , 2014-10-25 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 #IMOMSOHARD by Kristin Hensley and Jen Smedley Kristin Hensley and Jen Smedley 0.00 HarperOne 006285769X , 9780062857699 006285769X , 9780062857699 , 10 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2019-04-21 , 2019-04-06 , 1 , 0 , 0 , 1 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 #NEVERAGAIN Students from Marjory Stoneman Douglas High School describe the Valentine’s Day mass shooting and outline ways to prevent similar incidents. by David Hogg and Lauren Hogg David Hogg and Lauren Hogg 0.00 Random House 198480183X , 9781984801838 198480183X , 9781984801838 , 9 , Paperback Nonfiction, Paperback Nonfiction, 2018-07-08 , 2018-06-23 , 1 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 $100 STARTUP How to build a profitable start up for $100 or less and be your own boss. by Chris Guillebeau Chris Guillebeau 23.00 Crown Business 0307951529 , 9780307951526 NULL , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 $20 PER GALLON by Christopher Steiner Christopher Steiner 0.00 Grand Central NULL NULL , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ’57, Chicago NA NA Steve Monroe NA 0.00 NA NA 0786867302 , 9780786867301 NULL , NA , https://www.nytimes.com/2001/07/29/books/books-in-brief-fiction-poetry-319660.html, NA
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ‘ROCK OF AGES:’‘ROLLING STONE’’ HISTORY OF ROCK AND ROLL’ NA NA GEOFFREY STOKES, KEN TUCKER’ ’ED WARD NA 0.00 NA NA 0671630687 , 9780671630683 NULL , NA , https://www.nytimes.com/1986/12/28/books/three-chord-music-in-a-three-piece-suit.html, NA
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ‘THE HIGH ROAD TO CHINA: GEORGE BOGLE, THE PANCHEN LAMA AND THE FIRST BRITISH EXPEDITION TO TIBET’ NA NA KATE TELTSCHER NA 0.00 NA NA 0374217009 , 9780374217006 NULL , NA , https://www.nytimes.com/2007/04/22/books/review/Stuart.t.html, NA
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ’TIL DEATH by Sharon Sala Sharon Sala 0.00 Harlequin Mira 0778314278 , 9780778314271 NULL , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ’TIL DEATH DO US PART A matchmaker in Victorian England turns to a crime novelist for help when she starts receiving a series of disturbingly personalized trinkets. by Amanda Quick Amanda Quick 0.00 Berkley 069819361X , 039917446X , 9780698193611, 9780399174469 069819361X , 069819361X , 9780698193611 , 9780698193611 , 15 , 9 , Combined Print and E-Book Fiction, E-Book Fiction , Combined Print & E-Book Fiction , E-Book Fiction , 2016-05-08 , 2016-05-08 , 2016-04-23 , 2016-04-23 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ’Til Faith Do Us Part: How Interfaith Marriage is Transforming America NA NA Naomi Schaefer Riley NA 0.00 NA NA 0199873747 , 9780199873746 NULL , NA , https://www.nytimes.com/2013/05/26/books/review/til-faith-do-us-part-by-naomi-schaefer-riley.html, NA
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ’TIS THE SEASON Two classic holiday stories — “Under the Christmas Tree” (2009) and “Midnight Confessions” (2010) — plus the novella “Backward Glance” (1991). by Robyn Carr Ron Carr 0.00 Harlequin Mira 0778316645 , 9780778316640 0778316645 , 0778316645 , 9780778316640 , 9780778316640 , 18 , 6 , Mass Market Paperback , Mass Market Paperback , Paperback Mass-Market Fiction, Paperback Mass-Market Fiction, 2014-11-30 , 2014-11-16 , 2014-11-15 , 2014-11-01 , 0 , 1 , 0 , 0 , 0 , 1 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 (RE)BORN IN THE USA The soccer commentator describes how he embraced American popular culture while growing up in Liverpool. by Roger Bennett Roger Bennett 0.00 Dey Street 0062958690 , 0062958720 , 9780062958693, 9780062958723 0062958690 , 0062958690 , 9780062958693 , 9780062958693 , 3 , 1 , Combined Print and E-Book Nonfiction, Hardcover Nonfiction , Combined Print & E-Book Nonfiction , Hardcover Nonfiction , 2021-07-18 , 2021-07-18 , 2021-07-03 , 2021-07-03 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 ——, THAT’S DELICIOUS by Action Bronson with Rachel Wharton Action Bronson with Rachel Wharton 0.00 Abrams 1419726552 , 9781419726552 1419726552 , 9781419726552 , 9 , Advice How-To and Miscellaneous, Advice, How-To & Miscellaneous , 2017-10-01 , 2017-09-16 , 1 , 0 , 0 , 1 , , ,
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 …and the Horse He Rode In On: The People V. Kenneth Starr NA NA James Carville NA 0.00 NA NA 0684857340 , 9780684857343 NULL , NA , https://www.nytimes.com/1998/10/18/books/preaching-to-the-converted.html, NA
OK Copyright (c) 2023 The New York Times Company. All Rights Reserved. 35593 .HACK G.U. , VOL. 5 This series, set in the future, is about an online, multiplayer game run amok. This volume concludes the tale. by Hamazaki Tatsuya Hamazaki Tatsuya 10.99 TOKYOPOP NULL NULL , , ,

As we can see, this generates a rather unorganized (and incomplete) dataframe consisting of empty and nested columns (and even empty, nested columns) Of course, it is necessary to clean and tidy all of these datapoints

##First, I want to remove the "results." prefix

colnames(nyt) <- sub("^results.", "", colnames(nyt))

nyt_unnest <- nyt %>% unnest_wider(c("ranks_history", "reviews", "isbns"))

#dropping the primary isbn variables, since they contain the same information contained within the nested isbn variable (and the nested variable seems to contain more complete observations)

nyt_unnest <- subset(nyt_unnest, select = -c(primary_isbn10, primary_isbn13))


nyt_unnest <- nyt_unnest %>% unnest_longer(isbn10)
nyt_unnest <- nyt_unnest %>% unnest_longer(isbn13)
nyt_unnest <- nyt_unnest %>% unnest_longer(rank)
nyt_unnest <- nyt_unnest %>% unnest_longer(list_name)
nyt_unnest <- nyt_unnest %>% unnest_longer(display_name)
nyt_unnest <- nyt_unnest %>% unnest_longer(published_date)
nyt_unnest <- nyt_unnest %>% unnest_longer(bestsellers_date)
nyt_unnest <- nyt_unnest %>% unnest_longer(weeks_on_list)
nyt_unnest <- nyt_unnest %>% unnest_longer(rank_last_week)
nyt_unnest <- nyt_unnest %>% unnest_longer(asterisk)
nyt_unnest <- nyt_unnest %>% unnest_longer(dagger)

glimpse(nyt_unnest)
## Rows: 7,821,209
## Columns: 26
## $ status               <chr> "OK", "OK", "OK", "OK", "OK", "OK", "OK", "OK", "…
## $ copyright            <chr> "Copyright (c) 2023 The New York Times Company.  …
## $ num_results          <int> 35593, 35593, 35593, 35593, 35593, 35593, 35593, …
## $ title                <chr> "\"I GIVE YOU MY BODY ...\"", "\"MOST BLESSED OF …
## $ description          <chr> "The author of the Outlander novels gives tips on…
## $ contributor          <chr> "by Diana Gabaldon", "by Annette Gordon-Reed and …
## $ author               <chr> "Diana Gabaldon", "Annette Gordon-Reed and Peter …
## $ contributor_note     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ price                <chr> "0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "…
## $ age_group            <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ publisher            <chr> "Dell", "Liveright", "Beacon", "Beacon", "Beacon"…
## $ isbn10               <chr> "0399178570", "0871404427", "0807006475", "080700…
## $ isbn13               <chr> "9780399178573", "9780871404428", "9780807006474"…
## $ rank                 <int> 8, 16, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
## $ list_name            <chr> "Advice How-To and Miscellaneous", "Hardcover Non…
## $ display_name         <chr> "Advice, How-To & Miscellaneous", "Hardcover Nonf…
## $ published_date       <chr> "2016-09-04", "2016-05-01", "2023-01-29", "2023-0…
## $ bestsellers_date     <chr> "2016-08-20", "2016-04-16", "2023-01-14", "2023-0…
## $ weeks_on_list        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ rank_last_week       <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ asterisk             <int> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ dagger               <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ book_review_link     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ first_chapter_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ sunday_review_link   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
## $ article_chapter_link <chr> "", "", "", "", "", "", "", "", "", "", "", "", "…
nyt_unnest[nyt_unnest == ""] <- NA

And with that, we are left with a very, very large (albeit analyzable) unnested dataset