For this I chose to look up all articles related to smoking from January 1, 2001 to August 30, 2001. I have hidden the r code with the key in the html file and included a copy with a blank key in github. The information is contained in the list qlist, along with the search query. I loaded the results into the smoke variable.
smoke<-GET("https://api.nytimes.com/svc/search/v2/articlesearch.json", query=qlist)
smoke$status_code #200 means it is there
## [1] 200
head(smoke$content) #verifying the raw bits
## [1] 7b 22 73 74 61 74
smoke$cookies #unlike the webiste, the API does not serve cookies to inquiries.
## [1] domain flag path secure expiration name
## [7] value
## <0 rows> (or 0-length row.names)
To look at smoke, we can either load it as a list to examine or put it as a text file to then run through jsonlite. First as a list, only looking at one of the entries.
smokel<-content(smoke)
smokel[[3]][[1]][1]
## [[1]]
## [[1]]$web_url
## [1] "https://www.nytimes.com/2001/08/26/nyregion/county-lines-a-day-of-joy-clouded-by-smoke.html"
##
## [[1]]$snippet
## [1] "IT was too happy a day for such a sight. Early in the morning three weeks ago, my wife gave birth to our son. Hours later, I returned home to get my daughter. We collected a pile of presents for the baby and a separate but equal pile for the br..."
##
## [[1]]$abstract
## [1] "Marek Fuchs County Lines column on feeling joyful about seeing his newborn son but troubled by seeing ailing patients outside hospital smoking cigarettes; remembers childhood filled with smoke from his mother's two-pack-a-day habit; drawing (M)"
##
## [[1]]$print_page
## [1] "1"
##
## [[1]]$blog
## named list()
##
## [[1]]$source
## [1] "The New York Times"
##
## [[1]]$multimedia
## list()
##
## [[1]]$headline
## [[1]]$headline$main
## [1] "A Day of Joy, Clouded by Smoke"
##
## [[1]]$headline$kicker
## [1] "COUNTY LINES"
##
## [[1]]$headline$content_kicker
## NULL
##
## [[1]]$headline$print_headline
## NULL
##
## [[1]]$headline$name
## NULL
##
## [[1]]$headline$seo
## NULL
##
## [[1]]$headline$sub
## NULL
##
##
## [[1]]$keywords
## [[1]]$keywords[[1]]
## [[1]]$keywords[[1]]$name
## [1] "subject"
##
## [[1]]$keywords[[1]]$value
## [1] "SMOKING AND TOBACCO"
##
## [[1]]$keywords[[1]]$rank
## [1] 0
##
## [[1]]$keywords[[1]]$major
## NULL
##
##
##
## [[1]]$pub_date
## [1] "2001-08-26T00:00:00Z"
##
## [[1]]$document_type
## [1] "article"
##
## [[1]]$news_desk
## [1] "Westchester Weekly Desk"
##
## [[1]]$byline
## [[1]]$byline$original
## [1] "By MAREK FUCHS"
##
## [[1]]$byline$person
## [[1]]$byline$person[[1]]
## [[1]]$byline$person[[1]]$firstname
## [1] "Marek"
##
## [[1]]$byline$person[[1]]$middlename
## NULL
##
## [[1]]$byline$person[[1]]$lastname
## [1] "FUCHS"
##
## [[1]]$byline$person[[1]]$qualifier
## NULL
##
## [[1]]$byline$person[[1]]$title
## NULL
##
## [[1]]$byline$person[[1]]$role
## [1] "reported"
##
## [[1]]$byline$person[[1]]$organization
## [1] ""
##
## [[1]]$byline$person[[1]]$rank
## [1] 1
##
##
##
## [[1]]$byline$organization
## NULL
##
##
## [[1]]$type_of_material
## [1] "News"
##
## [[1]]$`_id`
## [1] "4fd20d558eb7c8105d77d5fc"
##
## [[1]]$word_count
## [1] 775
##
## [[1]]$score
## [1] 0.000180406
While this could be turned into a data.frame via a lapply command or loop, there is an easier way, as the content of smoke is a json file.
smoke$headers$`content-type`
## [1] "application/json;charset=UTF-8"
So, we use the content function with the “text” argument to get a text file that is essentially a json file already loaded. I first look at the 200 beginning characters for the example, then run the text through the fromJSON function. The from JSON function returns a list of data.frames, but I am only interested in the docs portion.
smoket<-content(smoke,"text")
substr(smoket,1,200)
## [1] "{\"status\":\"OK\",\"copyright\":\"Copyright (c) 2018 The New York Times Company. All Rights Reserved.\",\"response\":{\"docs\":[{\"web_url\":\"https://www.nytimes.com/2001/08/26/nyregion/county-lines-a-day-of-joy-c"
smokedf<-fromJSON(smoket,simplifyDataFrame = TRUE)$response$docs
head(smokedf)
## web_url
## 1 https://www.nytimes.com/2001/08/26/nyregion/county-lines-a-day-of-joy-clouded-by-smoke.html
## 2 https://www.nytimes.com/2001/07/08/nyregion/briefing-law-enforcement-beach-smoking.html
## 3 https://www.nytimes.com/2001/06/07/opinion/l-no-smoking-zones-677442.html
## 4 https://www.nytimes.com/2001/07/08/nyregion/smokes-fan-the-flames.html
## 5 https://www.nytimes.com/2001/05/08/science/l-anti-smoking-additives-213888.html
## 6 https://www.nytimes.com/2001/04/01/magazine/social-studies-up-in-smoke.html
## snippet
## 1 IT was too happy a day for such a sight. Early in the morning three weeks ago, my wife gave birth to our son. Hours later, I returned home to get my daughter. We collected a pile of presents for the baby and a separate but equal pile for the br...
## 2 Belmar became the only New Jersey beach community and one of the few in the nation to limit smoking on the beach and boardwalk. Yet the police are simply issuing warnings rather than summonses this summer; violators could receive fines of up to $1...
## 3 To the Editor: The newly designated no-smoking section of Bryant Park (news article, June 2) sends an important message that extends beyond the park's boundaries: There is no fresh air when one is near a smoker in an open space. Smokers would u...
## 4 IT was midafternoon on a recent Sunday and more than 100 members of the Unkechaug Indian nation gathered on the pow-wow grounds at their tiny reservation in a corner of Mastic to celebrate the abundance of nature. ''It's our feast of the flowe...
## 5 To the Editor: Yes, jacking up the price of cigarettes may drive away a fraction of teenage smokers (''When Smoking Is a Matter of Money,'' May 1). But here's a foolproof way to end teenage smoking once and for all: Continue to allow the tobacc...
## 6 Guess who I saw online last week in the first-class smoking room on the A-deck of the Virtual Titanic? Leonardo DiCaprio, no less. He was standing with his back to me, contemplating a blazing hearth, a brandy in one hand, an unlighted cigarette in...
## abstract
## 1 Marek Fuchs County Lines column on feeling joyful about seeing his newborn son but troubled by seeing ailing patients outside hospital smoking cigarettes; remembers childhood filled with smoke from his mother's two-pack-a-day habit; drawing (M)
## 2 Belmar becomes only New Jersey beach community and one of few in US to limit smoking on beach and boardwalk; photo (S)
## 3 Anne-Marie Marion letter comments on June 2 article on newly designated no-smoking area in Manhattan's Bryant Park
## 4 Article on suspicious fires and threats to some tribal members of Unkechaug tribe in Mastic, Long Island, in effort to gain control of tax-free tobacco shops on Poospatuck Indian Reservation; Chief Harry Wallace, who opened first shop on reservation, says tribe's Internet domain is being used as cover for illegal drug trafficking; photos (M)
## 5 Sarah Crichton letter on disincentives to teen-age smoking (May 1 article)
## 6 Comment by Georgia Dullea, recovering nicotine addict, looks back on days of special lounges for smokers, who are now reduced to lurking in front of office buildings with their cigarettes; photos (special sectioin, Home Design) (part 2 of 2-part section) (M)
## print_page source multimedia headline.main
## 1 1 The New York Times NULL A Day of Joy, Clouded by Smoke
## 2 4 The New York Times NULL BEACH SMOKING
## 3 32 The New York Times NULL No-Smoking Zones
## 4 1 The New York Times NULL Smokes Fan the Flames
## 5 3 The New York Times NULL Anti-Smoking Additives
## 6 60 The New York Times NULL Up In Smoke
## headline.kicker headline.content_kicker
## 1 COUNTY LINES NA
## 2 BRIEFING: LAW ENFORCEMENT NA
## 3 <NA> NA
## 4 <NA> NA
## 5 <NA> NA
## 6 SOCIAL STUDIES NA
## headline.print_headline headline.name headline.seo headline.sub
## 1 NA NA NA NA
## 2 NA NA NA NA
## 3 NA NA NA NA
## 4 NA NA NA NA
## 5 NA NA NA NA
## 6 NA NA NA NA
## keywords
## 1 subject, SMOKING AND TOBACCO, 0, NA
## 2 glocations, subject, subject, subject, BELMAR (NJ), BOARDWALKS, BEACHES, SMOKING AND TOBACCO, 0, 0, 0, 0, NA, NA, NA, NA
## 3 persons, glocations, subject, subject, MARION, ANNE-MARIE, BRYANT PARK (NYC), PARKS AND OTHER RECREATION AREAS, SMOKING AND TOBACCO, 0, 0, 0, 0, NA, NA, NA, NA
## 4 persons, glocations, subject, subject, subject, subject, subject, subject, subject, subject, subject, WALLACE, HARRY, MASTIC (NY), INDIANS, AMERICAN, FIRES AND FIREMEN, TAXATION, THREATS AND THREATENING MESSAGES, ARSON, SMOKING AND TOBACCO, DRUG ABUSE AND TRAFFIC, UNKECHAUG TRIBE, COMPUTERS AND THE INTERNET, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## 5 persons, subject, subject, CRICHTON, SARAH, CHILDREN AND YOUTH, SMOKING AND TOBACCO, 0, 0, 0, NA, NA, NA
## 6 subject, subject, SPECIAL SECTIONS, SMOKING AND TOBACCO, 0, 0, NA, NA
## pub_date document_type news_desk
## 1 2001-08-26T00:00:00Z article Westchester Weekly Desk
## 2 2001-07-08T00:00:00Z article New Jersey Weekly Desk
## 3 2001-06-07T00:00:00Z article Editorial Desk
## 4 2001-07-08T00:00:00Z article Long Island Weekly Desk
## 5 2001-05-08T00:00:00Z article Science Desk
## 6 2001-04-01T00:00:00Z article Home Design Magazine
## byline.original byline.person
## 1 By MAREK FUCHS Marek, NA, FUCHS, NA, NA, reported, , 1
## 2 By Karen DeMasters Karen, NA, DeMasters, NA, NA, reported, , 1
## 3 <NA> NULL
## 4 By MARY REINHOLZ Mary, NA, REINHOLZ, NA, NA, reported, , 1
## 5 <NA> NULL
## 6 By Georgia Dullea Georgia, NA, Dullea, NA, NA, reported, , 1
## byline.organization type_of_material _id word_count
## 1 NA News 4fd20d558eb7c8105d77d5fc 775
## 2 NA News 4fd24c528eb7c8105d7ec050 77
## 3 NA Letter 4fd20d548eb7c8105d77d593 63
## 4 NA News 4fd227658eb7c8105d7b03b9 1826
## 5 NA Letter 4fd2587c8eb7c8105d8043cf 98
## 6 NA News 4fd244a28eb7c8105d7df220 1069
## score
## 1 0.0001804060
## 2 0.0001666624
## 3 0.0001572481
## 4 0.0001493014
## 5 0.0001435403
## 6 0.0001408655
Interestingly enough, smokedf, while a data.frame, has lists as elements, as JSONlite will do when it receives more than one entry for an element. Thus, we receive an error for the summary function, even though smokedf is a data.frame. The offending columns are the keywords and the multimedia columns.
class(smokedf)
## [1] "data.frame"
summary(smokedf)
## Error in max(wid): invalid 'type' (list) of argument
class(smokedf$keywords)
## [1] "list"
class(smokedf$multimedia)
## [1] "list"