Assignment – Web APIs
The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis
You’ll need to start by signing up for an API key.
Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.
Data is retrieved using the API url for popular articles, my query which is for articles referencing the word Woodside in the title and my api key. They are concatenated with paste() and stored.
article_api_url <-"https://api.nytimes.com/svc/search/v2/articlesearch.json?"
api_query <-"query=Woodside&"
api_key <-paste("api-key=",yourkey)
nyt_api_url <-paste(article_api_url,api_query,api_key, sep = '')The method fromJSON from the the jsonlite package and converted into a data frame with the columns referenced below.
| x |
|---|
| status |
| copyright |
| response.docs.abstract |
| response.docs.web_url |
| response.docs.snippet |
| response.docs.lead_paragraph |
| response.docs.print_section |
| response.docs.print_page |
| response.docs.source |
| response.docs.multimedia |
| response.docs.keywords |
| response.docs.pub_date |
| response.docs.document_type |
| response.docs.news_desk |
| response.docs.section_name |
| response.docs.type_of_material |
| response.docs._id |
| response.docs.word_count |
| response.docs.uri |
| response.docs.slideshow_credits |
| response.docs.subsection_name |
| response.docs.headline.main |
| response.docs.headline.kicker |
| response.docs.headline.content_kicker |
| response.docs.headline.print_headline |
| response.docs.headline.name |
| response.docs.headline.seo |
| response.docs.headline.sub |
| response.docs.byline.original |
| response.docs.byline.person |
| response.docs.byline.organization |
| response.meta.hits |
| response.meta.offset |
| response.meta.time |
I only require 4 columns for my inquiry which I also rename for better context.
api_results %>%
select(response.docs.headline.main,
response.docs.news_desk,
response.docs.web_url,
response.docs.word_count)%>%
rename("Headline" = `response.docs.headline.main`,
"Category" = `response.docs.news_desk`,
"URL"=`response.docs.web_url`,
"Word Count"=`response.docs.word_count`)%>%
arrange(desc(`Word Count`))%>%
kableExtra::kable()My conclusion is very straight forward, I wanted to see the news article with the highest word count, published by NYTimes and be able to connect to it. The article in question is titled The Virus Drove Churchgoers Away. Will Easter Bring Them Back? and honestly a good read.