Overview

Assignment – Web APIs

The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis

You’ll need to start by signing up for an API key.

Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.

Import Data

Data is retrieved using the API url for popular articles, my query which is for articles referencing the word Woodside in the title and my api key. They are concatenated with paste() and stored.

article_api_url <-"https://api.nytimes.com/svc/search/v2/articlesearch.json?"
api_query     <-"query=Woodside&"
api_key       <-paste("api-key=",yourkey)
nyt_api_url   <-paste(article_api_url,api_query,api_key, sep = '')

The method fromJSON from the the jsonlite package and converted into a data frame with the columns referenced below.

api_results <- 
  fromJSON(nyt_api_url, flatten = TRUE)%>%
    as.data.frame()
x
status
copyright
response.docs.abstract
response.docs.web_url
response.docs.snippet
response.docs.lead_paragraph
response.docs.print_section
response.docs.print_page
response.docs.source
response.docs.multimedia
response.docs.keywords
response.docs.pub_date
response.docs.document_type
response.docs.news_desk
response.docs.section_name
response.docs.type_of_material
response.docs._id
response.docs.word_count
response.docs.uri
response.docs.slideshow_credits
response.docs.subsection_name
response.docs.headline.main
response.docs.headline.kicker
response.docs.headline.content_kicker
response.docs.headline.print_headline
response.docs.headline.name
response.docs.headline.seo
response.docs.headline.sub
response.docs.byline.original
response.docs.byline.person
response.docs.byline.organization
response.meta.hits
response.meta.offset
response.meta.time

I only require 4 columns for my inquiry which I also rename for better context.

  api_results %>%
     select(response.docs.headline.main,
         response.docs.news_desk,
         response.docs.web_url,
         response.docs.word_count)%>%
            rename("Headline" = `response.docs.headline.main`,
            "Category" = `response.docs.news_desk`,
            "URL"=`response.docs.web_url`,
            "Word Count"=`response.docs.word_count`)%>%
                arrange(desc(`Word Count`))%>%
                  kableExtra::kable()
Headline Category URL Word Count
The Virus Drove Churchgoers Away. Will Easter Bring Them Back? Metro https://www.nytimes.com/2021/04/03/nyregion/new-york-covid-church-easter.html 1423
How the Virus Swept Through a Corner of Queens Metro https://www.nytimes.com/2020/12/07/nyregion/coronavirus-queens-epicenter.html 944
How an Artisanal Doughnut Maker Spends Her Sundays Metropolitan https://www.nytimes.com/2021/02/26/nyregion/kora-doughnuts-nyc.html 869
Filipino Comfort Food in Woodside Dining In, Dining Out/Style Desk https://www.nytimes.com/2005/01/05/dining/filipino-comfort-food-in-woodside.html 706
Cheryl Dellasega, Stephen Woodside Society https://www.nytimes.com/2012/11/25/fashion/weddings/cheryl-dellasega-stephen-woodside-weddings.html 341
Recent Commercial Real Estate Transactions Business https://www.nytimes.com/2021/01/19/business/new-york-commercial-real-estate.html 241
Homes for Sale in Brooklyn, Manhattan and Queens RealEstate https://www.nytimes.com/2020/10/29/realestate/housing-market-nyc.html 141
Woodside Cafe Food https://www.nytimes.com/slideshow/2016/09/14/dining/woodside-cafe-review.html 0
Living in Woodside, Queens Real Estate https://www.nytimes.com/slideshow/2008/03/16/realestate/0316-LIVINGIN_index.html 0
On the Market in New York City Real Estate https://www.nytimes.com/slideshow/2020/10/29/realestate/on-the-market-in-new-york-city.html 0

Conclusion

My conclusion is very straight forward, I wanted to see the news article with the highest word count, published by NYTimes and be able to connect to it. The article in question is titled The Virus Drove Churchgoers Away. Will Easter Bring Them Back? and honestly a good read.