Overview

Assignment – Web APIs

The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis

You’ll need to start by signing up for an API key.

Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.

Import Data

Data is retrieved using the API url for popular articles, my query which is for articles referencing the word Woodside in the title and my api key. They are concatenated with paste() and stored.

article_api_url <-"https://api.nytimes.com/svc/search/v2/articlesearch.json?"
api_query     <-"query=Woodside&"
api_key       <-paste("api-key=",yourkey)
nyt_api_url   <-paste(article_api_url,api_query,api_key, sep = '')

The method fromJSON from the the jsonlite package and converted into a data frame with the columns referenced below.

api_results <- 
  fromJSON(nyt_api_url, flatten = TRUE)%>%
    as.data.frame()

x
status
copyright
response.docs.abstract
response.docs.web_url
response.docs.snippet
response.docs.lead_paragraph
response.docs.print_section
response.docs.print_page
response.docs.source
response.docs.multimedia
response.docs.keywords
response.docs.pub_date
response.docs.document_type
response.docs.news_desk
response.docs.section_name
response.docs.type_of_material
response.docs._id
response.docs.word_count
response.docs.uri
response.docs.slideshow_credits
response.docs.subsection_name
response.docs.headline.main
response.docs.headline.kicker
response.docs.headline.content_kicker
response.docs.headline.print_headline
response.docs.headline.name
response.docs.headline.seo
response.docs.headline.sub
response.docs.byline.original
response.docs.byline.person
response.docs.byline.organization
response.meta.hits
response.meta.offset
response.meta.time

I only require 4 columns for my inquiry which I also rename for better context.

  api_results %>%
     select(response.docs.headline.main,
         response.docs.news_desk,
         response.docs.web_url,
         response.docs.word_count)%>%
            rename("Headline" = `response.docs.headline.main`,
            "Category" = `response.docs.news_desk`,
            "URL"=`response.docs.web_url`,
            "Word Count"=`response.docs.word_count`)%>%
                arrange(desc(`Word Count`))%>%
                  kableExtra::kable()

Headline	Category	URL	Word Count
The Virus Drove Churchgoers Away. Will Easter Bring Them Back?	Metro	https://www.nytimes.com/2021/04/03/nyregion/new-york-covid-church-easter.html	1423
How the Virus Swept Through a Corner of Queens	Metro	https://www.nytimes.com/2020/12/07/nyregion/coronavirus-queens-epicenter.html	944
How an Artisanal Doughnut Maker Spends Her Sundays	Metropolitan	https://www.nytimes.com/2021/02/26/nyregion/kora-doughnuts-nyc.html	869
Filipino Comfort Food in Woodside	Dining In, Dining Out/Style Desk	https://www.nytimes.com/2005/01/05/dining/filipino-comfort-food-in-woodside.html	706
Cheryl Dellasega, Stephen Woodside	Society	https://www.nytimes.com/2012/11/25/fashion/weddings/cheryl-dellasega-stephen-woodside-weddings.html	341
Recent Commercial Real Estate Transactions	Business	https://www.nytimes.com/2021/01/19/business/new-york-commercial-real-estate.html	241
Homes for Sale in Brooklyn, Manhattan and Queens	RealEstate	https://www.nytimes.com/2020/10/29/realestate/housing-market-nyc.html	141
Woodside Cafe	Food	https://www.nytimes.com/slideshow/2016/09/14/dining/woodside-cafe-review.html	0
Living in Woodside, Queens	Real Estate	https://www.nytimes.com/slideshow/2008/03/16/realestate/0316-LIVINGIN_index.html	0
On the Market in New York City	Real Estate	https://www.nytimes.com/slideshow/2020/10/29/realestate/on-the-market-in-new-york-city.html	0

Conclusion

My conclusion is very straight forward, I wanted to see the news article with the highest word count, published by NYTimes and be able to connect to it. The article in question is titled The Virus Drove Churchgoers Away. Will Easter Bring Them Back? and honestly a good read.

DATA607 Week 9 Assignment

Gabriel Campos

April 11 2021

Overview

Import Data

Conclusion