Assignment on RPubs
Rmd on Github

Introduction

This asignment will employ the New York Times’ set of APIs to read data into JSON and transform it into an R data frame. We will use the New York Times Top Stories API and select world stories. An example of the call would use this API call https://api.nytimes.com/svc/topstories/v2/world.json?api-key=yourkey.

Import

Let’s import the data from the New York Times.

library(jsonlite)


# Call the JSON frm the NY Times by constructing the call store
apikey <- "WVJoOYFhRCJAX5qahIguxfbZmmcpVeBA"
baseurl <- "https://api.nytimes.com/svc/topstories/v2/world.json?api-key="

url <- URLencode(paste0(baseurl, apikey))
request <- fromJSON(url)
stories <- request$results


This is how the the top world stories looks using the fromJSON function from the jsonlite package.

stories

Let’s look at particular aspects of the data frame.

# These are the column names
colnames(stories)
##  [1] "section"             "subsection"          "title"              
##  [4] "abstract"            "url"                 "uri"                
##  [7] "byline"              "item_type"           "updated_date"       
## [10] "created_date"        "published_date"      "material_type_facet"
## [13] "kicker"              "des_facet"           "org_facet"          
## [16] "per_facet"           "geo_facet"           "multimedia"         
## [19] "short_url"
# This is the data type of the object
class(stories)
## [1] "data.frame"
# Let's look at the first 10 titles 
stories[1:10, 3:5]


Let’s look at how many of the top stories have Coronavirus in their title.

library(stringr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
coronavirus <- stories %>% 
  filter(str_detect(title, "Coronavirus"))

cNumber <- nrow(coronavirus)

There are 11 stories in the top stories with Coronavirus in their title.



Conclusion

jsonlite is a welcome package for working with APIs that generate JSON objects for data.