Assignment on RPubs
Rmd on Github
This asignment will employ the New York Times’ set of APIs to read data into JSON and transform it into an R data frame. We will use the New York Times Top Stories API and select world stories. An example of the call would use this API call https://api.nytimes.com/svc/topstories/v2/world.json?api-key=yourkey.
Let’s import the data from the New York Times.
library(jsonlite)
# Call the JSON frm the NY Times by constructing the call store
apikey <- "WVJoOYFhRCJAX5qahIguxfbZmmcpVeBA"
baseurl <- "https://api.nytimes.com/svc/topstories/v2/world.json?api-key="
url <- URLencode(paste0(baseurl, apikey))
request <- fromJSON(url)
stories <- request$results
This is how the the top world stories looks using the fromJSON function from the jsonlite package.
stories
Let’s look at particular aspects of the data frame.
# These are the column names
colnames(stories)
## [1] "section" "subsection" "title"
## [4] "abstract" "url" "uri"
## [7] "byline" "item_type" "updated_date"
## [10] "created_date" "published_date" "material_type_facet"
## [13] "kicker" "des_facet" "org_facet"
## [16] "per_facet" "geo_facet" "multimedia"
## [19] "short_url"
# This is the data type of the object
class(stories)
## [1] "data.frame"
# Let's look at the first 10 titles
stories[1:10, 3:5]
Let’s look at how many of the top stories have Coronavirus in their title.
library(stringr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
coronavirus <- stories %>%
filter(str_detect(title, "Coronavirus"))
cNumber <- nrow(coronavirus)
There are 11 stories in the top stories with Coronavirus in their title.
jsonlite is a welcome package for working with APIs that generate JSON objects for data.