Assignment – Web APIs

The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame.

Introduction

I will make a call to the topstories API to read the JSON data and transform it into an R data frame. Specifically I will be working with the sports API: https://api.nytimes.com/svc/topstories/v2/sports.json?api-key=WSkYmmMA8AoszM0tsfyPRX33G3GNSSzJ

My API key is WSkYmmMA8AoszM0tsfyPRX33G3GNSSzJ

Loading the required packages

library(httr)
library(jsonlite)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(kableExtra)
## Warning: package 'kableExtra' was built under R version 4.0.4
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows

Reading and Transforming Data

api_url <- "https://api.nytimes.com/svc/topstories/v2/sports.json?api-key=WSkYmmMA8AoszM0tsfyPRX33G3GNSSzJ"

# Read JSON data from the API
json_object <- fromJSON(api_url)

# Conver JSON to DataFrame
travel_df <- json_object$results
class(travel_df) # COnfirm it is a dataframe
## [1] "data.frame"
# Column names
colnames(travel_df)
##  [1] "section"             "subsection"          "title"              
##  [4] "abstract"            "url"                 "uri"                
##  [7] "byline"              "item_type"           "updated_date"       
## [10] "created_date"        "published_date"      "material_type_facet"
## [13] "kicker"              "des_facet"           "org_facet"          
## [16] "per_facet"           "geo_facet"           "multimedia"         
## [19] "short_url"
# Select relevant columns
travel_df <- subset(travel_df, select = c(section, title, abstract, url, byline))

# Show the first 50 records
head(travel_df,50)