The New York Times web site provides a rich set of APIs, as described here: http://developer.nytimes.com/docs
You’ll need to start by signing up for an API key.
NYT email:
Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it to an R dataframe.
To complete this assignment, I decided find an answer to the following question:
Analysis is for 2018
library(httr)
library(dplyr)
library(stringr)
library(ggplot2)
library(kableExtra)
We first retrieve articles for Halliburton using the New York Times API and place it in a data frame.
url<-"http://api.nytimes.com/svc/search/v2/articlesearch.json?q=halliburton&fl=web_url,headline,pub_date&begin_date=20180101&api-key=2537936837bc4979adef9cad2b72d0c4"
r<-GET(url)
rtable<-content(r)$response$docs
df<-data.frame(a=character(),b=character(),b=character(),c=numeric())
for (i in seq(1,length(rtable))) {
df<-rbind(df,data.frame(a=as.character(rtable[[i]][1]),b=as.character(rtable[[i]][2]$headline$main),str_extract(rtable[[i]][3],"[:digit:]+-[:digit:]+-[:digit:]+")))
}
names(df)<-c("URL","Headline","Date")
df<-tbl_df(df)
df %>% kable() %>% kable_styling() %>% scroll_box(width = "910px",height="400px")
Second we get Halliburton stock price from a different API (Alpha Vantage) to build a data frame with daily stock price differences between open and close.
url<-"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=HAL&outputsize=full&apikey=E577HED0QHU9QQA2"
r<-GET(url)
rtable<-content(r)
dfPrice<-data.frame(a=character(),b=numeric())
for (i in seq(1,length(rtable[[2]]))) {
dfPrice<-rbind(dfPrice,data.frame(as.character(names(rtable[[2]])[i]),as.numeric(rtable[[2]][i][[1]][4])-as.numeric(rtable[[2]][i][[1]][1])))
}
names(dfPrice)<-c("Date","Price_Delta")
dfPrice<-tbl_df(dfPrice)
Finally we join both data frames by date, the final data frame shows articles and the daily change in stock price.
dfFinal<-left_join(df,dfPrice)
## Joining, by = "Date"
## Warning: Column `Date` joining factors with different levels, coercing to
## character vector
dfFinal %>% kable() %>% kable_styling() %>% scroll_box(width = "910px",height="400px")