The New York Times web site provides a rich set of APIs, as described here: http://developer.nytimes.com/docs You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it to an R dataframe.
library(jsonlite)
library(httr)
library(knitr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
I will select articlesearch and quary with election. Use ’GET’function to send a request to website. I was needed to create an account in order to getting an API key.
nyt_elec<- GET('https://api.nytimes.com/svc/search/v2/articlesearch.json?q=election&api-key=2mLZpTsYEuzfKwJ6NOXZlZ4ZbYW73OgE')
nyt_elec
## Response [https://api.nytimes.com/svc/search/v2/articlesearch.json?q=election&api-key=2mLZpTsYEuzfKwJ6NOXZlZ4ZbYW73OgE]
## Date: 2019-10-24 00:03
## Status: 200
## Content-Type: application/json;charset=UTF-8
## Size: 219 kB
The request Status is 200, which means the connection is okay.
And I will transform JSON data into R data frame.
nyt_elec<- rawToChar(nyt_elec$content)
nyt_elec <- fromJSON(nyt_elec, simplifyVector = TRUE)
data <- data.frame(nyt_elec)
data <- select(data, "response.docs.web_url","response.docs.snippet","response.docs.lead_paragraph","response.docs.pub_date","response.docs.type_of_material")
data
## response.docs.web_url
## 1 https://www.nytimes.com/2019/08/29/opinion/fec-trump.html
## 2 https://www.nytimes.com/2019/09/29/us/fec-chairwoman-twitter-memo.html
## 3 https://www.nytimes.com/2019/08/26/us/politics/federal-election-commission.html
## 4 https://www.nytimes.com/2019/10/11/us/politics/when-are-the-2020-presidential-debates.html
## 5 https://www.nytimes.com/2019/09/16/opinion/canada-election-prime-minister.html
## 6 https://www.nytimes.com/2019/10/22/world/canada/trudeau-re-elected.html
## 7 https://www.nytimes.com/2019/10/18/technology/amazon-seattle-council-election.html
## 8 https://www.nytimes.com/2019/09/30/upshot/democrats-2020-losing-independents.html
## 9 https://www.nytimes.com/2019/09/20/world/canada/justin-pierre-trudeau-reelection.html
## 10 https://www.nytimes.com/2019/10/15/world/canada/canadian-voters-election.html
## response.docs.snippet
## 1 With three of six seats vacant — and the 2020 elections looming — the Federal Election Commission cannot do its job.
## 2 The election commission’s Democratic chairwoman said a Republican commissioner had blocked official publication of the memo.
## 3 The Federal Election Commission is supposed to monitor how candidates raise and spend money, but it will no longer have enough commissioners to legally meet.
## 4 The presidential debates will be held in Indiana, Michigan and Tennessee, and a vice-presidential debate will be held in Utah.
## 5 The campaign for prime minister just started. And it’s almost over.
## 6 An urban vs. rural split, along with increasing regionalism, has taken hold in a country celebrated for social cohesion.
## 7 The hometown tech giant has contributed more than $1.4 million this year to influence City Council races.
## 8 A survey experiment shows that some independents are already being turned off.
## 9 Justin Trudeau experienced a dramatic campaign setback with the revelation of blackface photos from his past, he faces many re-election challenges Pierre Trudeau faced in 1972
## 10 After a contentious, scandal-filled campaign for prime minister, many voters say they feel disillusioned.
## response.docs.lead_paragraph
## 1 The United States is headed into what promises to be among the most contentious and expensive campaign cycles in modern history — with foreign and domestic actors eager to make mischief — without the chief elections cop on the beat.
## 2 The Federal Election Commission chairwoman, Ellen L. Weintraub, on Friday took the dramatic step of using Twitter to release the entire draft of a memo addressing foreign election interference after its disputed publication in the agency’s weekly digest.
## 3 The Federal Election Commission, the beleaguered independent agency that is supposed to serve as the watchdog over how money is raised and spent in American elections, has long been criticized as dysfunctional, if not toothless.
## 4 The nonpartisan commission that governs presidential debates announced on Friday the dates and sites for the three presidential debates and one vice-presidential debate during the 2020 general election.
## 5 MONTREAL — Tired of the presidential election? Ready to scream if you hear another word from Marianne Williamson? So beaten down that the sight of 10 candidates on another debate stage might send you to counseling?
## 6 MONTREAL — Canada is often viewed as a model of harmony. Monday’s election suggests a more divided country.
## 7 Each week, we review the week’s news, offering analysis about the most important developments in the tech industry.
## 8 One defining feature of the Democratic primary so far has been the party’s leftward turn. In recent debates, candidates have supported policies like offering health insurance to undocumented immigrants, and commenters have warned about the potential electoral penalty of repelling persuadable voters.
## 9 If nothing else, it was a week of Canada being at the center of worldwide attention, if not for desirable reasons.
## 10 Canadians will head to the polls on Monday to elect their next government ending a contentious campaign that has been heavy on personalities and opposition attacks, and light on issues and policy.
## response.docs.pub_date response.docs.type_of_material
## 1 2019-08-29T21:01:04+0000 Editorial
## 2 2019-09-29T15:08:22+0000 News
## 3 2019-08-26T23:35:29+0000 News
## 4 2019-10-11T17:25:25+0000 News
## 5 2019-09-16T19:09:04+0000 Op-Ed
## 6 2019-10-22T18:00:48+0000 News
## 7 2019-10-18T13:00:07+0000 News
## 8 2019-09-30T09:00:10+0000 News
## 9 2019-09-21T02:04:30+0000 News
## 10 2019-10-15T21:46:34+0000 News
With using a For loop, I will collect more data. Since a single page has 10 observations and I try to loop 50 pages, total 500 articles are collected. Before I make a loop, I create an empty dataframe which has web url, snippet, lead paragraph, publish date, and type of materials.
data <- data.frame(web_url= character(),snippet= character(), lead_paragraph = character(),pub_date = character(),type_of_material = character())
page <- 50
for (i in 1:page){
url <- paste0("https://api.nytimes.com/svc/search/v2/articlesearch.json?q=election", "&page=", i, "&api-key=2mLZpTsYEuzfKwJ6NOXZlZ4ZbYW73OgE")
nyt_election <- GET(url)
nyt_election<- rawToChar(nyt_election$content)
nyt_election <- fromJSON(nyt_election, simplifyVector = TRUE)
d <- data.frame(nyt_election)
web_url <- d$response.docs.web_url
snippet <- d$response.docs.snippet
lead_paragraph <- d$response.docs.lead_paragraph
pub_date <- d$response.docs.pub_date
type <- d$response.docs.document_type
data <- rbind(data, as.data.frame(cbind(web_url,snippet,lead_paragraph,pub_date,type)))
}
This is the data that is transformed into data frame.
head(data)
## web_url
## 1 https://www.nytimes.com/2019/10/21/world/canada/canada-election.html
## 2 https://www.nytimes.com/reuters/2019/10/21/world/europe/21reuters-canada-election-factbox.html
## 3 https://www.nytimes.com/2019/10/16/world/canada/obama-trudeau-election.html
## 4 https://www.nytimes.com/aponline/2019/10/17/world/ap-cn-ap-explains-canada-election.html
## 5 https://www.nytimes.com/2019/09/17/world/middleeast/israel-photos-communities-elections.html
## 6 https://www.nytimes.com/2019/09/27/world/canada/election-debates-2019.html
## snippet
## 1 Prime Minister Justin Trudeau faces the political battle of his life in a tight race.
## 2 Prime Minister Justin Trudeau formally launched the race for the Oct. 21 election on Wednesday. Here is a breakdown of how Canadian elections work and the possible outcomes.
## 3 Former President Barack Obama said “the world needs his progressive leadership now” of Prime Minister Justin Trudeau of Canada, who faces a federal election this month.
## 4 Prime Minister Justin Trudeau is facing a tough re-election battle against his Conservative Party rival, Andrew Scheer, in Canadian elections on Monday. Here's a guide to the election:
## 5 Israeli politics can be tribal, with loyalties to ethnic groups, religious factions and ideologies as strong a factor in voting as views on particular issues. Here’s a guide in words and pictures.
## 6 Compared to the United States’, televised leaders’ debates came late to Canada, and we’re still figuring them out.
## lead_paragraph
## 1 Prime Minister Justin Trudeau has canoed to a campaign event, taken countless selfies and braved town halls with sometimes angry voters. One opponent — Jagmeet Singh — hopped on an orange tractor in northern Ontario at a plowing competition; his chief rival — Andrew Scheer — smiled with babies at a ribfest in Prince Edward Island.
## 2 (Reuters) - Prime Minister Justin Trudeau formally launched the race for the Oct. 21 election on Wednesday. Here is a breakdown of how Canadian elections work and the possible outcomes.
## 3 MONTREAL — Former President Barack Obama endorsed Justin Trudeau’s re-election as prime minister on Wednesday, potentially giving him a lift just days before Canadians go to the polls in closely fought elections.
## 4 TORONTO — Prime Minister Justin Trudeau is facing a tough re-election battle against his Conservative Party rival, Andrew Scheer, in Canadian elections on Monday. Here's a guide to the election:
## 5 JERUSALEM — Tuesday’s do-over election in Israel may not, by itself, decide who will be the next prime minister. That could take weeks of arduous coalition negotiations.
## 6 This week the world talked climate change at the United Nations and received a blunt dressing down of its elites by Greta Thunberg, the teenage climate activist from Sweden. And on Friday, Prime Minister Justin Trudeau and Elizabeth May, the Green Party leader, joined Ms. Thunberg and thousands of others in the climate march in Montreal.
## pub_date type
## 1 2019-10-21T09:00:14+0000 article
## 2 2019-10-21T13:05:42+0000 article
## 3 2019-10-16T18:43:05+0000 article
## 4 2019-10-17T14:27:09+0000 article
## 5 2019-09-17T07:00:32+0000 article
## 6 2019-09-27T22:28:12+0000 article
class(data)
## [1] "data.frame"