Choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it to an R dataframe.
New York Times web site provides a rich set of APIs:http://developer.nytimes.com/docs
Base URL:
http://api.nytimes.com/svc/semantic/v2/geocodes
Scope:
The New York Times controlled vocabulary (over 2000 places used to classify New York Times articles metadata) and New York Times articles from 1981 to today (excludes wire services such as the Associated Press)
The general form for a Geographic API request by concept type and specific concept:
http://api.nytimes.com/svc/semantic/v2/geocodes/query.json?(query parameters)&api-key=your-API-key
library(httr)
## Warning: package 'httr' was built under R version 3.4.4
library(jsonlite)
## Warning: package 'jsonlite' was built under R version 3.4.4
library(RCurl)
## Loading required package: bitops
API_key <- '562c28bceeaf4a43a251d7a6dddce779' #API key required to manipulate NYT API
query_parameters <- 'country_code=US'
API_geo_url <- paste0('http://api.nytimes.com/svc/semantic/v2/geocodes/query.json?',query_parameters,'&api-key=',API_key) #create URL for JSON file
#put all information gathered before and generate a data frame
geo.data <- flatten(fromJSON(API_geo_url)$result,recursive = TRUE)
head(geo.data)
## concept_id concept_name geocode_id geoname_id
## 1 24012 Charlottesville (Va) 2840 4752031
## 2 28132 Philadelphia (Pa) 436 4560349
## 3 28848 San Juan National Forest (Colo) 7240 5437675
## 4 27356 Nantucket (Mass) 1312 4944903
## 5 71052 Yamhill (Ore) 8680 5761959
## 6 27744 Ohio River 3916 4401696
## name latitude longitude elevation population
## 1 Charlottesville 38.02931 -78.47668 142 34703
## 2 Philadelphia 39.95233 -75.16379 12 1517550
## 3 San Juan National Forest 37.69166 -107.80895 3472 NA
## 4 Nantucket 41.28346 -70.09946 13 14775
## 5 Yamhill 45.34150 -123.18733 60 1024
## 6 Ohio River 36.98672 -89.13062 87 NA
## country_code country_name admin_code1 admin_code2 admin_code3
## 1 US United States VA 540 NA
## 2 US United States PA 101 NA
## 3 US United States CO 111 NA
## 4 US United States MA 019 NA
## 5 US United States OR 071 NA
## 6 US United States MO 133 NA
## admin_code4 admin_name1 admin_name2 admin_name3
## 1 NA Virginia City of Charlottesville NA
## 2 NA Pennsylvania Philadelphia County NA
## 3 NA Colorado San Juan County NA
## 4 NA Massachusetts Nantucket County NA
## 5 NA Oregon Yamhill County NA
## 6 NA Missouri Mississippi County NA
## admin_name4 feature_class feature_code feature_code_name
## 1 NA P PPL populated place
## 2 NA P PPL populated place
## 3 NA V FRST forest(s)
## 4 NA P PPL populated place
## 5 NA P PPL populated place
## 6 NA H STM stream
## time_zone_id dst_offset gmt_offset geocodes_created
## 1 America/New_York -4 -5 2013-02-25 15:10:12-05:00
## 2 America/New_York -4 -5 2013-02-25 15:10:12-05:00
## 3 America/Shiprock -6 -7 2013-02-25 15:10:12-05:00
## 4 America/New_York -4 -5 2013-02-25 15:10:12-05:00
## 5 America/Los_Angeles -7 -8 2013-02-25 15:10:12-05:00
## 6 America/Indiana/Knox -5 -6 2013-02-25 15:10:12-05:00
## geocodes_updated
## 1 2013-02-25 15:10:12-05:00
## 2 2013-02-25 15:10:12-05:00
## 3 2013-02-25 15:10:12-05:00
## 4 2013-02-25 15:10:12-05:00
## 5 2013-02-25 15:10:12-05:00
## 6 2013-02-25 15:10:12-05:00