Advanced Programming Interface (API) is a set of procedures that makes a website’s data be read by a computer
The main protocol for the internet is Hyper-Text Transfer Protocol (HTTP)
The key can be put into the Authentization header or onto the URL
Javascript Object Notation (JSON) is a human readable internet data format.
Example: IPEDS data in JSON format
{"universities":[
{"unitid": "127060", "instnm": "University of Denver"}
{"unitid": "126678", "instnm": "Colorado College"},
{"unitid": "126614", "instnm": "University of Colorado Boulder"}
]}
The Extensible Markup Language (XML) is a common internet data format that has similarities to both HTML and JSON.
Example: IPEDS data in XML format
<universities>
<university>
<unitid>127060</unitid><instname>University of Denver</instname>
</university>
<univerisity>
<unitid>126678</unitid><instname>Colorado College</instname>
</university>
<university>
<unitid>126614</unitid><instname>University of Colorado Boulder</instname>
</university>
</universities>
curl(url)
: Curl connection inferface
curl_download(url, destfile, ...)
: Download a file to disk# Create a connection to DU IRA webpage
suppressWarnings(library(curl))
# DU IRA URL
du_ira <- "https://www.du.edu/ir/"
# Create and Open a Connection
con <- curl(du_ira)
open(con)
# Read the first 10 lines of the connection
out <- readLines(con, n = 10)
# Print the output
cat(out, sep = "\n")
<!DOCTYPE HTML><html xmlns="http://www.w3.org/1999/xhtml" lang="en" dir="ltr" id="du-edu" class="no-js"><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Institutional Research & Analysis | University of Denver</title>
<meta name="Description" content="The Office of Institutional Research & Analysis is the central source for information about the University of Denver. We serve the University's vision, values, mission, and goals by analyzing and reporting institutional data to inform University and unit-level planning and development." />
<meta name="Keywords" content="IR, IRA, institutional research, analysis, strategic planning, decision support" />
<meta name="author" content="University of Denver" />
suppressWarnings(library(curl))
library(readxl)
# Createa temporary file
tmp <- tempfile()
# CCIHE 2018 Data file URL
ccihe_2018 <- 'http://carnegieclassifications.iu.edu/downloads/CCIHE2018-PublicData.xlsx'
# Download the file to the temporary file
curl_download(ccihe_2018, tmp)
# Read the Spreadsheet
ciihe_2018_data <- suppressWarnings(read_excel(tmp, sheet = 'Data'))
# Show data
print(ciihe_2018_data)
# A tibble: 4,324 x 97
UNITID NAME CITY STABBR CC2000 BASIC2005 BASIC2010 BASIC2015 BASIC2018
<dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 177834 A T ~ Kirk~ MO 52 25 25 25 25
2 180203 Aani~ Harl~ MT 60 33 33 33 33
3 222178 Abil~ Abil~ TX 21 19 18 18 18
4 138558 Abra~ Tift~ GA 40 2 12 23 23
5 488031 Abra~ Los ~ CA -3 -3 -3 -2 31
6 172866 Acad~ Bloo~ MN 40 14 23 23 28
7 451079 Acad~ Gain~ FL -3 -3 26 26 26
8 457271 Acad~ Los ~ CA -3 -3 -3 24 24
9 412173 Acad~ West~ FL -3 -3 -3 10 10
10 108232 Acad~ San ~ CA 56 30 30 30 18
# ... with 4,314 more rows, and 88 more variables: IPUG2018 <dbl>,
# IPGRAD2018 <dbl>, ENRPROFILE2018 <dbl>, UGPROFILE2018 <dbl>,
# SIZESET2018 <dbl>, CCE2015 <dbl>, OBEREG <dbl>, SECTOR <dbl>,
# ICLEVEL <dbl>, CONTROL <dbl>, LOCALE <dbl>, LANDGRNT <dbl>,
# MEDICAL <dbl>, HBCU <dbl>, TRIBAL <dbl>, HSI <dbl>, MSI <dbl>,
# WOMENS <dbl>, COPLAC <dbl>, CUSU <dbl>, CUMU <dbl>, ASSOCDEG <dbl>,
# BACCDEG <dbl>, MASTDEG <dbl>, DOCRSDEG <dbl>, DOCPPDEG <dbl>,
# DOCOTHDEG <dbl>, TOTDEG <dbl>, `S&ER&D` <dbl>, `NONS&ER&D` <dbl>,
# PDNFRSTAFF <dbl>, FACNUM <dbl>, HUM_RSD <dbl>, SOCSC_RSD <dbl>,
# STEM_RSD <dbl>, OTHER_RSD <dbl>, `DRSA&S` <dbl>, DRSPROF <dbl>,
# `OGRDA&S` <dbl>, OGRDPROF <dbl>, `A&SBADEG` <dbl>, PROFBADEG <dbl>,
# ASC1C2TRNS <dbl>, ASC1C2CRTC <dbl>, FALLENR16 <dbl>, ANENR1617 <dbl>,
# FALLENR17 <dbl>, FALLFTE17 <dbl>, UGTENR17 <dbl>, GRTENR17 <dbl>,
# UGDSFTF17 <dbl>, UGDSPTF17 <dbl>, UGNDFT17 <dbl>, UGNDPT17 <dbl>,
# GRFTF17 <dbl>, GRPTF17 <dbl>, UGN1STTMFT17 <dbl>, UGN1STTMPT17 <dbl>,
# UGNTRFT17 <dbl>, UGNTRPT17 <dbl>, FAITHFLAG <dbl>, OTHSFFLAG <dbl>,
# NUMCIP2 <dbl>, LRGSTCIP2 <dbl>, PCTLRGST <dbl>, UGCIP4PR <dbl>,
# GRCIP4PR <dbl>, COEXPR <dbl>, PCTCOEX <dbl>, DOCRESFLAG <dbl>,
# MAXGPEDUC <dbl>, MAXGPBUS <dbl>, MAXGPOTH <dbl>, NGCIP2PXDR <dbl>,
# NGCIP2DR <dbl>, ROOMS <dbl>, ACTCAT <dbl>, NSAT <dbl>, NACT <dbl>,
# NSATACT <dbl>, SATV25 <dbl>, SATM25 <dbl>, SATCMB25 <dbl>,
# SATACTEQ25 <dbl>, ACTCMP25 <dbl>, ACTFINAL <dbl>, ...96 <lgl>,
# ...97 <dbl>
GET()
, HEAD()
, PUT()
, POST()
, DELETE()
: HTTP verbsheaders(resp)
: Extract the response headerscontent(resp)
: Extract the response contentoauth_endpoints()
Popular OAuth endpointsoauth_app()
: Create an OAuth appconfig()
: Set CURL options such as authenticationoauth1.0_token()
, outh2.0_token()
: Generates an oauth1.0 or oauth2.0 tokensuppressWarnings(library(httr))
Attaching package: 'httr'
The following object is masked from 'package:curl':
handle_reset
url_str <-
paste0("https://ed-data-portal.urban.org/api/v1/college-university/",
"ipeds/admissions-enrollment/2016/?year=2016&unitid=127060")
resp <- GET(url_str); print(resp)
Response [https://ed-data-portal.urban.org/api/v1/college-university/ipeds/admissions-enrollment/2016/?year=2016&unitid=127060]
Date: 2019-08-19 17:06
Status: 200
Content-Type: application/json
Size: 564 B
headers(resp); content(resp)$results
$server
[1] "nginx/1.15.12"
$date
[1] "Mon, 19 Aug 2019 17:06:46 GMT"
$`content-type`
[1] "application/json"
$`content-length`
[1] "564"
$connection
[1] "keep-alive"
$vary
[1] "Origin, Cookie"
$`x-frame-options`
[1] "SAMEORIGIN"
$`cache-control`
[1] "max-age=36288000"
$allow
[1] "GET, HEAD, OPTIONS"
$expires
[1] "Fri, 09 Oct 2020 16:40:46 GMT"
attr(,"class")
[1] "insensitive" "list"
[[1]]
[[1]]$year
[1] 2016
[[1]]$fips
[1] 8
[[1]]$unitid
[1] 127060
[[1]]$sex
[1] 1
[[1]]$number_applied
[1] 8939
[[1]]$number_admitted
[1] 4656
[[1]]$number_enrolled_ft
[1] 613
[[1]]$number_enrolled_pt
[1] 12
[[1]]$number_enrolled_total
[1] 625
[[2]]
[[2]]$year
[1] 2016
[[2]]$fips
[1] 8
[[2]]$unitid
[1] 127060
[[2]]$sex
[1] 2
[[2]]$number_applied
[1] 11383
[[2]]$number_admitted
[1] 6211
[[2]]$number_enrolled_ft
[1] 764
[[2]]$number_enrolled_pt
[1] 10
[[2]]$number_enrolled_total
[1] 774
[[3]]
[[3]]$year
[1] 2016
[[3]]$fips
[1] 8
[[3]]$unitid
[1] 127060
[[3]]$sex
[1] 99
[[3]]$number_applied
[1] 20322
[[3]]$number_admitted
[1] 10867
[[3]]$number_enrolled_ft
[1] 1377
[[3]]$number_enrolled_pt
[1] 22
[[3]]$number_enrolled_total
[1] 1399
jsonlite is a package to parse JSON
fromJSON()
and toJSON()
: Converts R objects to/from JSONrbind_pages(pages)
: Combine a list of dataframes into a single dataframe.
prettify(json)
: Makes a JSON string readablesuppressWarnings(library(tibble))
suppressWarnings(library(httr))
suppressWarnings(library(jsonlite))
url_str <-
paste0("https://ed-data-portal.urban.org/api/v1/college-university/",
"ipeds/admissions-enrollment/2016/?year=2016&unitid=127060")
resp <- GET(url_str)
du_admissions <- as_tibble(fromJSON(content(resp, "text"))$results)
No encoding supplied: defaulting to UTF-8.
print(du_admissions)
# A tibble: 3 x 9
year fips unitid sex number_applied number_admitted number_enrolled~
<int> <int> <int> <int> <int> <int> <int>
1 2016 8 127060 1 8939 4656 613
2 2016 8 127060 2 11383 6211 764
3 2016 8 127060 99 20322 10867 1377
# ... with 2 more variables: number_enrolled_pt <int>,
# number_enrolled_total <int>