library(tidyverse)
library(jsonlite)
library(httr)
library(magrittr)Assignment 6 - Learning a New API
Introduction to APIs and The Chicago Art Institute
An Application Programming Interface (API) is a structured way for software systems to communicate. APIs enable one software application to access data or functionality from another—powering much of the technology we use today, from smartphone apps to digital services.
The Art Institute of Chicago’s API is a public, REST-style service that provides access to the museum’s collection in a structured JSON format. REST (Representational State Transfer) organizes the API around “resources” (e.g., artworks, artists, exhibitions), each accessible through a specific URL. Developers can send a GET request to retrieve information and use it in apps, websites, or data projects.
In this example, we’ll demonstrate how to:
- Define an API endpoint
- Create a reusable function to generate a search URL.
- Use GET() to retrieve data about paintings that include the word “cat” in their title.
- Extract and compile results into a clean data frame.
First, load the necessary R packages.
1. Define the API endpoint
We’ll begin by defining the base URL (endpoint) for the Art Institute’s artwork search.
sc_endpoint <- "https://api.artic.edu/api/v1/artworks"2. Create a function to build an API URL
This function creates a customized search URL based on the artwork title, page number, and limit of results per page.
# Create a url function
create_art_url <-
function(art_title, page, limit) {
# Insert endpoint and define parameters
art_api_endpoint <-
"https://api.artic.edu/api/v1/artworks/search"
art_title <- paste("?q=", art_title, sep = "")
parameters <-
paste("&fields=id,title,artist_title,is_on_view,style_title")
page <- paste("&page=", page, sep = "")
limit <- paste("&limit=", limit, sep = "")
#Final API Call
aic_api_url <-
paste(art_api_endpoint, art_title, parameters, page, limit, sep = "")
return(aic_api_url)
}3. Request Data from the API
With the final URL now available, you can use it to bring the JSON document returned from the API request into R.
Use the HTTP verb “GET” from the HTTR package. Your request for information will be sent to the API, and you will receive a response object.
Together, these lines convert the API’s JSON response into a structured format that you can filter, visualize, or analyze with tidyverse tools or base R functions.
#Function to determine the number of pages to return
request_art_data <- function(art_title = "", page = 1, limit = 10) {
# Build the full URL
url <- create_art_url(art_title = art_title, page = page, limit = limit)
# Make the GET request
response <- GET(url)
# Parse the content
result <- content(response, as = "text", encoding = "UTF-8")
result_json <- fromJSON(result, flatten = TRUE)
# Extract the 'data' part of the JSON
art_data <- result_json$data
return(art_data)
}
# View the API URL for searching "cats":
create_art_url(art_title = "cats",
page = 0,
limit = 100)[1] "https://api.artic.edu/api/v1/artworks/search?q=cats&fields=id,title,artist_title,is_on_view,style_title&page=0&limit=100"
4. Turn API Data into a Data Frame
Now let’s wrap everything into a function that returns a tidy data frame.
# Retrieve the data frame from the url
request_aic_api_df <-
function(art_title, page, limit) {
aic_api_url <-
create_art_url(art_title, page, limit)
# Use the URL to retrieve the JSON data and create the data frame
museum_data <-
aic_api_url %>%
GET() %>%
content(as = "text",
encoding = "UTF-8") %>%
fromJSON() %>%
use_series(data) %>%
bind_rows()
return(museum_data)
}Request and store a sample data frame:
cats_art_df <-
request_aic_api_df(art_title = "cats",
page = 0,
limit = 15)5. Loop the data in a blank data frame and bind results
This loop collects data from multiple pages and appends the results into a single data frame.
# Attempted loop
artworks_df <- data.frame()
# Harvest the first 3 pages of the data frame
for (i in 1:3) {
artworks_df <-
request_aic_api_df(art_title = "cats",
page = i-1,
limit = 5) %>%
bind_rows(artworks_df)
print(paste("Page", i,"of", 3, "results collected", sep = " "))
if (i<3) {
Sys.sleep(1)
}
}[1] "Page 1 of 3 results collected"
[1] "Page 2 of 3 results collected"
[1] "Page 3 of 3 results collected"
6. View results
head(cats_art_df) _score id title is_on_view
1 136.05936 656 Lion (One of a Pair, South Pedestal) TRUE
2 119.95461 117241 Girl with Cat TRUE
3 116.41442 45259 Nude with Cats FALSE
4 103.93291 16227 Cat Making Up FALSE
5 98.46439 22482 Homesickness FALSE
6 93.83458 51719 Winter: Cat on a Cushion FALSE
artist_title style_title
1 Edward Kemeys <NA>
2 Balthus <NA>
3 Pablo Picasso Cubism
4 Inagaki Tomoo Japanese (culture or style)
5 René Magritte <NA>
6 Théophile-Alexandre Pierre Steinlen <NA>