Public art in the city of Baltimore, Maryland is administered through the Baltimore City Public Art Commission (PAC) within the Baltimore Office of Promotion & the Arts (BOPA). They are responsible for administering the city’s Percent for Public Art Program - the second of such programs ever in the country - in which 1% of the city’s annual budget is set aside for the creation of public artworks with which to beautify the city (Kelly 2011, 6). The commission also evaluates all proposed permanent gifts to the city of Baltimore of public art (Baltimore Office of Promotion & the Arts 2024). Such artworks include monuments and memorials along with works of a more aesthetic orientation.
Baltimore was, in the 19th and early 20th centuries, popularly referred to as the “Monumental City.” Monuments were constructed in the city of Baltimore both earlier and on a larger scale than the other major cities in the United States. It’s Washington Monument both began construction and was completed before the one in Washington, D.C. While New York City, Philadelphia, and Washington began major monument construction in earnest only in the 1850s, Baltimore by that time had already completed half of a dozen monuments (Kelly 2011, 1 - 2).
It is because of my life-long interest in art history and local history that I decided to do my analysis on the dataset, “Public Art Inventory,” for the City of Baltimore, last updated in 2023. I’m fascinated by the ways that art intersects with historical and cultural memory, particularly in a city that, both today and in the past, is a significant center for the arts that played an important role in our nation’s historical identity.
This dataset was retrieved through Open Baltimore, the open data website for the City of Baltimore. The data was collected by the City of Baltimore. No information is available giving out how this data was collected. The fact that the City of Baltimore, however, is the ultimate source is reassuring in regard to its accuracy.
Before importing the .csv file into R Studio, I performed some forms of cleaning that I wouldn’t be able to perform in R. Values in the “zipcodes” column that didn’t match the standard 5-digit format were revised to fit it. Values in “dateofartwork” that didn’t have a four-digit year were altered to conform to that format. If a date was an interval (“1921-1923”) or a decade (“1830s”), I went with year at the midpoint of that interval or decade. If I was given two years (“1939/1940”), I went with the lower year out of caution.
I originally set out with the intention of combining this dataset with another available through Open Baltimore, titled “Percent of Residents - White/Caucasian (Non-Hispanic)” compiled by the Baltimore Neighborhood Indicators Alliance. I sought to combine these different datasets to see if there was any corellation between the percentage of Caucasian residents in an area and the number of public artworks there. I believed that I would be able to join the datasets through geographical variables, but I couldn’t find a way to accurately connect those available in the population dataset with those in the public art inventory. If I could try this project over again, I would keep looking for another way to combine these datasets, but for now I decided to remain focused on the art inventory.
artistlastname (nominal categorical) - Last name of the artwork’s artist.
artistfirstname (nominal categorical) - First name of the artwork’s artist.
titleofartwork (nominal categorical) - Artwork’s title.
dateofartwork (discrete numerical) - Date of the artwork’s creation.
medium (nominal categorical) - The media from which an artwork is made, i.e. bronze, marble, acrylic, etc.
zipcodes (nominal categorical) - The zip code in which the artwork is located.
locationofartwork (nominal categorical) - Specific location of an artwork, such as the Fallon Federal Building, National Aquarium, etc.
addressofartwork (nominal categorical) - Address where the artwork is located.
publicschoolnumber (nominal categorical) - Baltimore City Public School number, if in a public school.
siteofartwork (nominal categorical) - Specific spot within a location, such as inside fountain, side of building, etc.
descriptionofartwork (nominal categorical) - Physical and or formal description of the artwork.
visibility (nominal categorical) - Degree of the artwork’s visibility to the general public.
indooroutdooraccessible (nominal categorical) - Whether or not an artwork is accessible indoors or outdoors, if known.
monument (ordinal categorical, though briefly changed to discrete numerical at one point for analytical purposes) - Whether or not an artwork is a monument.
Chunk 1 below:
Loads libraries used in the analysis.
Sets working directory.
Imports and renames dataset.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dplyr)
library(treemap)
library(RColorBrewer)
library(tidymodels)
## ── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
## ✔ broom 1.0.6 ✔ rsample 1.2.1
## ✔ dials 1.2.1 ✔ tune 1.2.1
## ✔ infer 1.0.7 ✔ workflows 1.1.4
## ✔ modeldata 1.3.0 ✔ workflowsets 1.1.0
## ✔ parsnip 1.2.1 ✔ yardstick 1.3.1
## ✔ recipes 1.0.10
## Warning: package 'broom' was built under R version 4.3.3
## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ scales::discard() masks purrr::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ recipes::fixed() masks stringr::fixed()
## ✖ dplyr::lag() masks stats::lag()
## ✖ yardstick::spec() masks readr::spec()
## ✖ recipes::step() masks stats::step()
## • Learn how to get started at https://www.tidymodels.org/start/
setwd("/Users/Owner/Documents/DATA 110/Final?")
public_art_inventory <- read_csv("Public_Art_Inventory_perm.csv")
## Rows: 690 Columns: 21
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (15): artistLastName, artistFirstName, image, titleOfArtwork, medium, lo...
## dbl (6): OBJECTID, dateOfArtwork, zipcodes, F2010_Census_Neighborhoods, F20...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Chunk 3 below:
Removes columns for variables that aren’t relevant to my analysis.
Returns the first six rows at the head of the dataset as a preview.
public_art_inventory_revised <- public_art_inventory |>
select(-OBJECTID,-image,-location,-F2010_Census_Neighborhoods,-F2010_Census_Wards_Precincts,-Zip_Codes,-GlobalID)
head(public_art_inventory_revised)
## # A tibble: 6 × 14
## artistLastName artistFirstName titleOfArtwork dateOfArtwork medium zipcodes
## <chr> <chr> <chr> <dbl> <chr> <dbl>
## 1 Mears Mary Ann Streamings 1987 alumi… 21117
## 2 Alexander Patricia Unknown NA ceram… 21201
## 3 Betti Richard James Cardinal G… 1967 bronze 21201
## 4 Johnson, Jr. J. Seward The Right Light 1985 bronze 21201
## 5 Ono Setsuko Joy 2005 Stain… 21201
## 6 Hamel, Inc. J.S. Jacob France Mem… NA <NA> 21201
## # ℹ 8 more variables: locationOfArtwork <chr>, addressOfArtwork <chr>,
## # publicSchoolNumber <chr>, siteOfArtwork <chr>, descriptionOfArtwork <chr>,
## # visibility <chr>, indoorOutdoorAccessible <chr>, monument <chr>
Chunk 4 below:
names(public_art_inventory_revised) <- tolower(names(public_art_inventory_revised))
Chunk 5 below:
Neither of these zip codes are within the city limits of Baltimore. It is uncertain why they were included in the dataset.
public_art_inventory_clean <- public_art_inventory_revised |>
filter(zipcodes != "21117") |>
filter(zipcodes != "21797")
Chunk 6 below:
pub_art_inv_cln <- public_art_inventory_clean |> filter(!is.na(public_art_inventory_clean$zipcodes)) |>
filter(!is.na(public_art_inventory_clean$dateofartwork))
Chunk 7 below:
pub_art_inv_cln$zipcodes <- as.factor(pub_art_inv_cln$zipcodes)
To start, I decided to get an idea of which zip codes have the greatest concentrations of public artworks in the city. There were 25 different zip codes included among observations in the dataset, so the x-axis labels for each zip code overlapped each other, making it difficult to read or distinguish each. I added code to turn these labels at a 45-degree angle, making them more legible.
Chunk 8 below:
Creates a bar plot of the number of public artworks in each recorded zip code.
Sets the variables along the x-axis at a 45-degree angle.
Applies the RColorBrewer palette “Set3” to fill the bars.
Revises x and y-axis labels and adds a title and caption.
ggplot(pub_art_inv_cln, aes(x = fct_infreq(zipcodes), fill = zipcodes)) +
geom_bar() +
theme(axis.text.x = element_text(angle = 45)) +
scale_fill_manual(values = c(palette("Set3"), palette("Set3"), palette("Set3"))) +
labs(title = "Public Artworks in Baltimore by Zip Code", x = "Zip Code", y = "Count", caption = "Source: City of Baltimore via Open Baltimore")
By far, with 65+ artworks, the zip code with the highest concentration of public art works in 21202, which lies between the Inner Harbor to the south, Greenmount Cemetery to the north, North Central Avenue to the east and North Charles Street to the west. It includes most of the Inner Harbor, City Hall, downtown, the National Aquarium, and the George Peabody Library. This is the core of the modern city, so it makes sense that the majority of public artworks would be concentrated in an area of high foot and automobile traffic, and thus with a larger audience.
21202 is followed closely by 21217, a heavily-residential neighborhood that contains both Druid Hill Park and the Baltimore Zoo. It may seem odd that a largely residential, lower-income neighborhood would have almost as much public art as the city’s commercial and governmental heart, and if I had more time to explore the topic I would love to discover why, and what forms of artworks are more common in each zip code. The zip code with the third-largest number of public artworks is 21218, which encompasses much of the campuses of the Johns Hopkins University and Morgan State University. The zip codes that follow exhibit a significant drop in the number of public artworks within them.
The inclusion of a variable for artworks that are monuments caught my attention. In the past decade public monuments, particularly commemorating figures and events related to the U.S. Civil War, have been the subject of controversies so acute that they have even resulted in violence. I decided to turn my focus to monuments in Baltimore.
Chunk 9 below:
Creates a new dataset, pub_art_inv_cln_mon.
Filters observations down to those artworks that are monuments.
pub_art_inv_cln_mon <- pub_art_inv_cln |>
filter(monument == "Y")
Chunk 10 below:
Creates a density plot of the creation/establishment of monuments in Baltimore over time.
Sets the transparency level for the plot.
Sets the date of artwork for the x-axis and specifies the status or lack thereof as a monument for both color and fill of the plot.
Applies the “Set2” RColorBrewer palette for the fill and color of the curve.
Revises the x and y-axis labels and adds a title and caption.
ggplot(pub_art_inv_cln_mon, aes(x = dateofartwork, color = monument, fill = monument)) +
geom_density(alpha = 0.5) +
scale_fill_brewer(palette = "Set2") +
labs(x ="Date of Monuments", y = "Distribution of Monuments", title = "Monuments and Non-Monument Public Artworks in Baltimore Over Time, 1765 - 2023", caption = "Source: City of Baltimore via Open Baltimore")
I first pivoted to monuments with a density plot showing the increase in monuments constructed over time.
A few key aspects of this density plot catch my eye:
Monuments have been built in the city of Baltimore for quite a long time - as far back as the mid-1700s.
There is an explosion of monument construction during the late 19th and early 20th century. This covers a unique period of time in American history, spanning from the early post-Civil war era to the aftermath of the Second World War. The number of wars and other military actions during this interval likely inspired a good number of the monuments constructed at the time, among other factors.
Although the number of monuments constructed annually drops sharply after the 1950s, it never again dips as low as pre-Civil War levels. Our wars and military actions may not be as encompassing or as deadly at the Civil War or World Wars, but as each year goes by there is more and more history to engage with via public memory.
Next, I sought to then check the strength of the relationship between the date of a public artwork and its status or lack thereof as a monument. Over time, did the probability that a public artwork was a monument increase or decrease? Since “monument” was a binary categorical variable, I transformed it into a numerical variable in order to carry out logistic regression analysis.
Chunk 11 below:
pub_art_inv_cln2 <- pub_art_inv_cln |>
mutate(mon_num = case_when(monument %in% ("Y") ~ "1"))
Chunk 12 below:
pub_art_inv_cln2 <- pub_art_inv_cln2 |>
mutate(mon_num = replace_na(mon_num, "0"))
Chunk 13 below:
pub_art_inv_cln2$mon_num<-as.numeric(as.character(pub_art_inv_cln2$mon_num))
head(pub_art_inv_cln2)
## # A tibble: 6 × 15
## artistlastname artistfirstname titleofartwork dateofartwork medium zipcodes
## <chr> <chr> <chr> <dbl> <chr> <fct>
## 1 Betti Richard James Cardina… 1967 bronze 21201
## 2 Johnson, Jr. J. Seward The Right Lig… 1985 bronze 21201
## 3 Ono Setsuko Joy 2005 Stain… 21201
## 4 Luery Susan Babe's Dream 1996 bronze 21201
## 5 Konkle and DeRan Kathleen and Br… <NA> 1996 mural 21201
## 6 Borofsky Jonathan Male/ Female 2004 sculp… 21201
## # ℹ 9 more variables: locationofartwork <chr>, addressofartwork <chr>,
## # publicschoolnumber <chr>, siteofartwork <chr>, descriptionofartwork <chr>,
## # visibility <chr>, indooroutdooraccessible <chr>, monument <chr>,
## # mon_num <dbl>
Chunks 14 and 15 below:
Creates a scatterplot for the relationship between date of artwork and the artwork’s status or lack thereof as a monument.
Sets the date of artwork for the x-axis.
“Jitters” the data points so there are more spread out and distinguishable than they would be otherwise.
Creates a sigmoid curve for the logistical model of the relationship between date of artwork and the artwork’s status or lack thereof as a monument.
plot_log <- ggplot(data = pub_art_inv_cln2, aes(y = mon_num, x = dateofartwork, colour = mon_num)) + geom_jitter(width = 0, height = 0.05, alpha = 0.5) +
geom_smooth(method = "glm", se = FALSE, method.args = list(family = "binomial")) +
labs(title = "Monuments and Non-Commemorative Public Artworks in Baltimore Over Time, 1765 - 2011", x = "Date of Artwork", y = "Monument (Binary: 1 = yes, 0 = No)", caption = "Source: City of Baltimore via Open Baltimore")
plot(plot_log)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
This scatterplot shows the relationship between the date of an artwork and its status or lack thereof as a monument. The status is expressed as a binary in the data, with “1” meaning the artwork is a monument and “0” indicating that the artwork is a non-commemorative public artwork. All of the data points are therefore concentrated around 0 and 1 on the y-axis. A sigmoid curve for the relationship was added to visualize the average change in probability over time
Although monuments appear to be the most common type of artwork among the oldest of public artworks in Baltimore, the numbers of monuments and non-commemorative public artworks begin to even out between the last quarter of the 19th century and the first quarter of the 20th century - roughly where the plotted sigmoid curve begins a steep downward slope. This also, interestingly, corresponds to the period (as seen in the prior density graph) when the construction of monuments skyrocketed. For both models to make sense, it must mean that the creation of non-commemorative public artworks exploded in Baltimore at the same time as well, but in an increasing degree more related to monuments.
Next, I sought to test the accuracy of the logistic regression model of this relationship.
Chunk 16 below:
pub_art_inv_cln2$mon_num_2 <- as.factor(pub_art_inv_cln2$mon_num)
set.seed(123)
Chunks 17 and 18 below:
mon_splits<- initial_split(pub_art_inv_cln2, strata = mon_num_2)
mon_train <- training(mon_splits)
mon_test <- testing(mon_splits)
Chunk 19 below:
mon_train |>
group_by(mon_num_2) |>
tally() |>
mutate(prop = n/sum(n))
## # A tibble: 2 × 3
## mon_num_2 n prop
## <fct> <int> <dbl>
## 1 0 344 0.829
## 2 1 71 0.171
From the training set, it is estimated that you have a 17.11% chance that a public artwork is a monument.
Chunk 20 below:
logit_fit <-
logistic_reg(mode = "classification") |>
set_engine(engine = "glm") |>
fit(mon_num_2 ~ dateofartwork, data = mon_train)
logit_fit
## parsnip model object
##
##
## Call: stats::glm(formula = mon_num_2 ~ dateofartwork, family = stats::binomial,
## data = data)
##
## Coefficients:
## (Intercept) dateofartwork
## 62.55117 -0.03273
##
## Degrees of Freedom: 414 Total (i.e. Null); 413 Residual
## Null Deviance: 379.8
## Residual Deviance: 289.3 AIC: 293.3
The formula for the logistical regression model for the relationship is:
log(pmonument / [1 - pmonument) = 62.551 - 0.033(date of artwork)
OR
pmonument = (e62.551 - 0.033(date of artwork))/ (e62.551 - 0.033(date of artwork) + 1)
This means that, in the best-fitting logistic regression model of the relationship, the odds that an artwork is a monument (“1”) can be estimated by subtracting the product of 0.033 and the date of the artwork from 62.551. To find the probability, you need to set that equation to the e-constant, then divide that equation set to an e-constant plus 1.
The AIC (Akaike Information Criteria), or the measure of fit for a logistic regression equation for this model is 293.3 - a remarkably low value. This may be, however, because only one predicting variable is involved in the model.
Next, I generated training and testing datasets to test the strength of the modeled relationship.
Chunk 21 below:
Generates predicted values for the training dataset.
Generates predicted values for the testing dataset.
Generates probabilities for the testing dataset.
pred_logit_train <- predict(logit_fit, new_data = mon_train)
pred_logit_test <- predict(logit_fit, new_data = mon_test)
prob_logit_test <- predict(logit_fit, new_data = mon_test, type="prob")
Chunk 22 below:
library(yardstick)
metrics(bind_cols(mon_test, pred_logit_test), truth = mon_num_2, estimate = .pred_class)
## # A tibble: 2 × 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 accuracy binary 0.813
## 2 kap binary 0.149
This model has an accuracy of 81.29% - close to the p-hat for non-commemorative artworks in the training data (or, 1 - p-hat). Though the model isn’t perfect, it is of relatively high strength.
I will next move on to my final two visualizations in which the numbers and dates of monuments are related back to their zip codes: a packed bubbles graph and a Gantt plot.
Chunk 23 below:
write.csv(pub_art_inv_cln2, file = "/Users/Owner/Documents/DATA 110/Final?/r_export.csv")
Link to interactive Tableau graph
Through Tableau I created an interactive packed bubbles graph visualizing the change in the number of monuments in the city of Baltimore over time by zip code. When we look at the most up-to-date numbers of monuments per zip code (pictured in the screenshot above), we see that our top 3 zip codes are, as was the case with the numbers of all public artworks, 21218, 21217, and 21202 though in the reversed order:
21218 - 18 monuments
21217 - 16 monuments
21202 - 14 monuments
The number of monuments in a given zip code are expressed through both the size of the bubble and the color gradient. Using the filters for zip codes and years, and the color gradient key for the number of monuments, you can see how the number of monuments and geographical distribution of change over time. The first monument, built in the middle of the 18th century, was found within the zip code 21231, which encompasses the Fells Point neighborhood - the nucleus of the future city of Baltimore. As the city center gradually moved to the area north of what we now call the Inner Harbor, the number of monuments found in this area’s zip codes - including 21218, 21217, and 21202 - grew to eventually overtake those in zip code 21231 by 1890.
The graph may give a good sense of the relative amounts of monuments in each of the zip codes over time, but if I was able to spend more time on this project, I would’ve loved to find a way to organize the packed bubbles in a manner mirroring their relative geographic locations. It is unclear just how the packed bubbles are arranged, but if it were possible to visualize their locations it would help those viewing the graph get a greater sense of where monuments were erected over time within the city of Baltimore. Additionally, the monuments in a zip code are seen as a monolithic mass instead a collection of individual artworks, letting the viewer lose sight of the meaning of these totals.
Link to interactive Tableau plot
My second and final visualization is a Gantt plot, also via Tableau. Each monument is individually plotted by their date and zip code. The data is, of course, the same as that in the packed bubbles graph with the same top zip codes, but the format of the Gantt plot gives a greater idea of both the sheer number of individual monuments created over the centuries and in which years monument construction was more frequent. By clicking on each data point in the plot, you can see both the zip code and date for each monument, offering a more detailed picture of the monuments built over time in Baltimore. As in the packed bubble graph, there is no indication of how the monuments are related to each other geographically, but in the case of the Gantt plot we are precluded from doing so by the format itself.
I was surprised by the high concentration of monuments constructed in multiple zip codes from the mid-1970s and on. The greatest increase in monuments appears to have occurred in the late 19th and early 20th centuries, but there appears to be more data points concentrated towards the right end of the x-axis instead of near its middle. This may indicate that for Baltimore overall, the greatest increase in monuments occurred during the late 19th and early 20th centuries, but in many individual zip codes the greatest increases have occurred within the last half-century.
Although the data and its visualizations revealed fascinating insights, my analysis hasn’t completely answered my main questions: why were the monuments in Baltimore concentrated where they were, and why were they constructed when they were constructed? If given the chance, I would love to analyze historical economic data for, and the social history of, these areas within the context of Baltimore as a whole. The analysis in the number of monuments by zip code over time has given some vital clues related to the change in the city’s geography and contemporary historical events, but it doesn’t fully explain how the greatest number of monuments can be found in 3 zip codes containing, respectively, a commercial/financial/government district, a lower-income residential area, and a university district.
Kelly, Cindy. 2011. Outdoor Sculpture in Baltimore: A Historical Guide to Public Art in the Monumental City. Baltimore: The Johns Hopkins University Press.
Baltimore Office of Promotion & The Arts. “Public Art Commission.” Accessed July 5th, 2024. https://www.promotionandarts.org/about-us/public-arts-commission/.
Humphries, Lance. 2015. “What’s In A Name? Baltimore - ‘The Monumental City.’” Maryland Historical Magazine 110 (2): 253 - 270.