This document walks through how to build the Promise Zone National Housing Preservation Database dashboard in ArcGIS. It does two things: 1) explains how the dashboard was made, and 2) actually provides the code necessary for preparing the data and the graphics. These elements will still have to be uploaded to ArcGIS to actually create the dashboard, but this markdown document should make clear how to create and use them.
It lays out five steps:
Note: this walkthrough is not perfect. There are almost definitely some steps that I forgot to include, and you will need to fiddle with some things to get them to look exactly how you want. There are simply some limits to what it’s possible to build in ArcGIS. My hope is that this walkthrough will make it much easier for you to manually update the dashboard, though, and will serve as a foundation on which you can improve.
#If you are at a City office or on your VPN, you will need to load the ‘curl’ #package and run these three lines of code to get around the proxy server and #be able to use the census API
#function to get around proxy server for tidycensus
#library(curl)
#companyproxy <- curl::ie_proxy_info("proxy.phila.gov:8080")$Proxy
#Sys.setenv(http_proxy=companyproxy)
#Sys.setenv(https_proxy=companyproxy)
#install.packages("devtools")
#devtools::install_github("CityOfPhiladelphia/rphl")
#install.packages("janitor")
library(tidyverse)
library(acs) #to import american community survey data
library(tidycensus) #to import american community survey data
library(readxl) #to read the excel spreadsheet of nhpd data
library(sf) #for spatial data
library(rphl)
library(forcats)
library(janitor) #to clean dirty excel files
require("knitr")
opts_knit$set(root.dir = "C:/Users/Nissim.Lebovits/OneDrive - City of Philadelphia/Desktop/Transition Documents/Data/R Scripts and Datasets/Housing")
NHPD data must be manually downloaded as an Excel file here. You will need to register for an account, but this is free for government agencies. Select the dataset titled “Inconclusive and Active Properties PA”, then:
#import nhpd data for pennsylvania
all_nhpd = read_excel("./Inconclusive and Active Properties PA.xlsx") |>
clean_names()
#filter for in philadelphia; convert to sf
phl_nhpd_sf = all_nhpd |>
filter(city == "Philadelphia") |>
st_as_sf(coords = c("longitude", "latitude"),
crs = "EPSG:4326")
#fix typo in TargetTenantType column
phl_nhpd_sf$target_tenant_type[phl_nhpd_sf$target_tenant_type == "Eldery or Disabled"] = "Elderly or disabled"
#create column for earliestendyear
phl_nhpd_sf$earliest_end_year = lubridate::year(phl_nhpd_sf$earliest_end_date)
#import the pz; use to filter phl data
pz = read_sf("C:/Users/Nissim.Lebovits/OneDrive - City of Philadelphia/Desktop/Transition Documents/Data/R Scripts and Datasets/General Boundaries/Shapefiles/PZ_Shapefile",
"PZ_Boundaries", stringsAsFactors = FALSE) |>
st_transform(crs = st_crs("EPSG:4326"))
#Create a new column for a categorical variable indicating whether or not each site is in the Promise Zone.
pz_nhpd_sf = phl_nhpd_sf[pz, ]
pz_nhpd_sf$in_pz = "true"
non_pz_nhpd_sf = phl_nhpd_sf[pz, , op = st_disjoint]
non_pz_nhpd_sf$in_pz = "false"
phl_nhpd_sf_full = rbind(pz_nhpd_sf, non_pz_nhpd_sf)
#select the columns that you want
phl_nhpd_full_forwrite = phl_nhpd_sf_full |>
dplyr::select(property_name,
property_address,
total_units,
earliest_end_year,
owner,
owner_type,
manager_name,
manager_type,
target_tenant_type)
#esri shapefiles do not accept null values (NAs),
#so we'll split the data into an expiring layer and a non-expiring layer
#and then upload them to the same map
phl_nhpd_expiring = phl_nhpd_full_forwrite |>
filter(!is.na(earliest_end_year))
st_write(phl_nhpd_expiring, "./phl_nhpd_expiring.shp")
phl_nhpd_non_expiring = phl_nhpd_full_forwrite |>
filter(is.na(earliest_end_year))
st_write(phl_nhpd_non_expiring, "./phl_nhpd_non_expiring.shp")
In this step, we’re importing American Community Survey data to compare the Promise Zone to Philadelphia as a whole. We use area-weighted spatial interpolation to estimate numbers for the Promise Zone, and then perform some final simple calculations.
#import data by tract for interpolation
phl_tract_acs_data = get_acs(geography = "tract",
year = 2020,
variables = c(
"B11001_001E", #Total number of households
"B25070_007E", #Rent 30.0 to 34.9 percent
"B25070_008E", #Rent 35.0 to 39.9 percent
"B25070_009E", #Rent 40.0 to 49.9 percent
"B25070_010E"), #Rent 50.0 percent or more
geometry = T,
state = "PA",
county = "Philadelphia",
output = "wide") |>
rename(tot_hh = B11001_001E) |>
mutate(tot_rentburd_hh = B25070_007E +
B25070_008E +
B25070_009E +
B25070_010E)|>
st_transform(crs = st_crs("EPSG:4326"))
#import full city data
phl_county_acs_data = get_acs(geography = "county",
year = 2020,
variables = c(
"B11001_001E", #Total number of households
"B25070_007E", #Rent 30.0 to 34.9 percent
"B25070_008E", #Rent 35.0 to 39.9 percent
"B25070_009E", #Rent 40.0 to 49.9 percent
"B25070_010E"), #Rent 50.0 percent or more
geometry = T,
state = "PA",
county = "Philadelphia",
output = "wide") |>
rename(tot_hh = B11001_001E) |>
mutate(tot_rentburd_hh = B25070_007E +
B25070_008E +
B25070_009E +
B25070_010E)|>
st_transform(crs = st_crs("EPSG:4326"))
#interpolate pz data
pz_acs_data = st_interpolate_aw(phl_tract_acs_data[3:14], pz, ext = T)
pz_acs_data = pz_acs_data |>
dplyr::select(tot_hh, tot_rentburd_hh)
pz_acs_data$geography = "Promise Zone"
phl_county_acs_data = phl_county_acs_data |>
dplyr::select(tot_hh, tot_rentburd_hh)
phl_county_acs_data$geography = "Philadelphia"
acs_data_for_charts = rbind(pz_acs_data, phl_county_acs_data)
acs_data_for_charts = acs_data_for_charts |>
mutate(pct_rentburd = tot_rentburd_hh/tot_hh*100)
The code below will make the appropriate charts to embed in the ArcGIS storymap. All you have to do is right-click them, “save as”, and make sure to add “.png” to the file name that you give them. You can then upload the resulting .png files to ArcGIS and embed them in the NHPD storymap.
phl_nhpd_sf_full |>
group_by(in_pz) |>
summarise(total_units = sum(total_units)) |>
ggplot(aes(x = "", y = total_units, fill = in_pz)) +
geom_col(color = NA, alpha = 0.5) +
geom_text(aes(label = total_units),
position = position_stack(vjust = 0.5),
alpha = 1) +
coord_polar(theta = "y") +
scale_fill_manual(values = c("#5b6983", "#7fb0ac"),
labels = c("Outside Promise Zone", "In Promise Zone")) +
labs(title = "Assisted Housing Units in Philadelphia") +
theme_void() +
theme(plot.title = element_text(hjust = 0.5),
legend.title = element_blank())
ggplot(acs_data_for_charts) +
geom_col(aes(x = geography, y = pct_rentburd),
fill = "#7fb0ac",
alpha = 0.7) +
geom_text(aes(x = geography, y = pct_rentburd, label = round(pct_rentburd, digits = 1)), vjust = 1.5) +
labs(title = "Rent Burden",
y = "Rent Burden (%)") +
theme_phl(base_size = 14) +
theme(aspect.ratio = 1,
axis.title.x = element_blank())
target_tenant_hist = phl_nhpd_sf_full |>
filter(in_pz == "true") |>
group_by(target_tenant_type) |>
summarise(total_site_units = sum(total_units))
ggplot(target_tenant_hist) +
geom_col(aes(x = reorder(target_tenant_type, -total_site_units), y = total_site_units),
fill = "#7fb0ac",
alpha = 0.7) +
geom_text(aes(x = target_tenant_type, y = total_site_units, label = total_site_units), vjust = -.25) +
labs(title = "Promise Zone Assisted Housing by Tenant Type",
x = "Target Tenant Type",
y = "Total Units") +
theme_phl(base_size = 14)
ggplot(phl_nhpd_sf_full[phl_nhpd_sf_full$in_pz == "true", ], aes(x=earliest_end_year, y = total_units)) +
geom_col(aes(x=earliest_end_year, y = total_units),
fill = "#7fb0ac",
alpha = 0.7) +
labs(title = "Promise Zone Assisted Housing by Expiration Year",
x = "Expiration Year",
y = "Total Units") +
theme_phl(base_size = 14) +
theme(aspect.ratio = 0.75)
today = lubridate::today()
phl_assisted_housing = as.integer(sum(phl_nhpd_sf_full$total_units))
pz_assisted_housing = as.integer(sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"]))
pct_pz_assisted_housing = round(sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"]) / sum(phl_nhpd_sf_full$total_units)*100, digits = 1)
pct_pz_total_hh = round(sum(acs_data_for_charts$tot_hh[acs_data_for_charts$geography == "Promise Zone"]) / sum(acs_data_for_charts$tot_hh[acs_data_for_charts$geography == "Philadelphia"])*100, digits = 1)
pz_rent_burden = round(sum(acs_data_for_charts$pct_rentburd[acs_data_for_charts$geography == "Promise Zone"]), digits = 1)
phl_rent_burden = round(sum(acs_data_for_charts$pct_rentburd[acs_data_for_charts$geography == "Philadelphia"]), digits = 1)
pz_assisted_units = sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"])
pz_family_units = round((sum(na.omit(pz_nhpd_sf$total_units[pz_nhpd_sf$target_tenant_type == "Family"]))) /(sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"]))*100, digits = 1)
pz_elderly_units = round((sum(na.omit(pz_nhpd_sf$total_units[pz_nhpd_sf$target_tenant_type == "Elderly"]))) /(sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"]))*100, digits = 1)
disabled_or_combo = round(((sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"])) -
(sum(pz_nhpd_sf$total_units[is.na(pz_nhpd_sf$target_tenant_type)]) +
sum(na.omit(pz_nhpd_sf$total_units[pz_nhpd_sf$target_tenant_type == "Family"])) +
sum(na.omit(pz_nhpd_sf$total_units[pz_nhpd_sf$target_tenant_type == "Elderly"])))) /
(sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"])) * 100, digits = 1)
no_target = round(sum(pz_nhpd_sf$total_units[is.na(pz_nhpd_sf$target_tenant_type)]) /
sum(phl_nhpd_sf_full$total_units[phl_nhpd_sf_full$in_pz == "true"]) * 100, digits = 1)
less_than_five_exp = pz_nhpd_sf |>
filter(as.integer(earliest_end_year) <= (as.integer(lubridate::year(lubridate::today()))+ 5)) |>
summarise(sum = sum(total_units)) |>
as.data.frame() |>
dplyr::select(-geometry)
five_to_ten_exp = pz_nhpd_sf |>
filter(as.integer(earliest_end_year) >= (as.integer(lubridate::year(lubridate::today()))+ 6) &
as.integer(earliest_end_year) <= (as.integer(lubridate::year(lubridate::today())))+ 10) |>
summarise(sum = sum(total_units)) |>
as.data.frame() |>
dplyr::select(-geometry)
ten_to_twenty_exp = pz_nhpd_sf |>
filter(as.integer(earliest_end_year) >= (as.integer(lubridate::year(lubridate::today()))+ 11) &
as.integer(earliest_end_year) <= (as.integer(lubridate::year(lubridate::today())))+ 20) |>
summarise(sum = sum(total_units)) |>
as.data.frame() |>
dplyr::select(-geometry)
twenty_one_plus = pz_nhpd_sf |>
filter(as.integer(earliest_end_year) >= (as.integer(lubridate::year(lubridate::today()))+ 21)) |>
summarise(sum = sum(total_units)) |>
as.data.frame() |>
dplyr::select(-geometry)
no_exp = sum(pz_nhpd_sf$total_units[is.na(pz_nhpd_sf$earliest_end_year)])
Below is the text for the dashboard. I’ve coded it so that it automatically incorporates the relevant statistical calculations. The Markdown document will spit out just the text, but you can unhide code to see the calculations themselves.
This dashboard displays Philadelphia data on federally-assisted multi-family housing taken from the National Housing Preservation Database (NHPD).
The map tool can be used to identify individual housing developments: • points are sized according to the total units at a development • points are colored according to how soon they will expire
As of 2022-07-14, the NHPD counted 33281 total units of federally-assisted multi-family housing in Philadelphia. Of these, 3395 (10.2%) are in the Promise Zone.
The Promise Zone faces disproportionately high rates of rent burden: 40.4% in the Promise Zone versus 22.7% citywide. assisted housing in the Promise Zone targets different types of tenants. Out of 3395 total assisted units: • 26.8% are for families • 24% are for older adults • 9.1% are for disabled people • 40.1% of units have no specific target
Expiration timeline: • Next 5 years: 653 units • 6 to 10 years: 140 units • 11 to 20 years: 1232 units • 21 years or later: 152 units • No expiration date: 1218 units
Once you have all these pieces ready to go, you can upload the relevant pieces to ArcGIS and then assemble everything in the StoryMap.
This will require you to create five separate components:
Once you have created these, you can upload them as appropriate to the ArcGIS storymap and format them as you see fit. This may change a bit each time you make the map, and you may have to do some fiddling with the visualizations; what I’ve created here can certainly be improved.
The NHPD map is the most complicated part of the dsahboard. You’ll have to take the shapefile created earlier in this script and upload it to ArcGIS Pro. You can then push the resulting layer to an ArcGIS Online webmap. That webmap can be turned into a dashboard which, when properly formatted, can then be embedded in the final storymap.
Uploading the shapefile
Push the layer to ArcGIS Online
Turning the webmap into a dashboard
Creating the construction permit map can be done entirely in ArcGIS Online:
Creating the construction permit map can be done entirely in ArcGIS Online:
To take graphics from this markdown document to the storymap, right click on them in the markdown and select “save as”. Name them, making sure that the filename ends in “.png”. You can then upload the resulting .png files directly to the storymap and place them where you want to. If you want to edit the graphics, you can change the underlying R script.
Finally, you can take the text from above in this walkthrough (section 5) and paste it into the storymap as appropriate. If you decide to edit the text for future versions of the webmap, make sure that you also edit this markdown document accordingly so that your edits are reproducible the next time you update the storymap.
Finally, you’ll want to publish the storymap, which you can do by clicking “publish” in the top righthand corner of the storymap editor. When you do this, make sure that the dashboard itself, as well as all the underlying parts, are set to public sharing. If the dashboard is public but the makes that make it up are not set to public sharing, then viewers will not be able to see all elements of the map.
The issues below remain to be fixed. Some of them can be solved with some persistence; others are simply not worth the effort. You’ll have to gauge the combined abilities of the dashboards team and your own comfort with tools like Adobe Illustrator and Photoshop in order to decide what you want to work on.
Issues that can be fixed now:
Issues that can’t yet be fixed: