1 Housing and Affordable Housing in Pittsburgh

The information in this report will feature housing assessment and sales data for the City of Pittsburgh, as well as for neighborhoods where the Pittsburgh Housing Development Corporation have been active in the housing development market, or are planning to initiate larger scale efforts.

The purpose of this report is to better understand the current trends and dynamics in housing, specifically in affordable housing, for Pittsburgh moving forward. Aggregate data for the City as well as neighborhood level analysis will hopefully bring into focus the nuances, both in terms of neighborhood benefits and challenges, for housing development projects and programs.

3 Conclusions

Pittsburgh’s housing market has faced serious challenges in recent decades. A mix of declining population and aging housing stock has contributed to low property values, vacant, blighted, and condemned buildings, as well as occupied structures in need of major repairs.

However, a positive trend in sale prices has been shown in recent years. Public investment in areas with historically low home values have improved the climate for homeownerhsip and encouraged private investment in neighborhoods that were previously neglected by the private market. Now, a balance must be struck between the maintained appreciation of homes and land within a neighborhood and the affordability of the neighborhood for current LMI residents. The URA and PHDC help to achieve this through homeownerhsip initiatives and focus on creating generational wealth via pathways to long-term housing affordability, though a speculative private market may challenge affordability as neighborhood continue to improve.

4 Appendix

4.1 Limitations

There are a few limitations to taking conclusions ofrom this small study. The main one being the data, which is incomplete at times due to infrequence of property assessments in Allegheny County. More regular property assessments would give a better understanding of Fair Market Values. The data on Sale Price., though, is a useful metric.

Ideally, some form of hedonic price modelling can be undertaken with this data. this would allow for better prediction of future trends and evaluation of property value at a neighborhood or even street level (if some spatial element was included). Some properties include information on housing characteristics, such as bedrooms, square footage, garage, etc… However this data provided by the County may not be complete enough to equip an analyst with enough data to create a useful model.

4.2 Code

This project was taken on with the hope of creating a template for using the Allegheny County Assessments Data periodically to check the trends in sales and market values. Analysis can be adjusted for type of sales, commercial structures, as well as different geographies. This dataset may be especially useful to those with strong GIS skills.

rm(list = ls()) 
  gc()            
  cat("\f")  
packages <- c("readr", #open csv
              "psych", # quick summary stats for data exploration,
              "stargazer", #summary stats for sharing,
              "tidyverse", # data manipulation like selecting variables,
              "corrplot", # correlation plots
              "ggplot2", # graphing
              "ggcorrplot", # correlation plot
              "gridExtra", #overlay plots
              "data.table", # reshape for graphing 
              "car", #vif
              "prettydoc", # html output
              "visdat", # visualize missing variables
              "glmnet", # lasso/ridge
              "caret", # confusion matrix
              "MASS", #step AIC
              "plm", # fixed effects demeaned regression
              "lmtest" # test regression coefficients
)

for (i in 1:length(packages)) {
  if (!packages[i] %in% rownames(installed.packages())) {
    install.packages(packages[i]
                     , repos = "http://cran.rstudio.com/"
                     , dependencies = TRUE
    )
  }
  library(packages[i], character.only = TRUE)
}

rm(packages)
setwd("/Users/matthewcolantonio/Desktop/housing_pgh/")
a <- read_csv("allegheny_assess_may.csv")
a1 <- a[,c(1, 5, 8, 18:19, 25, 27:28, 34:35, 37, 48:57)] 
# removing hyphens from dates to easily subset before and after periods 
a1 <- a1 %>% mutate_if(is.character, function(c) str_replace_all(c, c("-"="")))
# limiting saledate to the last four digits (year)
a1$SALEDATE <- substr(a1$SALEDATE, start = nchar(a1$SALEDATE) - 3, stop = nchar(a1$SALEDATE))
# date as numerical not chracter
a1$SALEDATE <- as.numeric(a1$SALEDATE)
# dates after 2000
a2 <- a1[a1$SALEDATE > 1999,]
# character to numeric
a2$COUNTYBUILDING <- as.numeric(a2$COUNTYBUILDING)

a2$COUNTYLAND <- as.numeric(a2$COUNTYLAND)

a2$LOCALTOTAL <- as.numeric(a2$LOCALTOTAL)

a2$FAIRMARKETBUILDING <- as.numeric(a2$FAIRMARKETBUILDING)

a2$FAIRMARKETTOTAL <- as.numeric(a2$FAIRMARKETTOTAL)
# subsetting to include residential properties only
a3 <-a2[a2$CLASSDESC == "RESIDENTIAL",]
a3 %>%
  summarise(across(everything(),~sum(is.na(.)))) %>%
  glimpse()

a3 <- na.omit(a3)
a4 <- a3[a3$SALEDESC == "VALID SALE",]
describe(a4, fast = T)
# City of Pittsburgh 
pgh <- a4[a4$PROPERTYCITY == "PITTSBURGH",]
#Sale price aggregate, City
agg.pgh <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = pgh, 
          FUN = median)
#Fair Market Total ggregate, City
agg.pgh2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = pgh, 
          FUN = median)
ggplot(agg.pgh, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = "Median Sale Price")) +
  scale_color_manual(values = c("blue", "red")) +
  labs(title = "Median Sale Prices in Pittsburgh", x = "Year", y = "Sales Price (thousands)", color = "Variable") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +
  theme_minimal() + 
  guides(color = F) # remove color legend


ggplot(agg.pgh2, aes(x = SALEDATE)) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = "Fair Market Value")) +
  scale_color_manual(values = c("blue", "red")) +
  labs(title = "Fair Market Value of Homes in Pittsburgh", x = "Year", y = "Median Value (thousands)") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +  # Format y-axis as dollars +  #adding axis labels
  theme_minimal() + 
  guides(color = F) # remove color legend
# subset neighborhoods fo interest
elliott.ch <- pgh[pgh$PROPERTYZIP == "15220",]
sheraden <- pgh[pgh$PROPERTYZIP == "15204",]
allentown.beltz<- pgh[pgh$PROPERTYZIP == "15210",]
elliot <- pgh[pgh$PROPERTYZIP == "15220",]
hill <- pgh[pgh$PROPERTYZIP == "15219",]
larimer <- pgh[pgh$PROPERTYZIP == "15206",]
#Sale price aggregate, elliot/crafton heights
agg.ech <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = elliott.ch, 
          FUN = median)
#Fair Market Total ggregate, elliot, crafton heights
agg.ech2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = elliott.ch, 
          FUN = median)

#Sale price aggregate, sheraden
agg.s <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = sheraden, 
          FUN = median)
#Fair Market Total ggregate, sheraden
agg.s2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = sheraden, 
          FUN = median)
# merge Elliott and CH to display FMV and Sales on same chart
ech_merged <- merge(agg.ech, agg.ech2, by = "SALEDATE", all = T)
ggplot(ech_merged, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = 'Sales Price')) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = 'Fair Market Value')) +
  labs(title = "Elliott and Crafton Heights", y = "Thousands of $", x = "Year") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +  # Format y-axis as dollars +  #adding axis labels
  theme_minimal() +
  geom_vline(xintercept = 2013, linetype = "dashed") +
  theme(legend.title = element_blank())  # don't want a lergend title

s_merged <- merge(agg.s, agg.s2, by = "SALEDATE", all = T)
# Sheraden
ggplot(s_merged, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = 'Sales Price')) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = 'Fair Market Value')) +
  labs(title = "Sheraden", y = "Thousands of $", x = "Year") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +  # Format y-axis as dollars +  #adding axis labels
  theme_minimal() +
  geom_vline(xintercept = 2013, linetype = "dashed") +
  theme(legend.title = element_blank())  # don't want a lergend title

#Sale price aggregate, allentown/beltzhover
agg.ab <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = allentown.beltz, 
          FUN = median)
#Fair Market Total ggregate, allentown/beltzhoover
agg.ab2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = allentown.beltz, 
          FUN = median)
ab_merged <- merge(agg.ab, agg.ab2, by = "SALEDATE", all = T)
# allentown/beltzhoover 15210
ggplot(ab_merged, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = 'Sales Price')) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = 'Fair Market Value')) +
  labs(title = "Allentown and Beltzhoover", y = "Thousands of $", x = "Year") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +  # Format y-axis as dollars +  #adding axis labels
  theme_minimal() +
  geom_vline(xintercept = 2013, linetype = "dashed") +
  theme(legend.title = element_blank())  # don't want a lergend title
#Sale price aggregate, the hill
agg.h <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = hill, 
          FUN = median)
#Fair Market Total ggregate, the hill
agg.h2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = hill, 
          FUN = median)
h_merged <- merge(agg.h, agg.h2, by = "SALEDATE", all = T)
ggplot(h_merged, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = 'Sales Price')) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = 'Fair Market Value')) +
  labs(title = "The Hill District", y = "Thousands of $", x = "Year") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +  # Format y-axis as dollars
  theme_minimal() +
  geom_vline(xintercept = 2013, linetype = "dashed") +
  annotate("text", x = 2013, y = Inf, label = "Last County Assessment", vjust = -1, hjust = 0) +
  theme(legend.title = element_blank())
#Sale price aggregate, sheraden
agg.l <- aggregate(cbind(SALEPRICE) ~ SALEDATE + PROPERTYCITY, 
          data = larimer, 
          FUN = median)
#Fair Market Total ggregate, sheraden
agg.l2 <- aggregate(cbind(FAIRMARKETTOTAL) ~ SALEDATE, 
          data = larimer, 
          FUN = median)
l_merged <- merge(agg.l, agg.l2, by = "SALEDATE", all = T)
#Larimer
ggplot(l_merged, aes(x = SALEDATE)) +
  geom_line(aes(y = SALEPRICE, color = 'Sales Price')) +
  geom_line(aes(y = FAIRMARKETTOTAL, color = 'Fair Market Value')) +
  labs(title = "Larimer", y = "Thousands of $", x = "Year") +  #adding axis labels
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +
  theme_minimal() +
  geom_vline(xintercept = 2013, linetype = "dashed") +
  theme(legend.title = element_blank())  # don't want a lergend title

# merging neighborhoods with City to compare trends
merged <- merge(agg.pgh, agg.ech, by = "SALEDATE", all = TRUE)
merged <- merge(merged, agg.s, by = "SALEDATE", all = TRUE)
merged <- merge(merged, agg.ab, by = "SALEDATE", all = T)
merged <- merge(merged, agg.h, by = "SALEDATE", all = T)
merged <- merge(merged, agg.l, by = "SALEDATE", all = T)
# Rename columns
colnames(merged)[3:13] <- c("pgh","city2","ech","city3", "sheraden", "city4", "atown.b", "city5", "hill", "city6", "larimer")
ggplot(merged, aes(x = SALEDATE)) +
  geom_line(aes(y = pgh, color = "Pittsburgh")) +
  geom_line(aes(y = ech, color = "Elliott/Crafton Heights")) +
  geom_line(aes(y = sheraden, color = "Sheraden")) +
  geom_line(aes(y = atown.b, color = "Allentown/Beltzhoover")) +
  geom_line(aes(y = hill, color = "Hill District")) +
  geom_line(aes(y = larimer, color = "Larimer")) +
  labs(x = "Year", y = "Sale Price (thousands)", color = "Neighborhood") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +
  theme_minimal()
merged2 <- merge(agg.pgh2, agg.ech2, by = "SALEDATE", all = TRUE)
merged2 <- merge(merged2, agg.s2, by = "SALEDATE", all = TRUE)
merged2 <- merge(merged2, agg.ab2, by = "SALEDATE", all = T)
merged2 <- merge(merged2, agg.h2, by = "SALEDATE", all = T)
merged2 <- merge(merged2, agg.l2, by = "SALEDATE", all = T)
# Rename columns
colnames(merged2)[2:7] <- c("pgh","ech","sheraden", "atown.b", "hill", "larimer")
ggplot(merged2, aes(x = SALEDATE)) +
  geom_line(aes(y = pgh, color = "Pittsburgh")) +
  geom_line(aes(y = ech, color = "Elliott/Crafton Heights")) +
  geom_line(aes(y = sheraden, color = "Sheraden")) +
  geom_line(aes(y = atown.b, color = "Allentown/Beltzhoover")) +
  geom_line(aes(y = hill, color = "Hill District")) +
  geom_line(aes(y = larimer, color = "Larimer")) +
  labs(x = "Year", y = "Fair Market Value (thousands)", color = "Neighborhood") +
  scale_y_continuous(labels = scales::dollar_format(scale = 0.001)) +
  theme_minimal()