2024-12-16

Project Overview

Goal of the Project

Analyze if market rate apartment rents are consistent with federal defined Fair Market Rents (FMR) levels in three distinct metropolitan areas.


What is FMR?

  • FMR is the 40th percentile of gross rents for typical rental units paid by tenants that moved within the last 20 months.
  • FMR provides an estimate for rents for the fiscal year starting Oct 2024-Sept 2025
  • FMR is used to establish benefits such as the Housing Choice Vouchers used to help unhoused individuals find permanent housing.

Project Overview (cont.)

Potential Problem

FMR is an annually-released estimate that data does not capture the real-time economic realities that drive local housing markets.


Implications of Problem

Some individuals that are approved for housing assistance benefits are unable to find an apartment locally that they can pay for with their approved voucher amount based on their area’s FMR. In other words, market rates tend to outpace FMR estimates when market conditions quickly drive rental prices up in certain areas.

About the Data

Data sources

  • FMR data pulled from HUD using their publicly available API.
  • Market rate data was scraped from the rental listing site Trulia


Narrowing our Data

  • Atlanta (GA), Buffalo (NY) and San Diego (CA) Metropolitan Areas
  • Apartments less than 3 bedrooms

Working with the HUD API

https://www.huduser.gov/portal/dataset/fmr-api.html

# Custom function to call HUD FMR & IL API
call_hud <- function(endpoint) {
  
  # init request
  req <- request("https://www.huduser.gov") |>   #domain
    req_headers("Accept" = "application/json") |> 
    req_auth_bearer_token(token) |>
    req_url_path(paste("hudapi/public/fmr/", endpoint,sep="")) |> #path
    req_user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763")
  
  # parse response
  json_resp <- req_perform(req) |> 
    resp_body_string() |>
    fromJSON() 
  
  # return parsed response
  return(json_resp) 
}

API Responses

Getting the Ids for the Metropolitan Areas

metro_areas <- call_hud("listMetroAreas")
## Rows: 3
## Columns: 3
## $ cbsa_code <chr> "METRO12060M12060", "METRO15380M15380", "METRO41740M41740"
## $ area_name <chr> "Atlanta-Sandy Springs-Roswell", "Buffalo-Cheektowaga-Niagar…
## $ state     <chr> "GA", "NY", "CA"

Getting the FMR Data by Zip Code for our Target Metropolitan Areas

for (i in 1:nrow(metro_areas)) {
  metro_fmrs <- rbind(metro_fmrs, get_metro_data(metro_areas[i,]))
}
## # A tibble: 6 × 5
##   area_name                     zip_code studio one_bedroom two_bedroom
##   <chr>                         <chr>     <int>       <int>       <int>
## 1 Atlanta-Sandy Springs-Roswell 30002      1100        1150        1270
## 2 Atlanta-Sandy Springs-Roswell 30003      1730        1800        1990
## 3 Atlanta-Sandy Springs-Roswell 30004      1830        1910        2110
## 4 Atlanta-Sandy Springs-Roswell 30005      2030        2110        2340
## 5 Atlanta-Sandy Springs-Roswell 30006      1660        1730        1910
## 6 Atlanta-Sandy Springs-Roswell 30007      1660        1730        1910

Tidying the FMR Data

metro_fmr_by_zip_and_num_bds <- metro_fmrs |>
  pivot_longer(
    cols = studio:two_bedroom,
    names_to = c("bedrooms"),
    values_to = "fmr"
  ) |>
  mutate(
    bedrooms = case_when(
      bedrooms == 'studio' ~ as.integer(0),
      bedrooms == 'one_bedroom' ~ as.integer(1),
      bedrooms == 'two_bedroom' ~ as.integer(2)
    )
  ) 
## # A tibble: 6 × 4
##   area_name                     zip_code bedrooms   fmr
##   <chr>                         <chr>       <int> <int>
## 1 Atlanta-Sandy Springs-Roswell 30002           0  1100
## 2 Atlanta-Sandy Springs-Roswell 30002           1  1150
## 3 Atlanta-Sandy Springs-Roswell 30002           2  1270
## 4 Atlanta-Sandy Springs-Roswell 30003           0  1730
## 5 Atlanta-Sandy Springs-Roswell 30003           1  1800
## 6 Atlanta-Sandy Springs-Roswell 30003           2  1990

Scraping Trulia

  page_reponse <- read_html(paste("https://www.trulia.com/for_rent/", 14201, "_zip/", 2, "_beds/", sep=""))

Scraping Trulia (Cont.)

## # A tibble: 10 × 3
##    zip_code bedrooms market_rate
##    <chr>       <dbl>       <int>
##  1 30002           2        2250
##  2 30002           2        1900
##  3 30002           2        1950
##  4 30002           2        1200
##  5 30002           2        1200
##  6 30002           2        1286
##  7 30002           2        1570
##  8 30002           2        1450
##  9 30002           2        1923
## 10 30002           2        2420

Calculating th 40th Percentile for Market Rates

# group by zip codes and # bedrooms
market_rate_by_zip_and_num_bds <- market_rates |>
  group_by(across(all_of(c("zip_code", "bedrooms")))) |>
  summarise(
    market_rate = round(quantile(market_rate,probs=0.4))
  )
## `summarise()` has grouped output by 'zip_code'. You can override using the
## `.groups` argument.
head(market_rate_by_zip_and_num_bds)
## # A tibble: 6 × 3
## # Groups:   zip_code [3]
##   zip_code bedrooms market_rate
##   <chr>       <int>       <dbl>
## 1 14001           1        1040
## 2 14001           2        1145
## 3 14004           1        1748
## 4 14004           2        1750
## 5 14006           1        1159
## 6 14006           2        1315

Joining dataframes

# Combine FMR and Market Rate Dataframes By Zip Code
combined_df_by_zip <- metro_fmr_by_zip_and_num_bds |>
  right_join(market_rate_by_zip_and_num_bds, by= c('zip_code'='zip_code', 'bedrooms'='bedrooms')) |>
  mutate(bedrooms = as.factor(bedrooms))

# C4: pivot longer by estimate type
combined_df_by_zip_and_type <- combined_df_by_zip |>
  pivot_longer(
    cols = c(fmr, market_rate),
    names_to = c("type"),
    values_to = "rent"
  )
## # A tibble: 6 × 5
##   area_name                     zip_code bedrooms type         rent
##   <chr>                         <chr>    <fct>    <chr>       <dbl>
## 1 Atlanta-Sandy Springs-Roswell 30002    0        fmr          1100
## 2 Atlanta-Sandy Springs-Roswell 30002    0        market rate  1398
## 3 Atlanta-Sandy Springs-Roswell 30002    1        fmr          1150
## 4 Atlanta-Sandy Springs-Roswell 30002    1        market rate  1360
## 5 Atlanta-Sandy Springs-Roswell 30002    2        fmr          1270
## 6 Atlanta-Sandy Springs-Roswell 30002    2        market rate  1675

Plotting the Distribution of Estimate Types

Count of Zip Codes for Each Estimate Type

# Calculate Rent Differences by Zip
# create df with the margin of error 
combined_df_by_zip_diffs <- combined_df_by_zip |>
  mutate(
    margin_of_error = market_rate - fmr,
    at_or_below_fmr = as.factor(fmr >= market_rate)
  ) |>
  select(-c(market_rate))

# Counts by Estimate Type
counts_by_estimate_type <- combined_df_by_zip_diffs |>
  group_by(area_name, bedrooms, at_or_below_fmr) |>
  summarise(
    count_type =  n(),
  ) |>
  ungroup() |>
  arrange(area_name, bedrooms, desc(at_or_below_fmr))

# calc total counts
combined_df_by_zip_diffs_total_counts <- combined_df_by_zip_diffs |>
  group_by(area_name, bedrooms) |>
  summarise(
    count_total = n(),
  ) |>
  ungroup() 

## # A tibble: 4 × 6
##   area_name              zip_code bedrooms   fmr margin_of_error at_or_below_fmr
##   <chr>                  <chr>    <fct>    <int>           <dbl> <fct>          
## 1 Atlanta-Sandy Springs… 30002    0         1100             298 FALSE          
## 2 Atlanta-Sandy Springs… 30002    1         1150             210 FALSE          
## 3 Atlanta-Sandy Springs… 30002    2         1270             405 FALSE          
## 4 Atlanta-Sandy Springs… 30003    0         1730            -654 TRUE
# calculate percent of type per group
counts_by_estimate_type <- counts_by_estimate_type |>
  left_join(combined_df_by_zip_diffs_total_counts, by=c("area_name", "bedrooms")) |>
  mutate(
    type_ratio = count_type / count_total
  ) 

head(counts_by_estimate_type, n=4)
## # A tibble: 4 × 6
##   area_name           bedrooms at_or_below_fmr count_type count_total type_ratio
##   <chr>               <fct>    <fct>                <int>       <int>      <dbl>
## 1 Atlanta-Sandy Spri… 0        TRUE                   159         222      0.716
## 2 Atlanta-Sandy Spri… 0        FALSE                   63         222      0.284
## 3 Atlanta-Sandy Spri… 1        TRUE                   225         270      0.833
## 4 Atlanta-Sandy Spri… 1        FALSE                   45         270      0.167

Absolute Dollar Difference

What is the Median Rent Difference Above FMR?

Groups of Interest where Mark Rate Exceeded FMR

Area Bedrooms FMR Median $Diff Above FMR
Buffalo-Cheektowaga-Niagara Falls 1 $990 $220
Buffalo-Cheektowaga-Niagara Falls 2 $1180 $198



Groups of Interest where FMR Exceeded Market Rates

Area Bedrooms FMR Median $Diff Below FMR Median $Diff Above FMR
San Diego-Carlsbad 2 $2880 $314 $372

Conclusion & Limitations

  • Analysis is based on a “snapshot” of parsed data on December 12th showing only a sample of current listings.
  • Some zip codes had no rental listings available on that particular day.
  • AJAX pagination issues
  • Should collect real estate data over time to better represent real market conditions. Alt. use time-series data like the Zillow Observed Rent Index (ZORI).
  • Future work: Expand to include 3/4 bedroom apartments and other metropolitan areas