Overview

Use this R script to extract county-level median rent estimates and their error margins from the U.S. Census Bureau API, format the data and filter it for user-specified counties, calculate the minimum annual household, monthly and hourly pay, and number of minimum-wage jobs needed to afford each median rent (assuming that “affordable” means no more than 30 percent of one’s income), and produce an interactive graphic of each median rent and its error margin. The script also saves the final dataset as a local .csv file.

Loading required packages

The “tidyverse” and “tidycensus” packages are required. This code checks to see whether they are installed already, installs them if they aren’t, and loads them into memory.

if (!require("tidyverse")) install.packages("tidyverse")
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
if (!require("tidycensus")) install.packages("tidycensus")
## Loading required package: tidycensus
library(tidyverse)
library(tidycensus)

Census API authentication

The U.S. Census Bureau’s API requires an API key. You can get one for free at https://api.census.gov/data/key_signup.html. Once you get yours, replace PastYourAPIKeyBetwenTheseQuoteMarks with your API key. Also, delete the # from the front of the # census_api_key code.

#census_api_key("PasteYourAPIKeyBetweenTheseQuoteMarks")

Getting the data

All set to fetch the estimated median rent data from the Census API and store it in a data frame called MedianRent. The TN and 2021 parameters tell the script to request data for counties in Tennessee and from the 2021 American Community Survey. You may edit these to retrieve data for other states and/or years, provided data are available. Importantly, the acs1 paramter tells the script is asking for data from the single-year American Community Survey. Single-year ACS datasets contain county-level data only for counties with 65,000 residents or more. Getting data for smaller counties would require changing acs1 to acs5, assuming the requested data are available.

The B25031_001 parameter is the U.S. Census Bureau name for the Median Gross Rent estimated. You can retrieve it and other variable names from the American Community Survey 1-Year Data Codebook. asdf

MedianRent <- get_acs(geography = "county",
                  state = "TN",
                  variables = c(Median = "B25031_001"),
                  year = 2021,
                  survey = "acs1",
                  output = "wide",
                  geometry = FALSE)
## Getting data from the 2021 1-year ACS
## The 1-year ACS provides data for geographies with populations of 65,000 and greater.

Formatting and filtering

By default, the ACS adds ” County, Tennessee” to every county name retrieved. This script searches for, and removes, the ” County, Tennessee” text from each county name. It also deletes data for all counties except Davidson County and the five 65,000-plus counties contiguous to Davidson County. The list of county names can be modified as needed, or the filtering step can be skipped altogether.

# Shortening county names
MedianRent$NAME <- str_remove(MedianRent$NAME,
                              " County, Tennessee")

#Filtering for Davidson and contiguous counties
MedianRent <- MedianRent%>% 
  filter(NAME %in% c("Davidson",
                     "Robertson",
                     "Rutherford",
                     "Sumner",
                     "Williamson",
                     "Wilson"))

Calculations

This code calculates the lower (From) and upper (To) ends of the error margin for each rent estimate, using the MedianM error margin width variable provided by the Census Bureau.

It also calculates the minimum Monthly and Hourly income one would have to earn in order to pay the median rent with no more than 30 percent of one’s total income. Budgeting experts widely recommend spending no more than 30 percent of one’s household budget on housing. The script also calculates the number of minimum-wage Jobs one would have to work to afford each median rent, assuming $7.25 an hour, 40 hours of work per week, and rent equal to no more than 30 percent of one’s total income.

Finally, the script rounds all figures to two decimal places, sorts the data in descending order by the Median estimate, and saves the data to a .csv file, in the same directory as the script.

#Calculating error margin ranges and rent contextual statistics
MedianRent$County = MedianRent$NAME
MedianRent$Median = MedianRent$MedianE
MedianRent$From = MedianRent$MedianE-MedianRent$MedianM
MedianRent$To = MedianRent$MedianE+MedianRent$MedianM
MedianRent$Monthly = round(MedianRent$MedianE/.30, digits = 2)
MedianRent$Hourly = round(MedianRent$Monthly/160, digits = 2)
MedianRent$Jobs = round(MedianRent$Hourly/7.25, digits =2)

#Sorting data by median gross rent, in descending order
MedianRent <- MedianRent[order(-MedianRent$MedianE),]

#Saving data to a .csv
write.csv(MedianRent,"CountyMedianRents.csv", row.names = FALSE)

Graphing the median rents

Here is a script for producing an interactive bar chart showing each median rent, the county it is associated with, and the median rent’s error margin, all sorted in descending order, from left to right. The plotly package is required. The code checks to see whether it is installed, installs it if it isn’t, and loads it into memory. Note that the code assumes the MedianRent data frame has been produced and is available in memory.

if (!require("plotly")) install.packages("plotly")
## Loading required package: plotly
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(plotly)

plot_ly(MedianRent,
        x = ~County,
        y = ~Median,
        type = 'bar',
        name = 'Median Rent',
        error_y = ~list(array = MedianM,
                        color = '#000000')) %>% 
  layout(xaxis = list(categoryorder = "total descending"))