Use this R script to extract county-level median rent estimates and their error margins from the U.S. Census Bureau API, format the data and filter it for user-specified counties, calculate the minimum annual household, monthly and hourly pay, and number of minimum-wage jobs needed to afford each median rent (assuming that “affordable” means no more than 30 percent of one’s income), and produce an interactive graphic of each median rent and its error margin. The script also saves the final dataset as a local .csv file.
The “tidyverse” and “tidycensus” packages are required. This code checks to see whether they are installed already, installs them if they aren’t, and loads them into memory.
if (!require("tidyverse")) install.packages("tidyverse")
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
if (!require("tidycensus")) install.packages("tidycensus")
## Loading required package: tidycensus
library(tidyverse)
library(tidycensus)
The U.S. Census Bureau’s API requires an API key. You can get one for
free at https://api.census.gov/data/key_signup.html. Once you
get yours, replace PastYourAPIKeyBetwenTheseQuoteMarks
with
your API key. Also, delete the #
from the front of the
# census_api_key
code.
#census_api_key("PasteYourAPIKeyBetweenTheseQuoteMarks")
All set to fetch the estimated median rent data from the Census API
and store it in a data frame called MedianRent
. The
TN
and 2021
parameters tell the script to
request data for counties in Tennessee and from the 2021 American
Community Survey. You may edit these to retrieve data for other states
and/or years, provided data are available. Importantly, the
acs1
paramter tells the script is asking for data from the
single-year American Community Survey. Single-year ACS datasets contain
county-level data only for counties with 65,000 residents or more.
Getting data for smaller counties would require changing
acs1
to acs5
, assuming the requested data are
available.
The B25031_001
parameter is the U.S. Census Bureau name
for the Median Gross Rent estimated. You can retrieve it and other
variable names from the American
Community Survey 1-Year Data Codebook. asdf
MedianRent <- get_acs(geography = "county",
state = "TN",
variables = c(Median = "B25031_001"),
year = 2021,
survey = "acs1",
output = "wide",
geometry = FALSE)
## Getting data from the 2021 1-year ACS
## The 1-year ACS provides data for geographies with populations of 65,000 and greater.
By default, the ACS adds ” County, Tennessee” to every county name retrieved. This script searches for, and removes, the ” County, Tennessee” text from each county name. It also deletes data for all counties except Davidson County and the five 65,000-plus counties contiguous to Davidson County. The list of county names can be modified as needed, or the filtering step can be skipped altogether.
# Shortening county names
MedianRent$NAME <- str_remove(MedianRent$NAME,
" County, Tennessee")
#Filtering for Davidson and contiguous counties
MedianRent <- MedianRent%>%
filter(NAME %in% c("Davidson",
"Robertson",
"Rutherford",
"Sumner",
"Williamson",
"Wilson"))
This code calculates the lower (From
) and upper
(To
) ends of the error margin for each rent estimate, using
the MedianM
error margin width variable provided by the
Census Bureau.
It also calculates the minimum Monthly
and
Hourly
income one would have to earn in order to pay the
median rent with no more than 30 percent of one’s total income.
Budgeting experts widely recommend spending no more than 30 percent of
one’s household budget on housing. The script also calculates the number
of minimum-wage Jobs
one would have to work to afford each
median rent, assuming $7.25 an hour, 40 hours of work per week, and rent
equal to no more than 30 percent of one’s total income.
Finally, the script rounds all figures to two decimal places, sorts
the data in descending order by the Median
estimate, and
saves the data to a .csv file, in the same directory as the script.
#Calculating error margin ranges and rent contextual statistics
MedianRent$County = MedianRent$NAME
MedianRent$Median = MedianRent$MedianE
MedianRent$From = MedianRent$MedianE-MedianRent$MedianM
MedianRent$To = MedianRent$MedianE+MedianRent$MedianM
MedianRent$Monthly = round(MedianRent$MedianE/.30, digits = 2)
MedianRent$Hourly = round(MedianRent$Monthly/160, digits = 2)
MedianRent$Jobs = round(MedianRent$Hourly/7.25, digits =2)
#Sorting data by median gross rent, in descending order
MedianRent <- MedianRent[order(-MedianRent$MedianE),]
#Saving data to a .csv
write.csv(MedianRent,"CountyMedianRents.csv", row.names = FALSE)
Here is a script for producing an interactive bar chart showing each
median rent, the county it is associated with, and the median rent’s
error margin, all sorted in descending order, from left to right. The
plotly
package is required. The code checks to see whether
it is installed, installs it if it isn’t, and loads it into memory. Note
that the code assumes the MedianRent
data frame has been
produced and is available in memory.
if (!require("plotly")) install.packages("plotly")
## Loading required package: plotly
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(plotly)
plot_ly(MedianRent,
x = ~County,
y = ~Median,
type = 'bar',
name = 'Median Rent',
error_y = ~list(array = MedianM,
color = '#000000')) %>%
layout(xaxis = list(categoryorder = "total descending"))