README

Overview

This file contains instructions to reproduce data used for the survey design and analysis in Elmendorf, Nall, and Oklobdzija (2025b). The file structure of this replication package (Elmendorf, Nall, and Oklobdzija 2025a) is organized as follows:

JEP_replication/

└── README.md

└── references.bib

└── License.txt

└── code/

└── data/

└── figures/

└── survey_instruments/

└── tables/

The file /code/folk_econ_rep_code_JEP.rmd generates all figures and tables for the paper and online appendix, saving them to the /figures and /tables folders, respectively. The replicator should expect the code to run for about 6 minutes.

The file /code/zip_replicator/rents_home_price_api_data_generator.R will create the data table used to power the API that provided survey respondents with home price and rent data for their zip code. That data is also provided in the data folder as price_by_zip.csv. ## Data Availability and Provenance Statements

This paper does not involve analysis of external data (i.e., no data are used or the only data are generated by the authors via simulation in their code).

The figures in our paper use data from four original surveys with registered preanalysis plans. (One figure is borrowed from a previously published paper, as noted in the text.) The surveys were conducted using the Qualtrics platform. The sampling frame consists of residents of U.S. urban and suburban zip codes. We excluded zip codes with a population density of less than 500 persons per square mile. Code to generate the

The respondents are from an online panel provided by Forthright, a leading vendor. We directed the survey vendor to maintain equal proportions of homeowners and renters in the sample and to balance on age, race, and gender using the vendor’s nationally representative population quotas. See Online Appendix Section 8 for a table that benchmarking the demographics of our samples against the U.S. population.

The data included with this replication package were downloaded from Qualtrics shortly after each survey closed ( download dates are noted in folk_econ_rep_code_JEP.Rmd). These .csv files were renamed for convenience but not altered in any way, with one exception: we excluded two columns from JPIPE paper data.csv, because some respondents provided personally identifiable information in these fields and the fields aren’t used or referenced in the present paper.

Statement about Rights

I certify that the author(s) of the manuscript have legitimate access to and permission to use the data used in this manuscript.
I certify that the author(s) of the manuscript have documented permission to redistribute/publish the data contained within this replication package. Appropriate permission are documented in the LICENSE.txt file.

License for Data

The data are licensed under a Creative Commons/CC-BY-NC license. See LICENSE.txt for details.

Summary of Availability

All data are publicly available.
Some data cannot be made publicly available.
No data can be made publicly available.

Details on each Data Source

Data.Name	Data.Files	Location	Provided	Citation
“Survey 1”	`Survey 1.csv`	data/	TRUE	(Elmendorf, Nall, and Oklobdzija 2025a)
“Survey 2”	`Survey 2.csv`	data/	TRUE	(Elmendorf, Nall, and Oklobdzija 2025a)
“Survey 3”	`Survey 3.csv`	data/	TRUE	(Elmendorf, Nall, and Oklobdzija 2025a)
“ZCTA, NHGIS”	`nhgis0009_ds244_20195_zcta.csv`	data/	TRUE	(Manson et al. 2024)
“Block Group, NHGIS”	`nhgis0010_ds244_20195_blck_grp.csv`	data/	TRUE	(Manson et al. 2024)
“ZCTA GIS, NHGIS”	`US_zcta_2020.shp`	data/	TRUE	(Manson et al. 2024)
“Blk Grp GIS, NHGIS”	`US_blck_grp_2019.shp`	data/	TRUE	(Manson et al. 2024)
“Zillow HVI by Zip”	`zhvi.csv`	data/zip_replication	TRUE	(Zillow Group n.d.)
“Zillow HVI by County”	`zhvi_county.csv`	data/zip_replication	TRUE	(Zillow Group n.d.)
“Zillow ORI by Zip”	`zori.csv`	data/zip_replication	TRUE	(Zillow Group n.d.)
“US Zips”	`uszips.csv`	data/zip_replication	TRUE	(Pareto Software, LLC 2022)
“Zip-County Crosswalk”	`zip_county_cw.csv`	data/zip_replication	TRUE	(Missouri Census Data Center 2025)
“City/Co. Med Rents”	`alist_2021_11.csv`	data/zip_replication	TRUE	(Apartment List Research Team 2022)

Each GIS dataset contains multiple, related GIS files, usually with the same filename stem. For brevity, we refer only to the .shp file.

Public use data collected by the authors

The survey data used to support the findings of this study have been deposited in the AEA ICPSR repository ([@CN add DOI here]). [1]. The data were collected by the authors, and are available under a Creative Commons Non-commercial license.

Datafiles: data/Survey 1.csv, data/Survey 2.csv, data/Survey 3.csv, data/JPIPE paper data.csv.

Public use data sourced from elsewhere and provided

In addition to screening zip codes by respondent, we assembled ZCTA-level data on local demographics and prices.
These data were used to feed locally specific housing price information to respondents.

Below, we list each data file, its origin, and a website with data description or codebook.

Datafiles:

data/zip_replication/zhvi.csv: Zip-code-level measures of the Zillow Home Value Index (ZHVI) All Homes (SFR, Condo/Co-op) Time Series, Smoothed, Seasonally Adjusted($) obtained from Zillow at https://www.zillow.com/research/data/.

data/zip_replication/zhvi_county.csv: County-level measures of the Zillow Home Value Index (ZHVI) All Homes (SFR, Condo/Co-op) Time Series, Smoothed, Seasonally Adjusted($) obtained from Zillow at https://www.zillow.com/research/data/.

data/zip_replication/zori.csv: Zip-code-level measures of the Zillow Observed Rent Index (ZORI) All Homes Plus Multifamily Time Series ($) obtained from Zillow at https://www.zillow.com/research/data/.\

data/zip_replication/uszips.csv: A list of all ZCTAs in the 50 American states as well as the District of Columbia was obtained from https://simplemaps.com/data/us-zips with a basic subscription. A codebook is available at the listed website.

data/zip_replication/zip_county_cw.csv: A crosswalk linking ZCTAs to counties obtained from the University of Missouri’s Census Data Center at https://mcdc.missouri.edu/applications/geocorr2022.html. The crosswalk was obtained by selecting all 50 U.S. states and the District of Columbia, ZIP/ZCTA in the “Source Geography” and County in the “Target Geography.” Population was selected for the weighting variable.

data/zip_replication/alist_2021_11: Data on city and county median rents was obtained from Apartment List at https://www.apartmentlist.com/research/category/data-rent-estimates. The Current Month Summary report option was selected in November 2021 from the Download Report dropdown. A description of the data is also featured on that page.

Example for public use data with required registration and provided extract

We use IPUMS NHGIS data to identify the zip codes to be included in our survey sampling frame. We calculate the population density of each block group. Then, using R spatial packages, we spatially join block-group centroids to zip code tabulation area and calculate the average density of block groups within a ZCTA, weighting by block-group population. IPUMS does not allow for redistribution, except for inclusion of files in replication archives.

Datafiles:

data/nhgis0009_ds244_20195_zcta.csv: Zip code tabulation area (ZCTA) data for the 2019 5-year ACS estimates with population attributes, used to identify rural and non-rural zip codes for purposes of survey sampling. A codebook appears in the data archive at data/nhgis0009_ds244_20195_zcta_codebook.txt.

data/nhgis0010_ds244_20195_blck_grp.csv: ACS 2019 5-year population estimates from NHGIS. These populations are used in an overlay of block-group centroids over ZCTA to calculated population-weighted density within each ZCTA. A codebook appears in the data archive atdata/nhgis0010_ds244_20195_blck_grp_codebook.txt.

data/US_zcta_2020.shp: The ZCTA shapefile used to overlay block-groups to calculate population-weighted population density. All files with the “US_zcta_2020” stem are loaded by GIS software or GIS R packages.

data/US_blck_grp_2019.shp: Block-group shapefile, which is subsequently converted to centroids and spatially joined to the ZCTA file to calculate block-group-weighted population density. All files with the “US_blck_grp_2019” stem are loaded by GIS software or GIS R packages.

Example for free use data with required registration, extract not provided

Several of our code scripts call the R tidycensus package to import Census data for analysis in R. Instructions to obtain a Census API key needed to use this package appears in the code-execution instructions.

To access the plain-English labels for the various Census variables, use the function load_variables() in the R Census package. For example, for the full “codebook” for the 2015-2019 5-year ACS estimates, you would execute the following line: acs5_vars<-load_variables(year=2019, dataset=“acs5”) Then, search for the variable(s) downloaded with the tidycensus function.

Dataset list

Data File	Source	Notes	Provided
`data/Survey 1.csv`	authors	raw data	Yes
`data/Survey 2.csv`	authors	raw data	Yes
`data/Survey 3.csv`	authors	raw data	Yes
`data/PIPE paper data.csv`	authors	raw data	Yes
`data/nhgis0009_ds244_20195_zcta.csv`	(Manson et al. 2024)	raw data	Yes
`data/nhgis0010_ds244_20195_blck_grp.csv`	(Manson et al. 2024)	raw data	Yes
`data/US_zcta_2020.shp`	(Manson et al. 2024)	raw data	Yes
`data/US_blck_grp_2019.shp`	(Manson et al. 2024)	raw data	Yes
`data/zip_replication/zhvi.csv`	(Zillow Group n.d.)	raw data	Yes
`data/zip_replication/zhvi_county.csv`	(Zillow Group n.d.)	raw data	Yes
`data/zip_replication/zori.csv`	(Zillow Group n.d.)	raw data	Yes
`data/zip_replication/uszips.csv`	(Pareto Software, LLC 2022)	raw data	Yes
`data/zip_replication/zip_county_cw.csv`	(Missouri Census Data Center 2025)	raw data	Yes
`data/zip_replication/alist_2021_11.csv`	(Apartment List Research Team 2022)	raw data	Yes
`data/rural_zip_codes_for_rep.csv`	authors	derived	Yes

All survey data are provided in raw form as downloaded from Qualtrics. The main survey questions are encoded and labeled in the opening chunks of the analysis code, code/folk_econ_rep_code_JEP.rmd. A comprehensive guide to the encoded data, with text of the associated survey questions, is provided as data/jep_codebook.md.

For each itemized data source, we provide a reference to public codebooks, or refer to a codebook stored in the data folder.

Computational requirements

Our code will run on a typical personal computer.

Software Requirements

The replication archive is found at https://www.openicpsr.org/openicpsr/project/233932.

Code for this project was written in R Markdown, using R Studio version 2025.5.0.496. All packages required to run the replication code are named and “libraried” in the opening code chunks. If run in R Studio, you will be prompted to install the necessary packages. Two packages are not presently on CRAN and must be installed unprompted, using the following commands:

The replication package contains one or more programs to install all dependencies and set up the necessary directory structure. [HIGHLY RECOMMENDED]
R version 4.5.0 (2025-04-11)
- The code chunk {r setup, include=FALSE}, beginning on line 25 of code/folk_econ_rep_code_JEP.rmd, libraries all packages on which the data cleaning and analysis code relies. It will prompt you to install any packages that you do not already have installed, except for two packages not available on CRAN, which must be installed separately, as follows: install.packages('fwildclusterboot', repos ='https://s3alfisc.r-universe.dev')
  
  install.packages('wildrwolf', repos ='https://s3alfisc.r-universe.dev')
To compile the “demographics” chunk of the replication code, you will need an API key from the U.S. census. One can sign up for API access and obtain a key at https://api.census.gov/data/key_signup.html. Once you have obtained an API key, assign it to census.api.key on line 93 of folk_econ_rep_code_JEP.rmd, per comment “YOUR KEY GOES HERE”.

Controlled Randomness

Random seed is set at lines 90-91 of script code/folk_econ_rep_code_JEP.rmd.
No Pseudo random generator is used in the analysis described here.

Description of programs/code

The script, code/folk_econ_rep_code_JEP.rmd, recodes the raw data and generates all tables and figures for the paper and Online Appendix. It save tables to /tables and figures to /figures.
The script, code/popweighted_for_replication_original.R, is our original code used to create the list of of rural zip codes (data/weighted_zips_pop_original.csv), our population-weighted densities under 500 persons per square mile. It does not compile because several of the installed packages were discontinued in 2022 or 2023. The full list of zip codes generated by this script was used to determine which codes to sample in all 4 surveys on Qualtrics. It is provided here as a reference.
The script, code/popweighted_for_replication_final.R contains updated code to produce the same file for replication purposes, using contemporary GIS packages available through the CRAN repository. It creates a table called data/zcta_merge_comparison.csv that compares the population densities of the zip codes that we used in our study (columns labeled _orig) to zip codes generated using the new GIS code. This script produces results that deviate very slightly from the dataset actually used in our sample, for several possible reasons. In our original code, we joined rows by indexes, not by the GISJOIN field as we do in the new code. It is also possible that the new code used for spatial joins yielded slight differences in the block groups included in each zip code, as centroids might have been calculated slightly differently. The provided code shows that the weighted population densities of zip codes calculated using the two different code scripts are correlated at r=0.96 (with log transformation, r=0.87). On net, the updated code identified 2.8 million more residents as rural. In addition, ZCTA population density (unweighted) is correlated at r=0.90 with the weighted density measure used in our data, and r=0.93 with the weighted density measure generated using our new code. The zip codes actually targeted in our sample had an average unweighted ZCTA population density of 2,891 persons per square mile. If we had used the new code, the average population density of sampled zip codes would have been 3,127 persons per square mile. Though we have been unable to identify the coding difference that caused these minor changes, the full list of zip codes used to exclude rural zip codes in our Qualtrics sampling has been included in the replication archive. The script,code/popweighted_for_replication_final.R, also generates QQ plots and scatterplots illustrating the similarity of the two datasets.

License for Code

The code is licensed under a CC BY 4.0 license. See /LICENSE.txt for details.

Instructions to Replicators

INSTRUCTIONS: The first two sections ensure that the data and software necessary to conduct the replication have been collected. This section then describes a human-readable instruction to conduct the replication. This may be simple, or may involve many complicated steps. It should be a simple list, no excess prose. Strict linear sequence. If more than 4-5 manual steps, please wrap a main program/Makefile around them, in logical sequences. Examples follow.

To replicate the figures and tables in the paper and Online Appendix:

Open code/folk_econ_rep_code_JEP.rmd in R Studio
Install any packages you are notified to install
Enter these commands in the console:
- install.packages('fwildclusterboot', repos ='https://s3alfisc.r-universe.dev')
- install.packages('wildrwolf', repos ='https://s3alfisc.r-universe.dev')
Request a Census API key, if you don’t have one already, and assign it, quoted, to census.api.key on line 93 (bottom of the setup chunk), replacing the words “YOUR KEY GOES HERE.”
Click “Run All” or “Knit”

To replicate our population-weighted zip code screen, which we provided to the survey vendor to exclude potential respondents from rural areas:

Open code/popweighted_for_replication_final.R in R Studio
Install any packages you are notified to install
Click “Run All” or “Knit”

To replicate our method for defining counterfactual (no-supply-shock) home prices and rents in survey questions, i.e., the prices and rents in price_by_zip.csv:

Open code/rents_home_price_api_data_generator.R in R Studio.
Install any packages you are notified to install
Click “Run All” or “Knit”

List of tables and programs

The provided code reproduces:

All numbers provided in text in the paper
All tables and figures in the paper

Code chunks are labeled with the number of the figure or table they produce. The prefix SI indicates that a figure or table is produced as supplemental information for the Online Appendix.

Elmendorf, Christopher S., Clayton Nall, and Stan Oklobdzija. “The Folk Economics of Housing.” Journal of Economic Perspectives (2025).

Elmendorf, Christopher S., Clayton Nall, and Stan Oklobdzija. “The Folk Economics of Housing: Replication Data.” Journal of Economic Perspectives (2025).

Acknowledgements

Content on this page was adapted from the AEA Data Editor’s replication template.

References

Apartment List Research Team. 2022. “Data & Rent Estimates.” https://www.apartmentlist.com/research/category/data-rent-estimates.

Elmendorf, Christopher S., Clayton Nall, and Stan Oklobdzija. 2025a. “Data and Code for ‘the Folk Economics of Housing’.” ADD AFTER UPLOADING TO .

———. 2025b. “The Folk Economics of Housing.” Journal of Economic Perspectives 39 (3): 1–20.

Manson, Steven, Jonathan Schroeder, David Van Riper, Katherine Knowles, Tracy Kugler, Finn Roberts, and Steven Ruggles. 2024. “IPUMS National Historical Geographic Information System: Version 19.0.” Minneapolis, MN: http://doi.org/10.18128/D050.V19.0; IPUMS.

Missouri Census Data Center. 2025. “Geocorr 2022: ZIP Code to County Correspondence File.” https://mcdc.missouri.edu/applications/geocorr2022.html.

Pareto Software, LLC. 2022. “US Zip Codes Database.” https://simplemaps.com/data/us-zips.

Zillow Group. n.d. “Zillow Research: Data.” https://www.zillow.com/research/data/.