# load data
You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.
My research question is how does the legacy of redlining influence current demographic and socioeconomic conditions, particularly in regards to housing, wealth inequality, and access to resources?
What are the cases, and how many are there?
The cases in this study are metropolitan areas within the United States, with each area representing a separate case. The exact number of cases will depend on the specific metropolitan areas included in the analysis.
Describe the method of data collection.
The data for this study will be collected from publicly available sources, including the Mapping Inequality project, which provides historical data on redlining practices in the United States, and the Federal Reserve Economic Data (FRED) website, which offers economic and demographic data. As well as data which may be sourced from FiveThirtyEight’s repository on redlining.
What type of study is this (observational/experiment)?
I would consider this study as observational in nature, as it aims to analyze existing data on redlining practices and their long-term effects on demographic and socioeconomic conditions.
If you collected the data, state self-collected. If not, provide a citation/link.
The data sources include:
Mapping Inequality project: https://github.com/fivethirtyeight/data/tree/master/redlining
FiveThirtyEight’s redlining repository: https://projects.fivethirtyeight.com/redlining/
Federal Reserve Economic Data (FRED): https://fred.stlouisfed.org/series/MSPUS
What is the response variable? Is it quantitative or qualitative?
The response variable is likely to be multifaceted, encompassing various aspects of demographic and socioeconomic conditions, such as housing prices, wealth distribution, and access to resources. These variables can be both quantitative and qualitative, depending on the specific measures used in the analysis. The main response variable that I will focus on is quantitative, which is the rate of return or appreciation on real estate properties in these different metropolitan areas.
The independent variables include historical redlining grades assigned by the Home Owners Loan Corporation (HOLC) from 1935-40, as well as contemporary demographic and economic indicators.
Provide summary statistics for each the variables. Also include appropriate visualizations related to your research question (e.g. scatter plot, boxplots, etc). This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed.