Exploring Data Project: Income Level vs Real estate

Brionna Best

Introduction- Why

I choose these data sets because a family’s wealth is closely correlated with their quality of life and being that home ownership is the #1 wealth building vehicle in the US I would like to know how middle to poor income families are performing in these areas. If these families are able to buy a house this would help them to start building wealth and aid in an increase in their quality of life.

Introduction- What

This data set is reduced down to the SE region of the country’s population income level ranging from <$10,000 - $200,000+. While the house sales data set is reduced down to single family house sales in each state’s metro cities. All ranging from 2018-2022 in dates.

Introduction- Where

  1. I found the Household income data set on Datausa.com using MICA’s find data website
  2. I found the single family real estate sales prices on Zillow’s housing data research page

Data Cleaning- House Sales Data Cleaning in the SE

  1. I reshaped the data to get sales dates from 2022, and 2018-2022
  2. I got the total average house sale price from the 2022 to find out the income bracket that could afford the average sale price in the SE
  3. I got the total average house sale price for each state to compare who could afford a home in each state in 2022
  4. I plotted the last 5 years yearly average house sale in each state to find the trend of the market and regression model test on the plot

Data Cleaning- SE region Population by House Hold Income Bucket Level

  1. I reshaped the data to get income level population dates from 2022, and 2018-2022
  2. I combined the SE region population levels to get a high level view of the population income level breakdown
  3. I plotted the last 5 years yearly average of the low income population numbers in each state to find the trends in this population and ran a regression model test 
  4. I calculated the average yearly change in the low income population over the 5 years tested to show the biggest trend the plot showed 
  5. I calculated the percentage for income level populations that fell below $80,000

Exploratory Summary- House Hold Income Bucket Level

Take away:

Zillow says in order to qualify for a FHA mortgage, that no more than 31% of pre tax income should be used on a mortgage.

For families in SE region of the country that make $29,000 a year (22%) that’s equivalent to $749 in a mortgage and would qualify them for a house that costs about $128,000. Which is about $210,706 difference from the average single family house sold in the SE.

Exploratory Summary- Single Family House Sales in the SE Region

Take away:

  1. Being that the lowest average house sale state in the SE region which Mississippi at $254,108 which is well above the $128,000 price low income families could afford that data is starting to show owning a home for income families isn’t possible with traditional home buying methods 

  2. But not just low income families with the average home sale price for all the states total being $338,706 about 56% of the region can not afford to buy a house in the metro cities in their state

Exploratory Summary- Single Family House Sales in the SE Region

Take away:

  1. All states show a statistically significant increase with all p-values well below 0.05, letting us know that time is not on our side when it comes to buying a house and building wealth especially if your household makes under $80,000 a year
  2. All states also had a high R squared value with ranges of .93-.98, these numbers shows us that as time goes the gap for about 56% of the SE region population to obtain a home and start building wealth will widen

Population Levels of Combined Low Income Population (<$10,000-$29,999)

Take away:

  1. But, the good news is the low income and below population have steadily decreased over the last 5 years in the SE region with a average annual decreased rate of 39,634.72
  2. All states show a statistically significant decreasing trend in population over the years, this is unlikely due to random chance
  3. High R squared values suggest that these number continuing to decrease in the future, although this data only affects about 22% of the 56% population

Further Questions

  1. Why are low income populations steadily decreasing over the years how can we can we create this same affect in the lower middle income groups as well?
  2. Who makes up the low income populations (demographic/psychographics)?
  3. How are this population’s life outcomes?
  4. What are their desires around home ownership?
  5. What are other real estate alternatives to home ownership than traditional home buying?