Analysis of Covid-19 and the Impact on Housing

Author

Nathaly Munnicha

Introduction

Covid-19 pandemic not only affected people, but also many economic policies which has impacted the prices of residential properties in the Cincinnati neighborhoods that adjacent to Xavier’s campus. Citizen has been expressing their concern about housing affordability in the city. Through my analysis, I will be investigating residential property and have a better understanding and help address certain interests and concerns.

# A tibble: 6,885 × 19
   parcel_id  purchaser cps   norwood_schools street_address unit_id street_name
   <chr>      <chr>     <lgl> <lgl>                    <dbl> <chr>   <chr>      
 1 068-0002-… LAURENT … TRUE  FALSE                      713 #1      E MCMILLAN…
 2 041-0002-… BRABES I… TRUE  FALSE                     3443 #1      SHAW AVE   
 3 053-0001-… MITCHELL… TRUE  FALSE                     2324 #1809   MADISON RD 
 4 053-0001-… OKADA KE… TRUE  FALSE                     2324 #1810   MADISON RD 
 5 055-0004-… HUBER MA… TRUE  FALSE                     1720 #2      DEXTER AVE 
 6 086-0001-… TDPDX LLC TRUE  FALSE                      536 #2      LIBERTY HI…
 7 041-0002-… 3443 SHA… TRUE  FALSE                     3443 #2      SHAW AVE   
 8 068-0003-… WYATT DA… TRUE  FALSE                      719 #3      E MCMILLAN…
 9 086-0001-… RUDY AND… TRUE  FALSE                      534 #3      LIBERTY HI…
10 086-0001-… SCHINDEW… TRUE  FALSE                      534 #4      LIBERTY HI…
# ℹ 6,875 more rows
# ℹ 12 more variables: use <dbl>, yr_blt <dbl>, day <dbl>, month <dbl>,
#   year <dbl>, value <dbl>, neighborhood <chr>, total_rooms <dbl>,
#   bedrooms <dbl>, full_bath <dbl>, half_bath <dbl>, finished_sqft <dbl>
Data summary
Name property_sales
Number of rows 6885
Number of columns 19
_______________________
Column type frequency:
character 5
logical 2
numeric 12
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
parcel_id 0 1.00 16 16 0 6885 0
purchaser 0 1.00 3 87 0 6357 0
unit_id 6236 0.09 1 5 0 242 0
street_name 0 1.00 6 23 0 638 0
neighborhood 0 1.00 7 12 0 9 0

Variable type: logical

skim_variable n_missing complete_rate mean count
cps 0 1 0.7 TRU: 4840, FAL: 2045
norwood_schools 0 1 0.3 FAL: 4840, TRU: 2045

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
street_address 0 1.00 2492.31 1406.78 1 1443.0 2380 3611 5730 ▇▇▇▇▂
use 0 1.00 516.98 17.15 401 510.0 510 520 555 ▁▁▁▇▂
yr_blt 0 1.00 1922.80 31.50 1805 1905.0 1916 1928 2020 ▁▁▇▂▁
day 0 1.00 15.66 8.80 1 8.0 16 23 31 ▇▇▆▇▆
month 0 1.00 6.72 3.35 1 4.0 7 10 12 ▇▅▆▆▇
year 0 1.00 2019.76 1.12 2018 2019.0 2020 2021 2022 ▅▅▆▇▁
value 1995 0.71 279864.77 264251.50 0 127103.8 214900 340000 3650000 ▇▁▁▁▁
total_rooms 0 1.00 7.51 2.60 0 6.0 7 9 24 ▁▇▂▁▁
bedrooms 0 1.00 3.27 1.27 0 2.0 3 4 14 ▃▇▁▁▁
full_bath 0 1.00 1.89 0.88 0 1.0 2 2 9 ▅▇▁▁▁
half_bath 0 1.00 0.41 0.59 0 0.0 0 1 5 ▇▁▁▁▁
finished_sqft 0 1.00 2029.39 1004.70 0 1332.0 1800 2470 11567 ▇▃▁▁▁
# A tibble: 1 × 19
  parcel_id   purchaser cps   norwood_schools street_address unit_id street_name
  <chr>       <chr>     <lgl> <lgl>                    <dbl> <chr>   <chr>      
1 218-0060-0… CRAIG RI… TRUE  FALSE                      430 <NA>    WEST CLIFF…
# ℹ 12 more variables: use <dbl>, yr_blt <dbl>, day <dbl>, month <dbl>,
#   year <dbl>, value <dbl>, neighborhood <chr>, total_rooms <dbl>,
#   bedrooms <dbl>, full_bath <dbl>, half_bath <dbl>, finished_sqft <dbl>

Data Error

While examining the data, I noticed that there are 1995 missing values for the value column for the housing. This may show that many houses do not want to disclose this information to the public. Furthermore, there are some housing entries that are interesting. There is a house that has over 10,000 sq ft but less than 15 rooms which doesn’t seem that realistic. Futhermore there is a house that has 0 sq feet which was changed to NA.

Directed Anaylsis

To find out what neighborhood I wanted to be located in, I grouped the neighborhoods and found the median transaction value. Based on this, Mount Adam is a great location and so is Hyde Park. Futhermore, just based off of sq ft and the number of bedroom, many houses in cincinnati have many bedrroms and quite a bit of sq ft. There are also many houses that sell that are over 100 years old. It would be great to wither buy a house that is around 100 years old is is just a couple of years old. The best time of year to sell would be July. It has the highest transaction value for houses.