Data 608 - Final Project Proposal

Background

The pandemic has driven corporate America, esp. in SF Bay Area and NY to allow for remote work, which has, in turn, driven the housing demand for single family homes in rural and suburbs outside the city. Along with that it has driven down the demand for one-bedroom apartment, one-bedroom condos, and condos, in general. Also, it is noted that the pricing for 1-bed room apartment has been driven down due to decreased demand. As we’re heading toward herd immunity, I’m eager to see how can the demand 1-bedroom condo and apartment recover and return back to pre-pandemic level.

Objective

I wanted to build an app to showcase the month-over-month (MoM) and year-over-year (YoY) change in terms of listed price, sale-to-listed price ratio, rent estimate of 1-bedroom condos by zip codes and/or county level. If there is enough time, I wanted to bring in an aggregated 13-month comparison with a before and after heat map, maybe with just one toggle. The ultimate goal is three-fold:

to confirm that

    a) there is indeed a decrease in demand,
    b) there is some trend of recovery, and

to allow for

    c) estimating what’s n-th percentile of listed price, a price that is set by seller, should be within a certain zip code from a historical point of view.

Data Source

I’ll pick from one of the real estate APIs that are listed in the following 2 web pages.

Data Analysis

Socrata Query Language (SoQL) like we used in Module 4 to pre-select the data using the input from the dropdown menus. Since I’m likely looking into weekly aggregates, monthly aggregates, as well as a 13-month aggregate, having the ability to select and aggregate the data by where and group by will prove to be extremely useful when it comes to retrieving from a large scale of data.

Dealing with missing values. Treatment: Exclude them from this analysis

Doing the summarization using groupby in pandas dataframe.

Visualization

Dimension:

Time (weekly, monthly, 13-month)

Type: Condo (Fixed)

Size: 1-bedroom (Fixed)

Start and End Time: Text Field or slider

Search by: Zip Code or County

Enter your desired selling price $: [Sd]

There is going to be 4 sections:

  1. Listed Price changes at zip code or county level. Using the datashader approach (used in module 2) to show the gradient of heatmap of what’s percentage decrease in each zip code / county

  2. Rent estimate changes at the zip or county level

  3. Office vacancy rate changes at zip code or county level (Optional: depending on whether I could secure the data from my brother who works as Quant and modeling on commercial real estate loans for a bank in NY)

    Below are some relevant material I found from National Association of Realtors Research Group.

  4. Finally I wanted to get a visual to show a cumulative historical graph of listed price, given the input Sd, is at if someone is going to sell his/her one-bedroom condo in the near future, i.e., would it be at 89th percentile based on historical prices or 95th percentile.