August 18, 2017

Introduction

This analysis will utilize home value data from www.zillow.com to create a scatterplot comparing the home price index vs. the 10 year growth of the home price index by county.

We start with reading the data from the Zillow website. We then define the text that will appear when hovering over markers, remove N/A values, and define whether a county is in the state of Washington or not (since that is where I live!)

Data Prep Code

suppressPackageStartupMessages(library(plotly))
suppressPackageStartupMessages(library(magrittr))
        
## read data from www.zillow.com
values <- read.csv('http://files.zillowstatic.com/research/public/County/County_Zhvi_Summary_AllHomes.csv')
values <- values[, c(3, 4, 7, 12)]

## Format Variable Names for Labeling
names(values) <- c("RegionName", "State", 
                   "Home_Price_Index", "Ten_Year_Growth")

Data Prep Code (cont.)

## Define hover text
values$hover <- with(values, 
                     paste(
    RegionName, "County", ",", State, '<br>', "Zillow Index Price:", 
    paste(
        "$", format(Home_Price_Index, big.mark = ","), sep = ''), 
    "<br>", "10 Year Growth", 
    paste(
            "%", round(Ten_Year_Growth * 100, 2)
            )
    )
    )

## remove na values and create factor variable for WA/other state
values <- values[!is.na(values$Ten_Year_Growth),]
values$WA <- as.factor(ifelse(values$State == "WA", "WA", "Other States"))

Plotting Method

We will use plotly to create a scatterplot and fit a line showing the expected 10 year growth based on the home index value of a county, which we can use to determine which counties are growing above or below the expected rate. We alslo color counties in Washington state in red to compare against the rest of the country.

Scatterplot Code

## Build Plot
p <- plot_ly(values, x = ~Home_Price_Index, y = ~Ten_Year_Growth) %>%
    add_markers(marker = list(opacity = .5, 
                              size = 15), 
                text = ~hover, 
                color = ~WA, 
                colors = c("blue", "red")) %>%
    add_lines(y = ~fitted(loess(Ten_Year_Growth ~Home_Price_Index)), 
              name = "Expected Growth") %>%
    layout(title = "Home Value Index vs. 10 Year Growth", 
           width = 800, 
           height = 500)

Conclusion

The plot shows that, while there is a reduction in growth in counties with a home value index between $130,000 and $230,000, counties with higher home value indices tend to have a higher rate of growth over the past 10 years. Of the counties with a home value index above $625,000, only one, Nantucket County, had negative growth. We also see that a majority of the counties in Washington had positive growth, and had growth exceeding expectations given the county's home value index. The county I live in, King County, had a home value index of $580,000 (the highest in the state…), and a growth of %2.94 (4th highest).