Maryland Median Home Values

Erik White
12/17/2017

Overview

The Maryland Median Home Values application is a resource that allows users to explore how the median home values of houses have changed between April of 1996 and October of 2017 throughout the 24 counties in the state of Maryland.

Data was extracted from two main sources:

Data Pre-Processing

The raw datasets were manipulated to create two new datasets that were utilized for the creation of the Map, Line Graph, and Data Table graphics that are available for view within the application.

  • graphData - contains a 'melted' version of the Zillow median home value dataset where each record displays the median home value for that county as of the record's date.
head(readRDS("data/graphData.rds"),3)
    RegionName       Date Value
1     Allegany 1996-04-01    43
2 Anne Arundel 1996-04-01    95
3    Baltimore 1996-04-01    81
  • mapData - contains the Zillow median home value dataset with corresponding latitudinal and longitudinal coordinates for each county (note that certain fields were omitted from the below sample):
library(dplyr)
head(select(readRDS("data/mapData.rds"), RegionName, INTPTLAT, INTPTLONG, X1996.04, X1996.05, X1996.06),3)
    RegionName INTPTLAT INTPTLONG X1996.04 X1996.05 X1996.06
1     Allegany 39.61231 -78.70310       43       44       43
2 Anne Arundel 38.99358 -76.56048       95       95       95
3    Baltimore 39.44317 -76.61657       81       81       82

Map and Data Table

The “Map” tab in the application displays a leaflet graphic that contains circle markers at each counties centroid.

  • The size of each marker has a direct relationship with the median home value per square foot as of the “Ending Date” that the user has specified.
  • The color of each marker is derived based off of the percentage increase or decrease in median home value between the “Beginning Date” and the “Ending Date” that the user has specified. For example, using the default dates when opening the application will yield the below values for percentage increase or decrease. Higher values are assigned green values, lower values are assigned red values:
mapData <- readRDS("data/mapData.rds")
mapData <- mutate(mapData, Percentage_Increase_Decrease = (mapData[,"X2017.10"]/mapData[,"X1996.04"] - 1)*100)
head(select(mapData, RegionName, X2017.10, X1996.04, Percentage_Increase_Decrease))
      RegionName X2017.10 X1996.04 Percentage_Increase_Decrease
1       Allegany       64       43                     48.83721
2   Anne Arundel      217       95                    128.42105
3      Baltimore      164       81                    102.46914
4 Baltimore City      109       48                    127.08333
5        Calvert      161       80                    101.25000
6       Caroline      112       62                     80.64516

The output of the above calculations are also provided in the “Data Table” tab of the application for the user's convenience and dynamically change as the user re-calculates using different time intervals.

Line Graph

The “Line Graph” tab of the application displays a ggplot2 graphic that shows a time series analysis of median home values for the time period specified. Manipulating the time “Beginning Date” and “Ending Date” values will re-generate the line graph so that it's x-axis is rescaled to the interval that the user is exploring. The default dates for the application will result in the below graph.

library(ggplot2)
graphData <- readRDS("data/graphData.rds")
ggplot(graphData, 
                          aes(x=as.Date(Date), 
                              y = Value, 
                              col = RegionName)) + geom_line() + xlim(
                                      as.Date("Apr-1996-01", "%b-%Y-%d"),
                                      as.Date("Oct-2017-01", "%b-%Y-%d")
                                      ) + xlab("Date") + ylab("Median Home Value per Sq Ft (USD)")

plot of chunk unnamed-chunk-4