Forecasting US Housing Prices

# global image options
knitr::opts_chunk$set(fig.width = 10, fig.height = 6)

# set working directory if needed
# knitr::opts_knit$set(root.dir = normalizePath('C:/Users/<insert directory here>'))

Introduction

This analysis is intended to serve as an intermediate level, end-to-end demonstration of two key topics:

Acquiring, parsing, and wrangling various publically available time-series housing data sources
Using Facebook’s Prophet forecasting package, specifically in the context of forecasting the US Housing Price Index.

The goal is to begin an exploration of both historical and current US housing prices, and to consider the contrarian hypothesis that a peak has been reached. The assumption is that US housing prices are likely to depreciate in the near term, given current macroeconomic conditions. Secondarily, this demonstration explores three external data sources as potential forecasting regressors:

The effective Fed Funds Rate
Homebuilder Sentiment (HMI)
ITB ETF Monthly Share Price (Homebuilder ETF)

The HPI will be forecasted for 24 months as of the effective date of December 2018.

First, prepare the environment, install packages if needed, and load packages.

# clear environment and confirm working directory
rm(list = ls()) 

#install.packages('prophet')
#install.packages('dplyr')
#install.packages('tidyr')
#install.packages('ggplot2')
#install.packages('data.table')
#install.packages('quantmod')
#install.packages('plotly')

library(prophet)
library(ggplot2)
library(dplyr)
library(tidyr)
library(data.table)
library(quantmod)
library(plotly)

Generating The First Prophet Forecast

The first step is to load in the HPI (Housing Price Index) data. The master data was extracted here: https://www.fhfa.gov/DataTools/Downloads/Pages/House-Price-Index-Datasets.aspx https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_master.csv

Below, there are various options to read in the data from either a local directory, from the R Project, or from the web URL.

# read in locally
# hpi <- read.csv(choose.files())

# read in from R project
#hpi <- read.csv('HPI_master.csv', stringsAsFactors = FALSE)

# read in from the URL directly
hpi <- fread('https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_master.csv')

# examine the structure of the data
str(hpi)

## Classes 'data.table' and 'data.frame':   106948 obs. of  10 variables:
##  $ hpi_type  : chr  "traditional" "traditional" "traditional" "traditional" ...
##  $ hpi_flavor: chr  "purchase-only" "purchase-only" "purchase-only" "purchase-only" ...
##  $ frequency : chr  "monthly" "monthly" "monthly" "monthly" ...
##  $ level     : chr  "USA or Census Division" "USA or Census Division" "USA or Census Division" "USA or Census Division" ...
##  $ place_name: chr  "East North Central Division" "East North Central Division" "East North Central Division" "East North Central Division" ...
##  $ place_id  : chr  "DV_ENC" "DV_ENC" "DV_ENC" "DV_ENC" ...
##  $ yr        : int  1991 1991 1991 1991 1991 1991 1991 1991 1991 1991 ...
##  $ period    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ index_nsa : num  100 101 101 102 102 ...
##  $ index_sa  : num  100 101 101 101 101 ...
##  - attr(*, ".internal.selfref")=<externalptr>

This forecast will focus on the non-seasonally adjusted HPI: the forecast variable is ‘index_nsa’. Below, a copy of the data was retained before making any transformations. Data is subset down to ‘monthly’, ‘purchase-only’, ‘US or Census Division’, ‘United States’.

# copy data
hpi_orig <- hpi
hpi <- hpi_orig

# subset data to only the the monthly purchase
hpi <- hpi %>% 
  dplyr::filter(frequency == 'monthly' &
                hpi_flavor == 'purchase-only' & 
                level == 'USA or Census Division' &
                place_name == 'United States') %>%
  dplyr::select(period, yr, index_nsa)

## Warning: package 'bindrcpp' was built under R version 3.4.4

# view data
tail(hpi)

As the data stands now, there is no proper date variable present, only ‘period’ which represents the month, and the year. Below, a ‘date’ field is created after padding the ‘period’ variable with a 0 if the month number is less than 10. A correctly formatted date-type field is required for running time-series forecasting.

# create date field
hpi$period <- ifelse(hpi$period < 10, paste('0', hpi$period, sep = ''),hpi$period)

hpi$date <- paste(hpi$period, '01', hpi$yr, sep = '.')

# date conversion
hpi$date <- as.Date(as.character(hpi$date), '%m.%d.%Y')

# date field is now year-month-day
head(hpi)

It is key to ensure there is no missing data before forecasting. If there were some missing cells, NA would have to be included in these cells. With 0 rows, this is not necessary.

# ensure no missing data in the HPI variable
hpi %>% filter(index_nsa <= 0 | is.na(index_nsa))

Absence of missing data can also be confirmed via a summary of the HPI data

summary(hpi)

##     period                yr         index_nsa          date           
##  Length:334         Min.   :1991   Min.   :100.0   Min.   :1991-01-01  
##  Class :character   1st Qu.:1997   1st Qu.:121.0   1st Qu.:1997-12-08  
##  Mode  :character   Median :2004   Median :181.4   Median :2004-11-16  
##                     Mean   :2004   Mean   :172.6   Mean   :2004-11-15  
##                     3rd Qu.:2011   3rd Qu.:212.4   3rd Qu.:2011-10-24  
##                     Max.   :2018   Max.   :271.0   Max.   :2018-10-01

As a preliminary step, create a quick plot of the forecast variable: monthly HPI from 1991 to 2018. The large decline during the financial crisis followed by the subsequent recovery are clearly evident. The HPI has been on an upward trajectory since ~2011.

#visualize time series            
qplot(date, index_nsa, data = hpi, main = 'HPI: 1991 - 2018')

Running a Prophet model requires a specific dataframe with specifically named fields: ‘ds’ for the date-type variable, and ‘y’ for the forecast variable. These are created below and passed into a new ‘df’ dataframe.

Fitting the first forecast model is as easy as passing the new ‘df’ dataframe into the prophet() function from the prophet package. Most economic time-series feature a multiplicative seasonality, so a broad assumption is made here to specify this in the model. Delving into this topic is beyond the current scope of this analysis.

# prep new df for prophet input/fitting. need 'ds' and 'y' columns
df <- data.frame(ds = hpi$date, y = hpi$index_nsa)

# fit prophet model
mod.1 <- prophet::prophet(df, seasonality.mode = 'multiplicative')

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.

## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -3.14135
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

After the model is fit, the next step is to create a ‘future data frame’. In this case, I am extending it out 24 periods (months). As of this initial analysis, the latest period in HPI was October of 2018, meaning the forecast will extend to October of 2020

# creating a future data frame. Note the monthly frequency, and prediction period up to Dec 2020:
future <- make_future_dataframe(mod.1, periods = 24, freq = 'month')

tail(future)

Now, we can put the peices together by predicting (forecasting) out over the future data frame we just created in the previous step. Below, displaying the forecasted data as ‘yhat’, with upper and lower confidence intervals.

# predictions
forecast_df <- tbl_df(predict(mod.1, future))

#create a forecast data frame, including upper and lower limits
forecast_df <- data.frame(forecast_df[c('ds','yhat', 'yhat_lower', 'yhat_upper')])

tail(forecast_df)

Visualizing the components of the prophet forecast: the overall long-term trend and the yearly trend. The resulting forecast predicts that the HPI will continue unchecked into the future. The reason is likely related to the strong upward trend in both the long and short term.

# plot forecast
plot(mod.1, forecast_df, main = 'HPI Forecast', 
     xlab = 'Monthly Time Series',
     ylab = 'HPI - Nonseasonally Adj.')

# plot trend components
prophet_plot_components(mod.1, predict(mod.1, future))

Adding A Regressor

At this point, the analysis is going to introduce an additional regressor to the forecasting models: the effective Fed Funds rate: https://fred.stlouisfed.org/series/FEDFUNDS

Although the two variables have stark scaling differences, a visualization was developed below of both time series. In the short term, as of mid-2016, the US Federal Reserve has shifted its fiscal policy, begun quantitative tightening by raising interest rates after a long period of holding them near-zero. The theory here is that as rates rise, and money becomes more ‘expensive’, mortgages will become harder to obtain (more costly) which will inveitably affect housing prices. This new regressor may not be the strongest of leading indicators, but it is worth exploring.

As of the date of this analysis, December 2018, the Federal Reserve has recently raised their interest rate to be between 2.25 and 2.50. While the effective rate is not present in the current data file available on above link, the median of 2.38 was used for December 2018.

# effective fed funds rate by month
fed <- read.csv('FEDFUNDS.csv', stringsAsFactors = TRUE)
fed$DATE <- as.Date(as.character(fed$DATE), '%m/%d/%Y')


# join to the HPI time series
m <- dplyr::full_join(fed, df, by = c('DATE'='ds')) 
m <- m %>% dplyr::select(DATE, y, FEDFUNDS) %>%
     rename(ds = DATE, fedfunds = FEDFUNDS)

# plot fed funds and HPI on same visual
ggplot(m, aes(ds)) + 
  geom_line(aes(y = y, colour = 'y')) + 
  geom_line(aes(y = fedfunds, colour = 'fedfunds'))

Below, an alternative visualization was generated, with two time series presented on different scales. Between 2005 and 2010, interest rates demonstrate the beginning of the Fed’s QE (quantitative easing) program.

# plot them on two facets
m2 <- melt(m, id='ds')

m2 %>%
  ggplot(aes(x = ds, y = value)) +
  geom_point() + geom_line() + 
  geom_jitter(alpha = 0.5) +
  geom_smooth() +
  facet_wrap( ~ variable, scales = 'free_y')

Since the effective fed funds rate is being considered as an additional regressor, we will need to forecast this variable out October of 2020 before incorporating into our primary forecast. Since our Fed Funds dataset extends to December 2018, we will forecast out 22 periods. The Fed Funds data is a full months ahead of the HPI data.

The forecasted Fed Funds rate was given a floor of 0.5 and a ceiling of 5.0… this seemed to be a feasible min/max for interest rates in the near term. All of the previous steps were completed:

Train a Prophet model
Create a future dataframe
Generate a forecast (predictions)
Visualize the results

# filter from 2009 forward, create the df. Add a ceiling and a floor to the model
f <- fed %>% 
        #filter(year(DATE) >= 2009) %>%
        rename(ds = DATE, y = FEDFUNDS)
f$cap <- 5.0
f$floor <- 0.5

# fit model
fedMod.1 <- prophet::prophet(f, growth = 'linear', 
                             seasonality.mode = 'multiplicative',
                             changepoint.prior.scale = 0.98, 
                             changepoint.range = 0.98)

## Initial log joint probability = -22.9751
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

# make future data frame
future_f <- make_future_dataframe(fedMod.1, periods = 22, freq = 'month')
future_f$cap <- 5
future_f$floor <- 0.5

# generate forecast
forecast_f <- tbl_df(predict(fedMod.1, future_f))

#create a forecast data frame, including upper and lower limits
forecast_f <- data.frame(forecast_f[c('ds','yhat', 'yhat_lower', 'yhat_upper')])

# plot forecast
plot(fedMod.1, forecast_f, main = 'Fed Funds Forecast', 
     xlab = 'Monthly Time Series',
     ylab = 'US Fed Funds Rate (Effective)')

Now that the effective Fed Funds Rate has been forecasted, this can now be used in the second HPI model as an additional regressor.

# join the HPI data with the fed data by date
df <- dplyr::left_join(df, fed, by = c('ds'='DATE')) %>% 
      dplyr::rename(fedfunds = FEDFUNDS)

# fit model, add regressor
mod.2 <- prophet()
mod.2 <- add_regressor(mod.2, 'fedfunds')
mod.2 <- fit.prophet(mod.2, df)

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.

## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -3.14135
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

# make future data frame up to December 2020
future.2 <- make_future_dataframe(mod.2, periods = 24, freq = 'month')

# add the forecasted fedfunds
future.2 <- dplyr::left_join(future.2, forecast_f %>% 
                              dplyr::select(ds, yhat), by = c('ds' = 'ds')) %>%
                             dplyr::rename(fedfunds = yhat)
# set NA to 0
future.2$fedfunds[is.na(future.2$fedfunds)] <- 0

# generate predictions with additional fed funds regressor
forecast_addReg_f <- predict(mod.2, future.2)

# plot forecast
plot(mod.2, forecast_addReg_f[c('ds','yhat', 'yhat_lower', 'yhat_upper')], 
     xlab = 'Monthly Time Series',
     ylab = 'HPI non-seasonally adj.')

# plot trend components
prophet_plot_components(mod.2, forecast_addReg_f)

Adding Multiple Regressors

The inclusion of the additional regressor has done little to change the upward trend of the HPI. One reason for this is that interest rates were reduced in response to the financial crisis, so the Fed Fund rate is likley not a leading indicator but rather a lagging indicator. This subsequent section is a demonstration of how to add in a few more regressors simultaneously:

The Housing Market Index (HMI) which measures the 6 month forward looking homebuilder sentiment for single family home sales. https://www.nahb.org/en/research/housing-economics/housing-indexes/housing-market-index.aspx
Monthly share price of ITB, the iShares U.S. Home Construction ETF (NYSEARCA: ITB is the largest ETF that tracks homebuilders. See: https://www.investopedia.com/articles/etfs-mutual-funds/071016/top-3-homebuilders-etfs-itb-xhb.asp)

ITB data can be pulled from Yahoo Finance: https://finance.yahoo.com/quote/ITB/history?period1=1146801600&period2=1545368400&interval=1mo&filter=history&frequency=1mo

First, read in the ITB ETF monthly pricing data directly from Yahoo Finance using the getSymbols() function from the ‘quantmod’ package. (See:https://cran.r-project.org/web/packages/quantmod/quantmod.pdf)

# pull ITB ETF data from Yahoo Finance
getSymbols('ITB', src='yahoo', from = '1991-01-01', periodicity = 'monthly')

## [1] "ITB"

ITB_df <- as.data.frame(ITB)

# create dataframe with closing price and date variable
ITB_df <- data.frame(row.names(ITB_df), ITB_df$ITB.Close)
ITB_df <- ITB_df %>% 
          dplyr::rename(date = row.names.ITB_df.,
                        itb_close = ITB_df.ITB.Close)
# date conversion
ITB_df$date <- as.Date(ITB_df$date)

# optional: add floor / capacity
# ITB_df$floor <- 1.0
# ITB_df$cap <- 100.0

head(ITB_df)

##         date itb_close
## 1 2006-05-01     42.74
## 2 2006-06-01     39.37
## 3 2006-07-01     35.04
## 4 2006-08-01     35.80
## 5 2006-09-01     37.00
## 6 2006-10-01     38.89

# plot the data
plotly::plot_ly(ITB_df, x = ~date, y = ~itb_close, type = 'scatter', mode = 'lines') %>% plotly::layout(title = 'ITB ETF Monthly Share Price')

Here the HMI data is read in directly as a .CSV and reshaped as a time series. This data must be downloaded and read in directly since a URL pull is prevented by means of hashing.

# read in data
HMI <- read.csv('table2-nahb-wells-fargo-national-hmi-history.csv')

# recreate column names, filter out any rows with blank cells
hmi_cols <- c('year','01','02','03', '04', '05', '06', '07', '08', '09', '10','11','12')
colnames(HMI) <- hmi_cols
HMI <- as.data.frame(HMI) %>%
          dplyr::filter_all(all_vars(. != ''))

# view data
head(HMI)

Further trasformation of the HMI data includes reshaping into a narrow data frame from wide by using the gather() function from the tidyr package, adding a date variable, sorting by date, and converting sentiment to a numeric data type.

# convert from wide to narrow format
HMI <- HMI %>% 
        tidyr::gather(key = month, value = sentiment, -year) %>%
        dplyr::mutate(date = as.Date(paste(year, month, '01', sep = '-'))) %>%
        dplyr::arrange(date) %>%
        dplyr::select(date, sentiment)

# numeric conversion
HMI$sentiment <- as.numeric(HMI$sentiment)

# optional: add a floor / capacity
#HMI$floor <- 1.0
#HMI$cap <- 100.0

# plot the data
plotly::plot_ly(HMI, x = ~date, y = ~sentiment, type = 'scatter', mode = 'lines') %>% plotly::layout(title = 'US Homebuilder Sentiment (HMI)')

Plotting both homebuilder sentiment and the ITB Homebuilder ETF monthly share price, we see that both are following a similar long term and short term trend, with the short term showing recent decline.

# visualize
plotly::plot_ly(dplyr::left_join(ITB_df, HMI, by = 'date'), 
                x = ~date, 
                y = ~sentiment, 
                name = 'HMI',
                type = 'scatter', 
                mode = 'lines') %>%
plotly::add_trace(y = ~itb_close, 
                name = 'ITB ETF', 
                type = 'scatter', 
                mode = 'lines', 
                line = list(color = 'rgba(67,67,67,1)', width = 2)) %>%
plotly::layout(title = 'Homebuilder Sentiment (HMI) & ITB ETF Share Price',
                yaxis= list(title=''))

Below, the next steps are to forecast the new regressors before inclusion into the HPI forecast model. Note that these forecasts were tuned using a high changepoint prior scale, and a high changepoint range. By default, Prophet uses the first 80% of the input data to infer future changepoints… this was increased to 0.98. This means that the first 98% of the input data was used to infer changepoints; this adjustment allows for a better near-term fit.

Additionally, the erratic nature of these time-series resulted in a poor fit when using the default changepoint prior scale of 0.05. Increasing this makes the trend more flexible and provides a better fit. 0.98 was used for this parameter. The resulting combination of this tuning allowed for a better fit and also a more feasible short-term directional forecast. It seems rational that ITB share price would continue to decline in the short term, Fed Funds rate would continue to be increased, and the HMI sentiment index would also decline.

# create a list of time-series dataframes
ts <- list(hmi = HMI %>% 
                  #dplyr::filter(date >= '2006-05-01') %>%
                  dplyr::rename(ds = date,
                                y = sentiment),
           itb = ITB_df %>%
                  dplyr::filter(date <= '2018-12-01') %>%
                  dplyr::rename(ds = date,
                                y = itb_close),
           fed = fed %>%
                  dplyr::rename(ds = DATE,
                                y = FEDFUNDS))

# optional: add floor and cap to fed
# ts$fed$floor <- 0.0
# ts$fed$cap <- 5.0

# model each time series in the list
m_prophet <- purrr::map(ts, prophet,
                        #growth = 'logistic',
                        seasonality.mode = 'multiplicative', 
                        changepoint.prior.scale = 0.98,
                        changepoint.range = 0.98)

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.

## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -12.4285
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -14.0547
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -22.9751
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

# makek a future data frame for each forecast
future <- purrr::map(m_prophet, make_future_dataframe, periods = 24, freq = 'month')

# optional: adding floor / capacity to the regressors
# future$itb$floor <- 1.0
# future$itb$cap <- 100.0
# 
# future$hmi$floor <- 1.0
# future$hmi$cap <- 100.0
# 
# future$fed$floor <- 0.0
# future$fed$cap <- 5.0

# forecast each data frame
forecast_all <- purrr::map2(m_prophet, future, predict)

# set ITB negative forecast values to 0
forecast_all$itb$yhat[forecast_all$itb$yhat<0] <- 0.0
forecast_all$itb$yhat_lower[forecast_all$itb$yhat_lower<0] <- 0.0
forecast_all$itb$yhat_upper[forecast_all$itb$yhat_upper<0] <- 0.0

# view some columns from one forecast
tail(forecast_all$itb %>% dplyr::select(ds, yhat_lower, yhat, yhat_upper))

# plot forecasts
plot(m_prophet$hmi, forecast_all$hmi[c('ds','yhat', 'yhat_lower', 'yhat_upper')], 
     xlab = 'Monthly Time Series',
     ylab = 'HMI Seasonaly Adj.')

plot(m_prophet$itb, forecast_all$itb[c('ds','yhat', 'yhat_lower', 'yhat_upper')], 
     xlab = 'Monthly Time Series',
     ylab = 'ITB ETF Monthly Share Price')

plot(m_prophet$fed, forecast_all$fed[c('ds','yhat', 'yhat_lower', 'yhat_upper')], 
     xlab = 'Monthly Time Series',
     ylab = 'Effective Fed Funds Rate')

Below, all three regressors are joined to the primary HPI time series using the dplyr::left_join() function. In the case of non-existent data (NA), 0’s are substituted to ensure a fully numeric matrix.

# prep df for prophet input/fitting. need 'ds' and 'y' columns
df <- data.frame(ds = hpi$date, y = hpi$index_nsa)

df <- dplyr::left_join(df, 
                 data.frame(ds = as.Date(forecast_all$fed$ds), 
                            fedfunds = forecast_all$fed$yhat), 
                 by = c('ds'='ds')) %>%
      dplyr::left_join(., 
                       data.frame(ds = as.Date(forecast_all$hmi$ds),
                                  sentiment = forecast_all$hmi$yhat),
                       by = c('ds' = 'ds')) %>%
      dplyr::left_join(., 
                       data.frame(ds = as.Date(forecast_all$itb$ds),
                                  itb = forecast_all$itb$yhat),
                       by = c('ds' = 'ds'))
# replace NA withi 0
df[is.na(df)] <- 0

# view data
head(df)

The following steps will fit a prophet model which includes the three additional regressors: the effective Fed Funds rate, the Homebuilders Sentiment (HMI) Index, and the ITB ETF Monthly Share Price.

# fit model, add regressor
mod.3 <- prophet(seasonality.mode = 'multiplicative')
mod.3 <- add_regressor(mod.3, 'fedfunds')
mod.3 <- add_regressor(mod.3, 'sentiment')
mod.3 <- add_regressor(mod.3, 'itb')

mod.3 <- fit.prophet(mod.3, df)

## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.

## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

## Initial log joint probability = -3.14135
## Optimization terminated normally: 
##   Convergence detected: relative gradient magnitude is below tolerance

# make future data frame up to December 2020
future.3 <- make_future_dataframe(mod.3, periods = 24, freq = 'month')

# add the forecasted regressors
future.3 <- dplyr::left_join(future.3,
                             data.frame(ds = forecast_all$fed$ds, 
                             fedfunds = forecast_all$fed$yhat), 
                             by = c('ds'='ds')) %>%
            dplyr::left_join(., 
                             data.frame(ds = forecast_all$hmi$ds,
                                        sentiment = forecast_all$hmi$yhat),
                             by = c('ds' = 'ds')) %>%
            dplyr::left_join(., 
                             data.frame(ds = forecast_all$itb$ds,
                                        itb = forecast_all$itb$yhat),
                             by = c('ds' = 'ds'))
                             
# set NA to 0
future.3[is.na(future.3)] <- 0

# generate predictions using additional regressors
forecast_addReg <- predict(mod.3, future.3)

# plot forecast
plot(mod.3, forecast_addReg[c('ds','yhat', 'yhat_lower', 'yhat_upper')], 
     xlab = 'Monthly Time Series',
     ylab = 'HPI non-seasonally adj.')

# plot trend components
prophet_plot_components(mod.3, forecast_addReg)

Conclusion

The inclusion of the three regressors did nothing to suggest that the upward trend of the HPI will hit a major inflection point in the near term. The three additional regressors actually pushed the HPI forecast UP slightly rather than down. The inclusion of just the Fed Funds rate also increased the trend, but to a lesser degree. Other more leading indicators should be considered. See below:

# plot the yhat from the original forecast against the yhat forecast influenced by additional regressors.
plotly::plot_ly(tail(forecast_addReg,30), 
                x = ~ds, 
                y = ~yhat, 
                name = 'With Regressors',
                type = 'scatter', 
                mode = 'lines+markers') %>%
plotly::add_trace(y = ~tail(forecast_df$yhat,30), 
                name = 'Without Regressors', 
                type = 'scatter', 
                mode = 'lines+markers', 
                line = list(color = 'rgba(67,67,67,1)', width = 2)) %>%
plotly::add_trace(y = ~tail(forecast_addReg_f$yhat,30), 
                name = 'With Fed Funds Regressor', 
                type = 'scatter', 
                mode = 'lines+markers', 
                line = list(color = 'rgba(5,157,67,1)', width = 2))

Next Steps

It is worth considering the fact that homebulders sentiment, profits, and the Fed Funds rate are all lagging indicators. Additional analysis is warranted, such as experimentation with other more leading indicator type regressors, assessing the correlation of these variables to the HPI, and/or fine-tuning the Prophet model parameters.

Potential additional regressors: Consumer spending, saving, debt, labor statistics, and employment data.