1 Data Set Overview

This data comes from the United States Census Bureau and contains monthly data for new single family homes sold between January 1963 to October 2023, in units of thousands of homes. This data is not seasonally adjusted. A single family home is any house intended for only one family to live in. This data set only contains information on houses that are on the market for the first time.

2 Objective of Analysis

The objective of this analysis is to fit, test, and compare various exponential smoothing methods to determine which method is most accurate for future predictions of new home sales. The accuracy of the predictive models will be based on their MSE and MAPE in predicting the last 12 months in the data set, which will be held as testing data and not used in the model building process. The seven candidate exponential smoothing methods are the following.

  • simple exponential smoothing
  • Holt additive
  • Holt additive dampened
  • Holt Winter seasonally additive
  • Holt Winter seasonally multiplicative
  • Holt Winter seasonally additive dampened
  • Holt Winter seasonally multiplicative dampened

The smoothing and dampening parameters used in fitting of these models will be estimated using the forecast package’s optimization function, with the default values used as starting points.

3 Initial Time Series Analysis

fitting a graph for the full data set reveals that the trend of new home sales faced a dramatic drop during 2008. This was expected as the global financial crisis of 2008 was centered around defaults on home mortgages and the popping of the US housing bubble. As this global even caused such a seismic shift in the housing industry, It would be improper to use data from before this collapse to predict future pricing. For the rest of the analysis we will only concern ourselves with housing data from January 2009 to October 2023.

4 Training Accuracy

The following table gives the accuracy statistics for the 7 potential exponential smoothing methods. These accuracy methods are based solely off the training data that ranges from January 2009 to October 2022. As such this table will not be used to determine the best model, that will be decided by a later accuracy test that will determine how accurate these models are at predicting the testing data ranging from November 2022 to October 2023. Based on this table the most accurate method seems to be the Holt Winters Multiplicative with damping. All the Holt Winters methods are more accurate than their non-seasonal alternatives.

The accuracy measures of the 7 potential exponential smoothing methods, based on training data
ME RMSE MAE MPE MAPE MASE ACF1
Simple 0.1149 5.3890 4.1269 -0.3242 9.2377 0.5896 -0.0004
Holt -0.0712 5.4097 4.1682 -0.8562 9.4233 0.5955 -0.0037
Holt Damped 0.0668 5.3913 4.1188 -0.4761 9.2156 0.5884 0.0014
Holt Winters’ Additive 0.0018 4.5653 3.4063 -0.5494 7.7008 0.4866 0.0145
Holt Winters’ Multiplicative 0.2249 5.0045 3.5468 0.0847 7.7123 0.5067 0.2544
HW Additive Damped Trend 0.1605 4.5708 3.4389 -0.0692 7.7834 0.4913 0.0097
HW Multiplicative Damped Trend 0.1931 4.5261 3.3407 -0.0066 7.6027 0.4772 -0.0182

5 Prediction Graphs

The following two graphs give a visualization of the predicted values for the 3 seasonal and 4 non-seasonal smoothing methods. Compared to the actual values for these 12 months from the testing data all prediction methods underestimate the amount of houses sold. This underestimation may be explained by a brief downward trend in housing sales during the period corresponding to the covid-19 pandemic. The period of time covered in the testing data seems to be either a reversal to this trend or a stabilization of it. I am not confident that exponential smoothing is the optimal method for making predictions on this time series data, as the latest and most heavily weighted data is somewhat anomalous.

6 Testing Accuracy Measures

The following table gives the MSE and MAPE for the 7 models’ ability to predict the testing data. From these scores we can determine the model built using the Holt Winters Additive method is the superior model. The Holt Winters additive model has the lowest MSE of 90.32715 and lowest MAPE of 18.78776. All 4 Holt Winters variations had much greater accuracy compared to the simple and Holt additive models. This is unsurprising as home sales do have some significant seasonality.

The accuracy measures of the 7 exponential smoothing methods based on the test data
MSE MAPE
SES 200.37555 29.83186
Holt.Add 175.22224 27.02520
Holt.Add.Damp 206.90461 30.53098
HW.Add 90.32715 18.78776
HW.Exp 113.51545 21.12711
HW.Add.Damp 104.62960 20.70156
HW.Exp.Damp 96.58620 19.79212

7 Final Model And It’s Parameters

After selecting the Holt Winters Additive method as the most accurate exponential smoothing method, we must rebuild the model using both the training and testing data together in the model building. The optimal smoothing parameters selected for this model are given in the following table.

Estimated values of the smoothing parameters in Holt-Winters linear trend with additive seasonality
x
alpha 0.8333481
beta 0.0001255
gamma 0.0001056