This data comes from the United States Census Bureau and contains monthly data for new single family homes sold between January 1963 to October 2023, in units of thousands of homes. This data is not seasonally adjusted. A single family home is any house intended for only one family to live in. This data set only contains information on houses that are on the market for the first time.
The objective of this analysis is to fit, test, and compare various exponential smoothing methods to determine which method is most accurate for future predictions of new home sales. The accuracy of the predictive models will be based on their MSE and MAPE in predicting the last 12 months in the data set, which will be held as testing data and not used in the model building process. The seven candidate exponential smoothing methods are the following.
The smoothing and dampening parameters used in fitting of these models will be estimated using the forecast package’s optimization function, with the default values used as starting points.
fitting a graph for the full data set reveals that the trend of new
home sales faced a dramatic drop during 2008. This was expected as the
global financial crisis of 2008 was centered around defaults on home
mortgages and the popping of the US housing bubble. As this global even
caused such a seismic shift in the housing industry, It would be
improper to use data from before this collapse to predict future
pricing. For the rest of the analysis we will only concern ourselves
with housing data from January 2009 to October 2023.
The following table gives the accuracy statistics for the 7 potential exponential smoothing methods. These accuracy methods are based solely off the training data that ranges from January 2009 to October 2022. As such this table will not be used to determine the best model, that will be decided by a later accuracy test that will determine how accurate these models are at predicting the testing data ranging from November 2022 to October 2023. Based on this table the most accurate method seems to be the Holt Winters Multiplicative with damping. All the Holt Winters methods are more accurate than their non-seasonal alternatives.
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Simple | 0.1149 | 5.3890 | 4.1269 | -0.3242 | 9.2377 | 0.5896 | -0.0004 |
| Holt | -0.0712 | 5.4097 | 4.1682 | -0.8562 | 9.4233 | 0.5955 | -0.0037 |
| Holt Damped | 0.0668 | 5.3913 | 4.1188 | -0.4761 | 9.2156 | 0.5884 | 0.0014 |
| Holt Winters’ Additive | 0.0018 | 4.5653 | 3.4063 | -0.5494 | 7.7008 | 0.4866 | 0.0145 |
| Holt Winters’ Multiplicative | 0.2249 | 5.0045 | 3.5468 | 0.0847 | 7.7123 | 0.5067 | 0.2544 |
| HW Additive Damped Trend | 0.1605 | 4.5708 | 3.4389 | -0.0692 | 7.7834 | 0.4913 | 0.0097 |
| HW Multiplicative Damped Trend | 0.1931 | 4.5261 | 3.3407 | -0.0066 | 7.6027 | 0.4772 | -0.0182 |
The following two graphs give a visualization of the predicted values for the 3 seasonal and 4 non-seasonal smoothing methods. Compared to the actual values for these 12 months from the testing data all prediction methods underestimate the amount of houses sold. This underestimation may be explained by a brief downward trend in housing sales during the period corresponding to the covid-19 pandemic. The period of time covered in the testing data seems to be either a reversal to this trend or a stabilization of it. I am not confident that exponential smoothing is the optimal method for making predictions on this time series data, as the latest and most heavily weighted data is somewhat anomalous.
The following table gives the MSE and MAPE for the 7 models’ ability to predict the testing data. From these scores we can determine the model built using the Holt Winters Additive method is the superior model. The Holt Winters additive model has the lowest MSE of 90.32715 and lowest MAPE of 18.78776. All 4 Holt Winters variations had much greater accuracy compared to the simple and Holt additive models. This is unsurprising as home sales do have some significant seasonality.
| MSE | MAPE | |
|---|---|---|
| SES | 200.37555 | 29.83186 |
| Holt.Add | 175.22224 | 27.02520 |
| Holt.Add.Damp | 206.90461 | 30.53098 |
| HW.Add | 90.32715 | 18.78776 |
| HW.Exp | 113.51545 | 21.12711 |
| HW.Add.Damp | 104.62960 | 20.70156 |
| HW.Exp.Damp | 96.58620 | 19.79212 |
After selecting the Holt Winters Additive method as the most accurate exponential smoothing method, we must rebuild the model using both the training and testing data together in the model building. The optimal smoothing parameters selected for this model are given in the following table.
| x | |
|---|---|
| alpha | 0.8333481 |
| beta | 0.0001255 |
| gamma | 0.0001056 |