This project aims to forecast the electrical power consumption of three individual households, randomly selected from a database of 33. To achieve this, the study employs a combination of Multivariate Analysis of Variance (MANOVA), clustering techniques, and both Kalman and Particle filters.
The structure of the study is as follows:
Chapter 2 describes the data collection process, outlines the variables used, and assesses quality of data. Chapter 3 details the strategy adopted for handling missing values, presents the interpolation method and two distinct smoothing approaches, followed by the specification of Multivariate Regression Models. It is also specified the Akaike Information Criterion (AIC) useful specifically in this project for model performance comparisons of additive versus interaction-based specifications, as well as a complete weekday structure versus a binary weekday-weekend specification. The chapter also elaborates on the clustering methodology and the filtering techniques used for forecasting. Chapter 4 presents the empirical results. The report concludes with a summary of the main findings and recommendations for future research.
The results indicate that, for the first smoothing method using the stacked multivariate model with household Fixed Effects, a simpler additive specification outperforms models that include interaction terms. And at the individual household level, a binary weekday-weekend specification proves more effective than a full weekday-level model.
When using the second smoothing approach, evidence suggest relevance of interaction effects for explaining variability of electricity consumption in the stacked specification and also a complete week structure instead of weekday-weekend behavior.
Through clustering carried out with K-means analysis there are identified 4 different types of behaviors for weekdays and 2 for weekends when using the first smoothing filter and 2 and 3 when applying the second smoothing technique. There are identified 2 households that constitutes itself a cluster presenting a very distant behavior with regard majority of household patterns. Randomly chosen household were the corresponding to ID´s: 38, 19 and 23 which were classified into a behavior in line with dominant clusters.
Regarding forecasting performance at a 15 minute frequency, Kalman filter consistently shows a lower Root Mean Square Error (RMSE) than the Particle filter for both smoothing filters applied. Kalman filter also performs better than the Kalman filter without the load profile. For 2 of 3 households second smoothing method RMSE improves first smoothing method. Both Kalman and Particle filters have a better accuracy with respect to Naïve, thay only considers inmediate past. Considering Mean Average Error (MAE), Particle Filter with second smoothing method shows the best performance for 2 of 3 households. For the remaining one, Kalman without load profile improves remaining approaches. There is no evidence of normally distributed errors in the case of the multivariate regressions neither under the additive nor interaction specifications. Similarly, normality is not observed for case for the residuals in the univariate models based on Kalman and Particle filters applied to both smoothed series.
RMSE and MAE by Household and filter specification for first smoothing
RMSE and MAE by Household and filter specification for second smoothing