To provide a reliable business-ready forecast for the 2014 residential power load, the dataset was examined for quality and structural integrity. The raw data spans 192 months (January 1998 – December 2013).
The audit identified:
The character-based “YYYY-MMM” strings were converted into a formal month-year index to create a structured time series (tsibble).
For the missing value in September 2008, a Time Series Linear
Model (TSLM) was used.
This method estimates the missing point based on:
Rather than using a simple average.
The data contained a massive crash in July 2010 (~770k KWH), which is 90% below typical summer demand. This was identified as a recording error.
To normalize the series:
The plot shows consistent growth in residential consumption until roughly 2008. After 2008, consumption stabilizes, likely due to improved energy efficiency standards.
Two distinct annual surges appear:
Repeating peaks at lags 12, 24, and 36 confirm a strong 12-month annual seasonal cycle.
Used to handle:
Used to:
The ARIMA model was selected due to a significantly lower AICc:
This indicates a more efficient fit for the energy grid.
To maintain forecast quality, simpler methods were excluded:
Projected Total Annual Load (2014):
94,621,199 KWH
Lower demand expected during:
These represent mild weather periods with minimal climate control usage.
p-value = 0.67
Since p-value > 0.05, the residuals are considered White Noise, confirming the model captured all relevant patterns.
p-value = 0.01
The low p-value confirms the raw data was non-stationary, justifying the use of ARIMA differencing.
By identifying and correcting the 2010 recording anomaly and applying a Seasonal ARIMA model, a robust 2014 forecast was generated.
Seasonal swings remain the primary driver of residential demand, requiring higher capacity reserves in January and August to maintain grid stability.