library(prophet)
library(dplyr)
library(ggplot2)
library(lubridate)
library(knitr)
library(htmltools)This project studies electricity demand during 2019 using Meta’s Prophet forecasting model in R. The aim is to understand the main features of the series, such as trend and seasonality, and then produce a short-term forecast.
Electricity demand is a useful time series because it often shows strong seasonal patterns. Demand may vary by month, by week, and sometimes by special periods such as holidays. A forecasting model can help describe these patterns and predict future values.
The year 2019 was chosen because it is a pre-COVID period and therefore avoids the unusual demand behaviour seen during the pandemic.
The Prophet package (Meta, 2023) was installed using the standard R installation procedure and loaded using library(prophet).
The data was obtained from the National Energy System Operator (NESO, 2019), imported from a CSV file. Although the original dataset contains hourly observations, the data was aggregated to daily totals before modelling. This was done to simplify the analysis and focus on broader seasonal patterns, particularly weekly and yearly effects. Aggregating to the daily level also reduces short-term noise and computational complexity, making the Prophet model more stable and easier to interpret. As a result, intraday patterns (such as hourly demand fluctuations) are not captured in this study.
raw_demand_data <- read.csv("data/demanddata_2019.csv")
electricity_data <- raw_demand_data %>%
mutate(ds = as.Date(SETTLEMENT_DATE, format = "%d-%b-%Y")) %>%
group_by(ds) %>%
summarise(y = sum(ND), .groups = "drop") %>%
arrange(ds)| ds | y |
|---|---|
| 2019-01-01 | 1327420 |
| 2019-01-02 | 1681986 |
| 2019-01-03 | 1790227 |
| 2019-01-04 | 1793839 |
| 2019-01-05 | 1648905 |
| 2019-01-06 | 1571556 |
| Statistic | Value |
|---|---|
| Start date | 2019-01-01 |
| End date | 2019-12-31 |
| Minimum demand | 975974 |
| Maximum demand | 1924096 |
| Mean demand | 1409167 |
The dataset contains daily electricity demand observations for the full 2019 calendar year with no missing values after cleaning.
The time series shows a clear seasonal pattern in electricity demand across 2019. Demand peaks in winter (≈1.8–1.9 million) due to heating needs and declines in summer (≈1.1–1.3 million). This reflects strong annual seasonality driven by temperature. There are also regular weekly fluctuations, likely due to lower weekend demand. The consistent yearly and weekly patterns make the data well suited for models like Prophet. There is no clear long-term trend; the series is mainly seasonality-driven. However, day-to-day variability may still create short-term forecasting challenges. Overall, the data is suitable for forecasting using seasonal decomposition methods.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
The smoothed trend line provides a clearer view of the underlying long-term movement in electricity demand by filtering out short-term fluctuations. It reveals a pronounced U-shaped pattern over the year, with demand steadily declining from winter into the summer months before increasing again towards the end of the year. This confirms that the dominant structure in the series is driven by annual seasonality rather than a persistent long-term trend.
The absence of a consistent upward or downward trajectory suggests that there is no strong global trend in the data. Instead, the apparent “trend” is largely seasonal in nature. This distinction is important for modelling, as it indicates that the variation in demand should be captured primarily through seasonal components rather than a strong trend component.
Additionally, the gap between the smoothed line and the raw data highlights the presence of substantial short-term variability, which is likely due to weekly patterns. This further supports the inclusion of weekly seasonality in the Prophet model
The distribution of daily electricity demand is approximately bell-shaped, indicating that most observations are concentrated around a central level rather than being widely dispersed. The majority of demand values lie between roughly 1.2 million and 1.6 million, suggesting a relatively stable average level of consumption throughout the year.
There is a slight right skew in the distribution, with a longer tail extending towards higher demand values. This reflects occasional periods of unusually high electricity demand, likely occurring during colder winter days when heating usage increases. In contrast, extremely low demand values are less frequent, indicating that demand does not drop as sharply during low-consumption periods.
The absence of extreme outliers suggests that the data is relatively well-behaved and suitable for modelling without the need for heavy preprocessing. However, the spread of the distribution indicates moderate variability, which is consistent with the seasonal patterns observed in the time series. Overall, the distribution supports the use of an additive modelling approach, as the variability appears relatively constant across the range of values.
| month | mean_demand |
|---|---|
| Jan | 1701050 |
| Feb | 1589620 |
| Mar | 1465819 |
| Apr | 1369430 |
| May | 1283146 |
| Jun | 1230900 |
| Jul | 1235638 |
| Aug | 1186566 |
| Sep | 1251606 |
| Oct | 1430328 |
| Nov | 1617246 |
| Dec | 1560710 |
The monthly averages provide a clearer and more structured view of the seasonal pattern by removing daily fluctuations. Unlike the daily time series, which contains substantial short-term variability, this aggregated plot highlights a smooth and consistent annual cycle. Demand decreases steadily from January to August before rising again towards the end of the year, forming a pronounced U-shaped pattern.
This plot makes the strength and stability of the seasonal pattern more evident, showing that changes in demand occur gradually across months rather than abruptly. The relatively uniform differences between months suggest that seasonal effects are consistent in magnitude, supporting the use of an additive seasonal model. By reducing noise, the monthly aggregation reinforces the conclusion that electricity demand is primarily driven by predictable seasonal factors.
Prophet is a decomposable time series model that represents a series as the sum of trend, seasonality, and random error. It is appropriate for this dataset because the exploratory analysis showed clear weekly and yearly seasonal patterns.
The model was fitted to the training data, with the final 30 days reserved as a test set for forecast evaluation. Weekly and yearly seasonalities were included, while daily seasonality was excluded because the data had already been aggregated to daily totals.
Prophet models the time series as:
\[ y(t) = g(t) + s(t) + \varepsilon_t \]
where \(g(t)\) represents the trend component, \(s(t)\) captures seasonal patterns (such as weekly and yearly effects), and \(\varepsilon_t\) is the random error term.
test_horizon <- 30
training_data <- electricity_data %>%
slice(1:(n() - test_horizon))
test_data <- electricity_data %>%
slice((n() - test_horizon + 1):n())
prophet_model <- prophet(
training_data,
daily.seasonality = FALSE,
weekly.seasonality = TRUE,
yearly.seasonality = TRUE
)
future_dates <- make_future_dataframe(prophet_model, periods = test_horizon)
forecast_values <- predict(prophet_model, future_dates)The forecast plot suggests that demand is mainly driven by seasonal patterns rather than a strong long-term trend. Weekly seasonality is clear, with higher demand during the working week and lower demand at weekends, while yearly seasonality shows higher demand in winter and lower demand in summer. Since the model is trained on only one year of data, the estimated trend should be interpreted with caution, as some seasonal effects may be absorbed into the trend component.
| ds | y | yhat | yhat_lower | yhat_upper | error | abs_error | squared_error |
|---|---|---|---|---|---|---|---|
| 2019-12-02 | 1782485 | 1721043 | 1641391 | 1802633 | 61441.83 | 61441.83 | 3775099060 |
| 2019-12-03 | 1749127 | 1763564 | 1682952 | 1842304 | -14436.66 | 14436.66 | 208417047 |
| 2019-12-04 | 1743089 | 1778155 | 1701982 | 1859527 | -35066.19 | 35066.19 | 1229637417 |
| 2019-12-05 | 1687929 | 1776630 | 1701593 | 1853554 | -88700.70 | 88700.70 | 7867815033 |
| 2019-12-06 | 1578355 | 1760064 | 1680160 | 1840025 | -181709.32 | 181709.32 | 33018277949 |
| 2019-12-07 | 1432582 | 1616360 | 1537268 | 1690135 | -183777.59 | 183777.59 | 33774203494 |
| MAE | RMSE |
|---|---|
| 228689.8 | 268991 |
The forecast accuracy metrics indicate that the model performs reasonably well in capturing the overall pattern of electricity demand. The Mean Absolute Error (MAE) of approximately 228,690 suggests that, on average, daily forecasts deviate from the observed values by around 229 thousand units. The Root Mean Squared Error (RMSE) is slightly higher, reflecting the presence of some larger forecast errors. This indicates that while the model captures general trends and seasonal patterns, it occasionally struggles to predict sharper fluctuations in demand.
Overall, these results suggest that the Prophet model is suitable for short-term forecasting of electricity demand, although some limitations remain in capturing more extreme variations
The make_future_dataframe() function extends the time series by 30 days beyond the observed data, and predict() generates forecasts with corresponding uncertainty intervals.
The resulting forecast represents expected electricity demand beyond the end of 2019, based on the patterns learned from the historical data. As in earlier results, the forecast reflects strong weekly and yearly seasonality, with lower demand on weekends and higher demand during colder periods.
A second Prophet model is fitted using the full 2019 dataset in order to generate forecasts for the first six months of 2020. This allows the model to utilise all available information when predicting future demand. The forecast extends beyond the observed data and provides insight into how seasonal patterns are expected to continue into the following year.
The forecast shows a gradual decline in electricity demand from January to June, reflecting the transition from high winter demand to lower summer demand. Strong weekly seasonality is evident, with regular fluctuations showing higher demand during weekdays and lower demand on weekends. The overall downward pattern is driven by annual seasonality, with demand decreasing as the year moves from winter into warmer months. The shaded regions represent uncertainty intervals, which widen slightly over time, indicating increasing uncertainty in longer-term forecasts. Since the model is trained on only one year of data, it may incorrectly interpret seasonal patterns as a trend, leading to an overall downward bias in the forecast.
This comparison helps assess whether the forecast preserves the seasonal structure observed in the same months of 2019. The forecasted values for 2020 are generally lower than the corresponding 2019 observations. This occurs because the model interprets the decline in demand from winter to summer in 2019 as a downward trend, rather than purely seasonal variation.
Since the model is trained on only one year of data, it cannot fully distinguish between trend and seasonality, leading to lower predicted values in 2020. Despite this, the model successfully captures the overall seasonal pattern, with higher demand in winter and a gradual decline towards the summer months. Weekly fluctuations remain visible throughout the forecast horizon.
A limitation of the model is that it does not include external variables such as temperature, which may influence electricity demand.
A log transformation was applied to assess whether modelling relative (percentage) changes would improve forecast accuracy. This approach is often useful when variability increases with the level of the series.
| Model | MAE | RMSE |
|---|---|---|
| Original scale Prophet | 228689.8 | 268991.0 |
| Log scale Prophet | 534284.0 | 616331.7 |
However, the log-transformed model performed worse, with higher MAE and RMSE compared to the original-scale model. This indicates that the variance in electricity demand is relatively stable, and that modelling absolute changes is more appropriate.
Overall, the results support the use of an additive model rather than a multiplicative one
To compare model structures, both additive and multiplicative versions of Prophet were used. The additive model keeps seasonal effects constant, whereas the multiplicative model allows them to increase or decrease with the level of the series.
The results indicate that the additive model performs better, suggesting that seasonal fluctuations in electricity demand are relatively stable in absolute magnitude rather than proportional to the level of the series.
This study used Prophet to model daily electricity demand in 2019 and to produce short-term forecasts. The data showed clear weekly and yearly seasonal patterns, making it suitable for Prophet’s framework. The model captured the main structure of the series and produced reasonable forecasts, although the results should be interpreted with caution because only one year of data was available and no external variables, such as temperature, were included. Overall, the additive model on the original scale gave the most appropriate results for this dataset.
References: