This analysis explores electricity demand forecasting for Victoria, Australia, specifically using the vic_elec dataset. This dataset looks at half-hourly historical electricity data and makes predictions using forecasting techniques. This study also explores existing literature evaluating different time series methods generally and for energy use cases. In this study, I prepare models after exploration by conducting data cleaning and feature engineering. Also included is the creation of models, including traditional models like exponential smoothing (ETS), autoregressive integrated moving averages (ARIMA), and more complex models like machine learning and deep learning techniques. Finally, this study discusses the results of these methods and best approaches for this data, as well as broader implications.
Literature Review
There are existing studies that look at forecasting examples for the time series methods discussed in class - including exponential smoothing, ARIMA, machine learning and deep learning. Hyndman and Athanasopoulos (2021) provide a comprehensive overview of forecasting principles in Forecasting: Principles and Practice. The authors particularly use the vic_elec dataset in many examples. Specifically, they discuss concepts like simple exponential smoothing being sutiable for forecasting data with no clear trend or seasonal pattern (Hyndman & Athanasopoulos, 2021). Or with ARIMA models, for example, how the seasonal part consists of terms similar to the non-seasonal components of the model but with the seasonal backshifts. Or, in the case of more advanced models, how the simplest networks contain no hidden layers and are equivalent to linear regressions, but added layers (like a hidden layer) make the neural network become non-linear.
The study “Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing” explores space state modeling for forecasting complex seasonal time series, including high-frequency seasonality, non-integer seasonality, and dual-calendar effects (De Livera et al., 2011).
Other studies, such as by Bergmeir and Benítez (2012) look more generally are forecasting to compare testing approaches such as cross-validation and leaving out a section of data at the end of the historical time period for time series models. A common practice in determining the best model is testing the model with time series sets is to use test the model on “unseen” data, typically reserving a part from the end of each time series to conduct the testing. Bergmeir and Benítez (2012) explore this and compare it with cross-validation techniques used in machine learning. Interestingly, they find no practical consequences of theoretical flaws but cross-validation techniques on time series data led to more robust model selection.
Hua et al. (2023) developed a study “An ensemble framework for short-term load forecasting based on parallel CNN and GRU with improved ResNet” to explore more machine learning approaches, such as short-term load forecasting, particularly from an energy efficiency standpoint. To have confidence that energy loads will operate in a stable manner, companies need to forecasting electricity loads accurately. This is challenging because this data is usually non-linear and non-stationary. This paper provides an interesting novel approach based on a parallel convolutional neural network and gated recurrent unit. They find that these methods can often outperform baselines when carefully handling feature engineering and exogenous inputs (Hua et al., 2023).
Methodology
Data Preparation
For this analysis, I’ll use the vic_elec dataset from the fpp3 package. This dataset comes from the Monash time series forecasting repository. An initiative of Monash University in Melbourne, Australia, the repository is a comprehensive collection of time series data made available in a convenient form to encourage empirical forecast evaluations. The repository includes the data from many forecasting competitions as well as many other data sets from diverse applications (Hyndman, 2022).
The vic_elec dataset contains total half-hourly electricity demand from 1 January 2012 to 31 December 2014 for Victoria, Australia. This dataset also explores temperature and whether it was a holiday on a particular day.
First I’ll load the necessary packages to complete this analysis, then load the dataset.
Time Demand Temperature
Min. :2012-01-01 00:00:00 Min. :2858 Min. : 1.50
1st Qu.:2012-09-30 22:52:30 1st Qu.:3969 1st Qu.:12.30
Median :2013-07-01 22:45:00 Median :4635 Median :15.40
Mean :2013-07-01 22:45:00 Mean :4665 Mean :16.27
3rd Qu.:2014-04-01 23:37:30 3rd Qu.:5244 3rd Qu.:19.40
Max. :2014-12-31 23:30:00 Max. :9345 Max. :43.20
Date Holiday
Min. :2012-01-01 Mode :logical
1st Qu.:2012-09-30 FALSE:51120
Median :2013-07-01 TRUE :1488
Mean :2013-07-01
3rd Qu.:2014-04-01
Max. :2014-12-31
Time and Date columns are redundant here, so I can remove one, such as Date and keep Time because it has more detail. From the initial visualization, the data is very noisy. Half hourly data is likely too granular so I can roll it up to daily data instead for a more aggregate view. I will focus primarily on Demand as the variable to forecast, but Temperature could be useful for Dynamic Regression models. I can also look at Holiday to see any interesting patterns (e.g., people might use more or less electricity on holidays). Unsurprisingly there are far less holidays than non-holidays. There are no missing values in the dataset.
There appears to be less electricity usage during holidays than not during holidays. It’s likely because people are away on vacation during holidays so they use less than they would if they were home. For example, they might just have electricity on for basic household essentials, like keeping the fridge running.
When comparing electricity demand by temperature, we see that at the highest peaks of demand, there is also higher temperature, indicating that people are using air conditioning to combat extreme heat. These happen around the end and beginning of the year, which for Australia in the southern hemisphere is their summer, so this is on track with assumptions. There are some middle peaks that happen in the middle of the year, which have some of the coldest temperatures, around 10 degrees Celsius. This also makes sense because this is Australia’s winter, from June to August, so people are using more heat in their homes, thus using more electricity to combat these cold temperatures. So overall, the demand peaks in winter and summer, with higher peaks in summer. This shows that summer is more of a problem than winter, which makes sense as Australia doesn’t get that cold but it does get quite hot.
Next, I will perform additional cleaning, like rolling up the data to daily levels, making the dataset into a tsibble and removing redundant columns.
########### Data Cleaning # remove unnecessary or redundant variables Date and Holidayvic_elec <- vic_elec |>select(-Date)# convert to tsibblevic_elec |>as_tsibble()# aggregate data and change column namesdaily_data <- vic_elec |>index_by(date =as_date(Time)) |>summarise(total_demand =sum(Demand),avg_temperature =mean(Temperature) )
With this approach, I’ve rolled up demand to be daily so it’s enough to still capture demand fluctuations without dealing with excessive amounts of data. Since I summed demand over multiple 30 min increments, I needed a way to aggregate temperature as well to daily levels. For this exercise, I just used the mean temperature of all the 30-min increments to represent daily temperature.
Our resulting plot is now much cleaner, with less data showing varied seasonal patterns (due to effects of summer and winter).
Next, I’ll apply some feature engineering to help the models best forecast the data. For this analysis, I’ve employed a few measures:
Lagged indicators: where electricity demand follows a gradual peak and trough pattern (seasonality) and is a reactive to weather, often yesterday’s or even last week’s data is going to be similar to today’s data, so modeling this might be interesting.
Rolling averages: these can help smooth out any unusual peaks or troughs that might arise from strange weather patterns, causing electricity to spike or drop at unique times. For example, the large spike seen around the start of 2014.
Seasonal and weekend indicators: with electricity demand, these can change depending on if it’s a holiday (where people are traveling away from home, using less power being outside), or if it’s summer or winter (temperature driving electricity usage), or even if it’s a weekend or weekday (similar to holidays, people are away or outside using less electricity).
# add lagged indicatorsdaily_data <- daily_data |>mutate(lag1_temp =lag(avg_temperature, n =1),lag7_temp =lag(avg_temperature, n =7),lag1_demand =lag(total_demand, n =1))# create rolling averagesdaily_data <- daily_data |>mutate(rolling_avg_7 =rollmean(avg_temperature, k =7, fill =NA))# create seasonal and weekend indicatorsdaily_data <- daily_data |>mutate(weekend =as.integer(wday(date) %in%c(1, 7)),summer =as.integer(month(date) %in%c(12, 1, 2)))# drop NA values daily_data <- daily_data |>filter(!is.na(lag1_temp),!is.na(lag7_temp),!is.na(lag1_demand),!is.na(rolling_avg_7) )
For this feature engineering, I created 3 lags - yesterday’s temperature and last week’s temperature as 1 and 7 day lags, and yesterday’s demand as a 1-day lag. I also created a 7-day rolling average of average temperature, and seasonal and weekend indicators. I dropped the NA values that were created as a result of this feature engineering to not affect models.
Model Development
Next, I will prepare the traditional, machine learning, deep learning, and GARCH models, along with some ensemble techniques that combines the effects of these.
Before fitting the models, I’ll split the data into training and testing data.
######## Splitting Data # split data - 80%train <- daily_data |>filter_index("2012-01-08"~"2013-12-31")# split data - 20% test <- daily_data |>filter_index("2014-01-01"~ .)# checksummary(train)summary(test)
Traditional Models
Exponential Smoothing, or ETS, are forecasts method that using weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight. This framework generates reliable forecasts quickly and for a wide range of time series, which is a great advantage (Hyndman & Athanasopoulos, 2021).
ARIMA models provide another approach to time series forecasting and are widely used, just like ETS models, but provide complementary approaches to the problem. These models come from the combination of autoregressive and moving average models. They can also look at seasonality (or not) and aim to describe the autocorrelations in the data.
Dynamic regression models build on these models to incorporate exogenous variables, such as Temperature in the case of Victoria electricity data (Hyndman & Athanasopoulos, 2021).
Next, I’ll fit the models. In this setup, I’ve created two ETS models, 1 ARIMA model, and 1 dynamic regression model. In the first ETS model, I predict total_demand modifying the parameters to include an additive error, no trend, and additive seasonality (AAN). In the second ETS model, I’ve created it to include additive error, additive trend, and additive seasonality (AAA). In the ARIMA model, I’ve let the function automatically determine the amount of differencing to apply to make the data stationary, as well as other parameters (Noble, 2024).
Last, for dynamic regression, I’m using avg_temperature to predict total_demand.
####### Fit traditional modelstraditional_fit <- train |>model(ETS_AAN =ETS(total_demand ~error("A") +trend("N") +season("A")),ETS_AAM =ETS(total_demand ~error("A") +trend("A") +season("A")),auto_ARIMA =ARIMA(total_demand),dynamic_regression =ARIMA(total_demand ~ avg_temperature) )######### Create forecast of avg_temperature exogenous variable # Create a naive forecast for avg_temperaturetemp_forecast <- train |>model(NAIVE(avg_temperature)) |>forecast(h =nrow(test))# Add the forecasted temperatures to your test datatest_with_temp <- test |>mutate(avg_temperature = temp_forecast$.mean)####### Create forecasts traditional modelstraditional_forecasts <- traditional_fit |>forecast(new_data = test_with_temp)##### Visualize forecasts# create forecast plotsuppressMessages({traditional_forecasts |>autoplot(filter_index(daily_data)) +autolayer(test, color ="black") +labs(y ="Demand", title ="Prediction of Electricity Demand, Traditional Models")})
Since the dynamic regression requires an exogenous variable, avg_temperature , I needed to use a forecasted value of these to create our overall demand forecasts. I couldn’t use the actual test values because technically these would not be known at the time of forecasting. For this sub-forecast approach, I just used a NAIVE method, assuming that temperatures would be the same as the last temperature value in our training data.
The visualization appears to show the ARIMA modeling performing best. This might be due to the effectiveness of ARIMA handling seasonality, with our electricity having such strong seasonal patterns.
Comparing accuracy metrics, the automatic ARIMA method also has the lowest RMSE, MAE, and MAPE, confirming it performed the best out of the traditional models.
Machine Learning Models
XGBoost is widely used by data scientists to achieve effective results on machine learning challenges. XGBoost is able to scale well, driven by several key systems and algorithmic optimizations. These innovations include a novel tree-learning algorithm designed for handling sparse data, as well as a theoretically justified weighted quantile sketch procedure, which enables the effective incorporation of instance weights in approximate tree learning (Chen & Guestrin, 2016).
Similar is LightGBM, a gradient boosting framework that uses tree-based learning algorithms. It is designed have faster training speed and higher efficiency, along with lower memory usage, better accuracy, and more (LightGBM 3.3.5 documentation, 2025). Last is CatBoost, an algorithm for gradient boosting on decision trees. It is developed by Yandex researchers and engineers, and is used for search, recommendation systems, weather prediction and many other tasks (cite Yandex, 2019).
Next, I will develop the machine learning model, XGBoost, to predict electricity demand:
####### Machine Learning models# create x and y parameters# first create feature listfeatures <-c("lag1_demand","lag1_temp","lag7_temp","avg_temperature","rolling_avg_7","weekend","summer")# build X/y for train and testX_train <- train %>%select(all_of(features)) %>%mutate(across(everything(), as.numeric)) %>%as.matrix()y_train <- train$total_demandX_test <- test %>%select(all_of(features)) %>%mutate(across(everything(), as.numeric)) %>%as.matrix()y_test <- test$total_demand# drop dateX_train <- X_train[, colnames(X_train) !="date"]X_test <- X_test[, colnames(X_test) !="date"]# checkcolnames(X_train)
model RMSE MAE MAPE
1 XGBoost 10291.76 6315.942 2.849983
The XGBoost model has a different setup than traditional models (operating in matrix form) and requires splitting up the train and test data into X and Y matrices. They also had to be put into a numeric form.
Looking at the visualization, the XGB boost model does a decent job following the patterns in electricity demand, and seems to forecast well.
The metrics for XGBoost are also significantly less than the traditional models.
Deep Learning Models
Deep learning models are an effective way to handle very complex data. These models use neural networks, operating similarly to the human brain, to analyze data and learn complex patterns. They are skilled in handling data like financial historical data, and can make predictions with surprisingly high accuracy. The downside of these models is these use a lot of computational power, which can be expensive or not feasible for individuals or small companies. They also take a long time to run which can make using them prohibitive.
I will run a multilayer perceptron (MLP) model in this analysis, using R. Before running the model, I need to scale my data, still using the matrix-based splits.
####### Create forecasts DL Model# Scale the dataX_train_scaled <-scale(X_train)X_test_scaled <-scale(X_test, center =attr(X_train_scaled, "scaled:center"), scale =attr(X_train_scaled, "scaled:scale"))# Fit MLP model# set random seedset.seed(42)mlp_fit <-nnet(X_train_scaled, y_train, size =10,linout =TRUE, trace =FALSE)# Make predictionsmlp_forecasts <-predict(mlp_fit, X_test_scaled)##### Visualize forecasts# Create visualization data for MLPmlp_forecast_data <-data.frame(time =1:length(y_test),actual = y_test,mlp_forecast =as.vector(mlp_forecasts))# Plot MLP forecasts vs actualggplot(mlp_forecast_data, aes(x = time)) +geom_line(aes(y = actual, color ="Actual"), size =1) +geom_line(aes(y = mlp_forecast, color ="MLP"), size =1) +labs(title ="MLP Forecasts vs Actual", x ="Time", y ="Total Demand") +theme_minimal()
model RMSE MAE MAPE
1 MLP 24472.02 18857.75 8.750674
The deep learning, MLP model doesn’t seem to capture the pattern in the electricity demand data that well. There is somewhat of a pattern it captures initially (though it underestimates it) and then plateaus for much of the time, until it starts to pick up somewhat of a pattern again.
Based on accuracy metrics, this model performs better than traditional models, but not as well as machine learning methods, like XGBoost.
Ensemble Techniques
Next, I will combine forecasts from all the models developed thus far into an ensemble model. Ensembles models are a way to average the forecasts of the models by pulling the mean, and combining them into one forecast. This can be an effective method to help reduce errors across individual models.
model RMSE MAE MAPE
1 Ensemble 38089.85 33661.07 14.62878
Interestingly, the ensemble method seems to underestimate not only the values of total demand but the seasonality pattern in the data. This could be as a result of the MLP model that had a low straight line of data in the middle of the time period, dragging down the overall average.
Results & Discussion
Looking at the results above, it’s clear that not all models perform equally. There are advantages and disadvantages to each kind and the best model depends on the type of data being analyzed.
The Victoria electricity data was granular with some irregularities, but almost no trend and largely seasonal patterns driving its movements. This resulted mainly from the factors that drive electricity usage, like temperature (resulting from season), and holidays to an extent.
The machine learning model, XGBoost seemed to perform the best on the data, having the lowest metrics across RMSE. XGBoost can handle seasonal data very well, especially data which is not perfectly linear. As a tree-based ensemble model, it can identify and model intricate patterns well, especially non-linear relationships due to its nature.
Traditional models, particularly ARIMA performed well here too. Possibly, XGBoost had better results because of the way it handles errors, creating decision trees with new models that correct errors of previous models. ARIMA handles errors by taking into account previous errors (autocorrelations) as well but in a different way. It could also be that the relationships between predictors (like temperature, lag of demand, etc.) were non-linear, thus XGBoost performed better.
Conclusion
Being able to predict energy demand is a crucial task for any energy company looking to meet supply and be able to set and defend prices to government authorities. XGBoost is an effective tool in demand forecasting but other models like ARIMA, and MLP models also can help energy forecasters understand future demand.
References
Bergmeir, C., & Benítez, J. M. (2012). On the use of cross-validation for time series predictor evaluation. Information Sciences, 191, 192–213. https://doi.org/10.1016/j.ins.2011.12.028
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16) (pp. 785–794). ACM. https://doi.org/10.1145/2939672.2939785
De Livera, A. M., Hyndman, R. J., & Snyder, R. D. (2011). Forecasting time series with complex seasonal patterns using exponential smoothing (TBATS). Journal of the American Statistical Association, 106(496), 1513–1527. https://doi.org/10.1198/jasa.2011.tm09771
Hua, H., et al. (2023). An ensemble framework for short-term load forecasting based on parallel CNN and GRU. Electric Power Systems Research, 222, 109462. https://www.sciencedirect.com/science/article/abs/pii/S0378779622011063
Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts. https://otexts.com/fpp3/
Monash time series forecasting repository – Rob J Hyndman. (2022, February 23). Rob J Hyndman. https://robjhyndman.com/hyndsight/forecastingdata/
Noble, J. (2024, May 24). ARIMA models. IBM. https://www.ibm.com/think/topics/arima-model
Welcome to LightGBM’s documentation! — LightGBM 3.3.5 documentation. (n.d.). LightGBM.readthedocs.io. Retrieved 2025, from https://lightgbm.readthedocs.io/en/stable/