Part A – ATM Cash Forecast

Introduction

Automated Teller Machines (ATMs) play a critical role in modern banking by providing customers with continuous access to cash. Efficient management of ATM cash levels is essential to ensure reliable service while minimizing operational costs. The dataset used in this analysis contains daily cash withdrawal amounts from four ATM machines over a specified time period, with withdrawals measured in hundreds of dollars.

Understanding the patterns in cash demand—such as trends, variability, and potential seasonality—is important for developing accurate forecasts. Time series methods, including transformation, decomposition, and ARIMA modeling, are used to capture these patterns and generate reliable predictions for future cash requirements.

The primary objective of this analysis is to forecast daily cash withdrawals for each ATM for the month of May 2010. These forecasts will support better planning and decision-making for ATM cash management.

Business Problem

Financial institutions must ensure that ATMs are adequately stocked with cash at all times. Underestimating demand can lead to empty machines, resulting in poor customer experience and potential revenue loss. Overestimating demand, on the other hand, leads to excess cash being held in machines, increasing security risks and operational costs.

The key business problem is to determine how much cash should be allocated to each ATM on a daily basis. This requires accurate forecasting of withdrawal demand based on historical data.

By developing robust time series models, this analysis aims to:

Predict future cash withdrawal patterns Optimize cash replenishment schedules Reduce operational and transportation costs Minimize the risk associated with holding excess cash Improve overall service reliability

Ultimately, the goal is to provide a data-driven approach to ATM cash management, enabling more efficient and cost-effective operations.

Introduction

knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
library(readr)
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(tsibble)
## Warning: package 'tsibble' was built under R version 4.4.3
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## 
## Attaching package: 'tsibble'
## The following object is masked from 'package:lubridate':
## 
##     interval
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, union
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.4.3
## Registered S3 methods overwritten by 'ggtime':
##   method                        from      
##   +.gg_tsensemble               feasts    
##   autolayer.fbl_ts              fabletools
##   autolayer.tbl_ts              fabletools
##   autoplot.dcmp_ts              fabletools
##   autoplot.fbl_ts               fabletools
##   autoplot.tbl_cf               feasts    
##   autoplot.tbl_ts               fabletools
##   chooseOpsMethod.gg_tsensemble feasts    
##   fortify.fbl_ts                fabletools
##   grid.draw.gg_tsensemble       feasts    
##   print.gg_tsensemble           feasts    
##   scale_type.cf_lag             feasts
## Warning: replacing previous import 'feasts::scale_x_cf_lag' by
## 'ggtime::scale_x_cf_lag' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_season' by 'ggtime::gg_season'
## when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_tsresiduals' by
## 'ggtime::gg_tsresiduals' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_irf' by 'ggtime::gg_irf' when
## loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_arma' by 'ggtime::gg_arma' when
## loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_tsdisplay' by
## 'ggtime::gg_tsdisplay' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_subseries' by
## 'ggtime::gg_subseries' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_lag' by 'ggtime::gg_lag' when
## loading 'fpp3'
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.3 ──
## ✔ tibble      3.2.1     ✔ ggtime      0.2.0
## ✔ tidyr       1.3.1     ✔ feasts      0.4.2
## ✔ ggplot2     3.5.2     ✔ fable       0.5.0
## ✔ tsibbledata 0.4.1
## Warning: package 'ggplot2' was built under R version 4.4.3
## Warning: package 'tsibbledata' was built under R version 4.4.3
## Warning: package 'ggtime' was built under R version 4.4.3
## Warning: package 'feasts' was built under R version 4.4.3
## Warning: package 'fabletools' was built under R version 4.4.3
## Warning: package 'fable' was built under R version 4.4.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()        masks base::date()
## ✖ dplyr::filter()          masks stats::filter()
## ✖ feasts::gg_arma()        masks ggtime::gg_arma()
## ✖ feasts::gg_irf()         masks ggtime::gg_irf()
## ✖ feasts::gg_lag()         masks ggtime::gg_lag()
## ✖ feasts::gg_season()      masks ggtime::gg_season()
## ✖ feasts::gg_subseries()   masks ggtime::gg_subseries()
## ✖ feasts::gg_tsdisplay()   masks ggtime::gg_tsdisplay()
## ✖ feasts::gg_tsresiduals() masks ggtime::gg_tsresiduals()
## ✖ tsibble::intersect()     masks base::intersect()
## ✖ tsibble::interval()      masks lubridate::interval()
## ✖ dplyr::lag()             masks stats::lag()
## ✖ feasts::scale_x_cf_lag() masks ggtime::scale_x_cf_lag()
## ✖ tsibble::setdiff()       masks base::setdiff()
## ✖ tsibble::union()         masks base::union()
library(tidyr)
library(ggplot2)
library(zoo)
## Warning: package 'zoo' was built under R version 4.4.3
## 
## Attaching package: 'zoo'
## The following object is masked from 'package:tsibble':
## 
##     index
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(janitor)
## Warning: package 'janitor' was built under R version 4.4.3
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

Data Loading

ATMDATA <- read_csv("https://raw.githubusercontent.com/tanzil64/DATA624/refs/heads/main/ATM624Data.csv")

ATMDATA <- ATMDATA %>%
  mutate(DATE = as.Date(DATE, format = "%m/%d/%Y")) %>%
  filter(!is.na(DATE))

head(ATMDATA)
summary(ATMDATA)
##       DATE                ATM                 Cash        
##  Min.   :2009-05-01   Length:1474        Min.   :    0.0  
##  1st Qu.:2009-08-01   Class :character   1st Qu.:    0.5  
##  Median :2009-11-01   Mode  :character   Median :   73.0  
##  Mean   :2009-10-31                      Mean   :  155.6  
##  3rd Qu.:2010-02-01                      3rd Qu.:  114.0  
##  Max.   :2010-05-14                      Max.   :10920.0  
##                                          NA's   :19
str(ATMDATA)
## tibble [1,474 × 3] (S3: tbl_df/tbl/data.frame)
##  $ DATE: Date[1:1474], format: "2009-05-01" "2009-05-01" ...
##  $ ATM : chr [1:1474] "ATM1" "ATM2" "ATM1" "ATM2" ...
##  $ Cash: num [1:1474] 96 107 82 89 85 90 90 55 99 79 ...
head(ATMDATA)
summary(ATMDATA)
##       DATE                ATM                 Cash        
##  Min.   :2009-05-01   Length:1474        Min.   :    0.0  
##  1st Qu.:2009-08-01   Class :character   1st Qu.:    0.5  
##  Median :2009-11-01   Mode  :character   Median :   73.0  
##  Mean   :2009-10-31                      Mean   :  155.6  
##  3rd Qu.:2010-02-01                      3rd Qu.:  114.0  
##  Max.   :2010-05-14                      Max.   :10920.0  
##                                          NA's   :19

Missing Data Inspection

library(naniar)
gg_miss_var(ATMDATA)

Data Preparation

atm <- ATMDATA %>%
  mutate(Cash = as.integer(Cash))

str(atm)
## tibble [1,474 × 3] (S3: tbl_df/tbl/data.frame)
##  $ DATE: Date[1:1474], format: "2009-05-01" "2009-05-01" ...
##  $ ATM : chr [1:1474] "ATM1" "ATM2" "ATM1" "ATM2" ...
##  $ Cash: int [1:1474] 96 107 82 89 85 90 90 55 99 79 ...

Because the four ATM machines may behave differently, each series will be analyzed separately.

atm1 <- atm %>%
  filter(ATM == "ATM1") %>%
  as_tsibble(index = DATE)

atm2 <- atm %>%
  filter(ATM == "ATM2") %>%
  as_tsibble(index = DATE)

atm3 <- atm %>%
  filter(ATM == "ATM3") %>%
  as_tsibble(index = DATE)

atm4 <- atm %>%
  filter(ATM == "ATM4") %>%
  as_tsibble(index = DATE)

Initial Time Plots

autoplot(atm1) +
  labs(title = "ATM1", subtitle = "Cash withdrawals per day", y = "Hundreds USD")

autoplot(atm2) +
  labs(title = "ATM2", subtitle = "Cash withdrawals per day", y = "Hundreds USD")

autoplot(atm3) +
  labs(title = "ATM3", subtitle = "Cash withdrawals per day", y = "Hundreds USD")

autoplot(atm4) +
  labs(title = "ATM4", subtitle = "Cash withdrawals per day", y = "Hundreds USD")

From the plots, ATM1 and ATM2 appear to have relatively stable variation and possible seasonality. ATM3 contains very few observations. ATM4 contains one clear outlier. # Outlier Treatment

which(atm4$Cash == 10919)
## integer(0)

The value 10919 is an obvious outlier, so it is replaced with the mean of the ATM4 series.

atm4 <- atm4 %>%
  mutate(Cash = ifelse(Cash == 10919,
                       round(mean(Cash, na.rm = TRUE), 0),
                       Cash))

autoplot(atm4) +
  labs(title = "ATM4 After Outlier Correction",
       subtitle = "Cash withdrawals per day",
       y = "Hundreds USD")

Missing Value Treatment

sum(is.na(atm1$Cash))
## [1] 3
sum(is.na(atm2$Cash))
## [1] 2
sum(is.na(atm3$Cash))
## [1] 0
sum(is.na(atm4$Cash))
## [1] 0
which(is.na(atm1$Cash))
## [1] 44 47 53
which(is.na(atm2$Cash))
## [1] 49 55

We inspect the distributions of ATM1 and ATM2 before imputing the missing values.

hist(atm1$Cash, main = "ATM1 Cash Distribution", xlab = "Cash")

hist(atm2$Cash, main = "ATM2 Cash Distribution", xlab = "Cash")

Since the distributions do not appear heavily skewed, the missing values are imputed using the median.

atm1$Cash[44] <- round(median(atm1$Cash, na.rm = TRUE), 0)
atm1$Cash[47] <- round(median(atm1$Cash, na.rm = TRUE), 0)
atm1$Cash[53] <- round(median(atm1$Cash, na.rm = TRUE), 0)

atm2$Cash[49] <- round(median(atm2$Cash, na.rm = TRUE), 0)
atm2$Cash[55] <- round(median(atm2$Cash, na.rm = TRUE), 0)

Box-Cox Transformation

lambda1 <- atm1 %>%
  filter(!is.na(Cash)) %>%
  features(Cash + 1, guerrero) %>%
  pull(lambda_guerrero)

lambda2 <- atm2 %>%
  filter(!is.na(Cash)) %>%
  features(Cash + 1, guerrero) %>%
  pull(lambda_guerrero)

lambda3 <- atm3 %>%
  filter(!is.na(Cash)) %>%
  features(Cash + 1, guerrero) %>%
  pull(lambda_guerrero)

lambda4 <- atm4 %>%
  filter(!is.na(Cash)) %>%
  features(Cash + 1, guerrero) %>%
  pull(lambda_guerrero)
plot_trans1 <- atm1 %>%
  autoplot(box_cox(Cash + 1, lambda1)) +
  labs(title = "ATM1 Transformed", subtitle = "Cash withdrawals per day", y = "USD")

plot_trans2 <- atm2 %>%
  autoplot(box_cox(Cash + 1, lambda2)) +
  labs(title = "ATM2 Transformed", subtitle = "Cash withdrawals per day", y = "USD")

plot_trans3 <- atm3 %>%
  autoplot(box_cox(Cash + 1, lambda3)) +
  labs(title = "ATM3 Transformed", subtitle = "Cash withdrawals per day", y = "USD")

plot_trans4 <- atm4 %>%
  autoplot(box_cox(Cash + 1, lambda4)) +
  labs(title = "ATM4 Transformed", subtitle = "Cash withdrawals per day", y = "USD")

Decomposition

atm1 %>%
  model(classical_decomposition(box_cox(Cash, lambda1), type = "additive")) %>%
  components() %>%
  autoplot() +
  labs(title = "Classical Additive Decomposition of ATM1")

atm2 %>%
  model(classical_decomposition(box_cox(Cash, lambda2), type = "additive")) %>%
  components() %>%
  autoplot() +
  labs(title = "Classical Additive Decomposition of ATM2")

atm3 %>%
  model(classical_decomposition(box_cox(Cash, lambda3), type = "additive")) %>%
  components() %>%
  autoplot() +
  labs(title = "Classical Additive Decomposition of ATM3")

atm4 %>%
  model(classical_decomposition(box_cox(Cash, lambda4), type = "additive")) %>%
  components() %>%
  autoplot() +
  labs(title = "Classical Additive Decomposition of ATM4")

ACF Analysis

plot_acf1 <- atm1 %>%
  ACF(box_cox(Cash, lambda1)) %>%
  autoplot() +
  labs(title = "Autocorrelation of Cash - ATM1")

plot_acf2 <- atm2 %>%
  ACF(box_cox(Cash, lambda2)) %>%
  autoplot() +
  labs(title = "Autocorrelation of Cash - ATM2")

plot_acf3 <- atm3 %>%
  ACF(box_cox(Cash, lambda3)) %>%
  autoplot() +
  labs(title = "Autocorrelation of Cash - ATM3")

plot_acf4 <- atm4 %>%
  ACF(box_cox(Cash, lambda4)) %>%
  autoplot() +
  labs(title = "Autocorrelation of Cash - ATM4")
library(gridExtra)
grid.arrange(plot_acf1, plot_acf2, plot_acf3, plot_acf4, nrow = 2)

Differencing Checks

atm1 %>% features(box_cox(Cash, lambda1), unitroot_nsdiffs)
atm2 %>% features(box_cox(Cash, lambda2), unitroot_nsdiffs)
atm3 %>% features(box_cox(Cash, lambda3), unitroot_nsdiffs)
atm4 %>% features(box_cox(Cash, lambda4), unitroot_nsdiffs)

The results indicate that ATM1 and ATM2 require one seasonal difference, while ATM3 and ATM4 do not.

We next check whether any additional regular differencing is required.

atm1 %>% features(difference(box_cox(Cash, lambda1), 7), unitroot_ndiffs)
atm2 %>% features(difference(box_cox(Cash, lambda2), 7), unitroot_ndiffs)
atm3 %>% features(box_cox(Cash, lambda3), unitroot_ndiffs)
atm4 %>% features(box_cox(Cash, lambda4), unitroot_ndiffs)

No additional non-seasonal differencing appears necessary.

ACF After Seasonal Differencing

atm1 %>%
  ACF(difference(box_cox(Cash, lambda1), 7)) %>%
  autoplot() +
  labs(title = "ACF of Differenced ATM1")

atm2 %>%
  ACF(difference(box_cox(Cash, lambda2), 7)) %>%
  autoplot() +
  labs(title = "ACF of Differenced ATM2")

After differencing, the series look much closer to white noise. # Model Building

Because ATM1 and ATM2 need seasonal differencing, ARIMA models are appropriate. ATM3 has too little data, so a naive model is more reasonable. # ATM1 Model

atm1_fit <- atm1 %>%
  model(ARIMA(box_cox(Cash, lambda1)))

report(atm1_fit)
## Series: Cash 
## Model: ARIMA(0,0,2)(0,1,1)[7] 
## Transformation: box_cox(Cash, lambda1) 
## 
## Coefficients:
##          ma1      ma2     sma1
##       0.0980  -0.1143  -0.6463
## s.e.  0.0524   0.0527   0.0425
## 
## sigma^2 estimated as 1.483:  log likelihood=-578.9
## AIC=1165.81   AICc=1165.92   BIC=1181.33
atm1_fit %>% gg_tsresiduals()

ATM2 Model

atm2_fit <- atm2 %>%
  model(ARIMA(box_cox(Cash, lambda2)))

report(atm2_fit)
## Series: Cash 
## Model: ARIMA(2,0,2)(0,1,1)[7] 
## Transformation: box_cox(Cash, lambda2) 
## 
## Coefficients:
##           ar1      ar2     ma1     ma2     sma1
##       -0.4246  -0.9072  0.4679  0.8003  -0.7257
## s.e.   0.0587   0.0439  0.0890  0.0584   0.0409
## 
## sigma^2 estimated as 67.61:  log likelihood=-1261.88
## AIC=2535.75   AICc=2535.99   BIC=2559.03
atm2_fit %>% gg_tsresiduals()

ATM3 Model

atm3_fit <- atm3 %>%
  model(NAIVE(box_cox(Cash, lambda3)))

report(atm3_fit)
## Series: Cash 
## Model: NAIVE 
## Transformation: box_cox(Cash, lambda3) 
## 
## sigma^2: 5.0877
atm3_fit %>% gg_tsresiduals()

ATM4 Model

atm4_fit <- atm4 %>%
  model(ARIMA(box_cox(Cash, lambda4)))

report(atm4_fit)
## Series: Cash 
## Model: ARIMA(0,0,0)(2,0,0)[7] w/ mean 
## Transformation: box_cox(Cash, lambda4) 
## 
## Coefficients:
##         sar1    sar2  constant
##       0.2501  0.1951    2.4713
## s.e.  0.0521  0.0525    0.0460
## 
## sigma^2 estimated as 0.8133:  log likelihood=-479.32
## AIC=966.65   AICc=966.76   BIC=982.25
atm4_fit %>% gg_tsresiduals()

Forecasting

We now forecast 31 days ahead for May 2010.

forecast_atm1 <- atm1_fit %>% forecast(h = 31)
forecast_atm2 <- atm2_fit %>% forecast(h = 31)
forecast_atm3 <- atm3_fit %>% forecast(h = 31)
forecast_atm4 <- atm4_fit %>% forecast(h = 31)

Forecast Plots

autoplot(forecast_atm1, atm1) + labs(title = "Forecast for ATM1")

autoplot(forecast_atm2, atm2) + labs(title = "Forecast for ATM2")

autoplot(forecast_atm3, atm3) + labs(title = "Forecast for ATM3")

autoplot(forecast_atm4, atm4) + labs(title = "Forecast for ATM4")

Export Forecasts

write.csv(as.data.frame(forecast_atm1), "forecast_atm1.csv", row.names = FALSE)
write.csv(as.data.frame(forecast_atm2), "forecast_atm2.csv", row.names = FALSE)
write.csv(as.data.frame(forecast_atm3), "forecast_atm3.csv", row.names = FALSE)
write.csv(as.data.frame(forecast_atm4), "forecast_atm4.csv", row.names = FALSE)

How the Model and Forecast Solve the Business Problem

The business problem is to determine how much cash each ATM should hold in order to avoid shortages while minimizing excess cash. The time series models developed in this analysis directly address this problem by converting historical withdrawal patterns into quantitative forecasts of future demand.

First, the models capture key characteristics of the data, such as trend, variability, and seasonality. For example, ATM1 and ATM2 showed clear weekly patterns, which were incorporated through seasonal differencing in the ARIMA models. By accounting for these patterns, the models are able to produce more accurate and reliable forecasts.

Second, the forecasting step provides daily predictions of cash withdrawals for each ATM. These forecasts allow decision-makers to estimate how much cash will be needed on each day of May 2010. Instead of relying on intuition or simple averages, the bank can use these statistically grounded predictions to plan cash allocation.

As a result, the models help solve the business problem in the following ways:

Prevent stockouts: Forecasts ensure that enough cash is loaded into each ATM to meet expected demand, reducing the risk of machines running empty. Reduce excess cash: By avoiding overestimation, the bank minimizes unnecessary cash holding, lowering security risks and opportunity costs. Optimize replenishment schedules: Knowing expected demand allows for efficient planning of cash delivery routes and frequencies. Improve operational efficiency: Resources such as staff and transportation can be allocated more effectively based on predicted needs.

Conclusion

In summary, the models transform raw historical data into actionable insights, enabling the bank to make informed, data-driven decisions about ATM cash management. This directly improves service reliability while reducing operational costs.

Part B – Residential Power Forecast

Introduction

Accurate forecasting of residential electricity demand is critical for efficient energy planning and resource allocation. The dataset used in this analysis contains monthly residential power consumption from January 1998 through December 2013. The goal is to build a time series model and generate forecasts for 2014.

Business Problem

Energy providers must anticipate future electricity demand to ensure reliable service while minimizing operational costs. Underestimating demand can lead to shortages, while overestimating demand results in inefficient resource allocation. This analysis aims to develop a data-driven forecasting model to support better planning and decision-making.

Data Loading

power <- read_csv("https://raw.githubusercontent.com/tanzil64/DATA624/refs/heads/main/ResidentialCustomerForecastLoad-624.csv") %>%
  rename(Date = `YYYY-MMM`) %>%
  mutate(Date = yearmonth(Date)) %>%
  as_tsibble(index = Date)

power

Missing Value Handling

power %>% filter(is.na(KWH))

There is one missing value (September 2008). This value is replaced using the median of the series.

power <- power %>%
  mutate(KWH = ifelse(is.na(KWH),
                      median(KWH, na.rm = TRUE),
                      KWH))

Exploratory Analysis

power %>%
  autoplot(KWH) +
  labs(title = "Monthly Residential Power Consumption",
       y = "KWH")

The data shows strong seasonal patterns and variability over time.

power %>%
  gg_season(KWH) +
  labs(title = "Seasonal Plot")

power %>%
  gg_subseries(KWH) +
  labs(title = "Subseries Plot")

These plots confirm strong yearly seasonality.

Box-Cox Transformation

lambda <- power %>%
  features(KWH, guerrero) %>%
  pull(lambda_guerrero)

lambda
## [1] 0.1041666
power %>%
  autoplot(box_cox(KWH, lambda)) +
  labs(title = "Box-Cox Transformed Series")

The transformation stabilizes variance and improves model suitability.

ACF Analysis

power %>%
  ACF(box_cox(KWH, lambda)) %>%
  autoplot() +
  labs(title = "ACF Plot")

The slow decay of the ACF suggests non-stationarity.

Differencing Check

power %>%
  features(KWH, unitroot_nsdiffs)
power %>%
  features(KWH, unitroot_ndiffs)

Seasonal differencing is required, which will be handled internally by the ARIMA model.

Model Building

ARIMA Model (Correct Approach)

fit_arima <- power %>%
  model(ARIMA(box_cox(KWH, lambda)))
report(fit_arima)
## Series: KWH 
## Model: ARIMA(0,1,2)(0,0,2)[12] 
## Transformation: box_cox(KWH, lambda) 
## 
## Coefficients:
##           ma1      ma2    sma1    sma2
##       -0.7266  -0.2492  0.2269  0.2104
## s.e.   0.0717   0.0706  0.0787  0.0656
## 
## sigma^2 estimated as 1.294:  log likelihood=-295.53
## AIC=601.05   AICc=601.38   BIC=617.31
fit_arima %>%
  gg_tsresiduals()

The residual diagnostics indicate that the residuals behave like white noise, suggesting a good model fit.

ETS Model

fit_ets <- power %>%
  model(ETS(KWH))

report(fit_ets)
## Series: KWH 
## Model: ETS(M,N,M) 
##   Smoothing parameters:
##     alpha = 0.1167131 
##     gamma = 0.0001038822 
## 
##   Initial states:
##     l[0]      s[0]     s[-1]     s[-2]    s[-3]    s[-4]    s[-5]    s[-6]
##  6133758 0.9567738 0.7524052 0.8702234 1.184894 1.261653 1.185654 1.000083
##      s[-7]     s[-8]     s[-9]   s[-10]   s[-11]
##  0.7742886 0.8153702 0.9116166 1.065073 1.221965
## 
##   sigma^2:  0.0144
## 
##      AIC     AICc      BIC 
## 6224.787 6227.514 6273.649
fit_ets %>%
  gg_tsresiduals()

Model Comparison

accuracy(fit_arima)
accuracy(fit_ets)

The ARIMA model shows lower RMSE and better residual behavior, making it the preferred model.

Forecasting (2014)

fc <- fit_arima %>%
  forecast(h = "12 months")

fc %>%
  autoplot(power) +
  labs(title = "Forecast for 2014")

Back Transformation

fc_final <- fc %>%
  as_tibble() %>%   
  mutate(KWH = inv_box_cox(.mean, lambda))

fc_final

Export Results

write.csv(fc_final, "results_rcfl.csv", row.names = FALSE)

How the Model Solves the Business Problem

The ARIMA model captures both trend and seasonal patterns in electricity consumption and generates reliable monthly forecasts. These forecasts allow energy providers to anticipate demand, optimize production, and improve operational efficiency.

Conclusion

This analysis developed a forecasting model for residential electricity demand using historical monthly data. After handling missing values and applying a Box-Cox transformation, the ARIMA model was selected based on its superior performance and residual diagnostics. The forecasts for 2014 provide a reliable basis for planning and decision-making, helping energy providers manage resources efficiently and reduce operational risk.