Accurately forecasting cash demand is important for ensuring ATM availability while minimizing excess cash holdings. In this report, historical ATM withdrawal data is used to forecast daily cash withdrawals for four ATM machines for May 2010. The analysis follows a practical, business-oriented approach using standard time-series forecasting techniques. Results are presented through visualizations and summarized forecasts, and the final output is provided in an Excel-readable format to support operational planning and decision-making.
This report forecasts daily cash withdrawals for each ATM for May 1–May 31, 2010 using the provided ATM624 dataset. Because the variable Cash is recorded in hundreds of dollars, the analysis first converts Cash to dollars and then builds time-series models per ATM. The goal is to produce a practical forecast that can be used for staffing and cash-replenishment planning, along with visuals and an Excel-readable forecast file.
The working directory is set to the folder containing this Rmd so the data file can be read reliably when knitting.
These constants define the training cutoff (end of April 2010), the May 2010 forecast window, and a zoom window used in one plot.
The dataset is loaded, the date is converted to a proper Date, and Cash is converted from hundreds of dollars into dollars. The result is converted into a tsibble so each ATM becomes its own time series.
A “future” dataset is created for May 1–May 31, 2010 for each ATM. This is the date grid the models will forecast into.
Models are trained using data before May 1, 2010. ETS and ARIMA are fit per ATM, and if ARIMA fails for an ATM, the workflow falls back to ETS to ensure a forecast is produced.
The May 2010 forecasts are produced from both models and then filtered so each ATM keeps only its selected “best” model (ARIMA when available, otherwise ETS).
fc_ets <- fits %>%
select(ATM, ets) %>%
forecast(new_data = new_data) %>%
mutate(model_used = "ets")
fc_arima <- fits %>%
select(ATM, arima) %>%
forecast(new_data = new_data) %>%
mutate(model_used = "arima")
may_fc <- bind_rows(fc_ets, fc_arima) %>%
left_join(fits %>% as_tibble() %>% select(ATM, best), by = "ATM") %>%
filter(model_used == best)This plot shows the full history of daily cash withdrawals by ATM to provide context for the forecast period.
atm_ts %>%
autoplot(Cash_dollars) +labs( title = "Daily Cash Withdrawals by ATM",
y = "Cash Withdrawn ($)",x = "Date")This plot overlays the May 2010 forecast on top of the historical data and includes uncertainty bands.
may_fc %>% autoplot(atm_ts, level = 95) + labs( title = "Forecast for May 2010 (ARIMA with ETS fallback)", y = "Cash Withdrawn ($)",x = "Date")This zoomed view focuses on the most recent months leading into May, which makes the forecast region easier to read.
atm_recent <- atm_ts %>% filter(DATE >= ZOOM_START)
may_fc %>% autoplot(atm_recent, level = 95) + labs(title = "Forecast for May 2010 (Zoomed In)",y = "Cash Withdrawn ($)",x = "Date")This table shows which model was used for each ATM after applying the ARIMA/ETS fallback rule.
## # A tibble: 4 × 2
## ATM model_used
## <chr> <chr>
## 1 ATM1 arima
## 2 ATM2 arima
## 3 ATM3 arima
## 4 ATM4 arima
The forecast is summarized into a May total and an average daily value per ATM, which is useful for comparing machines and planning cash loading.
may_summary <- may_fc %>%
mutate(Forecast = as.numeric(.mean)) %>%
group_by(ATM) %>%
summarise(
May2010_Total = sum(Forecast, na.rm = TRUE),
Avg_Daily = mean(Forecast, na.rm = TRUE),
.groups = "drop"
)
may_summary## # A tsibble: 124 x 4 [1D]
## # Key: ATM [4]
## ATM DATE May2010_Total Avg_Daily
## <chr> <date> <dbl> <dbl>
## 1 ATM1 2010-05-01 8519. 8519.
## 2 ATM1 2010-05-02 9810. 9810.
## 3 ATM1 2010-05-03 7546. 7546.
## 4 ATM1 2010-05-04 353. 353.
## 5 ATM1 2010-05-05 9849. 9849.
## 6 ATM1 2010-05-06 7814. 7814.
## 7 ATM1 2010-05-07 8398. 8398.
## 8 ATM1 2010-05-08 8490. 8490.
## 9 ATM1 2010-05-09 9749. 9749.
## 10 ATM1 2010-05-10 7764. 7764.
## # ℹ 114 more rows
This bar chart compares the forecasted total withdrawals across ATMs for the month of May 2010.
may_summary %>%
ggplot(aes(x = ATM, y = May2010_Total)) +
geom_col() +
labs(
title = "Forecasted Total Cash Withdrawals – May 2010",
y = "Total ($)",
x = "ATM"
)The daily May 2010 forecast is exported as a CSV for submission. It includes the forecast mean and 95% prediction intervals.
may_export <- may_fc %>%
as_tibble() %>%
transmute(
ATM,
DATE,
Forecast = as.numeric(.mean)
) %>%
arrange(ATM, DATE)
write.csv(may_export, "ATM_May2010_Forecast.csv", row.names = FALSE)This analysis produced daily cash withdrawal forecasts for each ATM for May 2010 using time-series models built on historical transaction data. By converting the cash values into dollars and modeling each ATM separately, the forecasts reflect differences in usage patterns across machines. The combination of ARIMA and ETS models provided flexibility, with a fallback approach ensuring a forecast was generated for every ATM. The final results, supported by visualizations and an Excel-readable output, offer a practical estimate of expected cash demand that can be used for operational planning. While the models rely solely on historical patterns and do not incorporate external factors such as holidays or special events, they provide a reasonable and transparent baseline forecast for decision-making.
This section models monthly residential electricity consumption (KWH) from January 1998 through December 2013 and produces a monthly forecast for all of 2014. The goal is to generate a practical, reproducible forecast with visuals and an Excel-readable output file.
The dataset is loaded from a single Excel file and converted into a monthly time series. The KWH variable represents monthly power usage in kilowatt-hours. Converting the date column into a proper monthly index ensures models can correctly learn long-term trend and yearly seasonality.
The dataset is read from a single CSV file. A quick preview is included to confirm the column names and verify that the data loaded correctly before converting it into a time-series format.
power_raw_csv <- read.csv("ResidentialCustomerForecastLoad-624.csv", stringsAsFactors = FALSE)
# quick check
head(power_raw_csv)## CaseSequence YYYY.MMM KWH
## 1 733 1998-Jan 6862583
## 2 734 1998-Feb 5838198
## 3 735 1998-Mar 5420658
## 4 736 1998-Apr 5010364
## 5 737 1998-May 4665377
## 6 738 1998-Jun 6467147
## 'data.frame': 192 obs. of 3 variables:
## $ CaseSequence: int 733 734 735 736 737 738 739 740 741 742 ...
## $ YYYY.MMM : chr "1998-Jan" "1998-Feb" "1998-Mar" "1998-Apr" ...
## $ KWH : int 6862583 5838198 5420658 5010364 4665377 6467147 8914755 8607428 6989888 6345620 ...
The YYYY.MMM column is converted into a monthly date index and the dataset is converted into a tsibble. This ensures the forecasting models correctly interpret the data as monthly observations with yearly seasonality.
power_ts <- power_raw_csv %>%
transmute(
DATE = yearmonth(parse_date_time(paste0(.data[["YYYY.MMM"]], "-01"),
orders = c("Y-b-d", "Y-B-d"))),
KWH = as.numeric(.data[["KWH"]])
) %>%
arrange(DATE) %>%
as_tsibble(index = DATE)
power_ts## # A tsibble: 192 x 2 [1M]
## DATE KWH
## <mth> <dbl>
## 1 1998 Jan 6862583
## 2 1998 Feb 5838198
## 3 1998 Mar 5420658
## 4 1998 Apr 5010364
## 5 1998 May 4665377
## 6 1998 Jun 6467147
## 7 1998 Jul 8914755
## 8 1998 Aug 8607428
## 9 1998 Sep 6989888
## 10 1998 Oct 6345620
## # ℹ 182 more rows
Models are trained using data through December 2013 and used to generate forecasts for January–December 2014. Creating an explicit 2014 “future” index ensures the forecast output has correct monthly timestamps.
Because the series is monthly and spans many years, the models should capture trend and annual seasonality. ETS and ARIMA are used as standard forecasting models, and a seasonal naïve method is included as a simple baseline for comparison.
fits_b <- train_b %>%
model(
snaive = SNAIVE(KWH ~ season("year")),
ets = ETS(KWH),
arima = ARIMA(KWH)
)
fits_b## # A mable: 1 x 3
## snaive ets arima
## <model> <model> <model>
## 1 <NULL model> <NULL model> <ARIMA(0,0,2)(2,1,0)[12] w/ drift>
To choose a final model, the year 2013 is treated as a holdout period. Models are fit on data through 2012 and evaluated by forecasting 2013, then comparing forecast accuracy to the actual 2013 values.
train_2012 <- power_ts %>% filter(DATE < yearmonth("2013 Jan"))
test_2013 <- power_ts %>% filter(DATE >= yearmonth("2013 Jan") & DATE < yearmonth("2014 Jan"))
fits_2012 <- train_2012 %>%
model(
snaive = SNAIVE(KWH ~ season("year")),
ets = ETS(KWH),
arima = ARIMA(KWH)
)
acc_b <- fits_2012 %>%
forecast(new_data = test_2013) %>%
accuracy(test_2013)
acc_b## # A tibble: 3 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima Test 467241. 1500763. 971623. 4.63 11.8 NaN NaN 0.0219
## 2 ets Test NaN NaN NaN NaN NaN NaN NaN NA
## 3 snaive Test NaN NaN NaN NaN NaN NaN NaN NA
To choose a final model, the year 2013 is treated as a holdout period. Models are fit on data through 2012 and evaluated by forecasting 2013, then comparing forecast accuracy to the actual 2013 values.
## [1] "arima"
The final model is selected using the holdout accuracy results. In this report, the model with the lowest RMSE is chosen as the best-performing option for forecasting 2014.
final_fit_b <- train_b %>%
model(
snaive = SNAIVE(KWH ~ season("year")),
ets = ETS(KWH),
arima = ARIMA(KWH)
)
fc_2014 <- final_fit_b %>%
select(!!best_model) %>%
forecast(new_data = new_data_2014)
fc_2014## # A fable: 12 x 4 [1M]
## # Key: .model [1]
## .model DATE KWH .mean
## <chr> <mth> <dist> <dbl>
## 1 arima 2014 Jan N(9790458, 7.8e+11) 9790458.
## 2 arima 2014 Feb N(8624753, 8e+11) 8624753.
## 3 arima 2014 Mar N(6623643, 8.1e+11) 6623643.
## 4 arima 2014 Apr N(6e+06, 8.1e+11) 5974459.
## 5 arima 2014 May N(5935072, 8.1e+11) 5935072.
## 6 arima 2014 Jun N(8199158, 8.1e+11) 8199158.
## 7 arima 2014 Jul N(9498803, 8.1e+11) 9498803.
## 8 arima 2014 Aug N(1e+07, 8.1e+11) 9999104.
## 9 arima 2014 Sep N(8456634, 8.1e+11) 8456634.
## 10 arima 2014 Oct N(5860011, 8.1e+11) 5860011.
## 11 arima 2014 Nov N(6143003, 8.1e+11) 6143003.
## 12 arima 2014 Dec N(8245603, 8.1e+11) 8245603.
## [1] "KWH"
This plot shows the full history of monthly residential electricity consumption. It helps confirm overall trend and repeating seasonal patterns before focusing on the forecast period.
power_ts %>%
autoplot(KWH) +
labs(
title = "Residential Power Usage (Monthly KWH), 1998–2013",
x = "Month",
y = "KWH"
)This plot overlays the 2014 forecast on the historical data and includes uncertainty bands. The result provides a clear view of the expected seasonal pattern across the forecast year.
fc_2014 %>%
autoplot(power_ts, level = 95) +
labs(
title = paste("2014 Monthly Power Forecast (", best_model, ")", sep = ""),
x = "Month",
y = "KWH"
)
## Export Forecast to Excel-Readable File The final step exports the
2014 monthly forecast to a CSV file that can be opened in Excel. The
output includes the forecast mean and 95% prediction interval
bounds.
if (length(dist_col) == 0) {
stop("No distribution column found in fc_2014. Recreate fc_2014 directly from forecast() and try again.")
}
power_export_2014 <- fc_2014 %>%
mutate(`95%` = hilo(.data[[dist_col[1]]], 95)) %>%
unpack_hilo(`95%`) %>%
as_tibble() %>%
transmute(
Month = as.character(DATE),
Forecast_KWH = as.numeric(.mean),
Lo95 = as.numeric(`95%_lower`),
Hi95 = as.numeric(`95%_upper`)
)
write.csv(power_export_2014, "ResidentialPower_Forecast_2014.csv", row.names = FALSE)
power_export_2014## # A tibble: 12 × 4
## Month Forecast_KWH Lo95 Hi95
## <chr> <dbl> <dbl> <dbl>
## 1 2014 Jan 9790458. 8059904. 11521011.
## 2 2014 Feb 8624753. 6868233. 10381273.
## 3 2014 Mar 6623643. 4864954. 8382333.
## 4 2014 Apr 5974459. 4215769. 7733148.
## 5 2014 May 5935072. 4176382. 7693762.
## 6 2014 Jun 8199158. 6440469. 9957848.
## 7 2014 Jul 9498803. 7740113. 11257493.
## 8 2014 Aug 9999104. 8240414. 11757794.
## 9 2014 Sep 8456634. 6697944. 10215323.
## 10 2014 Oct 5860011. 4101322. 7618701.
## 11 2014 Nov 6143003. 4384314. 7901693.
## 12 2014 Dec 8245603. 6486913. 10004292.
## [1] "CaseSequence" "YYYY.MMM" "KWH"
Using monthly KWH data from 1998–2013, this analysis generated a monthly forecast for 2014 that reflects both long-term trend and strong annual seasonality. The final model was selected using a simple holdout accuracy check and the forecast was exported in an Excel-ready format for practical use.