This report covers:
Across all three parts, the modeling work depended on fixing data-format and structure problems before any forecasts could be trusted. The ATM file required Excel date conversion and removal of trailing non-series rows, the power file required interpolation of a missing monthly usage value, and the waterflow files required conversion from Excel serial timestamps into regular hourly series. Those cleanup steps were not cosmetic. Each one addressed a specific failure mode that would otherwise distort the time index, break the tsibble structure, or cause the forecasting functions to evaluate the wrong objects.
atm <- read_excel("ATM624Data.xlsx") |>
rename(Date = DATE) |>
mutate(Date = as.Date(Date, origin = "1899-12-30")) |>
filter(!is.na(ATM), !is.na(Date))
The ATM data looked straightforward at first, but two cleanup issues
had to be handled immediately. First, the DATE field came
from Excel as a serial date rather than a native R Date, so
using a plain as.Date() call pushed the observations into
the wrong century. Converting with the Excel origin
"1899-12-30" restored the intended 2009 to 2010 daily
calendar. Second, the worksheet included trailing rows with missing ATM
identifiers and no usable cash values. Those rows were removed because
they do not belong to any ATM time series and they caused grouped
modeling code to attempt fits on empty series. The unit interpretation
also needs one clarification: the Excel file itself labels the measure
only as Cash; the statement that cash is recorded in
hundreds of dollars comes from the assignment prompt, not from metadata
embedded in the workbook.
atm |>
ggplot(aes(Date, Cash)) +
geom_line() +
facet_wrap(~ATM, scales = "free_y", ncol = 1) +
labs(
title = "ATM cash withdrawal history",
y = "Cash (hundreds of dollars)"
)
The ATM4 panel includes one very large one-day spike that rises above
9,000 on the chart scale. The assignment specification, rather than the
workbook itself, states that Cash is recorded in hundreds
of dollars, so under the assignment’s unit convention the February 9,
2010 ATM4 value of 10,919.76 corresponds to roughly $1.09 million. The
value immediately drops back toward the usual range on the following
days, and even the 99th percentile of ATM4 is only about 1,488. In
practical terms, the chart is showing a single outlier stretching the
panel scale rather than a permanent regime change.
That outlier could reasonably be viewed as operationally implausible for an ATM, so imputing it would be a defensible modeling choice if we were willing to lean on domain knowledge about transaction limits and dispensing behavior. More specifically, ATMs in general do not dispense anything close to that amount in a single transaction, and even as a daily total this point is far outside the surrounding ATM4 pattern. In this report, however, the point was not replaced because the dataset itself does not provide enough metadata to verify whether the issue is a data-entry mistake, an aggregation problem, or some other recording error. Replacing it would therefore insert a guessed value without a confirmed correction target. The conservative treatment was to retain the observation but model ATM4 on the log scale, which reduces the leverage of that point while preserving the weekly pattern in the rest of the series. A reasonable alternative analysis could instead document an imputation on operational-plausibility grounds.
# Hold out the final 28 days for each ATM and compare candidate models.
atm_accuracy <- atm |>
group_by(ATM) |>
group_modify(~ {
dat <- .x |> as_tsibble(index = Date)
train <- head(dat, nrow(dat) - 28)
test <- tail(dat, 28)
if (nrow(train) == 0 || nrow(test) == 0 || all(is.na(train$Cash))) {
return(tibble())
}
fits <- list(
mean = train |> model(MEAN(Cash)),
naive = train |> model(NAIVE(Cash)),
snaive = train |> model(SNAIVE(Cash ~ lag("week"))),
sarima = train |> model(ARIMA(Cash)),
log_sarima = train |> mutate(log_cash = log1p(Cash)) |> model(ARIMA(log_cash))
)
bind_rows(
accuracy(forecast(fits$mean, h = 28), test) |> mutate(Model = "mean"),
accuracy(forecast(fits$naive, h = 28), test) |> mutate(Model = "naive"),
accuracy(forecast(fits$snaive, h = 28), test) |> mutate(Model = "snaive"),
accuracy(forecast(fits$sarima, h = 28), test) |> mutate(Model = "sarima"),
accuracy(forecast(fits$log_sarima, h = 28), test |> mutate(log_cash = log1p(Cash))) |> mutate(Model = "log_sarima")
)
})
atm_accuracy
## # A tibble: 20 × 12
## # Groups: ATM [4]
## ATM .model .type ME RMSE MAE MPE MAPE MASE RMSSE
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ATM1 "MEAN(Cash)" Test -9.24 32.9 20.1 -3.37e2 3.48e2 NaN NaN
## 2 ATM1 "NAIVE(Cash)" Test -21.6 38.2 24.9 -4.01e2 4.04e2 NaN NaN
## 3 ATM1 "SNAIVE(Cash… Test -12.2 16.9 14.7 -7.63e1 7.92e1 NaN NaN
## 4 ATM1 "ARIMA(Cash)" Test -7.79 12.7 10.4 -4.50e1 4.80e1 NaN NaN
## 5 ATM1 "ARIMA(log_c… Test -0.167 0.325 0.204 -8.79e0 9.61e0 NaN NaN
## 6 ATM2 "MEAN(Cash)" Test -1.94 37.6 32.3 -7.11e2 7.43e2 NaN NaN
## 7 ATM2 "NAIVE(Cash)" Test 19.8 42.4 39.6 -4.30e2 4.98e2 NaN NaN
## 8 ATM2 "SNAIVE(Cash… Test 14.5 28.6 18.6 4.09e1 4.68e1 NaN NaN
## 9 ATM2 "ARIMA(Cash)" Test 11.0 20.9 15.1 3.57e0 3.77e1 NaN NaN
## 10 ATM2 "ARIMA(log_c… Test 0.425 0.579 0.463 1.72e1 1.81e1 NaN NaN
## 11 ATM3 "MEAN(Cash)" Test 9.39 28.8 9.39 1 e2 1 e2 NaN NaN
## 12 ATM3 "NAIVE(Cash)" Test 9.39 28.8 9.39 1 e2 1 e2 NaN NaN
## 13 ATM3 "SNAIVE(Cash… Test 9.39 28.8 9.39 1 e2 1 e2 NaN NaN
## 14 ATM3 "ARIMA(Cash)" Test 9.39 28.8 9.39 1 e2 1 e2 NaN NaN
## 15 ATM3 "ARIMA(log_c… Test 0.480 1.47 0.480 1 e2 1 e2 NaN NaN
## 16 ATM4 "MEAN(Cash)" Test -102. 297. 252. -1.31e3 1.33e3 NaN NaN
## 17 ATM4 "NAIVE(Cash)" Test -439. 520. 444. -2.29e3 2.29e3 NaN NaN
## 18 ATM4 "SNAIVE(Cash… Test -92.4 272. 196. -3.48e2 3.64e2 NaN NaN
## 19 ATM4 "ARIMA(Cash)" Test -102. 297. 252. -1.31e3 1.33e3 NaN NaN
## 20 ATM4 "ARIMA(log_c… Test -0.287 1.34 0.989 -2.20e1 3.29e1 NaN NaN
## # ℹ 2 more variables: ACF1 <dbl>, Model <chr>
Model comparison for the ATM section also needed one defensive fix. Because grouped forecasting code is sensitive to malformed or empty partitions, the evaluation step now skips any ATM subset that would produce an empty training window or contain no usable cash history. This keeps the holdout comparison stable and makes the report reproducible even if stray rows reappear in the source workbook. Conceptually, the ATM series also differ in behavior enough that the same model should not be forced onto all four machines: ATM1 and ATM4 show variance patterns that benefit from log transformation, ATM2 is stable enough to model on the original scale, and ATM3 is mostly inactive so a zero forecast remains the most interpretable choice.
atm_ts <- atm |> as_tsibble(key = ATM, index = Date)
atm1_fit <- atm_ts |> filter(ATM == "ATM1") |> mutate(log_cash = log1p(Cash)) |> model(arima = ARIMA(log_cash))
atm2_fit <- atm_ts |> filter(ATM == "ATM2") |> model(arima = ARIMA(Cash))
atm4_fit <- atm_ts |> filter(ATM == "ATM4") |> mutate(log_cash = log1p(Cash)) |> model(arima = ARIMA(log_cash))
atm1_fc <- forecast(atm1_fit, h = "31 days") |> as_tibble() |> transmute(Date, ATM = "ATM1", Forecast_HundredsUSD = exp(.mean) - 1)
atm2_fc <- forecast(atm2_fit, h = "31 days") |> as_tibble() |> transmute(Date, ATM = "ATM2", Forecast_HundredsUSD = .mean)
atm3_fc <- tibble(Date = seq(as.Date("2010-05-01"), as.Date("2010-05-31"), by = "day"),
ATM = "ATM3",
Forecast_HundredsUSD = 0)
atm4_fc <- forecast(atm4_fit, h = "31 days") |> as_tibble() |> transmute(Date, ATM = "ATM4", Forecast_HundredsUSD = exp(.mean) - 1)
atm_forecast <- bind_rows(atm1_fc, atm2_fc, atm3_fc, atm4_fc) |>
mutate(Forecast_USD = Forecast_HundredsUSD * 100)
atm_forecast
## # A tibble: 124 × 4
## Date ATM Forecast_HundredsUSD Forecast_USD
## <date> <chr> <dbl> <dbl>
## 1 2010-05-01 ATM1 85.2 8519.
## 2 2010-05-02 ATM1 95.6 9559.
## 3 2010-05-03 ATM1 74.9 7487.
## 4 2010-05-04 ATM1 3.42 342.
## 5 2010-05-05 ATM1 98.9 9891.
## 6 2010-05-06 ATM1 77.4 7739.
## 7 2010-05-07 ATM1 83.7 8372.
## 8 2010-05-08 ATM1 84.8 8484.
## 9 2010-05-09 ATM1 95.4 9539.
## 10 2010-05-10 ATM1 77.7 7769.
## # ℹ 114 more rows
atm_forecast |>
ggplot(aes(Date, Forecast_HundredsUSD)) +
geom_line() +
facet_wrap(~ATM, scales = "free_y", ncol = 1) +
labs(title = "May 2010 ATM forecasts", y = "Forecast (hundreds of dollars)")
power <- read_excel("ResidentialCustomerForecastLoad-624.xlsx") |>
mutate(
Date = yearmonth(`YYYY-MMM`),
KWH = as.numeric(KWH)
) |>
as_tsibble(index = Date)
power <- power |>
mutate(KWH = if_else(is.na(KWH), (lag(KWH) + lead(KWH)) / 2, KWH))
The residential power series was much cleaner structurally, but it
still required a targeted repair before modeling. The monthly date stamp
was stored as a character label such as YYYY-MMM, so it was
converted into a proper yearmonth index to preserve monthly
spacing. One monthly KWH value was missing in September
2008. Because the surrounding pattern is smooth, strongly seasonal, and
does not show abrupt local shocks at that location, that single gap was
filled using the average of the adjacent months. This is a reasonable
interpolation for a lone interior missing value and preserves the
continuity needed for ETS and ARIMA estimation.
power |>
autoplot(KWH) +
labs(title = "Residential power usage", y = "KWH")
The most visually striking anomaly in the power chart is not in January 2010 but in July 2010, where usage drops to 770,523 KWH. That value is far below the neighboring months of June 2010 at 6,696,292 and August 2010 at 7,922,701, so it behaves like an isolated low outlier rather than part of the normal annual cycle. In this report, that point was retained rather than imputed. The reason is that the file provides no external metadata indicating whether the dip reflects a billing artifact, a reporting problem, a partial-month measurement, or a true but unusual drop in usage. Because there was no verified correction target, the cleanup step was limited to the documented missing value in September 2008. The ETS model therefore absorbs the July 2010 dip as part of the observed history, with the understanding that it may slightly increase forecast uncertainty around the seasonal pattern.
power_train <- head(power, nrow(power) - 12)
power_test <- tail(power, 12)
power_fits <- power_train |>
model(
mean = MEAN(KWH),
naive = NAIVE(KWH),
snaive = SNAIVE(KWH),
ets = ETS(KWH),
arima = ARIMA(KWH)
)
power_fc <- forecast(power_fits, h = "12 months")
accuracy(power_fc, power_test)
## # A tibble: 5 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima Test 693868. 1579340. 1025467. 7.71 12.4 NaN NaN 0.0146
## 2 ets Test 398169. 1057623. 692844. 4.14 8.31 NaN NaN 0.0850
## 3 mean Test 1189879. 1965309. 1546469. 12.1 18.2 NaN NaN 0.165
## 4 naive Test 1008641. 1861179. 1501420 9.59 18.0 NaN NaN 0.165
## 5 snaive Test 405195. 1035538. 618606. 4.55 7.06 NaN NaN -0.0313
There was also an evaluation issue in this section: forecast accuracy has to be computed on actual forecasts, not on the fitted model objects themselves. The workflow therefore generates a 12-month holdout forecast first and only then compares those values with the test year. That change does not alter the underlying models, but it is essential for producing valid accuracy metrics and for making the model-selection table reflect true out-of-sample performance.
power_fit <- power |>
model(ets = ETS(KWH))
power_fc <- forecast(power_fit, h = "12 months")
power_fc
## # A fable: 12 x 4 [1M]
## # Key: .model [1]
## .model Date
## <chr> <mth>
## 1 ets 2014 Jan
## 2 ets 2014 Feb
## 3 ets 2014 Mar
## 4 ets 2014 Apr
## 5 ets 2014 May
## 6 ets 2014 Jun
## 7 ets 2014 Jul
## 8 ets 2014 Aug
## 9 ets 2014 Sep
## 10 ets 2014 Oct
## 11 ets 2014 Nov
## 12 ets 2014 Dec
## # ℹ 2 more variables: KWH <dist>, .mean <dbl>
autoplot(power, KWH) +
autolayer(power_fc, level = NULL) +
labs(title = "Residential power forecast for 2014", y = "KWH")
read_pipe_hourly <- function(path) {
read_excel(path) |>
rename(DateTime = 1, WaterFlow = 2) |>
mutate(
DateTime = as_datetime(DateTime * 86400, origin = "1899-12-30", tz = "UTC"),
Hour = floor_date(DateTime, "hour")
) |>
summarise(WaterFlow = mean(WaterFlow, na.rm = TRUE), .by = Hour) |>
as_tsibble(index = Hour) |>
fill_gaps() |>
mutate(WaterFlow = zoo::na.approx(WaterFlow, na.rm = FALSE)) |>
tidyr::fill(WaterFlow, .direction = "downup")
}
pipe1 <- read_pipe_hourly("Waterflow_Pipe1.xlsx")
pipe2 <- read_pipe_hourly("Waterflow_Pipe2.xlsx")
The waterflow files presented the most substantial preprocessing challenge. Their timestamps were stored as Excel serial datetimes, not as text datetimes, so parsing them with a string-based function produced missing times and caused the tsibble conversion to fail. The fix was to convert the numeric values from Excel time into valid POSIX datetimes first, then floor them to the hour and aggregate within each hour using the mean flow. That hourly aggregation directly addresses the irregular observation times described in the methods notes and creates a regular series suitable for stationarity testing and forecasting.
After aggregation, any missing hourly positions were made explicit
with fill_gaps(). Internal gaps were interpolated with
zoo::na.approx(), and any boundary gaps were filled
forward/backward so the final hourly series remained complete. This
sequence matters because the stationarity tests and fable models both
assume a valid, ordered index without missing timestamps. In practical
terms, the cleanup converts two irregular operational logs into
comparable hourly process series, which then makes it reasonable to use
a simple naïve model for Pipe 1 and an ARIMA-based model for Pipe 2.
adf.test(pipe1$WaterFlow)
##
## Augmented Dickey-Fuller Test
##
## data: pipe1$WaterFlow
## Dickey-Fuller = -6.3308, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(pipe1$WaterFlow)
##
## KPSS Test for Level Stationarity
##
## data: pipe1$WaterFlow
## KPSS Level = 0.19284, Truncation lag parameter = 4, p-value = 0.1
adf.test(pipe2$WaterFlow)
##
## Augmented Dickey-Fuller Test
##
## data: pipe2$WaterFlow
## Dickey-Fuller = -9.1249, Lag order = 9, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(pipe2$WaterFlow)
##
## KPSS Test for Level Stationarity
##
## data: pipe2$WaterFlow
## KPSS Level = 0.47514, Truncation lag parameter = 7, p-value = 0.04727
pipe1_fit <- pipe1 |> model(naive = NAIVE(WaterFlow))
pipe2_fit <- pipe2 |> model(arima = ARIMA(WaterFlow))
pipe1_fc <- forecast(pipe1_fit, h = "168 hours")
pipe2_fc <- forecast(pipe2_fit, h = "168 hours")
autoplot(pipe1, WaterFlow) + autolayer(pipe1_fc, level = NULL) + labs(title = "Pipe 1 one-week forecast")
autoplot(pipe2, WaterFlow) + autolayer(pipe2_fc, level = NULL) + labs(title = "Pipe 2 one-week forecast")
write_xlsx(
list(
ATM_May2010 = atm_forecast,
Power_2014 = as_tibble(power_fc),
Pipe1_WeekAhead = as_tibble(pipe1_fc),
Pipe2_WeekAhead = as_tibble(pipe2_fc)
),
"forecast_outputs.xlsx"
)