Fleet & Delivery Analytics: Predicting Trip Delays and Fuel Misuse for AB InBev Sagamu Operations

Author

Babatunde Odedina

Published

May 6, 2026

1. Executive Summary

This analysis examines 1,104 delivery trips executed by AB InBev’s Sagamu Transport Operations between January and March 2026. The dataset covers 91 vehicles, 119 delivery officers, and 5 fleet controllers, spanning routes from Sagamu to 51 destinations across Nigeria. The central business question is: which operational factors predict whether a trip will be delayed or involve fuel misuse, and how can those predictions be explained and acted upon by fleet management?

Five analytical techniques are applied. A classification model predicts trip delay (on-time vs late) with strong accuracy; SHAP explainability reveals that trip distance and fleet controller assignment are the dominant drivers. K-Means clustering segments trips into four operationally distinct groups — short efficient runs, long problematic routes, fuel-intensive hauls, and high-cost anomalous trips. PCA reduces the feature space and confirms cluster separation. Time series decomposition of weekly trip volume and fuel consumption reveals a March demand surge and stable mid-week operational patterns.

Recommendation: Immediate intervention is warranted on long-haul routes (> 500 km) supervised by underperforming fleet controllers, where delay probability exceeds 40%. A fuel misuse alert threshold should be set at 20 litres per trip, flagging the 36.4% of trips currently above zero misuse for controller review before payment release.

2. Professional Disclosure

Job Title: Fleet Operations Analyst
Organisation: DP World Logistics
Department: Transport Managed Services

Technique Relevance to My Work

1. Classification Model: In fleet management, the most operationally valuable prediction is whether a trip in progress will be completed on time. A classification model trained on historical trip records allows the operations team to flag at-risk trips before departure — triggering proactive route adjustments, driver briefings, or contingency dispatch. This directly replaces the current reactive approach of identifying delays only after they occur.

2. Model Explainability (SHAP): Fleet controllers and depot managers are not data scientists. SHAP values translate the model’s predictions into driver- and route-level language that a non-technical operations team can act on. When presenting to management, I need to explain why a specific trip was flagged as high-risk — not just that it was flagged.

3. Clustering (K-Means Segmentation): Our 1,104 trips are not homogeneous — short Lagos runs behave completely differently from long-haul Abuja or Enugu deliveries. Clustering allows the operations team to manage each segment with tailored SLAs, fuel budgets, and driver assignments rather than applying uniform policies across the entire fleet.

4. Dimensionality Reduction (PCA): With 12+ operational variables per trip, visualising the full data space is impossible. PCA compresses these into two interpretable dimensions, allowing me to confirm that our clusters are genuinely distinct — a necessary validation step before acting on segmentation outputs.

5. Time Series Analysis: Trip volume and fuel consumption vary week-by-week. Decomposing this time series separates the trend (is the operation growing?), seasonal patterns (which weeks are consistently high-pressure?), and irregular spikes. This directly informs staffing levels, vehicle allocation, and fuel procurement scheduling.

3. Data Collection & Sampling

3.1 Data Source

Source system: Fleet management platform — Trip Review Report export (ABNBEV)
Client: AB InBev (Anheuser-Busch InBev Nigeria)
Operation site: Sagamu Transport Operations
Data type: One row per completed delivery trip; covers all outbound trips from the Sagamu depot and affiliated origins (Ilesa, Onitsha, Lagos)
Collection method: Direct export from the fleet management system, covering January, February, and March 2026

3.2 Variables Collected

Variable	Type	Description
`Shipment Number`	ID	Unique trip identifier
`Fleet Controller`	Categorical	Supervising controller (5 controllers)
`Vehicle`	Categorical	Truck registration (91 vehicles)
`Delivery Officer`	Categorical	Driver assigned to trip (119 officers)
`Created At`	Date	Trip dispatch date
`Completed At`	Date	Trip completion date
`Origin`	Categorical	Departure depot (4 origins)
`Destination`	Categorical	Delivery destination (51 locations)
`Trip Distance`	Numeric	Planned route distance (km)
`Distance Covered`	Numeric	Actual odometer distance covered (km)
`Fuel Budget`	Numeric	Allocated fuel (litres)
`Fuel Dispensed`	Numeric	Fuel issued at departure (litres)
`Fuel Used`	Numeric	Fuel actually consumed (litres)
`Fuel Efficiency`	Numeric	km per litre achieved
`Fuel Misused`	Numeric	Litres flagged as misused
`Expenses`	Numeric	Total trip expenses in Naira
`lead_time_days`	Derived	Days from dispatch to completion
`on_time`	Derived	1 = within SLA, 0 = delayed
`fuel_misuse_flag`	Derived	1 = any fuel misuse recorded

3.3 Sampling Frame & Period

Population: All outbound trips dispatched from Sagamu Transport Operations, January–March 2026
Sample size: 1,104 completed trips (full census — no sampling required)
Period covered: 1 January 2026 to 31 March 2026 (Q1 2026)
SLA definition: Trips under 500 km must complete within 7 days; trips 500 km and above must complete within 14 days. This mirrors the contractual framework in place with AB InBev.
Statistical rationale: 1,104 observations provides excellent power for both classification (well above the 200-observation minimum) and clustering. The three-month window captures a full quarterly cycle including the March production push typical of FMCG operations.

3.4 Ethical Notes

Driver names and delivery officer identities are present in the raw data. In this published version, individual names are retained at the operational level but are not used as model features — analysis is aggregated to the fleet controller and vehicle level. All data is internal operational data shared with organisational approval for academic purposes.

4. Data Description

Code

# ── Libraries ──────────────────────────────────────────────────────────────────
library(tidyverse)
library(lubridate)
library(janitor)
library(skimr)
library(gt)

# ── Load and prepare data ──────────────────────────────────────────────────────
df <- readxl::read_excel("TripReviewReport_ABNBEV_JAN_FEB_MAR.xlsx") |>
  clean_names() |>
  mutate(
    created_at    = dmy(created_at),
    completed_at  = dmy(completed_at),
    lead_time_days = as.numeric(completed_at - created_at),
    # SLA: <=7 days for short-haul (<500km), <=14 days for long-haul
    on_time        = as.integer(
      (trip_distance < 500  & lead_time_days <= 7) |
      (trip_distance >= 500 & lead_time_days <= 14)
    ),
    fuel_misuse_flag = as.integer(fuel_misused > 0),
    distance_cat     = factor(
      ifelse(trip_distance < 500, "Short-haul (<500km)", "Long-haul (>=500km)"),
      levels = c("Short-haul (<500km)", "Long-haul (>=500km)")
    ),
    month            = factor(month(created_at, label = TRUE, abbr = FALSE),
                              levels = c("January","February","March")),
    fleet_controller = factor(fleet_controller),
    origin           = factor(origin),
    destination      = factor(destination),
    # Cost per km
    cost_per_km      = ifelse(distance_covered > 0, expenses / distance_covered, NA_real_),
    # Odometer discrepancy (actual vs planned)
    dist_discrepancy = distance_covered - trip_distance
  ) |>
  # Remove 16 same-day trips (zero lead time — likely data entry issue)
  filter(lead_time_days > 0)

cat("Rows after cleaning:", nrow(df), "\n")

Rows after cleaning: 1088

Code

cat("Date range:", format(min(df$created_at)), "to", format(max(df$created_at)), "\n")

Date range: 2026-01-01 to 2026-03-31

Code

cat("On-time rate:", round(mean(df$on_time) * 100, 1), "%\n")

On-time rate: 89.3 %

Code

cat("Fuel misuse rate:", round(mean(df$fuel_misuse_flag) * 100, 1), "%\n")

Fuel misuse rate: 36.9 %

Code

glimpse(df)

Rows: 1,088
Columns: 31
$ s_n              <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
$ client           <chr> "AB InBev", "AB InBev", "AB InBev", "AB InBev", "AB I…
$ shipment_number  <chr> "206855", "269651", "273737", "201603", "201604", "20…
$ operation_site   <chr> "Sagamu Transport Operations", "Sagamu Transport Oper…
$ fleet_controller <fct> Vincent Ojo, Vincent Ojo, Dare Adediran, Vincent Ojo,…
$ vehicle          <chr> "T32805LA", "T32822LA", "T32781LA", "T32808LA", "T327…
$ delivery_officer <chr> "Alabi sunday Kayode", "Godfrey Braimoh", "Izuka Good…
$ created_at       <date> 2026-01-02, 2026-01-02, 2026-01-23, 2026-01-28, 2026…
$ completed_at     <date> 2026-01-19, 2026-01-12, 2026-01-30, 2026-02-02, 2026…
$ origin           <fct> Sagamu, Ilesa, Sagamu, Sagamu, Sagamu, Sagamu, Sagamu…
$ destination      <fct> Lagos, Abuja, Abia, Lagos, Oyo, Lagos, Oyo, Lagos, An…
$ trip_distance    <dbl> 138, 980, 1110, 164, 316, 164, 174, 280, 436, 1388, 1…
$ start_odometer   <dbl> 3410, 4045, 6935, 4819, 4618, 6862, 8034, 4785, 2634,…
$ end_odometer     <dbl> 3521, 5315, 7802, 4955, 5038, 7001, 8198, 5069, 3110,…
$ distance_covered <dbl> 111, 1270, 867, 136, 420, 139, 164, 284, 476, 1196, 1…
$ fuel_budget      <dbl> 69, 490, 555, 82, 158, 82, 87, 140, 218, 694, 85, 734…
$ fuel_dispensed   <dbl> 69, 626, 555, 78, 180, 78, 83, 133, 436, 753, 84, 770…
$ fuel_balance     <dbl> 0, 0, 20, 0, 0, 0, 0, 0, 0, 135, 0, 0, 0, 0, 10, 0, 1…
$ fuel_used        <dbl> 69, 626, 535, 78, 180, 78, 83, 133, 436, 618, 84, 770…
$ fuel_consumption <dbl> 1.61, 2.03, 1.62, 1.74, 2.33, 1.78, 1.98, 2.14, 1.09,…
$ fuel_efficiency  <dbl> 2.00, 1.57, 2.07, 2.10, 1.76, 2.10, 2.10, 2.11, 1.00,…
$ fuel_misused     <dbl> 0, 136, 0, 0, 22, 0, 0, 0, 218, 0, 0, 36, 0, 0, 0, 0,…
$ breakages        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ expenses         <dbl> 10800, 14700, 16650, 4800, 6952, 4800, 4800, 6160, 88…
$ lead_time_days   <dbl> 17, 10, 7, 5, 6, 3, 3, 4, 7, 7, 3, 13, 7, 9, 1, 2, 12…
$ on_time          <int> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ fuel_misuse_flag <int> 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1,…
$ distance_cat     <fct> Short-haul (<500km), Long-haul (>=500km), Long-haul (…
$ month            <ord> January, January, January, January, January, January,…
$ cost_per_km      <dbl> 97.297297, 11.574803, 19.204152, 35.294118, 16.552381…
$ dist_discrepancy <dbl> -27, 290, -243, -28, 104, -25, -10, 4, 40, -192, -15,…

Code

skim(df |> select(lead_time_days, trip_distance, distance_covered,
                  fuel_used, fuel_efficiency, fuel_misused,
                  expenses, cost_per_km, on_time, fuel_misuse_flag))

Data summary
Name	select(…)
Number of rows	1088
Number of columns	10
_______________________
Column type frequency:
numeric	10
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
lead_time_days	0	1.00	6.92	4.30	1.00	4.00	6.0	9.00	56	▇▁▁▁▁
trip_distance	0	1.00	693.70	512.36	64.00	188.00	750.0	1100.00	2978	▇▆▂▁▁
distance_covered	0	1.00	692.00	550.17	0.00	194.75	548.0	1098.00	3151	▇▆▂▁▁
fuel_used	0	1.00	349.92	266.51	-10.00	93.00	357.0	545.25	1518	▇▆▂▁▁
fuel_efficiency	0	1.00	2.24	4.97	-19.00	1.90	2.0	2.10	146	▇▁▁▁▁
fuel_misused	0	1.00	14.92	31.31	0.00	0.00	0.0	20.00	342	▇▁▁▁▁
expenses	0	1.00	11818.73	6933.21	4800.00	4800.00	12750.0	16680.00	44670	▇▆▁▁▁
cost_per_km	28	0.97	27.74	92.79	7.54	14.49	18.6	26.23	2400	▇▁▁▁▁
on_time	0	1.00	0.89	0.31	0.00	1.00	1.0	1.00	1	▁▁▁▁▇
fuel_misuse_flag	0	1.00	0.37	0.48	0.00	0.00	0.0	1.00	1	▇▁▁▁▅

Code

# Fleet controller performance summary
df |>
  group_by(fleet_controller) |>
  summarise(
    trips       = n(),
    on_time_pct = round(mean(on_time) * 100, 1),
    avg_lead    = round(mean(lead_time_days), 1),
    fuel_misuse_pct = round(mean(fuel_misuse_flag) * 100, 1),
    avg_expenses = round(mean(expenses), 0)
  ) |>
  arrange(on_time_pct) |>
  gt() |>
  tab_header(title = "Fleet Controller Performance Summary") |>
  cols_label(
    fleet_controller = "Fleet Controller", trips = "Trips",
    on_time_pct = "On-Time %", avg_lead = "Avg Lead Time (days)",
    fuel_misuse_pct = "Fuel Misuse %", avg_expenses = "Avg Expenses (NGN)"
  ) |>
  tab_style(
    style     = cell_fill(color = "#FFEBEE"),
    locations = cells_body(rows = on_time_pct < 85)
  )

Fleet Controller	Trips	On-Time %	Avg Lead Time (days)	Fuel Misuse %	Avg Expenses (NGN)
Fleet Controller Performance Summary
Dare Adediran	211	87.2	7.0	44.5	11640
Deborah James	133	87.2	6.9	30.1	12402
Vincent Ojo	253	87.4	7.1	36.8	11532
Kayode Adebisi	261	89.7	7.0	39.5	12379
OLUWADAMILARE ADENIRAN	230	94.3	6.6	30.9	11325

5. Classification Model

Theory: Classification algorithms learn decision boundaries that separate observations into classes. We compare Logistic Regression (interpretable linear baseline) and Random Forest (non-linear ensemble). Performance is assessed via confusion matrix, ROC/AUC, and cross-validation (Adi, 2026, Ch. 12–15).

Business justification: Predicting whether a trip will be delayed before it departs allows fleet controllers to assign more experienced drivers, pre-clear documentation, or select alternative routes on flagged trips. This transforms delay management from reactive to proactive.

Code

library(caret)
library(randomForest)
library(pROC)

# ── Feature engineering ────────────────────────────────────────────────────────
df_model <- df |>
  mutate(
    on_time_factor = factor(on_time, levels = c(0, 1),
                            labels = c("Late", "OnTime")),
    controller_encoded = as.numeric(fleet_controller),
    origin_encoded     = as.numeric(origin),
    is_long_haul       = as.integer(trip_distance >= 500),
    fuel_budget_ratio  = fuel_dispensed / (fuel_budget + 1),  # avoid div by 0
    month_num          = as.numeric(month)
  ) |>
  select(on_time_factor, trip_distance, lead_time_days,
         fuel_used, fuel_efficiency, fuel_misused, expenses,
         controller_encoded, origin_encoded, is_long_haul,
         fuel_budget_ratio, month_num, dist_discrepancy) |>
  drop_na()

set.seed(2026)

# 75/25 stratified train/test split
train_idx <- createDataPartition(df_model$on_time_factor, p = 0.75, list = FALSE)
train_df  <- df_model[train_idx, ]
test_df   <- df_model[-train_idx, ]

cat("Training set:", nrow(train_df), "trips\n")

Training set: 816 trips

Code

cat("Test set:    ", nrow(test_df),  "trips\n")

Test set:     272 trips

Code

cat("Class balance (train):\n")

Class balance (train):

Code

print(prop.table(table(train_df$on_time_factor)))


     Late    OnTime 
0.1066176 0.8933824

Code

# ── Model 1: Logistic Regression ───────────────────────────────────────────────
ctrl_cv <- trainControl(
  method          = "cv",
  number          = 5,
  classProbs      = TRUE,
  summaryFunction = twoClassSummary,
  savePredictions = "final"
)

set.seed(2026)
lr_model <- train(
  on_time_factor ~ .,
  data      = train_df,
  method    = "glm",
  family    = "binomial",
  trControl = ctrl_cv,
  metric    = "ROC"
)

lr_preds <- predict(lr_model, test_df)
lr_probs <- predict(lr_model, test_df, type = "prob")[, "OnTime"]

cat("\n=== Logistic Regression — Test Set ===\n")


=== Logistic Regression — Test Set ===

Code

confusionMatrix(lr_preds, test_df$on_time_factor, positive = "OnTime")

Confusion Matrix and Statistics

          Reference
Prediction Late OnTime
    Late     29      0
    OnTime    0    243
                                     
               Accuracy : 1          
                 95% CI : (0.9865, 1)
    No Information Rate : 0.8934     
    P-Value [Acc > NIR] : 4.81e-14   
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.8934     
         Detection Rate : 0.8934     
   Detection Prevalence : 0.8934     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : OnTime

Code

# ── Model 2: Random Forest ─────────────────────────────────────────────────────
set.seed(2026)
rf_model <- train(
  on_time_factor ~ .,
  data      = train_df,
  method    = "rf",
  ntree     = 300,
  trControl = ctrl_cv,
  metric    = "ROC",
  tuneGrid  = expand.grid(mtry = c(2, 4, 6))
)

rf_preds <- predict(rf_model, test_df)
rf_probs <- predict(rf_model, test_df, type = "prob")[, "OnTime"]

cat("\n=== Random Forest — Test Set ===\n")


=== Random Forest — Test Set ===

Code

confusionMatrix(rf_preds, test_df$on_time_factor, positive = "OnTime")

Confusion Matrix and Statistics

          Reference
Prediction Late OnTime
    Late     27      0
    OnTime    2    243
                                          
               Accuracy : 0.9926          
                 95% CI : (0.9737, 0.9991)
    No Information Rate : 0.8934          
    P-Value [Acc > NIR] : 2.686e-11       
                                          
                  Kappa : 0.9602          
                                          
 Mcnemar's Test P-Value : 0.4795          
                                          
            Sensitivity : 1.0000          
            Specificity : 0.9310          
         Pos Pred Value : 0.9918          
         Neg Pred Value : 1.0000          
             Prevalence : 0.8934          
         Detection Rate : 0.8934          
   Detection Prevalence : 0.9007          
      Balanced Accuracy : 0.9655          
                                          
       'Positive' Class : OnTime

Code

# ── ROC Curves — both models ───────────────────────────────────────────────────
roc_lr <- roc(test_df$on_time_factor, lr_probs, levels = c("Late","OnTime"))
roc_rf <- roc(test_df$on_time_factor, rf_probs, levels = c("Late","OnTime"))

cat("Logistic Regression AUC:", round(auc(roc_lr), 4), "\n")

Logistic Regression AUC: 1

Code

cat("Random Forest AUC:      ", round(auc(roc_rf), 4), "\n")

Random Forest AUC:       1

Code

# Plot
roc_df <- bind_rows(
  data.frame(fpr = 1 - roc_lr$specificities,
             tpr = roc_lr$sensitivities, model = "Logistic Regression"),
  data.frame(fpr = 1 - roc_rf$specificities,
             tpr = roc_rf$sensitivities, model = "Random Forest")
)

ggplot(roc_df, aes(x = fpr, y = tpr, colour = model)) +
  geom_line(linewidth = 1.2) +
  geom_abline(linetype = "dashed", colour = "grey50") +
  annotate("text", x = 0.6, y = 0.3,
           label = paste0("LR AUC = ", round(auc(roc_lr), 3)),
           colour = "#1565C0", size = 4) +
  annotate("text", x = 0.6, y = 0.22,
           label = paste0("RF AUC = ", round(auc(roc_rf), 3)),
           colour = "#E53935", size = 4) +
  scale_colour_manual(values = c("Logistic Regression" = "#1565C0",
                                 "Random Forest"       = "#E53935")) +
  theme_minimal(base_size = 13) +
  labs(title = "ROC Curves: Logistic Regression vs Random Forest",
       subtitle = "Predicting on-time trip completion | Higher AUC = better discrimination",
       x = "False Positive Rate (1 - Specificity)",
       y = "True Positive Rate (Sensitivity)",
       colour = "Model")

Code

# ── Side-by-side performance comparison ───────────────────────────────────────
cm_lr <- confusionMatrix(lr_preds, test_df$on_time_factor, positive = "OnTime")
cm_rf <- confusionMatrix(rf_preds, test_df$on_time_factor, positive = "OnTime")

tibble(
  Metric    = c("Accuracy", "Sensitivity (Recall)", "Specificity",
                "Precision (PPV)", "AUC"),
  `Logistic Regression` = c(
    round(cm_lr$overall["Accuracy"], 3),
    round(cm_lr$byClass["Sensitivity"], 3),
    round(cm_lr$byClass["Specificity"], 3),
    round(cm_lr$byClass["Pos Pred Value"], 3),
    round(auc(roc_lr), 3)
  ),
  `Random Forest` = c(
    round(cm_rf$overall["Accuracy"], 3),
    round(cm_rf$byClass["Sensitivity"], 3),
    round(cm_rf$byClass["Specificity"], 3),
    round(cm_rf$byClass["Pos Pred Value"], 3),
    round(auc(roc_rf), 3)
  )
) |>
  gt() |>
  tab_header(title = "Model Performance Comparison — Test Set") |>
  tab_style(
    style     = cell_fill(color = "#E3F2FD"),
    locations = cells_body(rows = `Random Forest` > `Logistic Regression`)
  )

Metric	Logistic Regression	Random Forest
Model Performance Comparison — Test Set
Accuracy	1	0.993
Sensitivity (Recall)	1	1.000
Specificity	1	0.931
Precision (PPV)	1	0.992
AUC	1	1.000

Deployment recommendation: Both models perform exceptionally well, but they tell different operational stories. Logistic Regression achieves a perfect AUC of 1.0 and 100% accuracy — a result that warrants caution, as it may reflect data leakage from lead_time_days (which is derived from the same timestamps used to compute on_time and cannot be known at the point of departure). The Random Forest achieves 99.3% accuracy and AUC of 1.0, with a specificity of 93.1% — correctly identifying 27 of the 29 Late trips in the test set. For operational deployment, the Random Forest is recommended: its slight imperfection is more realistic, its ensemble structure generalises better to unseen trips, and its feature importance outputs feed directly into the SHAP explainability layer. Before production use, both models should be retrained on pre-departure features only (removing lead_time_days) to eliminate any leakage risk.

6. Model Explainability (SHAP)

Theory: SHAP (SHapley Additive exPlanations) decomposes each prediction into the additive contribution of each feature, grounded in cooperative game theory. A SHAP summary plot shows the global feature importance and directionality; a waterfall plot explains a single prediction (Lundberg & Lee, 2017; Adi, 2026, Ch. 16).

Business justification: Fleet controllers are accountable for on-time performance but cannot act on a black-box model output. SHAP translates predictions into operational language: “this trip was flagged as high-risk primarily because it is a long-haul route with a controller who has historically underperformed on similar trips.”

Code

library(SHAPforxgboost)
library(xgboost)

# Feature columns — explicitly named so outcome column is NEVER included
feature_cols <- c("trip_distance", "fuel_used", "fuel_efficiency",
                  "fuel_misused", "expenses", "controller_encoded",
                  "origin_encoded", "is_long_haul", "fuel_budget_ratio",
                  "month_num", "dist_discrepancy")

# Extract feature matrices — select only feature_cols (excludes on_time_factor)
X_train <- as.matrix(train_df[, feature_cols])
X_test  <- as.matrix(test_df[, feature_cols])

# Binary label: must be integer 0/1, NOT a factor or factor-derived integer
# Using == "OnTime" on the factor column gives a clean TRUE/FALSE -> 0/1 integer
y_train <- as.integer(train_df$on_time_factor == "OnTime")
y_test  <- as.integer(test_df$on_time_factor  == "OnTime")

# Verify labels are strictly 0 and 1 before passing to XGBoost
stopifnot(all(y_train %in% c(0L, 1L)))
cat("Label check — unique values in y_train:", unique(y_train), "\n")

Label check — unique values in y_train: 0 1

Code

cat("Class balance in y_train:",
    round(mean(y_train) * 100, 1), "% OnTime\n")

Class balance in y_train: 89.3 % OnTime

Code

# Build DMatrix objects (XGBoost's native format — avoids type coercion issues)
dtrain <- xgb.DMatrix(data = X_train, label = y_train)
dtest  <- xgb.DMatrix(data = X_test,  label = y_test)

set.seed(2026)
xgb_model <- xgb.train(
  params  = list(
    objective   = "binary:logistic",
    eval_metric = "auc",
    max_depth   = 4,
    eta         = 0.1,
    subsample   = 0.8
  ),
  data    = dtrain,
  nrounds = 150,
  watchlist = list(train = dtrain, test = dtest),
  verbose = 0
)

# Compute SHAP values
shap_values  <- shap.values(xgb_model = xgb_model, X_train = X_train)
shap_long    <- shap.prep(shap_contrib = shap_values$shap_score,
                          X_train = X_train)

# SHAP summary plot (beeswarm)
shap.plot.summary(shap_long) +
  labs(title = "SHAP Summary Plot: Feature Contributions to On-Time Prediction",
       subtitle = "Each point = one trip; colour = feature value (red=high, blue=low)",
       x = "SHAP Value (impact on log-odds of on-time)")

Code

# Global feature importance (mean |SHAP|)
shap_importance <- shap_values$mean_shap_score |>
  enframe(name = "Feature", value = "Mean_SHAP") |>
  arrange(desc(Mean_SHAP))

ggplot(shap_importance, aes(x = reorder(Feature, Mean_SHAP), y = Mean_SHAP)) +
  geom_col(fill = "#1565C0", alpha = 0.8) +
  coord_flip() +
  theme_minimal(base_size = 13) +
  labs(title = "Global Feature Importance (Mean |SHAP| Value)",
       subtitle = "Larger bar = greater average impact on model predictions",
       x = NULL, y = "Mean |SHAP| Value")

Code

# Waterfall plot for a single high-risk trip (predicted Late)
# Note: shap.plot.waterfall() was removed in newer SHAPforxgboost versions.
# We build an equivalent waterfall manually from the SHAP score matrix.

xgb_probs     <- predict(xgb_model, dtest)
high_risk_idx <- which.min(xgb_probs)   # trip most likely to be Late

cat("High-risk trip index:", high_risk_idx,
    "| Predicted on-time prob:", round(xgb_probs[high_risk_idx], 3), "\n")

High-risk trip index: 28 | Predicted on-time prob: 0.107

Code

# Extract SHAP values for that one trip and the matching feature values
shap_row    <- shap_values$shap_score[high_risk_idx, ]
feature_row <- X_train[high_risk_idx, ]

# Build waterfall data frame
waterfall_df <- tibble(
  feature    = names(shap_row),
  shap_value = as.numeric(shap_row),
  feat_value = as.numeric(feature_row)
) |>
  arrange(desc(abs(shap_value))) |>
  slice_head(n = 10) |>                      # top 10 features only
  mutate(
    direction = ifelse(shap_value >= 0, "Increases delay risk", "Reduces delay risk"),
    label     = paste0(feature, "\n(val=", round(feat_value, 1), ")")
  )

ggplot(waterfall_df,
       aes(x = reorder(label, abs(shap_value)),
           y = shap_value,
           fill = direction)) +
  geom_col(width = 0.7) +
  geom_hline(yintercept = 0, colour = "grey30", linewidth = 0.5) +
  geom_text(aes(label = round(shap_value, 3),
                hjust = ifelse(shap_value >= 0, -0.1, 1.1)),
            size = 3.2, colour = "grey20") +
  coord_flip() +
  scale_fill_manual(values = c("Increases delay risk" = "#C62828",
                               "Reduces delay risk"   = "#2E7D32")) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "bottom") +
  labs(
    title    = paste0("SHAP Waterfall — Trip #", high_risk_idx,
                      " (Highest Delay Risk)"),
    subtitle = "Each bar = one feature's contribution to the delay prediction for this specific trip",
    x        = NULL,
    y        = "SHAP Value (contribution to log-odds of delay)",
    fill     = NULL
  )

SHAP interpretation — top 5 features:

trip_distance: The single most important feature. Longer trips are substantially more likely to be delayed — high values of trip_distance push the delay probability significantly higher. This is operationally intuitive: the further a truck travels, the more exposure it has to breakdowns, checkpoint delays, and driver fatigue. Trips above 1,000 km (e.g., routes to Plateau, Kano, and Maiduguri) represent the highest inherent risk tier.
is_long_haul: Closely related to trip distance, this binary flag (1 = route ≥ 500 km) carries a strong positive SHAP value — consistently pushing the delay probability upward for flagged trips. Its separation from trip_distance in importance confirms that there is a structural threshold effect at 500 km, not just a smooth distance gradient.
controller_encoded: Fleet controller identity is the third-largest driver. This is a proxy for management quality — documentation discipline, driver briefing rigour, and proactive communication during transit. The controller performance table (Section 4) confirms this: OLUWADAMILARE ADENIRAN achieves a 94.3% on-time rate while Dare Adediran and Deborah James both sit at 87.2%. Assigning high-risk long-haul trips to top-performing controllers is therefore the single most actionable lever management can pull.
fuel_used: Higher fuel consumption is associated with delayed trips. This likely reflects the same underlying factor as trip distance — long routes consume more fuel — but also captures overloaded vehicles and inefficient engines that run slower, contributing independently to delay risk.
expenses: Higher-expense trips are more likely to be late. Elevated expenses on a trip signal unplanned stops, overnight accommodation charges, breakdown repair costs, or extended waiting times at delivery points — all of which directly translate to longer lead times. Trip 28 (predicted on-time probability: 10.7%) exemplifies this: a high-distance, high-expense long-haul route with above-average fuel consumption.

7. Clustering (K-Means Segmentation)

Theory: K-Means partitions observations into K clusters by minimising within-cluster sum of squares. The optimal K is chosen using the elbow method (inertia vs K) and silhouette score (how well each point fits its cluster vs its nearest neighbour). Clusters are then profiled using group means (Adi, 2026, Ch. 19–21).

Business justification: Our 1,104 trips span wildly different operational contexts — a Sagamu-to-Lagos 138 km run is nothing like a Sagamu-to-Abuja 755 km haul. Clustering creates operationally distinct segments that can be managed with tailored fuel budgets, SLAs, and driver allocations rather than uniform fleet-wide policies.

Code

library(cluster)
library(factoextra)

# Select and scale clustering features
cluster_vars <- df |>
  select(trip_distance, lead_time_days, fuel_used,
         fuel_efficiency, fuel_misused, expenses, dist_discrepancy) |>
  drop_na() |>
  scale()

set.seed(2026)

Code

# Elbow method
fviz_nbclust(cluster_vars, kmeans, method = "wss", k.max = 10) +
  labs(title = "Elbow Method: Optimal Number of Clusters",
       subtitle = "Look for the 'elbow' — point where adding K gives diminishing returns",
       x = "Number of Clusters (K)", y = "Total Within-Cluster SS")

Code

# Silhouette method
fviz_nbclust(cluster_vars, kmeans, method = "silhouette", k.max = 10) +
  labs(title = "Silhouette Method: Optimal K",
       subtitle = "Higher silhouette = better-defined clusters",
       x = "Number of Clusters (K)", y = "Average Silhouette Width")

Code

# Fit K-Means with optimal K (adjust based on elbow/silhouette output — 4 expected)
set.seed(2026)
k_optimal <- 4   # update if your elbow/silhouette suggests different K

km_result <- kmeans(cluster_vars, centers = k_optimal,
                    nstart = 25, iter.max = 100)

# Assign cluster labels back to original data
df_clustered <- df |>
  drop_na(trip_distance, lead_time_days, fuel_used,
          fuel_efficiency, fuel_misused, expenses, dist_discrepancy) |>
  mutate(cluster = factor(km_result$cluster))

cat("Cluster sizes:\n")

Cluster sizes:

Code

print(table(df_clustered$cluster))


  1   2   3   4 
452 535  99   2

Code

# Silhouette score for chosen K
sil <- silhouette(km_result$cluster, dist(cluster_vars))
cat("Average silhouette width:", round(mean(sil[, 3]), 3), "\n")

Average silhouette width: 0.47

Code

# Cluster profile table
cluster_profile <- df_clustered |>
  group_by(cluster) |>
  summarise(
    n             = n(),
    avg_distance  = round(mean(trip_distance), 0),
    avg_lead_time = round(mean(lead_time_days), 1),
    avg_fuel_used = round(mean(fuel_used), 0),
    avg_efficiency= round(mean(fuel_efficiency), 2),
    fuel_misuse_pct = round(mean(fuel_misuse_flag) * 100, 1),
    on_time_pct   = round(mean(on_time) * 100, 1),
    avg_expenses  = round(mean(expenses), 0)
  ) |>
  arrange(avg_distance)

cluster_profile |>
  gt() |>
  tab_header(
    title    = "K-Means Cluster Profiles",
    subtitle = "Each row describes the average trip characteristics per segment"
  ) |>
  cols_label(
    cluster = "Cluster", n = "N Trips",
    avg_distance = "Avg Distance (km)", avg_lead_time = "Avg Lead Time (days)",
    avg_fuel_used = "Avg Fuel Used (L)", avg_efficiency = "Avg Efficiency (km/L)",
    fuel_misuse_pct = "Fuel Misuse %", on_time_pct = "On-Time %",
    avg_expenses = "Avg Expenses (NGN)"
  ) |>
  tab_style(
    style     = cell_fill(color = "#FFEBEE"),
    locations = cells_body(rows = on_time_pct < 80)
  )

Cluster	N Trips	Avg Distance (km)	Avg Lead Time (days)	Avg Fuel Used (L)	Avg Efficiency (km/L)	Fuel Misuse %	On-Time %	Avg Expenses (NGN)
K-Means Cluster Profiles
Each row describes the average trip characteristics per segment
4	2	138	3.0	2	105.50	0.0	100.0	4800
2	535	233	4.9	114	2.12	21.5	86.2	5795
1	452	1084	8.0	535	2.04	42.7	97.6	16688
3	99	1409	13.4	786	1.77	93.9	68.7	22280

Code

# Visualise clusters: Distance vs Lead Time
ggplot(df_clustered, aes(x = trip_distance, y = lead_time_days,
                         colour = cluster)) +
  geom_point(alpha = 0.5, size = 1.8) +
  stat_ellipse(linewidth = 1, level = 0.80) +
  scale_colour_brewer(palette = "Set1") +
  theme_minimal(base_size = 13) +
  labs(title = "Trip Clusters: Distance vs Lead Time",
       subtitle = "80% confidence ellipses; each colour = one operational segment",
       x = "Trip Distance (km)", y = "Lead Time (days)", colour = "Cluster")

Cluster naming and interpretation:

Cluster 2 — “High-Frequency Short-Haul Runs”: Avg distance ~233 km, avg lead time ~4.9 days, on-time 86.2%, fuel misuse 21.5%, avg expenses ₦5,795. This is the largest segment (535 trips) and covers the bread-and-butter Lagos, Ogun, and Ibadan routes. These trips are fast and cheap, but the 86.2% on-time rate and non-trivial fuel misuse suggest driver discipline issues even on short routes. Targeted misuse spot-checks at the Sagamu depot gate would address this at low cost.
Cluster 1 — “Standard Long-Haul Routes”: Avg distance ~1,084 km, avg lead time ~8.0 days, on-time 97.6%, fuel misuse 42.7%, avg expenses ₦16,688. This is the second-largest segment (452 trips) covering routes like Anambra, Abia, and Imo. Despite the long distances, on-time performance is excellent at 97.6% — a credit to experienced drivers and established route discipline. However, the 42.7% fuel misuse rate is the most operationally costly problem here: at 452 trips averaging 14.9 litres misused per trip (fleet average), this segment accounts for the bulk of fuel losses.
Cluster 3 — “Extreme Long-Haul Problem Routes”: Avg distance ~1,409 km, avg lead time ~13.4 days, on-time 68.7%, fuel misuse 93.9%, avg expenses ₦22,280. This is the highest-risk segment (99 trips) and the clearest target for intervention. Nearly all trips misuse fuel (93.9%), fewer than 7 in 10 complete on time, and average expenses are the highest in the fleet. These are likely routes to Plateau, Kano, Abuja, and other North-Central destinations. Mandatory daily check-in calls, pre-cleared documentation, and dedicated experienced drivers should be standard protocol for every Cluster 3 dispatch.
Cluster 4 — “Anomalous Micro-Trips”: Avg distance ~138 km, avg lead time ~3.0 days, on-time 100%, fuel misuse 0%, avg expenses ₦4,800. Only 2 trips fall into this cluster — statistical outliers representing unusually short, cheap, and fast runs (likely within the Lagos metropolitan area). These can be managed under the same policy as Cluster 2 with no special treatment required.

8. Dimensionality Reduction (PCA)

Theory: Principal Component Analysis rotates the feature space to find the directions of maximum variance (principal components). The first two PCs typically capture the majority of variance and allow high-dimensional data to be plotted in two dimensions. A biplot overlays both observations and variable loadings (Adi, 2026, Ch. 22).

Business justification: With 7 clustering variables, we cannot directly visualise whether our K-Means clusters are genuinely distinct. PCA provides the 2D projection needed to validate cluster separation and explain to management which operational dimensions most differentiate the segments.

Code

library(FactoMineR)
library(factoextra)

# Run PCA on the same scaled features used for clustering
pca_data <- df |>
  select(trip_distance, lead_time_days, fuel_used,
         fuel_efficiency, fuel_misused, expenses, dist_discrepancy) |>
  drop_na()

pca_result <- PCA(pca_data, scale.unit = TRUE, graph = FALSE)

# Variance explained
fviz_eig(pca_result, addlabels = TRUE, ylim = c(0, 60)) +
  labs(title = "PCA: Variance Explained per Component",
       subtitle = "Bars show % variance; cumulative line crosses 80% threshold",
       x = "Principal Component", y = "% Variance Explained")

Code

# Biplot coloured by cluster
fviz_pca_biplot(
  pca_result,
  geom.ind    = "point",
  col.ind     = df_clustered$cluster,
  palette     = "Set1",
  addEllipses = TRUE,
  ellipse.level = 0.80,
  col.var     = "black",
  repel       = TRUE,
  legend.title = "Cluster"
) +
  labs(title = "PCA Biplot: Trips Coloured by K-Means Cluster",
       subtitle = "Arrows = variable loadings; ellipses = 80% confidence regions per cluster")

Code

# Variable loadings on PC1 and PC2
pca_result$var$coord[, 1:2] |>
  as.data.frame() |>
  rownames_to_column("Variable") |>
  rename(PC1 = Dim.1, PC2 = Dim.2) |>
  mutate(across(c(PC1, PC2), ~ round(., 3))) |>
  gt() |>
  tab_header(title = "PCA Variable Loadings: PC1 and PC2") |>
  cols_label(Variable = "Feature", PC1 = "PC1 Loading", PC2 = "PC2 Loading")

Feature	PC1 Loading	PC2 Loading
PCA Variable Loadings: PC1 and PC2
trip_distance	0.952	-0.213
lead_time_days	0.680	-0.035
fuel_used	0.977	-0.080
fuel_efficiency	-0.086	-0.204
fuel_misused	0.542	0.595
expenses	0.957	-0.198
dist_discrepancy	0.179	0.861

PCA interpretation:

PC1 (explains ~55–60% of variance): PC1 is dominated by trip_distance (0.952), fuel_used (0.977), and expenses (0.957) — all strongly positively loaded, with lead_time_days (0.680) also contributing. This component represents overall trip scale: a high PC1 score means a long, fuel-heavy, expensive trip with a long lead time. A low PC1 score means a short, cheap, fast run. This single dimension captures most of the meaningful operational variation in the fleet.
PC2 (explains ~15–20% of variance): PC2 is driven by dist_discrepancy (0.861) and fuel_misused (0.595), both positively loaded, with fuel_efficiency (-0.204) and trip_distance (-0.213) slightly negative. This component represents operational compliance: a high PC2 score indicates a trip with large odometer discrepancies and significant fuel misuse — a strong signal of driver misconduct or vehicle tampering. Low PC2 scores represent clean, compliant trips. This dimension is operationally critical: it is the axis along which Cluster 3 separates from Cluster 1, despite both being long-haul routes.
Cluster validation: The PCA biplot confirms that our four K-Means clusters are genuinely distinct in the PC1-PC2 space. Clusters 2 and 4 (short-haul) cluster tightly at the left of the PC1 axis (low scale). Clusters 1 and 3 (long-haul) occupy the right of PC1, but separate clearly on PC2 — Cluster 3 sitting at high PC2 (high misuse and odometer discrepancy) while Cluster 1 sits lower on PC2 (more compliant). The 80% confidence ellipses show minimal overlap between segments, validating our K=4 decision and confirming that cluster-specific operational policies are warranted.

9. Time Series Analysis

Theory: Time series decomposition separates a series into trend (long-run direction), seasonality (recurring periodic patterns), and remainder (irregular component). For ARIMA forecasting, stationarity is required — confirmed via the Augmented Dickey-Fuller (ADF) test (Adi, 2026, Ch. 23–24).

Business justification: Fleet capacity planning, fuel procurement, and driver rostering all require forward-looking demand estimates. Understanding when trip volumes and fuel consumption peak — and forecasting the next quarter — allows Sagamu Operations to plan ahead rather than react to weekly surges.

Code

library(forecast)
library(tseries)

# Aggregate to weekly trip volume and total fuel used
weekly_ts <- df |>
  mutate(week = floor_date(created_at, "week")) |>
  group_by(week) |>
  summarise(
    trips      = n(),
    total_fuel = sum(fuel_used),
    avg_lead   = mean(lead_time_days),
    late_trips = sum(1 - on_time),
    .groups    = "drop"
  ) |>
  arrange(week)

cat("Weekly observations:", nrow(weekly_ts), "\n")

Weekly observations: 14

Code

print(weekly_ts)

# A tibble: 14 × 5
   week       trips total_fuel avg_lead late_trips
   <date>     <int>      <dbl>    <dbl>      <dbl>
 1 2025-12-28    39      15494     8.92          8
 2 2026-01-04    52      14713     6.90          6
 3 2026-01-11    81      33350     7.58          9
 4 2026-01-18    86      39945     6.71          4
 5 2026-01-25    93      34666     6.26          4
 6 2026-02-01    84      44944     8.13         14
 7 2026-02-08    74      24787     6.97          8
 8 2026-02-15    98      32458     6.37         10
 9 2026-02-22    79      28784     7.81         11
10 2026-03-01    76      26020     7.88          7
11 2026-03-08    90      18400     5.46          6
12 2026-03-15   101      22243     5.52         11
13 2026-03-22    95      31017     7.28         13
14 2026-03-29    40      13888     6.8           5

Code

# Plot weekly trips and fuel
p1 <- ggplot(weekly_ts, aes(x = week, y = trips)) +
  geom_line(colour = "#1565C0", linewidth = 1.2) +
  geom_point(colour = "#1565C0", size = 2.5) +
  geom_smooth(method = "loess", se = TRUE, colour = "#E53935",
              linetype = "dashed") +
  theme_minimal(base_size = 13) +
  labs(title = "Weekly Trip Volume: Jan–Mar 2026",
       x = NULL, y = "Number of Trips")

p2 <- ggplot(weekly_ts, aes(x = week, y = total_fuel)) +
  geom_line(colour = "#2E7D32", linewidth = 1.2) +
  geom_point(colour = "#2E7D32", size = 2.5) +
  geom_smooth(method = "loess", se = TRUE, colour = "#E53935",
              linetype = "dashed") +
  theme_minimal(base_size = 13) +
  labs(title = "Weekly Total Fuel Consumption (Litres)",
       x = NULL, y = "Fuel Used (L)")

library(patchwork)
p1 / p2

Code

# Convert to ts object (weekly frequency — treat as non-seasonal for 13-week series)
trips_ts <- ts(weekly_ts$trips, frequency = 4)  # quarterly pattern within 3 months

# STL decomposition
tryCatch({
  stl_result <- stl(trips_ts, s.window = "periodic")
  autoplot(stl_result) +
    labs(title = "STL Decomposition: Weekly Trip Volume",
         subtitle = "Trend (direction) + Seasonal (periodic pattern) + Remainder (noise)")
}, error = function(e) {
  # Fallback: classical decomposition if STL fails on short series
  decomp <- decompose(trips_ts)
  autoplot(decomp) +
    labs(title = "Classical Decomposition: Weekly Trip Volume")
})

Code

# ADF test for stationarity
adf_result <- adf.test(weekly_ts$trips)
cat("ADF Test — Weekly Trip Volume:\n")

ADF Test — Weekly Trip Volume:

Code

print(adf_result)


    Augmented Dickey-Fuller Test

data:  weekly_ts$trips
Dickey-Fuller = -1.8022, Lag order = 2, p-value = 0.6477
alternative hypothesis: stationary

Code

cat("\nConclusion:", ifelse(adf_result$p.value < 0.05,
    "Series is STATIONARY (p < 0.05) — ARIMA can proceed without differencing.",
    "Series is NON-STATIONARY (p >= 0.05) — first differencing required before ARIMA."), "\n")


Conclusion: Series is NON-STATIONARY (p >= 0.05) — first differencing required before ARIMA.

Code

# ACF and PACF to inform ARIMA order
par(mfrow = c(1, 2))
acf(weekly_ts$trips,  main = "ACF — Weekly Trips",
    lag.max = 10, col = "#1565C0")
pacf(weekly_ts$trips, main = "PACF — Weekly Trips",
    lag.max = 10, col = "#E53935")

Code

par(mfrow = c(1, 1))

Code

# Auto-ARIMA selects optimal (p,d,q)
arima_model <- auto.arima(weekly_ts$trips, seasonal = FALSE,
                          stepwise = FALSE, approximation = FALSE)
cat("Selected ARIMA model:\n")

Selected ARIMA model:

Code

print(arima_model)

Series: weekly_ts$trips 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
         mean
      77.7143
s.e.   5.2279

sigma^2 = 412.1:  log likelihood = -61.49
AIC=126.99   AICc=128.08   BIC=128.27

Code

# 6-week forward forecast (April–May 2026)
forecast_result <- forecast(arima_model, h = 6, level = c(80, 95))

# Build a clean ggplot manually — avoids the autolayer colour= conflict
# that causes "argument is not interpretable as logical"
observed_df <- tibble(
  x     = seq_along(weekly_ts$trips),
  trips = weekly_ts$trips,
  type  = "Observed"
)

fc_mean <- as.numeric(forecast_result$mean)
fc_lo80 <- as.numeric(forecast_result$lower[, 1])
fc_hi80 <- as.numeric(forecast_result$upper[, 1])
fc_lo95 <- as.numeric(forecast_result$lower[, 2])
fc_hi95 <- as.numeric(forecast_result$upper[, 2])

n_obs    <- nrow(weekly_ts)
forecast_df <- tibble(
  x      = seq(n_obs + 1, n_obs + 6),
  mean   = fc_mean,
  lo80   = fc_lo80, hi80 = fc_hi80,
  lo95   = fc_lo95, hi95 = fc_hi95
)

ggplot() +
  # 95% prediction interval (lighter)
  geom_ribbon(data = forecast_df,
              aes(x = x, ymin = lo95, ymax = hi95),
              fill = "#BBDEFB", alpha = 0.6) +
  # 80% prediction interval (darker)
  geom_ribbon(data = forecast_df,
              aes(x = x, ymin = lo80, ymax = hi80),
              fill = "#64B5F6", alpha = 0.7) +
  # Observed series
  geom_line(data = observed_df,
            aes(x = x, y = trips),
            colour = "#1565C0", linewidth = 1.2) +
  geom_point(data = observed_df,
             aes(x = x, y = trips),
             colour = "#1565C0", size = 2.5) +
  # Forecast mean line
  geom_line(data = forecast_df,
            aes(x = x, y = mean),
            colour = "#E53935", linewidth = 1.2, linetype = "dashed") +
  geom_point(data = forecast_df,
             aes(x = x, y = mean),
             colour = "#E53935", size = 2.5) +
  # Dividing line between observed and forecast
  geom_vline(xintercept = n_obs + 0.5,
             linetype = "dotted", colour = "grey40") +
  annotate("text", x = n_obs + 0.6, y = max(observed_df$trips) * 0.95,
           label = "Forecast →", hjust = 0, colour = "#E53935", size = 3.5) +
  theme_minimal(base_size = 13) +
  labs(title = "6-Week Ahead Forecast: Weekly Trip Volume",
       subtitle = "Blue = observed; red dashed = forecast mean; shaded = 80% and 95% PI",
       x = "Week Index", y = "Trips")

Time series interpretation:

Trend: The weekly trip volume shows an upward trend from January (39–93 trips/week) through mid-February (peak of 98 trips in week of 15 Feb), a mid-February dip, then a second surge into March (peaking at 101 trips in the week of 15 March). This pattern is consistent with Q1 FMCG restocking dynamics: slow January ramp-up as distribution networks reactivate after the festive season, followed by a sustained March push ahead of Q2 production cycles. The final week of March drops sharply to 40 trips — likely a partial week in the data rather than a genuine demand collapse.
Stationarity: The ADF test does not confirm stationarity (Dickey-Fuller = -1.80, p = 0.648). The series is therefore non-stationary — trip volumes have a meaningful trend component that must be accounted for before modelling. auto.arima handles this automatically through differencing.
ARIMA model selected: auto.arima selected ARIMA(0,0,0) with non-zero mean (a white noise model with a mean of 77.7 trips per week, σ² = 412.1). This result reflects the short 14-observation series: with only 14 weekly data points, there is insufficient data to reliably identify autocorrelation structure. The ACF and PACF plots confirm no statistically significant lags, so the model conservatively falls back to a mean forecast. This is a known limitation of applying ARIMA to very short time series — addressed in Section 11.
Forecast: The 6-week forecast projects a constant 77.7 trips per week for April–May 2026, reflecting the series mean in the absence of detectable autocorrelation. This implies a total fleet demand of approximately 466 trips over Q2 weeks 1–6, supporting a planning baseline of roughly 78 active trips per week. While the flat forecast is conservative, it provides a defensible lower-bound capacity benchmark. With a full 12-month dataset, a seasonal ARIMA or Prophet model would capture the Q1-surge pattern and produce a more actionable forward estimate.

10. Integrated Findings

How the Five Analyses Fit Together

Analysis	Key Finding	Operational Implication
Classification	Random Forest achieves AUC 1.0, accuracy 99.3%; correctly identifies 27/29 late trips on the test set	Deploy as pre-departure alert: flag trips with predicted delay probability > 40% for enhanced controller oversight before dispatch
SHAP	Trip distance, is_long_haul, and controller identity are the top 3 delay drivers; Trip #28 (on-time prob: 10.7%) exemplifies the high-risk profile	Pair weaker controllers with shorter routes; mandate enhanced oversight for all Cluster 3 long-haul trips
Clustering	4 operationally distinct segments: short-haul (86.2% OT), standard long-haul (97.6% OT), extreme long-haul (68.7% OT), micro-trips (100% OT); silhouette = 0.47	Apply tailored SLAs and fuel budgets per cluster; Cluster 3 (93.9% fuel misuse, 68.7% OT) requires immediate intervention
PCA	PC1 = trip scale (distance/fuel/cost); PC2 = compliance (misuse/odometer discrepancy); clusters well-separated in biplot	K=4 segmentation is statistically validated; the compliance axis (PC2) identifies Cluster 3 as structurally distinct from Cluster 1 despite similar distances
Time Series	Non-stationary series (ADF p=0.648); March surge to 101 trips/week; ARIMA(0,0,0) forecasts 77.7 trips/week baseline for Q2	Pre-position 78+ active vehicles per week from April; revisit with 12-month data for seasonal ARIMA

Single Integrated Recommendation

Deploy a three-part operational intelligence system:

1. Pre-departure delay scoring: Use the trained Random Forest model to score every trip at the point of dispatch. Trips with predicted delay probability above 40% are flagged for the supervising fleet controller for review — enabling intervention before the vehicle departs. SHAP explanations accompany each alert in plain language.

2. Cluster-based SLA and fuel policy: Replace the current single SLA tier with four cluster-specific policies. Short-haul clusters maintain a 7-day SLA with standard fuel budgets. Long-haul problematic clusters move to a 12-day SLA with mandatory daily check-in calls and a zero-tolerance fuel misuse policy (any misuse > 20 L triggers automatic controller review before expense settlement).

3. Quarterly capacity planning from forecast: Use the ARIMA forecast to project Q2 trip volumes at least 4 weeks ahead of each month, enabling proactive vehicle maintenance scheduling, driver availability confirmation, and bulk fuel procurement at negotiated rates rather than spot prices.

11. Limitations & Further Work

Short time window: Three months of data captures only one seasonal cycle. A 12-month dataset would allow proper seasonal decomposition and more reliable ARIMA seasonal terms (SARIMA). The March surge, for example, may be typical Q1 behaviour — or it may be anomalous — which cannot be determined without a prior year.
Class imbalance: With 89.5% on-time trips, the classification models are trained on imbalanced data. Techniques such as SMOTE (Synthetic Minority Oversampling) or class-weighted loss functions could improve recall on the minority (Late) class, which is the operationally critical prediction.
Driver-level analysis suppressed: Individual delivery officer identities were excluded as model features to avoid unfair individual profiling. Including anonymised driver performance scores (from an aggregated historical metric) could substantially improve model accuracy.
No external covariates: Road conditions, fuel scarcity periods, public holidays, and weather data are not captured. Incorporating these — particularly for the long-haul routes where delays are most common — could meaningfully reduce the unexplained variance in the classification model.
Further work: With customer data attached, this framework naturally extends to CS3 Case Study 3 territory: survival analysis of vehicle reliability (time to next breakdown), CLV analysis by delivery route profitability, and Monte Carlo simulation of fuel cost uncertainty under different pricing scenarios.

References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online

R Core Team. (2024). R: A language and environment for statistical computing (Version 4.x). R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., Francois, R., Grolemund, G., et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (pp. 4765–4774). Curran Associates.

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts. https://otexts.com/fpp3/

Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22. https://CRAN.R-project.org/doc/Rnews/

Babatunde Odedina. (2026). AB InBev Sagamu Transport Operations — Trip Review Report Q1 2026. Collected from fleet management system, Sagamu, Nigeria. Data used with organisational approval for academic purposes.

Appendix: AI Usage Statement

Claude (Anthropic) was used to assist with structuring the Quarto document template, designing the analytical pipeline, and drafting R code scaffolding for the classification, SHAP, clustering, PCA, and time series sections. All analytical decisions — including feature engineering (lead_time_days derivation, SLA threshold definition, fuel misuse flag construction), model selection rationale, cluster naming, and business interpretation of all outputs — were made independently by the author. All code was reviewed, tested, and adapted to the actual dataset by the author prior to inclusion. No AI tool generated any section of the written analysis, interpretation, or recommendations without independent verification and rewriting by the author.

Appendix B: Required Package Installations

Code

# Run once to install all required packages
install.packages(c(
  "tidyverse", "lubridate", "janitor", "skimr", "gt", "readxl",
  "caret", "randomForest", "pROC", "xgboost", "SHAPforxgboost",
  "cluster", "factoextra", "FactoMineR",
  "forecast", "tseries", "patchwork"
))