DATA 604 - Final_Simulation_Project

Bikash Bhowmik —- 23 May 2026

Column

Column

Instructions

Develop a simulation model that represents the process of fulfilling a customer order, where a product is picked from a warehouse and delivered to the customer after being purchased through an e-commerce platform.

Introduction

This project simulates the last-mile delivery process used by modern e-commerce companies such as Amazon. A discrete-event simulation is developed in R using the simmer package to model the complete flow from order arrival, warehouse picking, packing, and courier-based delivery. The goal is to measure performance (lead time, SLA compliance) and optimize the number of couriers needed to achieve a 95% on-time delivery target while minimizing cost.

E-commerce companies rely heavily on efficient warehouse operations and delivery logistics. Once a customer places an order on a website, multiple processes occur:

Order enters system
Warehouse worker picks items
Items are packed
A courier loads the package
Delivery is made to the customer
This simulation replicates these steps and evaluates different staffing levels.

This model is inspired by:

Amazon fulfillment center robotics
Optimization strategies described in last-mile delivery research
Warehouse process engineering references provided in class

The final objective is to:

✔ Evaluate system performance under different courier staffing levels
✔ Meet SLA target: 95% of deliveries within 120 minutes
✔ Minimize cost per delivered order

Conceptual Model

The conceptual model defines the structure of the simulation system by identifying its core components and their interactions.

Entities

  • Customer orders entering the system

Resources

  • Pickers (warehouse workers retrieving items)
  • Packers (staff responsible for packaging)
  • Couriers (drivers delivering orders)

Queues

  • Orders waiting for picking
  • Orders waiting for packing
  • Orders waiting for courier dispatch

Processes

  1. Order arrival into the system
  2. Item picking from warehouse
  3. Order packing
  4. Courier delivery to customer

Performance Metrics

  • Average lead time (order to delivery)
  • SLA compliance (delivery within 120 minutes)
  • Cost per order

System Flow
Order → Picking → Packing → Delivery

This diagram visually represents the flow of orders through the system, including processing stages and queues between resources.

Figure: Conceptual flow of the last-mile delivery system

This conceptual structure ensures that the simulation captures the essential operations of an e-commerce last-mile delivery system.

Key Assumptions

  • Orders are independent and identically distributed
  • No order cancellations or returns
  • FIFO queue discipline at all stages
  • Resources are always available (no breakdowns)
  • Travel times are independent of traffic conditions
  • No batching or routing optimization is considered

Process Logic

The simulation follows a discrete-event process flow where each order moves sequentially through the system.

  • Orders arrive based on a Poisson process, meaning interarrival times follow an exponential distribution

  • Each order enters the picking stage, where it: 1. Seizes a picker if available 2. Otherwise waits in a queue

  • After picking, the order moves to packing, following the same logic

  • Once packed, the order proceeds to the delivery stage, where: 1. A courier must be available 2. If not, the order waits in the delivery queue

Queue Discipline

  • First-In-First-Out (FIFO) is assumed for all stages

Resource Constraints

  • Limited number of pickers, packers, and couriers
  • Couriers are varied (3–15) to test system performance

Processing Times

  • Picking and packing follow exponential distributions
  • Delivery time follows a truncated normal distribution

Key Assumption

  • No priority orders or interruptions (non-preemptive system)

This logic replicates real-world warehouse operations where orders must pass through each stage before completion.

Process Explanation

When a customer places an order, it enters the system and begins its journey through the warehouse and delivery network.

First, the order joins the picking queue. If a picker is available, the item retrieval process starts immediately; otherwise, the order waits. After picking is completed, the order moves to the packing stage, where it is prepared for shipment.

Once packed, the order enters the delivery queue, where it must wait for an available courier. This stage is critical because courier availability directly impacts delivery time. If couriers are limited, orders accumulate in the queue, increasing delays.

After a courier is assigned, the order is delivered to the customer, completing the process. The total time from arrival to delivery is recorded as the lead time, which determines whether the order meets the Service Level Agreement (SLA).

The simulation shows that while picking and packing are important, the delivery stage is the primary bottleneck, as it has the greatest impact on overall system performance and customer satisfaction.

Model Description

Note: This simulation model is implemented using the simmer package in R. While Simio is commonly used for discrete-event simulation, simmer provides equivalent functionality including event scheduling, resource allocation, and queue-based process modeling. The same simulation logic and structure applied in Simio are replicated here programmatically.

Mapping to Simio Objects

  • add_generator() → Source (order arrivals)
  • seize() → Server (resource allocation)
  • timeout() → Processing time
  • release() → Resource release
  • Queues are automatically managed before each seize step

This mapping ensures that the same discrete-event logic used in Simio is preserved.

System Flow (Modeled After Amazon Warehouse)

  1. Order Arrival
    Orders arrive according to a Poisson process: λ = 20 orders/hour.

  2. Picking Process
    Item retrieval time is modeled as an exponential distribution (mean = 5 minutes).

  3. Packing Process
    Packaging time is exponential (mean = 3 minutes).

  4. Courier Dispatch & Delivery
    Travel time is modeled as a normal distribution with mean 30 min and SD 10 min, truncated between 5 and 60 min to avoid unrealistic extreme values.

SLA Target Order must be delivered within 120 minutes.

Simulation Parameters Horizon: 1 week (7 days) Warm-up removed: first 24 hours Couriers tested: 3 to 15 Costs: Driver wage: $20/hour Vehicle cost: $8/hour

params <- list(
  lambda_orders = 20/60,
  pick_mean     = 5,
  pack_mean     = 3,
  travel_mean   = 30,
  travel_sd     = 10,
  sla_minutes   = 120,
  horizon_min   = 7*24*60,
  warmup_min    = 24*60,
  wage_per_hr   = 20,
  vehicle_per_hr= 8,
  stage_mean=2
)
simulate_last_mile <- function(num_couriers, params) {

env <- simmer("warehouse_delivery")

order_path <- trajectory("order") %>%
seize("picker", 1) %>%
timeout(function() rexp(1, 1/params$pick_mean)) %>%
release("picker", 1) %>%


seize("packer", 1) %>%
timeout(function() rexp(1, 1/params$pack_mean)) %>%
release("packer", 1) %>%

seize("courier", 1) %>%
timeout(function() pmax(pmin(rnorm(1, params$travel_mean, params$travel_sd),60), 5)) %>%
release("courier", 1)


env %>%
add_resource("picker", 3) %>%
add_resource("packer", 2) %>%
##add_resource("stager", 1) %>%
add_resource("courier", num_couriers) %>%
add_generator("order",
order_path,
function() rexp(1, rate = params$lambda_orders))

env %>% run(until = params$horizon_min)

arr <- get_mon_arrivals(env) %>%
filter(end_time > params$warmup_min) %>%
mutate(
lead_time = end_time - start_time,
within_sla = lead_time <= params$sla_minutes
)




# Summary metrics

mean_lead <- mean(arr$lead_time)
p_sla     <- mean(arr$within_sla)
n_orders  <- nrow(arr)

hours_total <- params$horizon_min / 60
total_cost  <- num_couriers * hours_total * (params$wage_per_hr + params$vehicle_per_hr)
cost_order  <- total_cost / n_orders

tibble(
num_couriers = num_couriers,
mean_lead = mean_lead,
p_sla = p_sla,
cost_order = cost_order,
n_orders = n_orders
)

}
courier_range <- 3:15

run_experiment <- function(couriers, reps = 10, params) {
out <- lapply(couriers, function(c) {
replicate(reps, simulate_last_mile(c, params), simplify = FALSE) %>%
bind_rows()
})
bind_rows(out)
}

results <- run_experiment(courier_range, 10, params)

Model Validation

The model was validated through multiple checks:

  • Logical validation: The process flow correctly follows order → picking → packing → delivery.
  • Bottleneck behavior: As expected, courier capacity becomes the limiting factor at low staffing levels.
  • Replication stability: Results were averaged over multiple replications to ensure consistency.
  • Face validity: Output metrics such as lead time and SLA performance align with realistic expectations for e-commerce delivery systems.

Each scenario was simulated with 10 replications to reduce stochastic variability and ensure stable performance estimates.


Results

Summary Table

summary_results <- results %>%
group_by(num_couriers) %>%
summarise(
avg_lead = mean(mean_lead),
sla_rate = mean(p_sla),
cost = mean(cost_order)
)

summary_results
# A tibble: 13 × 4
   num_couriers avg_lead sla_rate  cost
          <int>    <dbl>    <dbl> <dbl>
 1            3   4048.    0       16.3
 2            4   3424.    0       16.3
 3            5   2885.    0       16.3
 4            6   2350.    0       16.4
 5            7   1707.    0       16.3
 6            8   1184.    0       16.4
 7            9    605.    0.0217  16.3
 8           10    173.    0.357   16.4
 9           11     53.5   0.992   17.9
10           12     44.1   1.00    19.6
11           13     42.0   1       21.2
12           14     41.2   1       22.8
13           15     40.7   1       24.7

Lead Time Plot

ggplot(summary_results, aes(num_couriers, avg_lead)) +
geom_line(color="blue") +
geom_point() +
geom_hline(yintercept = params$sla_minutes, linetype="dashed", color="red") +
labs(title="Average Delivery Lead Time vs Couriers",
x="Number of Couriers", y="Lead Time (minutes)")

Lead time decreases sharply as the number of couriers increases, indicating that delivery capacity is the primary bottleneck in the system.

SLA Compliance Plot

ggplot(summary_results, aes(num_couriers, sla_rate)) +
geom_line(color="darkgreen") +
geom_point() +
geom_hline(yintercept = 0.95, linetype="dashed", color="red") +
scale_y_continuous(labels=scales::percent) +
labs(title="SLA Compliance vs Couriers",
x="Couriers", y="Probability of On-Time Delivery")

SLA compliance improves with additional couriers and crosses the 95% target at approximately 11 couriers.

Cost per Order Plot

ggplot(summary_results, aes(num_couriers, cost)) +
geom_line(color="purple") +
geom_point() +
labs(title="Cost per Order vs Couriers",
x="Couriers", y="Cost per Order ($)")

Cost per order increases almost linearly with the number of couriers, reflecting higher operational expenses.

Optimization

Goal:

SLA ≥ 95%

Minimize cost per order

best <- summary_results %>%
filter(sla_rate >= 0.95) %>%
arrange(cost) %>%
slice(1)
best
# A tibble: 1 × 4
  num_couriers avg_lead sla_rate  cost
         <int>    <dbl>    <dbl> <dbl>
1           11     53.5    0.992  17.9

Cost vs SLA Tradeoff Curve

ggplot(summary_results, aes(cost, sla_rate)) +
geom_point(size=3) +
geom_line() +
geom_hline(yintercept=0.95, linetype="dashed", color="red") +
labs(title="Cost vs SLA Tradeoff Curve",
x="Cost per Order ($)",
y="SLA Compliance")

The cost vs. SLA tradeoff curve clearly illustrates the relationship between service quality and operational expense. As the number of couriers increases, SLA compliance improves but at a higher cost per order. The optimal staffing level occurs at 11 couriers, where the system first satisfies the 95% SLA requirement while maintaining the lowest possible delivery cost.

In the base scenario, staffing 11 couriers results in an average lead time of about 54 minutes, SLA compliance of 99.2%, and an estimated cost of $17.9 per order. This is the smallest courier level that meets the 95% of SLA requirement while minimizing cost.

Conclusion

The simulation model represents warehouse staff and couriers and studies the impact on the overall last-mile delivery process.

The following observations can be made based on the simulation results. Couriers are the most effective lever to improve the lead time and the overall SLA for the delivery service:

  • With a higher number of couriers, the lead time is lower and SLA is higher.
  • The delivery cost rises almost linearly with an increasing number of couriers, which establishes a clear cost–service trade-off.
  • The effective staffing level is 11 couriers, which provides a sufficiently high SLA (≥ 95%) and the least cost per order ( ~$17.90).
  • With lower staffing, the lack of couriers in the system is the performance bottleneck, as the long queues lead to high SLA misses on the delivery deadlines.
  • Packing and picking resources can also be part of the bottleneck area with low resource capacities, however, their effect on the system performance is weaker compared to the courier capacity.

The observations are relevant to the actual settings of Amazon and e-commerce operations. In particular, robotics is being employed to increase the picking rates in the warehouses, the packing stations are often the bottleneck in the system, and the availability of the couriers is a major driver of customer experience in on-time delivery service.

The model abstracts away several real-world complexities, such as traffic variability, dynamic routing, peak-hour demand, and driver heterogeneity. The addition of these features is an interesting opportunity for the model extension.

The model demonstrates the existence of an optimal operating point for a given level of service and cost that is helpful for data-informed decision-making.