Develop a simulation model that represents the process of fulfilling a customer order, where a product is picked from a warehouse and delivered to the customer after being purchased through an e-commerce platform.
This project simulates the last-mile delivery process used by modern e-commerce companies such as Amazon. A discrete-event simulation is developed in R using the simmer package to model the complete flow from order arrival, warehouse picking, packing, and courier-based delivery. The goal is to measure performance (lead time, SLA compliance) and optimize the number of couriers needed to achieve a 95% on-time delivery target while minimizing cost.
E-commerce companies rely heavily on efficient warehouse operations and delivery logistics. Once a customer places an order on a website, multiple processes occur:
Order enters system
Warehouse worker picks items
Items are
packed
A courier loads the package
Delivery is made to the
customer
This simulation replicates these steps and evaluates
different staffing levels.
This model is inspired by:
Amazon fulfillment center robotics
Optimization strategies
described in last-mile delivery research
Warehouse process
engineering references provided in class
The final objective is
to:
✔ Evaluate system performance under different courier staffing
levels
✔ Meet SLA target: 95% of deliveries within 120 minutes
✔
Minimize cost per delivered order
The conceptual model defines the structure of the simulation system by identifying its core components and their interactions.
Entities
Resources
Queues
Processes
Performance Metrics
System Flow
Order → Picking → Packing → Delivery
This diagram visually represents the flow of orders through the system, including processing stages and queues between resources.
Figure: Conceptual flow of the last-mile delivery system
This conceptual structure ensures that the simulation captures the essential operations of an e-commerce last-mile delivery system.
Key Assumptions
The simulation follows a discrete-event process flow where each order moves sequentially through the system.
Orders arrive based on a Poisson process, meaning interarrival times follow an exponential distribution
Each order enters the picking stage, where it: 1. Seizes a picker if available 2. Otherwise waits in a queue
After picking, the order moves to packing, following the same logic
Once packed, the order proceeds to the delivery stage, where: 1. A courier must be available 2. If not, the order waits in the delivery queue
Queue Discipline
Resource Constraints
Processing Times
Key Assumption
This logic replicates real-world warehouse operations where orders must pass through each stage before completion.
When a customer places an order, it enters the system and begins its journey through the warehouse and delivery network.
First, the order joins the picking queue. If a picker is available, the item retrieval process starts immediately; otherwise, the order waits. After picking is completed, the order moves to the packing stage, where it is prepared for shipment.
Once packed, the order enters the delivery queue, where it must wait for an available courier. This stage is critical because courier availability directly impacts delivery time. If couriers are limited, orders accumulate in the queue, increasing delays.
After a courier is assigned, the order is delivered to the customer, completing the process. The total time from arrival to delivery is recorded as the lead time, which determines whether the order meets the Service Level Agreement (SLA).
The simulation shows that while picking and packing are important, the delivery stage is the primary bottleneck, as it has the greatest impact on overall system performance and customer satisfaction.
Note: This simulation model is implemented using the simmer
package in R. While Simio is commonly used for discrete-event
simulation, simmer provides equivalent functionality including event
scheduling, resource allocation, and queue-based process modeling. The
same simulation logic and structure applied in Simio are replicated here
programmatically.
Mapping to Simio Objects
add_generator() → Source (order arrivals)seize() → Server (resource allocation)timeout() → Processing timerelease() → Resource releaseThis mapping ensures that the same discrete-event logic used in Simio is preserved.
System Flow (Modeled After Amazon Warehouse)
Order Arrival
Orders arrive according to a Poisson process:
λ = 20 orders/hour.
Picking Process
Item retrieval time is modeled as an
exponential distribution (mean = 5 minutes).
Packing Process
Packaging time is exponential (mean = 3
minutes).
Courier Dispatch & Delivery
Travel time is modeled as a
normal distribution with mean 30 min and SD 10 min, truncated between 5
and 60 min to avoid unrealistic extreme values.
SLA Target Order must be delivered within 120 minutes.
Simulation Parameters Horizon: 1 week (7 days) Warm-up removed: first 24 hours Couriers tested: 3 to 15 Costs: Driver wage: $20/hour Vehicle cost: $8/hour
params <- list(
lambda_orders = 20/60,
pick_mean = 5,
pack_mean = 3,
travel_mean = 30,
travel_sd = 10,
sla_minutes = 120,
horizon_min = 7*24*60,
warmup_min = 24*60,
wage_per_hr = 20,
vehicle_per_hr= 8,
stage_mean=2
)simulate_last_mile <- function(num_couriers, params) {
env <- simmer("warehouse_delivery")
order_path <- trajectory("order") %>%
seize("picker", 1) %>%
timeout(function() rexp(1, 1/params$pick_mean)) %>%
release("picker", 1) %>%
seize("packer", 1) %>%
timeout(function() rexp(1, 1/params$pack_mean)) %>%
release("packer", 1) %>%
seize("courier", 1) %>%
timeout(function() pmax(pmin(rnorm(1, params$travel_mean, params$travel_sd),60), 5)) %>%
release("courier", 1)
env %>%
add_resource("picker", 3) %>%
add_resource("packer", 2) %>%
##add_resource("stager", 1) %>%
add_resource("courier", num_couriers) %>%
add_generator("order",
order_path,
function() rexp(1, rate = params$lambda_orders))
env %>% run(until = params$horizon_min)
arr <- get_mon_arrivals(env) %>%
filter(end_time > params$warmup_min) %>%
mutate(
lead_time = end_time - start_time,
within_sla = lead_time <= params$sla_minutes
)
# Summary metrics
mean_lead <- mean(arr$lead_time)
p_sla <- mean(arr$within_sla)
n_orders <- nrow(arr)
hours_total <- params$horizon_min / 60
total_cost <- num_couriers * hours_total * (params$wage_per_hr + params$vehicle_per_hr)
cost_order <- total_cost / n_orders
tibble(
num_couriers = num_couriers,
mean_lead = mean_lead,
p_sla = p_sla,
cost_order = cost_order,
n_orders = n_orders
)
}courier_range <- 3:15
run_experiment <- function(couriers, reps = 10, params) {
out <- lapply(couriers, function(c) {
replicate(reps, simulate_last_mile(c, params), simplify = FALSE) %>%
bind_rows()
})
bind_rows(out)
}
results <- run_experiment(courier_range, 10, params)The model was validated through multiple checks:
Each scenario was simulated with 10 replications to reduce stochastic variability and ensure stable performance estimates.
Summary Table
summary_results <- results %>%
group_by(num_couriers) %>%
summarise(
avg_lead = mean(mean_lead),
sla_rate = mean(p_sla),
cost = mean(cost_order)
)
summary_results# A tibble: 13 × 4
num_couriers avg_lead sla_rate cost
<int> <dbl> <dbl> <dbl>
1 3 4048. 0 16.3
2 4 3424. 0 16.3
3 5 2885. 0 16.3
4 6 2350. 0 16.4
5 7 1707. 0 16.3
6 8 1184. 0 16.4
7 9 605. 0.0217 16.3
8 10 173. 0.357 16.4
9 11 53.5 0.992 17.9
10 12 44.1 1.00 19.6
11 13 42.0 1 21.2
12 14 41.2 1 22.8
13 15 40.7 1 24.7
Lead Time Plot
ggplot(summary_results, aes(num_couriers, avg_lead)) +
geom_line(color="blue") +
geom_point() +
geom_hline(yintercept = params$sla_minutes, linetype="dashed", color="red") +
labs(title="Average Delivery Lead Time vs Couriers",
x="Number of Couriers", y="Lead Time (minutes)")Lead time decreases sharply as the number of couriers increases, indicating that delivery capacity is the primary bottleneck in the system.
SLA Compliance Plot
ggplot(summary_results, aes(num_couriers, sla_rate)) +
geom_line(color="darkgreen") +
geom_point() +
geom_hline(yintercept = 0.95, linetype="dashed", color="red") +
scale_y_continuous(labels=scales::percent) +
labs(title="SLA Compliance vs Couriers",
x="Couriers", y="Probability of On-Time Delivery")SLA compliance improves with additional couriers and crosses the 95% target at approximately 11 couriers.
Cost per Order Plot
ggplot(summary_results, aes(num_couriers, cost)) +
geom_line(color="purple") +
geom_point() +
labs(title="Cost per Order vs Couriers",
x="Couriers", y="Cost per Order ($)")Cost per order increases almost linearly with the number of couriers, reflecting higher operational expenses.
Goal:
SLA ≥ 95%
Minimize cost per order
# A tibble: 1 × 4
num_couriers avg_lead sla_rate cost
<int> <dbl> <dbl> <dbl>
1 11 53.5 0.992 17.9
Cost vs SLA Tradeoff Curve
ggplot(summary_results, aes(cost, sla_rate)) +
geom_point(size=3) +
geom_line() +
geom_hline(yintercept=0.95, linetype="dashed", color="red") +
labs(title="Cost vs SLA Tradeoff Curve",
x="Cost per Order ($)",
y="SLA Compliance")The cost vs. SLA tradeoff curve clearly illustrates the relationship between service quality and operational expense. As the number of couriers increases, SLA compliance improves but at a higher cost per order. The optimal staffing level occurs at 11 couriers, where the system first satisfies the 95% SLA requirement while maintaining the lowest possible delivery cost.
In the base scenario, staffing 11 couriers results in an average lead time of about 54 minutes, SLA compliance of 99.2%, and an estimated cost of $17.9 per order. This is the smallest courier level that meets the 95% of SLA requirement while minimizing cost.
The simulation model represents warehouse staff and couriers and studies the impact on the overall last-mile delivery process.
The following observations can be made based on the simulation results. Couriers are the most effective lever to improve the lead time and the overall SLA for the delivery service:
The observations are relevant to the actual settings of Amazon and e-commerce operations. In particular, robotics is being employed to increase the picking rates in the warehouses, the packing stations are often the bottleneck in the system, and the availability of the couriers is a major driver of customer experience in on-time delivery service.
The model abstracts away several real-world complexities, such as traffic variability, dynamic routing, peak-hour demand, and driver heterogeneity. The addition of these features is an interesting opportunity for the model extension.
The model demonstrates the existence of an optimal operating point for a given level of service and cost that is helpful for data-informed decision-making.