Delivery Delays Are Olist’s Clearest Customer Experience Risk

Executive Summary
Decision Question and Scope
Late Delivery Is A Focused, High-Impact Problem
Severe Delays Are The Satisfaction Break Point
The Risk Should Be Managed By Time And Geography
Recommended Board Actions
Further Questions For Management
Data Confidence Check
Caveats And Assumptions
AI + Human Audit

Executive Summary

Delivery is a clear customer-experience risk, even though most orders arrive on time. Only 8.1% of delivered orders arrived after the estimated date, but those late orders had a sharply worse customer response.
Late delivery is associated with a major satisfaction penalty. On-time or early orders averaged 4.28 stars, while late orders averaged 2.55 stars. Bad reviews were 5.8 times more common for late orders than for on-time or early orders.
The board-level implication is operational focus, not broad repositioning. The problem is concentrated enough to target: reduce severe delays first, monitor bad-review rate as the customer-impact KPI, and prioritize high-volume states where late delivery is above average.

Decision Question and Scope

The business question for this report is:

How strongly is delivery performance connected to customer satisfaction, and where should Olist prioritize operational improvements?

This report defines a late order as a delivered order where the customer delivery date is later than the estimated delivery date. Customer satisfaction is measured using review score, and a bad review means a 1- or 2-star review.

The analysis uses delivered orders with complete delivery-date fields. Canceled, unavailable, and still-in-progress orders are excluded from delivery-performance comparisons.

Late Delivery Is A Focused, High-Impact Problem

Late delivery is not the typical customer experience, but it is large enough to matter at marketplace scale. The first chart shows that most delivered orders arrive on time or early, which means the issue is operationally addressable rather than universal.

The customer impact is much larger than the incidence rate suggests. Late orders average materially lower review scores than on-time or early orders.

So what: the company does not need to redesign the whole customer journey to improve this metric. It should identify the operating conditions that create late deliveries and attack those failure points directly.

Severe Delays Are The Satisfaction Break Point

The most important operational target is not just “late” versus “not late.” The review penalty increases sharply as delays get longer. This makes severe-delay prevention a better board-level target than average delivery time alone.

The distribution of delivery days reinforces the same point. Late orders are not merely shifted a little to the right; they have a longer tail of very slow deliveries.

So what: Olist should treat severe late deliveries as a customer-trust issue. A board KPI such as “share of delivered orders more than 7 days late” would be more actionable than average delivery time alone.

The Risk Should Be Managed By Time And Geography

The late-delivery rate changes over time, which suggests the issue may be sensitive to seasonal volume, operational capacity, carrier performance, or regional constraints. The next step is to connect peaks in late delivery to specific operational causes.

Geography gives management a clearer action path. Among the largest customer states, late-delivery rates vary meaningfully. That variation can guide where operations should investigate carrier performance, fulfillment coverage, and customer communication first.

So what: the delivery problem can be managed as a targeted operating program. Start with the highest-volume states where late-delivery risk is above average, then work backward to carriers, sellers, and fulfillment routes.

Recommended Board Actions

Set a customer-impact delivery KPI. Track the share of delivered orders arriving late and the share arriving more than 7 days late. Pair both with bad-review rate so the KPI reflects customer impact, not just operational speed.
Launch a severe-delay reduction program. Prioritize orders that are likely to miss the estimated date by more than a week. These orders carry the highest review-risk signal and are the clearest threat to customer trust.
Prioritize by state before scaling interventions nationally. Use high-volume states with elevated late-delivery rates as the first operating focus. This should produce a clearer test of whether delivery interventions improve reviews.
Improve customer communication around delivery risk. If an order is likely to be late, proactive updates may reduce dissatisfaction even before fulfillment performance fully improves.

Further Questions For Management

Which carriers, sellers, or fulfillment routes explain the highest-delay states?
Are late deliveries concentrated in specific product categories, weights, or freight-cost bands?
Do proactive customer updates reduce bad-review rates for delayed orders?
What portion of late orders are caused by carrier delay versus seller handling delay?

Data Confidence Check

The analysis uses the local Olist CSV files in the data folder. The joins were checked at the order level before making delivery claims.

table	rows	columns
orders	99441	8
customers	99441	5
items	112650	7
payments	103886	5
reviews	100000	7

check	value
Rows in raw orders table	99441
Rows after order-level joins	99441
Delivered orders with actual and estimated delivery dates	96470
Delivered orders with delivery dates and reviews	96470
Orders with missing customer state after join	0
Delivered rows with negative delivery days	0

Caveats And Assumptions

This is observational analysis. It shows a strong association between late delivery and bad reviews, but it does not prove that delivery is the only cause of dissatisfaction.
Review score is used as the customer satisfaction measure.
The analysis excludes non-delivered, canceled, unavailable, and still-in-progress orders.
If an order has multiple review rows, the report uses the average review score for that order.
The dataset does not include internal logistics root causes, so the next management step is operational diagnosis.

AI + Human Audit

What Codex helped with:

Built the RMarkdown structure and data joins.
Generated the ggplot2 visualizations.
Applied the Wes Anderson color palette and heat-style bar gradients.
Added verification checks for row counts, joins, date parsing, and missing values.
Rewrote the report into a board-ready narrative.

What the human chose or should verify:

The final business question and recommendation.
Whether “late” should mean any delivery after the estimated date or only delays above a larger threshold.
Whether the board should focus nationally or on a specific state, seller group, carrier group, or product category.
Whether the visual story is clear enough for the target audience.

What remains uncertain:

The dataset does not contain internal logistics causes, so it cannot identify the exact operational owner of each delay.
The dataset does not directly measure product defects or returns.
The data is historical and should not be treated as current Olist marketplace performance.