Data Provenance: Baseline market benchmarks were synthesized from the USDA AgTransport Open Data Portal.
Variable Engineering: I engineered proprietary variables to simulate full, actualized operational data used to reduce costing.
This report identifies “System Leakage” within FFE operations by pinpointing “False Wins”—loads that look profitable but result in a net loss. By auditing internal costs by month and Rate Per Mile by destination, we isolate the specific market ceilings and high-burn months eroding our margins. The goal is to pivot toward high-density freight to ensure every trailer maximizes value.
Cleaning The Data
Identified Oxnard Routes as variance outliers. The variance for these are all higher than 10.
# Load and clean data
FFE <- read_excel("C:FFE_DATA_R.xlsx") %>%
na.omit()
# Define the Audit Status to categorize load performance
FFE_clean <- FFE %>%
filter(Variance < 10) %>%
mutate(
Audit_Status = case_when(
Operating_Ratio <= 100 ~ "Profitable/Healthy",
`City_Specific_Variance_Percentage` > 0 ~ "System Leakage (False Win)",
TRUE ~ "Known Loss (Below Target)"
)
)
FFE_clean %>%
select(Date, Origin, Destination, Distance, `Rate Per Mile`, Operating_Ratio) %>%
head(5) %>%
kable(
caption = "Snapshot of FFE Logistics Data (First 5 Rows)",
col.names = c("Date", "Origin", "Destination", "Distance (Miles)", "RPM ($)", "Operating Ratio (%)"),
align = "l"
)
| Date | Origin | Destination | Distance (Miles) | RPM ($) | Operating Ratio (%) |
|---|---|---|---|---|---|
| 2024-01-09 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | ATLANTA | 2100 | 2.571429 | 95.83333 |
| 2024-01-09 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | BALTIMORE | 2600 | 2.615385 | 92.64706 |
| 2024-01-09 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | BOSTON | 2900 | 2.534483 | 94.89796 |
| 2024-01-09 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | CHICAGO | 2030 | 2.192118 | 112.75281 |
| 2024-01-09 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | DALLAS | 1300 | 2.692308 | 96.42857 |
The following visualization isolates loads based on their pricing variance vs. their profitability. The Red Zone (Upper Right) is our primary concern: these are loads where we charged more than the city average, yet we still lost money. This suggests that high-cost factors—such are outstripping our pricing premiums. In these cases, the “win” on price is being negated by operational leakage.
# Generate the Leakage Visual
ggplot(FFE_clean, aes(x = `City_Specific_Variance_Percentage`, y = Operating_Ratio, color = Audit_Status)) +
geom_point(alpha = 0.5, size = 2.5) +
# Crosshair lines for benchmarks
geom_hline(yintercept = 100, linetype = "dashed", color = "black", size = 0.8) +
geom_vline(xintercept = 0, linetype = "dashed", color = "blue", size = 0.8) +
# The "Leakage" Zone (Upper Right Quadrant)
annotate("rect", xmin = 0, xmax = Inf, ymin = 100, ymax = Inf,
fill = "red", alpha = 0.2) +
# Audit Colors
scale_color_manual(values = c("System Leakage (False Win)" = "red",
"Known Loss (Below Target)" = "orange",
"Profitable/Healthy" = "darkgreen")) +
# Formatting X-axis as %
scale_x_continuous(labels = scales::percent) +
labs(title = "FFE Audit: City-Specific Margin Leakage",
subtitle = "Red Zone isolates loads priced above city average with an Operating Ratio > 100",
x = "City Specific Variance (%)",
y = "Operating Ratio (%)",
caption = "Dataset: FFE | Filtered for Variance < 10 | Target = City-Specific Average RPM") +
theme_minimal() +
theme(legend.position = "bottom",
plot.title = element_text(face = "bold", size = 14),
legend.title = element_blank())
System_Leakage_Results <- FFE_clean %>%
filter(Audit_Status == "System Leakage (False Win)")
System_Leakage_Results %>%
head(11) %>%
select(Date, Origin, Destination, Distance, Operating_Ratio, City_Specific_Variance_Percentage) %>%
kable(
caption = "Top 11 System Leakage Loads (Chicago Lane)",
col.names = c("Date", "Origin", "Destination", "Distance (Miles)", "Operating Ratio (%)", "Pricing Variance (%)")
)
| Date | Origin | Destination | Distance (Miles) | Operating Ratio (%) | Pricing Variance (%) |
|---|---|---|---|---|---|
| 2024-01-16 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | CHICAGO | 2030 | 102.3980 | 0.0096005 |
| 2024-01-16 | KERN DISTRICT CALIFORNIA | CHICAGO | 2030 | 100.3500 | 0.0302046 |
| 2024-01-16 | OXNARD DISTRICT CALIFORNIA | CHICAGO | 2070 | 102.1500 | 0.0102972 |
| 2024-03-12 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | CHICAGO | 2030 | 102.3980 | 0.0096005 |
| 2024-03-12 | KERN DISTRICT CALIFORNIA | CHICAGO | 2030 | 101.3636 | 0.0199025 |
| 2024-03-19 | IMPERIAL, COACHELLA VALLEYS CA, CENTRAL AND WESTERN AZ, MEXICO CROSSINGS THROUGH CALEXICO AND SAN LUIS | CHICAGO | 2030 | 102.3980 | 0.0096005 |
| 2024-03-19 | KERN DISTRICT CALIFORNIA | CHICAGO | 2030 | 101.3636 | 0.0199025 |
| 2024-08-13 | SAN JOAQUIN VALLEY CALIFORNIA | CHICAGO | 2200 | 100.0000 | 0.0266457 |
| 2024-08-20 | SAN JOAQUIN VALLEY CALIFORNIA | CHICAGO | 2200 | 100.0000 | 0.0266457 |
| 2024-08-27 | SAN JOAQUIN VALLEY CALIFORNIA | CHICAGO | 2200 | 100.0000 | 0.0266457 |
| 2024-09-03 | SAN JOAQUIN VALLEY CALIFORNIA | CHICAGO | 2200 | 100.0000 | 0.0266457 |
The data reveals a clear “Economy of Scale” for our long-haul business. For trips over 500 miles, the trend is undeniable: as the distance of a load increases, our profit margin improves.
By targeting long-haul shipments, we spread our fixed operational costs over a wider revenue base, which structurally lowers our Operating Ratio. This makes every additional mile driven beyond the 1,000-mile mark a direct contribution to our net profitability.
# Core OTR Insight: Showing how longer distances drive the Operating Ratio down.
# Filtering out the short-haul regional outliers that begin in Oxnard (Distance < 500).
ggplot(FFE %>% filter(Distance > 500),
aes(x = Distance, y = Operating_Ratio)) +
geom_point(alpha = 0.3, color = '#2c3e50') +
# Regression line showing the efficiency trend (Negative Slope)
geom_smooth(method = "lm", color = "#e74c3c", size = 1.5, se = TRUE) +
# The "Break-even" Threshold
geom_hline(yintercept = 100, linetype = "dashed", color = "black", alpha = 0.6) +
labs(title = "Distance vs. Efficiency: The Core Trend",
subtitle = "Once regional outliers are removed, OTR efficiency improves over distance.",
x = "Trip Distance (Miles)",
y = "Operating Ratio (%)",
caption = "Regression Slope is Negative | Long-haul loads are more efficient than mid-haul.") +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 18))
This boxplot compares the Rate Per Mile (RPM) across our major long-haul destinations. It reveals a clear pricing hierarchy: Miami and Dallas consistently command the highest premiums, while Boston sits at the bottom of the OTR scale.
The wide “boxes” in cities like Dallas show we have more room to negotiate higher spot rates, whereas the tight distribution in Seattle suggests we are stuck at a fixed market ceiling.
# --- Pricing Heat Check: Boxplots of Rates by City ---
# Filter for long-haul OTR (Distance > 500)
# Sort destinations by median Rate Per Mile
city_order <- FFE %>%
filter(Distance > 500) %>%
group_by(Destination) %>%
summarise(median_rpm = median(`Rate Per Mile`)) %>%
arrange(desc(median_rpm)) %>%
pull(Destination)
ggplot(FFE %>% filter(Distance > 500),
aes(x = `Rate Per Mile`, y = factor(Destination, levels = city_order), fill = Destination)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
scale_fill_viridis_d(option = "mako") +
labs(title = "Pricing Heat Check: Rate Per Mile by Destination",
subtitle = "Comparing distribution and median rates across major OTR lanes.",
x = "Rate Per Mile ($)",
y = "Destination Market") +
theme_minimal() +
theme(legend.position = "none", plot.title = element_text(face = "bold", size = 16))
This chart breaks down exactly what it costs to run our trucks each month. By adding up the Linehaul (miles) and Pickup and Delivery (terminal fees), we can see exactly when we’re burning the most cash. If our contract rates are locked in, these high-spend months are the “danger zones” where we might be working hard just to break even.
# 1. Clean and Calculate
FFE_monthly <- FFE %>%
mutate(
Month_Num = as.numeric(format(Date, "%m")),
# Convert number to full name (January, February, etc.)
Month_Name = factor(month.name[Month_Num], levels = month.name),
# Linehaul + Pickup and Delivery (P&D)
Linehaul_Cost = Distance * 2.25,
PD_Cost = 250 + (Terminal_Touches * 75),
Total_Cost = Linehaul_Cost + PD_Cost
) %>%
group_by(Month_Name) %>%
summarise(Total_Operating_Cost = sum(Total_Cost, na.rm = TRUE))
# 2. Plot
ggplot(FFE_monthly, aes(x = Month_Name, y = Total_Operating_Cost)) +
geom_bar(stat = "identity", fill = "#2980b9") +
geom_text(aes(label = scales::dollar(Total_Operating_Cost)),
vjust = -0.5, size = 3, fontface = "bold") +
labs(title = "Monthly Expenditure Audit: Total Operating Cost",
subtitle = "Total spend using Linehaul costs and Pickup and Delivery costs.",
x = "Month",
y = "Total Expenditure ($)") +
scale_y_continuous(labels = scales::dollar, expand = expansion(mult = c(0, 0.15))) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 16),
axis.text.x = element_text(angle = 45, hjust = 1))
The data confirms that volume does not guarantee profit. While long-haul routes show structural efficiency, the “Chicago Leakage” proves that even high-rate loads can result in a net loss.
This audit underscores the importance of a high-density strategy. By prioritizing heavy freight and identifying “False Win” loads, FFE ensures that operational costs don’t outpace revenue.
Maintaining this margin-first focus is the only way to turn high-burn months into consistent bottom-line wins.