1. Executive Summary

This report provides an in-depth analysis of TxDOT bid tab data, focused on evaluating vendor competitiveness and predicting pricing. It includes exploratory data analysis, win rate comparison, predictive modeling, spatial visualizations, and strategic insights to guide pricing and bid strategy.


2. Load Libraries and Dataset

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Linking to GEOS 3.11.2, GDAL 3.8.2, PROJ 9.3.1; sf_use_s2() is TRUE
## 
## To enable caching of data, set `options(tigris_use_cache = TRUE)`
## in your R script or .Rprofile.
## 
## Loading required package: lattice
## 
## 
## Attaching package: 'caret'
## 
## 
## The following object is masked from 'package:purrr':
## 
##     lift
## 
## 
## corrplot 0.94 loaded
## 
## randomForest 4.7-1.2
## 
## Type rfNews() to see new features/changes/bug fixes.
## 
## 
## Attaching package: 'randomForest'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
## 
## 
## The following object is masked from 'package:ggplot2':
## 
##     margin
## 
## 
## Rows: 378 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (11): DISTRICT, CCSJ #, COUNTY, COORDINATES, PROJECT LET DATE, PRICE PER...
## dbl  (4): MMM MILES, VULCAN MILES, TONS, CONTRACT DAYS
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

3. Exploratory Data Analysis (EDA)

##  PRICE PER TON   BID TOTAL AMOUNT  % OVER / UNDER          TONS      
##  Min.   : 80.0   Min.   :  45448   Min.   :-20.3200   Min.   :  325  
##  1st Qu.:110.9   1st Qu.: 267128   1st Qu.: -2.7375   1st Qu.: 2025  
##  Median :127.8   Median : 315867   Median :  0.0000   Median : 2690  
##  Mean   :129.5   Mean   : 501055   Mean   :  0.2854   Mean   : 3941  
##  3rd Qu.:144.7   3rd Qu.: 614485   3rd Qu.:  1.6475   3rd Qu.: 5000  
##  Max.   :182.8   Max.   :3714805   Max.   : 37.5300   Max.   :21000  
##  CONTRACT DAYS
##  Min.   :120  
##  1st Qu.:180  
##  Median :180  
##  Mean   :214  
##  3rd Qu.:183  
##  Max.   :365

Price per ton ranges widely; moderate correlations found with tonnage and % over/under estimates


4. Additional EDA: Tonnage and Distance Distributions

Summary: Most contracts are small to mid-sized in tonnage, with haul distances clustering below 300 miles


5. Predictive Modeling: Linear Regression and Random Forest

## 
## Call:
## lm(formula = price_per_ton ~ tons + contract_days + pct_over_under, 
##     data = model_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -52.908 -14.783  -2.119  14.926  67.510 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     1.544e+02  3.147e+00  49.073  < 2e-16 ***
## tons           -5.569e-04  3.415e-04  -1.631  0.10375    
## contract_days  -1.070e-01  1.329e-02  -8.048 1.13e-14 ***
## pct_over_under  4.616e-01  1.647e-01   2.803  0.00533 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.22 on 374 degrees of freedom
## Multiple R-squared:  0.1802, Adjusted R-squared:  0.1736 
## F-statistic:  27.4 on 3 and 374 DF,  p-value: 4.882e-16
## 
## Call:
##  randomForest(formula = price_per_ton ~ tons + contract_days +      pct_over_under, data = model_data, importance = TRUE) 
##                Type of random forest: regression
##                      Number of trees: 500
## No. of variables tried at each split: 1
## 
##           Mean of squared residuals: 164.8794
##                     % Var explained: 72.34

## [1] 10.24695
## [1] 0.8507934

Random forest model identified tonnage and estimate deviation as key price drivers with good predictive performance (high R²). The random forest model achieved an R² of 0.72, indicating that the predictors explain 72% of the variation in price per ton.


6. Summary by Vendor

## # A tibble: 2 × 5
##   `VENDOR NAME`                      Count Avg_Price Avg_Tons Avg_Contract_Days
##   <chr>                              <int>     <dbl>    <dbl>             <dbl>
## 1 MARTIN MARIETTA MATERIALS, INC.       54      136.    3092.              196.
## 2 VULCAN CONSTRUCTION MATERIALS, LLC    72      121.    4577.              228.

Martin’s contracts tend to be larger, but Vulcan shows tighter pricing.


7. Vendor Competitiveness Analysis

Win rates vary significantly by county, with each vendor showing regional advantages.


8. Map: Vendor Win Rates by County

Vulcan dominates southern/coastal counties; Martin Marietta leads inland.


9. Geographic Pricing Patterns

Awarded prices are higher in remote or lower-density counties, consistent with increased logistics costs.


10. Weighted Average Price Per Ton by County

Tonnage-weighted prices highlight true economic load, emphasizing cost differentials in higher-volume counties.


11. Competitor Pricing Strategy by Distance and Tonnage

Martin Marietta and Vulcan tend to price lower on larger contracts and shorter distances; pricing rises with complexity.


12. Pricing Trend Over Time

Prices have shown gradual increases, suggesting inflationary or market trend effects over time.


14. Strategic Actions to Improve Martin Marietta’s Bid Success

This section outlines data-driven actions Martin Marietta can take to increase win rates and pricing competitiveness across Texas regions.

To improve bid success and pricing efficiency, Martin Marietta should consider the following actions based on observed trends:

1. Expand in Underserved Regions

  • Focus on southern and coastal counties where Vulcan consistently outperforms.
  • Prioritize high-volume, low-win-rate counties where competitive pricing may enable market entry.

2. Leverage Predictive Pricing Tools

  • Use the random forest model to simulate pricing scenarios using tonnage, haul distance, and % over/under estimates.
  • Benchmark bids against model-based pricing expectations to remain competitive without underbidding.

3. Target High-Volume Contracts

  • Emphasize larger projects, which tend to have lower price-per-ton awards due to scale efficiencies.
  • Utilize volume-based discounting to gain a competitive edge on bundled or regional bids.

4. Refine Regional Pricing Strategy

  • Adjust bid pricing in remote counties where transportation costs significantly impact competitiveness.
  • Explore local sourcing or logistical partnerships to mitigate higher freight costs and reduce delivered price.

5. Improve Bid Calibration

  • Align pricing more closely with engineer estimates, as substantial deviations reduce win likelihood.
  • Track and adapt to market tightening in % Over/Under to stay within awardable bid ranges.