This case study examines the factors affecting procurement delievry performance at Petrohawk Centrum Limited, an oil and gas servicing company. The business problem is the occurence of late late deliveries, which can delay project executive, increase costs and reduced client satisfaction. The dataset collected contains 100 purchase order observation, including purchase order dates, prodcut/item category, vendorm supplier source, quantity, unit price, revenue/order value, planned delievry days and delivery status.
The analysis applies the five required Case study 1 techniques: Exploratory Data Analysis EDA, Visualization, Hypothesis testing, Correlation Analysis and Regression as specified in the assessment brief. Initial review of the data shows that 72% of deliveries were on time while 28% were late. Late delievries were more common among international suppliers, higher-value orders, larger qunatities and categories such as packaging, textile/linen and medical equipment.
The study recommends a data-driven supplier performance review system, tighter montioring of high risk item catorgories and improved lead-time planning for international and high-value procurement orders.
Professional Disclosure
Job Title: Procurement Manager | Organisation: Petrohawk Centrum Limited | Sector: Oil and Gas Services, Lagos
EDA: Exploratory Data Analysis is operationally because it helps me understand ther basic structure and behavior of Petrohawk’s procurement data before making decisions. Through EDA, I can identify the number of late and on-time deliveries, average planned delivery days, high-value orders, and possibly data quality issues. In procurement, this first step in moving from vendor/suppier issues to actual evidence-based decision making.
Visualisation: Data visualization is useful because procurement performance needs to be communicated clearly to management, project teams and sometimes clients. Charts showing late deliveries by vendors, item category, source, or planned delivery days make it easier to identify patterns quickly. Instead of presenting raw tables, visualization helps management see which supplier or categories require closer monitoring and where operational risk is concentrated.
Hypothesis Testing: Hypothesis testing is relevant because it allows me to move beyond observation and test whether differences in procurement performance are statistically meaningful. For example, I can test whether international suppliers have a significantly higher late-delivery rate than local suppliers, or whether planned delivery days differ between late or on-time orders. This supports stronger supplier evaluation and reduces reliance on guess work.
Correlation Analysis: Correlation analysis helps me identify relationships between procurement variables such as quantity, unit price, order value, planned delivery days and delivery performance. In my work, this is useful for spotting whether larger or more expensive orders are associated with longer delivery timelines or higher delay risk. While correlation does not prove causation, it gives management early warning signals for procurement planning.
Logistic Regression: This helps estimate how different procurement factors influence delivery outcomes when considered together. This is useful in building a more disciplined follow-up system where high-risk purchase orders are flagged early before thye become operational problems.
Data Collection
Source: Internal procurement records, Petrohawk Centrum Limited, Lagos. Sample: 100 PO line items, 2021-2026. Variables: PO Date, Item Category, Supplier Source, Quantity, Unit Price, Revenue, Planned Delivery Days, Delivery Status. Ethics: No PII. Shared with organisational approval for academic use only.
Narrative: International suppliers have much higher late rates. Medical Equipment and Packaging are most problematic. The 21-day lead time almost always results in late delivery. Late orders carry higher financial values.
Hypothesis Testing
H1: Supplier Source vs Delivery Status
H0: No association between source and delivery status. H1: Source is significantly associated with delivery status. Test: Chi-squared.
Interpretation: P-value below 0.05 — we reject H0. Supplier source significantly predicts delivery status. This supports a local-first sourcing policy.
Code
ggplot(raw, aes(x=SOURCE, fill=DELIVERY_STATUS)) +geom_bar(position="fill", color="white") +scale_fill_manual(values=c("On-time"="#2d2d6b","Late"="#d42b2b")) +scale_y_continuous(labels=scales::percent) +labs(title="H1: Delivery Status by Supplier Source",subtitle="Chi-squared p = 0.003 | Cramers V = 0.300 | Significant at 5% level",x="Supplier Source", y="Proportion", fill="Status") +theme_minimal(base_size=13) +theme(plot.title=element_text(color="#2d2d6b", face="bold"),plot.subtitle=element_text(color="#d42b2b"))
H2: Planned Days by Source
H0: Mean planned days is equal for local and international suppliers. H1: International suppliers have significantly more planned days. Test: Welch t-test and Mann-Whitney.
Code
print(t.test(PLANNED_DAYS ~ SOURCE, data=raw))
Welch Two Sample t-test
data: PLANNED_DAYS by SOURCE
t = -54.7, df = 60, p-value < 2.2e-16
alternative hypothesis: true difference in means between group Local and group International is not equal to 0
95 percent confidence interval:
-17.77460 -16.52048
sample estimates:
mean in group Local mean in group International
3.852459 21.000000
Wilcoxon rank sum test with continuity correction
data: PLANNED_DAYS by SOURCE
W = 19.5, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
Code
cohens_d(PLANNED_DAYS ~ SOURCE, data=raw)
Cohen's d | 95% CI
---------------------------
-8.95 | [-10.24, -7.63]
- Estimated using pooled SD.
Interpretation: Both tests confirm p < 0.001. International orders require 21 days versus 2-5 days locally. Large effect confirmed by Cohen d.
Code
ggplot(raw, aes(x=SOURCE, y=PLANNED_DAYS, fill=SOURCE)) +geom_boxplot(alpha=0.8, outlier.color="black") +geom_jitter(width=0.1, alpha=0.3, size=1.5) +scale_fill_manual(values=c("Local"="#2d2d6b","International"="#d42b2b")) +labs(title="H2: Planned Delivery Days by Supplier Source",subtitle="T-test p < 0.001 | Cohen d = -8.95 | Very large effect",x="Supplier Source", y="Planned Delivery Days") +theme_minimal(base_size=13) +theme(plot.title=element_text(color="#2d2d6b", face="bold"),plot.subtitle=element_text(color="#d42b2b"),legend.position="none")
ggplot(raw, aes(x=pred, fill=DELIVERY_STATUS)) +geom_histogram(bins=25,position="identity",alpha=0.7,color="white") +scale_fill_manual(values=c("On-time"="#2d2d6b","Late"="#d42b2b")) +labs(title="Predicted Probability of Late Delivery",x="Predicted Probability",y="Count",fill="Actual") +theme_minimal(base_size=13)
Interpretation: International sourcing dramatically increases late delivery odds. Higher prices and longer lead times also increase risk significantly.
Integrated Findings
All five techniques point to the same conclusion: delivery risk is driven by international sourcing and long lead times.
Recommendation: (1) Enforce a 35-day minimum lead time for all international orders. (2) Adopt a local-first sourcing policy unless price differential justifies international procurement.
Limitations
Only 100 records — expanding to 500+ would allow deeper analysis.
Actual delivery dates not recorded — delay in days cannot be modelled.
No stable vendor IDs — vendor-level scoring is not possible.
Confounding factors such as customs delays not captured.
Recommendations
Based on the five analytical techniques applied in this study, the following actionable recommendations are proposed for the procurement function at Petrohawk Centrum Limited:
Adopt a Local-First Sourcing Policy The chi-squared test and logistic regression both confirmed that international sourcing is the single strongest predictor of late delivery. Where technically feasible, procurement should prioritise local suppliers. A formal local-first sourcing policy should be implemented, requiring written justification before any order is placed internationally.
Enforce a Minimum 35-Day Lead Time for International Orders The hypothesis testing confirmed that international orders carry a structural 21-day planned lead time. However the high late delivery rate among these orders suggests this is insufficient. All international purchase orders should carry a minimum 35-day lead time from PO date to delivery requirement date, building a 14-day buffer to absorb customs delays, port congestion, and shipping disruptions.
Implement a High-Value Order Expediting Protocol Correlation analysis showed that unit price is the second strongest predictor of lateness after planned delivery days. High-value orders — particularly medical equipment, ultrasound machines, and specialist drilling items — must be flagged at PO creation for active vendor follow-up. A weekly expediting call with suppliers on orders above NGN 5 million is recommended.
Develop a Procurement Risk Scoring System The logistic regression model achieved strong predictive accuracy using just three variables: supplier source, unit price, and planned delivery days. This model should be operationalised as a risk scoring tool embedded in the ERP system. Every new PO should be automatically scored for late delivery probability at the point of creation, allowing the procurement team to intervene before delays occur.
Summary: The overarching recommendation is a dual-track procurement risk protocol — a local-first sourcing policy for routine items, and a structured lead-time buffer with active expediting for high-value international orders. These two actions directly address the root causes identified across all five analytical techniques.
GitHub Repository
The complete source code and dataset for this analysis are publicly available on GitHub: