What factors cause delays in cybersecurity audit projects?
Author
Bisola Oladejo
Published
May 19, 2026
Executive Summary
Project delays are the most common issue we face in our cybersecurity advisory practice, with 72% of engagements running late. This paper was designed to understand what actually causes these delays and how the advisory team can avoid them.
Data for 100 assessment completed projects (PCI DSS, VAPT, ISO 27001, SWIFT, IMS, and FORENSICS) was gathered through a corporate survey of consultants. Five analytical techniques are used: Exploratory data analysis (EDA), Data visualisation, Test the hypothesis, Correlation, and Logistic Regression.
We identified two behavioural predictors of delay in the logistic regression: both responsiveness (OR = 0.58, p = 0.075) and security maturity (OR = 0.57, p = 0.047) reduced the odds of delay by more than 40% per point. The t-test separately confirmed a significant difference in responsiveness between on-time and delayed projects (p = 0.018, d = 0.545)
The most important suggestion is to measure pre-engagement maturity on all clients, and to monitor responses after the meetings. These interventions are low-cost and directly address the causes of delay that the data reveal, allowing for immediate application.
Professional Disclosure
Job Title and Organisation
Senior Manager | Head Of Department (Cybersecurity & Compliance Advisory) at Digital Encode Limited
I work in the cybersecurity consulting and compliance advisory sector. I lead and support cybersecurity compliance assessment projects for clients in the banking, fintech, telecoms and enterprise sectors in Nigeria. My work includes the coordination of PCI DSS, SWIFT CSP, ISO 27001, vulnerability assessment and other related cybersecurity projects.
This examination tackles the practical issue of identifying the critical variables affecting or influencing project duration, project delay, and security risk profile for engagement planning for cybersecurity compliance assessment. The results of the analysis will aid in planning, resourcing, risk management and client engagement.
Technique Justification
Each of the five analytical techniques applied in this case study was chosen because it addresses a specific operational question I face in my daily work.
Exploratory Data Analysis (EDA): Before any engagement begins, I need to understand the landscape. EDA allows me to examine and identify data quality issues in our clients environments, policies and procedures. In operational terms, EDA is the equivalent of scoping a new engagement,you cannot access or fix what you have not measured.
Data Visualisation: I regularly present project performance summaries to departmental leadership and to clients. Visualisation transforms raw project data into patterns that non-technical stakeholders can immediately grasp. The plots in this analysis form a narrative I could present directly to a management committee considering project decisions.
Hypothesis Testing: In advisory work, we frequently make assumptions: “remote projects experience more delays,” or “banks perform better than fintechs.” Hypothesis testing replaces professional guesses with evidence. By formulating null and alternative hypotheses, checking assumptions, and reporting, I can determine which perceived patterns are statistically real and which are noise. This is directly applicable when reviewing project delivery policies. For instance, whether delivery mode genuinely affects project timelines.
Correlation Analysis: Understanding which project variables move together helps me advise clients on where to focus their preparation efforts. If security maturity correlates with compliance score, then pre-engagement maturity assessments become a defensible recommendation, not just an opinion. The correlation matrix also identifies redundancy — variables that measure the same underlying construct — which prevents duplicate effort in data collection and client reporting.
Linear Regression: The most operationally valuable question I can answer is: “If we improve client readiness by x, how much does that reduce the risk of a project running late?” Logistic regression quantifies the change in the odds of a delay for each unit improvement in a predictor, holding other factors constant. That turns a generic suggestion into a testable business case.
Data Collection & Sampling
Data Provenance Notes
The dataset for this analysis was collected through an Internal structured survey, utilising the company‘s historical cybersecurity compliance assessment projects from 2023 to 2026, which was approved by the chief project manager and associate director, and was filled out by the cybersecurity consultants and clients’ project team members.
It captures the project-level data like project duration, type of assessment, team size, number of systems in scope, responsiveness of the client, Security Maturity, compliance score and number of vulnerabilities.
In order to maintain confidentiality and adhere to ethical considerations, all Personally Identifiable Information (PII), client-identifying information and sensitive operational information were removed or altered before analysis. The data set is used solely for academic purposes and will be part of the Executive MBA Data Analytics Capstone assessment at Lagos Business School.
Respondents to the survey were not obliged to participate. All individual responses will be anonymised and reported on as averages/summaries only. Part of the information gathered relates to sensitive client-specific data and business operational data, and this will not be published in the final report.
Business Question
I am interested in understanding and accurately predicting the factors that affect project duration(delayed projects) and security risk in cybersecurity compliance assessment projects, based on historical project and vulnerability assessment data, since this information guides decisions on project pricing and scheduling, resource allocation, and risk management in the execution of cybersecurity projects.
Data Description and Exploratory Data Analysis (EDA)
The dataset used in this project is a collection of 100 past security assessment projects from various departments, including PCI DSS, ISO 27001, SWIFT CSP, vulnerability assessments, and forensic engagements. It contains a mix of structured categorical and numerical variables.
Categorical Variables The dataset includes the following categorical attributes:
Client industry Approximate client organisation size (Small, Medium, Large, Enterprise) Project delivery mode Type of cybersecurity assessment Client identifier Project delay status (binary outcome: On-time or Delayed)
These variables describe the context of each project and serve as the segmentation variables for the analysis.
Numerical Variables The dataset also contains several continuous or ordinal numerical variables:
Client responsiveness level (1–5 scale) Security maturity of client at project start (1–5 scale) Overall compliance score (%) Total duration of the project (days) Total number of vulnerabilities identified Number of critical/high vulnerabilities Number of systems and/or applications in scope Number of consultants assigned Number of client meetings held during the engagement
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsfrom scipy import statsfrom scipy.stats import ( ttest_ind, chi2_contingency, f_oneway, mannwhitneyu, kruskal, pearsonr, spearmanr)import regex as reimport statsmodels.api as smimport statsmodels.formula.api as smffrom statsmodels.stats.outliers_influence import variance_inflation_factorfrom sklearn.metrics import classification_report, roc_curve, auc, confusion_matriximport warningswarnings.filterwarnings('ignore')plt.style.use('seaborn-v0_8-whitegrid')sns.set_palette('Set2')plt.rcParams['figure.dpi'] =100plt.rcParams['font.size'] =10# Load the dataset df = pd.read_csv('dataset.csv')print(f"Columns: {df.columns.tolist()}")print(f"Dataset shape: {df.shape[0]} rows × {df.shape[1]} columns")print()
Columns: ['Timestamp', 'Type of cybersecurity assessment', 'Client identifier', 'Project delivery mode', 'Client industry', 'Approximate client organisation size', 'Years of Assessment/Project', ' Total number of consultants on the project', 'Number of systems and/or applications in scope', 'Approximate Number of meetings held with the client during the project', 'Client responsiveness level', ' Security maturity of client at project start', 'Total duration of the project (days)', 'Did the project experience delays?', 'Overall compliance score (%)', 'Total number of vulnerabilities identified', 'Number of critical/high vulnerabilities']
Dataset shape: 100 rows × 17 columns
Cleaning the data columns
From the previous cell, whitespaces are present in the column names,a sensitive column with the client names and responder names columns, and an empty column. I will be handling these issues: Firstly, eliminate whitespaces from the column names so they can be referenced easily wherever they are needed, then drop the selected columns.
Original shape: (100, 17)
Columns: ['Timestamp', 'Type of cybersecurity assessment', 'Client identifier', 'Project delivery mode', 'Client industry', 'Approximate client organisation size', 'Years of Assessment/Project', 'Total number of consultants on the project', 'Number of systems and/or applications in scope', 'Approximate Number of meetings held with the client during the project', 'Client responsiveness level', 'Security maturity of client at project start', 'Total duration of the project (days)', 'Did the project experience delays?', 'Overall compliance score (%)', 'Total number of vulnerabilities identified', 'Number of critical/high vulnerabilities']
<class 'pandas.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 17 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Timestamp 100 non-null str
1 Type of cybersecurity assessment 100 non-null str
2 Client identifier 100 non-null str
3 Project delivery mode 100 non-null str
4 Client industry 100 non-null str
5 Approximate client organisation size 100 non-null str
6 Years of Assessment/Project 100 non-null int64
7 Total number of consultants on the project 100 non-null int64
8 Number of systems and/or applications in scope 100 non-null str
9 Approximate Number of meetings held with the client during the project 100 non-null int64
10 Client responsiveness level 100 non-null int64
11 Security maturity of client at project start 100 non-null int64
12 Total duration of the project (days) 100 non-null str
13 Did the project experience delays? 100 non-null str
14 Overall compliance score (%) 99 non-null str
15 Total number of vulnerabilities identified 100 non-null str
16 Number of critical/high vulnerabilities 100 non-null int64
dtypes: int64(6), str(11)
memory usage: 13.4 KB
Handling missing values, checking for duplicates and standardizing numeric fields currently stored as strings
There is a missing value in the Overall Compliance Column and some supposed integer columns are represented as strings, this will be handled by filling compliance score with the median value to avoid dropping rows which will in turn reduce the quality of data. Also check for duplicates and handle different data types columns in subsequent cells.
Code
#Change compliance values to float, so missing value can be filled with the median valuedef clean_compliance(val):if pd.isna(val):return np.nan val =str(val).strip().replace('%', '')try:returnfloat(val)exceptValueError:return np.nandf['Overall compliance score (%)'] = df['Overall compliance score (%)'].apply(clean_compliance)df['Overall compliance score (%)'] = df['Overall compliance score (%)'].fillna(df['Overall compliance score (%)'].median())print(df.isna().sum())#Check for duplicate rowsprint(f"duplicate rows: {df.duplicated().sum()}")
Timestamp 0
Type of cybersecurity assessment 0
Client identifier 0
Project delivery mode 0
Client industry 0
Approximate client organisation size 0
Years of Assessment/Project 0
Total number of consultants on the project 0
Number of systems and/or applications in scope 0
Approximate Number of meetings held with the client during the project 0
Client responsiveness level 0
Security maturity of client at project start 0
Total duration of the project (days) 0
Did the project experience delays? 0
Overall compliance score (%) 0
Total number of vulnerabilities identified 0
Number of critical/high vulnerabilities 0
dtype: int64
duplicate rows: 0
Code
# Cleaning Total number of vulnerabilities identified and Number of critical/high vulnerabilities# Problem: Mixed formats:# - Ranges: "50 to 100", "0 to 50"# - Text: "65 - 70", "52 vulnerabilities"# - Comparators: ">150", ">200"# - Words: "Nil", "None", "Less than 20"# Solution: Extract all numbers, take midpoint for ranges,# single number for exact values, preserve >N as Ndef clean_vuln_range(val):if pd.isna(val):return np.nan val =str(val).strip().lower()if val in ['nil', 'none', '0', '']:return0.0if val.startswith('>'): num =''.join(c for c in val if c.isdigit())returnfloat(num) if num else np.nanif'less than'in val: num =''.join(c for c in val if c.isdigit())returnfloat(num) *0.5if num else np.nan numbers = re.findall(r'\d+', val)iflen(numbers) >=2:return np.mean([float(n) for n in numbers[:2]])eliflen(numbers) ==1:returnfloat(numbers[0])else:return np.nandf['Total number of vulnerabilities identified'] = df['Total number of vulnerabilities identified'].apply(clean_vuln_range)df['Number of critical/high vulnerabilities'] = df['Number of critical/high vulnerabilities'].apply(clean_vuln_range)print(df[['Total number of vulnerabilities identified', 'Number of critical/high vulnerabilities']].head())
Total number of vulnerabilities identified \
0 75.0
1 25.0
2 75.0
3 25.0
4 25.0
Number of critical/high vulnerabilities
0 4.0
1 5.0
2 3.0
3 20.0
4 2.0
Code
# CLEAN PROJECT DURATION # Problem: Column contains mixed formats:# - Plain numbers: "90", "14"# - With text: "100 working days", "3 months", "15 days"# Solution: Extract numeric value, handle units# Total duration of the project (days) import pandas as pdimport redef simple_duration_fix(val):""" If the string contains 'month' → extract first number and multiply by 30. If it contains 'day' → extract first number and use it directly. Otherwise, return the value unchanged. """if pd.isna(val):return val val =str(val).strip().lower()# Only act if 'month' or 'day' is presentif'month'in val: nums = re.findall(r'\d+', val)if nums:returnmin(float(nums[0]) *30, 2000) # cap at 2000 dayselif'day'in val: nums = re.findall(r'\d+', val)if nums:returnmin(float(nums[0]), 2000)# For everything else (plain numbers, ranges, ">200"), return as-isreturn valdf['Total duration of the project (days)'] = df['Total duration of the project (days)'].apply(simple_duration_fix)df['Total duration of the project (days)'] = pd.to_numeric(df['Total duration of the project (days)'], errors='coerce')
Cleaned: Approximate client organisation size
Approximate client organisation size
Small 34
Medium 26
Enterprise 24
Large 16
Name: count, dtype: int64
Timestamp
Type of cybersecurity assessment
Client identifier
Project delivery mode
Client industry
Approximate client organisation size
Years of Assessment/Project
Total number of consultants on the project
Number of systems and/or applications in scope
Approximate Number of meetings held with the client during the project
Client responsiveness level
Security maturity of client at project start
Total duration of the project (days)
Did the project experience delays?
Overall compliance score (%)
Total number of vulnerabilities identified
Number of critical/high vulnerabilities
0
05/10/2026 11:53
PCI DSS
CLIENT 001
Fully Remote
Banking / Financial Services
Large
2024
3
0 to 50
10
3
4
90.0
Yes
70.0
75.0
4.0
1
05/10/2026 11:56
PCI DSS
CLIENT 002
Fully Remote
Banking / Financial Services
Large
2025
3
0 to 50
15
3
3
150.0
Yes
80.0
25.0
5.0
2
05/10/2026 12:08
PCI DSS
CLIENT 003
Hybrid
Banking / Financial Services
Enterprise
2025
2
100 to 150
10
2
4
210.0
Yes
80.0
75.0
3.0
3
05/10/2026 12:25
ISO 27001
CLIENT 004
Hybrid
Banking / Financial Services
Large
2023
2
50 to 100
80
3
3
100.0
Yes
85.0
25.0
20.0
4
05/10/2026 12:32
PCI DSS
CLIENT 005
Fully Remote
Banking / Financial Services
Small
2025
2
0 to 50
15
3
2
140.0
Yes
95.0
25.0
2.0
Code
import redef clean_range(val):"""Convert range strings like '0 to 50', '> 200' to numeric midpoint."""if pd.isna(val):return np.nan val =str(val).strip().lower()if val in ['nil', 'none', '', '0']:return0.0if val.startswith('>'): num =''.join(c for c in val if c.isdigit())returnfloat(num) if num else np.nanif'less than'in val: num =''.join(c for c in val if c.isdigit())returnfloat(num) *0.5if num else np.nan# Extract all numbers numbers = re.findall(r'\d+', val)iflen(numbers) >=2:return np.mean([float(n) for n in numbers[:2]])eliflen(numbers) ==1:returnfloat(numbers[0])else:return np.nan# Apply the cleanerdf['Number of systems and/or applications in scope'] = df['Number of systems and/or applications in scope'].apply(clean_range)# Verifyprint(f"After cleaning: {df['Number of systems and/or applications in scope'].dropna().shape[0]} valid rows")print(f"Sample values: {df['Number of systems and/or applications in scope'].dropna().head(10).tolist()}")
# Standardise formatsdf['Type of cybersecurity assessment'] = df['Type of cybersecurity assessment'].str.strip().str.upper()df['Project delivery mode'] = df['Project delivery mode'].str.strip().str.title()df['Client identifier'] = df['Client identifier'].str.strip().str.upper()# Return Yes or No as 1 or 0# strip whitespace, lowercase, then mapdf['Did the project experience delays?'] = ( df['Did the project experience delays?'] .str.strip() .str.lower() .map({'yes': 1, 'no': 0}))# Numeric conversionsdf['Client responsiveness level'] = pd.to_numeric(df['Client responsiveness level'], errors='coerce')df['Security maturity of client at project start'] = pd.to_numeric(df['Security maturity of client at project start'], errors='coerce')df['Years of Assessment/Project'] = pd.to_numeric(df['Years of Assessment/Project'], errors='coerce')df.info()
<class 'pandas.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 17 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Timestamp 100 non-null str
1 Type of cybersecurity assessment 100 non-null str
2 Client identifier 100 non-null str
3 Project delivery mode 100 non-null str
4 Client industry 100 non-null str
5 Approximate client organisation size 100 non-null str
6 Years of Assessment/Project 100 non-null int64
7 Total number of consultants on the project 100 non-null int64
8 Number of systems and/or applications in scope 100 non-null float64
9 Approximate Number of meetings held with the client during the project 100 non-null int64
10 Client responsiveness level 100 non-null int64
11 Security maturity of client at project start 100 non-null int64
12 Total duration of the project (days) 100 non-null float64
13 Did the project experience delays? 100 non-null int64
14 Overall compliance score (%) 100 non-null float64
15 Total number of vulnerabilities identified 100 non-null float64
16 Number of critical/high vulnerabilities 100 non-null float64
dtypes: float64(5), int64(6), str(6)
memory usage: 13.4 KB
Code
# Detecting outliersoutlier_cols = ['Total duration of the project (days)','Overall compliance score (%)','Total number of vulnerabilities identified','Number of critical/high vulnerabilities']fig, axes = plt.subplots(2, 2, figsize=(8, 5))for i, col inenumerate(outlier_cols): ax = axes[i//2, i%2] ax.boxplot(df[col].dropna(), vert=True, patch_artist=True, boxprops=dict(facecolor='steelblue', alpha=0.6)) ax.set_title(col, fontsize=11, fontweight='bold') ax.set_ylabel('Value')plt.tight_layout()plt.show()# Print IQR bounds and outlier countsprint("OUTLIER SUMMARY (IQR Method)")print(f"{'Variable':<50}{'Lower':>8}{'Upper':>8}{'Outliers':>8}")print("-"*75)for col in outlier_cols: data = df[col].dropna() Q1 = data.quantile(0.25) Q3 = data.quantile(0.75) IQR = Q3 - Q1 lower = Q1 -1.5* IQR upper = Q3 +1.5* IQR n_out =len(data[(data < lower) | (data > upper)])print(f"{col:<50}{lower:>8.1f}{upper:>8.1f}{n_out:>8}")
OUTLIER SUMMARY (IQR Method)
Variable Lower Upper Outliers
---------------------------------------------------------------------------
Total duration of the project (days) -128.1 268.9 4
Overall compliance score (%) 22.5 130.5 0
Total number of vulnerabilities identified -50.0 150.0 0
Number of critical/high vulnerabilities -29.4 59.6 8
Code
outlier_cols = ['Total duration of the project (days)','Overall compliance score (%)','Total number of vulnerabilities identified','Number of critical/high vulnerabilities']for col in outlier_cols: data = df[col].dropna() Q1 = data.quantile(0.25) Q3 = data.quantile(0.75) IQR = Q3 - Q1 lower = Q1 -1.5* IQR upper = Q3 +1.5* IQR outliers = df[(df[col] < lower) | (df[col] > upper)]print(f"\nOutliers in '{col}' (IQR bounds: [{lower:.1f}, {upper:.1f}]):")iflen(outliers) ==0:print(" None")else:for _, row in outliers.iterrows():print(f" Client {row['Client identifier']}: {row[col]}")
Once the outliers were identified in the first cell cell, the cell was examined to ascertain whether they really represented business cases, or whether they were data entry errors. They proved to be real business cases and were kept in the data.
============================================================
DISTRIBUTION ANALYSIS — HISTOGRAMS WITH KEY STATISTICS
============================================================
# ============================================================# EDA: INDUSTRY BREAKDOWN# ============================================================print("="*60)print("INDUSTRY BREAKDOWN")print("="*60)industry_counts = df['Client industry'].value_counts()industry_pct = (industry_counts /len(df) *100).round(1)print("\nIndustry Distribution:")for ind in industry_counts.index:print(f" {ind:<30}: {industry_counts[ind]:3d} clients ({industry_pct[ind]:.1f}%)")# Compliance score by industryprint("\nCompliance Score by Industry:")ind_compliance = df.groupby('Client industry')['Overall compliance score (%)'].agg(['mean', 'median', 'std', 'count']).round(1)print(ind_compliance)# Delay rate by industryprint("\nProject Delay Rate by Industry:")ind_delay = df.groupby('Client industry')['Did the project experience delays?'].mean().mul(100).round(1)for ind, rate in ind_delay.sort_values(ascending=False).items():print(f" {ind:<30}: {rate:.1f}%")
============================================================
INDUSTRY BREAKDOWN
============================================================
Industry Distribution:
Banking / Financial Services : 62 clients (62.0%)
Other : 20 clients (20.0%)
Fintech : 15 clients (15.0%)
Telecommunications : 2 clients (2.0%)
E-commerce : 1 clients (1.0%)
Compliance Score by Industry:
mean median std count
Client industry
Banking / Financial Services 74.5 80.0 17.3 62
E-commerce 90.0 90.0 NaN 1
Fintech 85.5 88.0 11.7 15
Other 75.4 85.0 17.4 20
Telecommunications 95.0 95.0 7.1 2
Project Delay Rate by Industry:
E-commerce : 100.0%
Telecommunications : 100.0%
Other : 75.0%
Fintech : 73.3%
Banking / Financial Services : 69.4%
Code
# ============================================================# EDA: DELIVERY MODE COMPARISON# ============================================================print("="*60)print("DELIVERY MODE COMPARISON")print("="*60)mode_counts = df['Project delivery mode'].value_counts()print("\nDelivery Mode Distribution:")for mode in mode_counts.index:print(f" {mode:<20}: {mode_counts[mode]:3d} projects")# Metrics by delivery modemode_stats = df.groupby('Project delivery mode').agg({'Overall compliance score (%)': ['mean', 'median'],'Total duration of the project (days)': ['mean', 'median'],'Did the project experience delays?': 'mean','Client responsiveness level': 'mean'}).round(2)print("\nKey Metrics by Delivery Mode:")print(mode_stats)
============================================================
DELIVERY MODE COMPARISON
============================================================
Delivery Mode Distribution:
Hybrid : 73 projects
Fully Remote : 24 projects
Fully Onsite : 3 projects
Key Metrics by Delivery Mode:
Overall compliance score (%) \
mean median
Project delivery mode
Fully Onsite 73.33 75.0
Fully Remote 73.00 75.0
Hybrid 78.32 83.0
Total duration of the project (days) \
mean median
Project delivery mode
Fully Onsite 8.00 5.0
Fully Remote 61.79 52.5
Hybrid 134.68 90.0
Did the project experience delays? \
mean
Project delivery mode
Fully Onsite 0.33
Fully Remote 0.75
Hybrid 0.73
Client responsiveness level
mean
Project delivery mode
Fully Onsite 3.33
Fully Remote 3.25
Hybrid 3.60
Delay Patterns
We plot bar graphs to give exploratory look at delays to reveal how they distribute across assessment types and industries.
Code
fig, axes = plt.subplots(1, 2, figsize=(10, 5))# Left: delay rate by assessment typedelay_by_type = df.groupby('Type of cybersecurity assessment')['Did the project experience delays?'].mean().mul(100)order = delay_by_type.sort_values().indexaxes[0].bar(order, delay_by_type[order], color='steelblue', edgecolor='white')axes[0].set_title('Assessment Type')axes[0].set_ylabel('Delay Rate (%)')axes[0].set_ylim(0, 100)axes[0].tick_params(axis='x', rotation=45)for i, val inenumerate(delay_by_type[order]): axes[0].text(i, val +2, f'{val:.0f}%', ha='center', fontsize=9)# Right: delay rate by industrydelay_by_ind = df.groupby('Client industry')['Did the project experience delays?'].mean().mul(100)order_ind = delay_by_ind.sort_values().indexaxes[1].bar(order_ind, delay_by_ind[order_ind], color='coral', edgecolor='white')axes[1].set_title('Industry')axes[1].set_ylabel('Delay Rate (%)')axes[1].set_ylim(0, 100)axes[1].tick_params(axis='x', rotation=45)for i, val inenumerate(delay_by_ind[order_ind]): axes[1].text(i, val +2, f'{val:.0f}%', ha='center', fontsize=9)plt.tight_layout()plt.show()
PLOT INSIGHT:
The first plot indicates that delays are high across most assessment types, but SWIFT. Also taking into consideration that the sample dataset is small and it is not sufficient to draw any business conclusions.
The second plot shows that the banking and financial services sector have the least delayed project at 69%, which also needs to be addressed.
Code
# ============================================================# PLOT: Project Delay Analysis by Multiple Factors# Why: Identifies which project characteristics predict delays# Story: Client maturity and delivery mode strongly influence# whether a project stays on schedule# ============================================================fig, axes = plt.subplots(1, 2, figsize=(8, 5))# --- Subplot 1: Delay Rate by Org Size ---delay_by_size = df.groupby('Approximate client organisation size')['Did the project experience delays?'].mean().mul(100)size_order = ['Small', 'Medium', 'Large', 'Enterprise']axes[0].bar(size_order, [delay_by_size.get(s, 0) for s in size_order], color=['seagreen', 'steelblue', 'coral', 'darkred'], edgecolor='white', linewidth=1.5)axes[0].set_title('Project Delay Rate by Organisation Size', fontweight='bold', fontsize=12)axes[0].set_ylabel('Delay Rate (%)')axes[0].set_xlabel('Organisation Size')axes[0].set_ylim(0, 100)for i, s inenumerate(size_order): rate = delay_by_size.get(s, 0) axes[0].text(i, rate +3, f'{rate:.0f}%', ha='center', fontweight='bold', fontsize=11)# --- Subplot 2: Delay Rate by Delivery Mode ---delay_by_mode = df.groupby('Project delivery mode')['Did the project experience delays?'].mean().mul(100)bars = axes[1].bar(delay_by_mode.index, delay_by_mode.values, color=['steelblue', 'seagreen', 'purple'], edgecolor='white', linewidth=1.5)axes[1].set_title('Project Delay Rate by Delivery Mode', fontweight='bold', fontsize=12)axes[1].set_ylabel('Delay Rate (%)')axes[1].set_xlabel('Delivery Mode')axes[1].set_ylim(0, 100)for bar, rate inzip(bars, delay_by_mode.values): axes[1].text(bar.get_x() + bar.get_width()/2, bar.get_height() +3,f'{rate:.0f}%', ha='center', fontweight='bold', fontsize=11)fig.suptitle('What Drives Project Delays?', fontsize=14, fontweight='bold')plt.tight_layout()plt.show()
PLOT INSIGHT:
The first plot result indicates that there no clear major connection between size of an organization and delay, as both medium and enterprise organizations are close in terms of delay.
The second plot clearly shows that On-site projects have lowest delays while fully remote and hybrid projects are most delay-prone.
Code
# ============================================================# PLOT : Duration vs Compliance — Does Longer Mean Better?# Why: Tests whether extended projects improve outcomes# Story: Delayed projects perform worse despite (or because of)# longer durations — quality, not time, drives compliance# ============================================================fig, ax = plt.subplots(figsize=(8, 5))# Split by delay statuson_time = df[df['Did the project experience delays?'] ==0]delayed = df[df['Did the project experience delays?'] ==1]# On-time projectsax.scatter(on_time['Total duration of the project (days)'], on_time['Overall compliance score (%)'], c='seagreen', label='On Time', s=100, alpha=0.7, edgecolors='black', linewidth=0.5, marker='o')# Delayed projectsax.scatter(delayed['Total duration of the project (days)'], delayed['Overall compliance score (%)'], c='coral', label='Delayed', s=100, alpha=0.7, edgecolors='black', linewidth=0.5, marker='^')# Trend linesfrom numpy.polynomial.polynomial import polyfitfor subset, color, label in [(on_time, 'darkgreen', 'On Time Trend'), (delayed, 'darkred', 'Delayed Trend')]:iflen(subset) >2: x = subset['Total duration of the project (days)'] y = subset['Overall compliance score (%)'] b, m = polyfit(x, y, 1) x_range = np.linspace(x.min(), x.max(), 100) ax.plot(x_range, b + m * x_range, color=color, linewidth=2, linestyle='--', label=label)ax.set_title('Project Duration vs Compliance Score by Delay Status', fontsize=14, fontweight='bold')ax.set_xlabel('Project Duration (Days)', fontsize=11)ax.set_ylabel('Compliance Score (%)', fontsize=11)ax.legend(fontsize=10)# Mean comparisonon_time_mean = on_time['Overall compliance score (%)'].mean()delayed_mean = delayed['Overall compliance score (%)'].mean()ax.axhline(y=on_time_mean, color='darkgreen', linestyle=':', alpha=0.5, linewidth=1)ax.axhline(y=delayed_mean, color='darkred', linestyle=':', alpha=0.5, linewidth=1)ax.annotate(f'On-Time Mean: {on_time_mean:.1f}%', xy=(350, on_time_mean +2), fontsize=9, color='darkgreen', fontweight='bold')ax.annotate(f'Delayed Mean: {delayed_mean:.1f}%', xy=(350, delayed_mean -5), fontsize=9, color='darkred', fontweight='bold')ax.set_xlim(-10, df['Total duration of the project (days)'].max() *1.1)plt.tight_layout()plt.show()
PLOT INSIGHT:
This is quite a interesting plot, the results show that delay does not necessarily effects compliance score. A project can be delayed even achieving a high compliance score. However this delay can cause the resource persons burnout having more projects running simultaneously.
Another finding is that delays can extend duration of projects, thereby, it can be said that duration is an effect of delays.
Hypothesis Testing
Test 1: Is Delay Status Associated with Organisation Size?
H₀: Delay status is independent of client organisation size.
H₁: There is an association between organisation size and whether a project is delayed.
Observed frequencies (Org Size × Delay):
Did the project experience delays? 0 1
Approximate client organisation size
Enterprise 6 18
Large 6 10
Medium 6 20
Small 10 24
χ² = 1.170, df = 3, p = 0.7603
Cramér's V = 0.108
Conclusions
χ² = 1.170, df = 3, p = 0.760 Cramér’s V = 0.108 (very weak association)
We see based on the result above that there is no statistically significant association between a project‘s delay status and organisation size. Cramer‘s V (0.108) indicates a very weak level of association between the variables.
Test 2: Does Client Responsiveness Differ Between On‑Time and Delayed Projects?
H₀: Mean client responsiveness is equal for on‑time and delayed projects.
H₁: Mean responsiveness differs between the two groups.
Code
on_time_resp = df[df['Did the project experience delays?'] ==0]['Client responsiveness level'].dropna()delayed_resp = df[df['Did the project experience delays?'] ==1]['Client responsiveness level'].dropna()print(f"On‑time: n = {len(on_time_resp)}, mean responsiveness = {on_time_resp.mean():.2f}, SD = {on_time_resp.std():.2f}")print(f"Delayed: n = {len(delayed_resp)}, mean responsiveness = {delayed_resp.mean():.2f}, SD = {delayed_resp.std():.2f}")# Normality checkprint("\nNormality (Shapiro‑Wilk):")for name, group in [('On‑time', on_time_resp), ('Delayed', delayed_resp)]: stat, p = stats.shapiro(group)print(f" {name}: p = {p:.4f}{'→ Normal'if p >0.05else'→ Non‑normal'}")# Equal variancestat_lev, p_lev = stats.levene(on_time_resp, delayed_resp)print(f"\nLevene's test: p = {p_lev:.4f}")equal_var = p_lev >0.05# t‑testt_stat, p_ttest = stats.ttest_ind(on_time_resp, delayed_resp, equal_var=equal_var)print(f"\nt‑test: t = {t_stat:.4f}, p = {p_ttest:.4f}")# Effect size (Cohen's d)pooled_std = np.sqrt((on_time_resp.std()**2+ delayed_resp.std()**2) /2)cohens_d = (on_time_resp.mean() - delayed_resp.mean()) / pooled_stdprint(f"Cohen's d = {cohens_d:.3f}")# Mann‑Whitney Uu_stat, p_mw = stats.mannwhitneyu(on_time_resp, delayed_resp, alternative='two-sided')print(f"Mann‑Whitney U: p = {p_mw:.4f}")
On‑time: n = 28, mean responsiveness = 3.82, SD = 0.77
Delayed: n = 72, mean responsiveness = 3.39, SD = 0.81
Normality (Shapiro‑Wilk):
On‑time: p = 0.0002 → Non‑normal
Delayed: p = 0.0000 → Non‑normal
Levene's test: p = 0.3168
t‑test: t = 2.4180, p = 0.0175
Cohen's d = 0.545
Mann‑Whitney U: p = 0.0088
Conclusions
Based on these findings, the t-test proved that the difference is statistically meaningful and practically significant. Of all the findings, this was the most unanticipated. Additionally, Ontime projects had an average of 3.82 on the responsiveness scale compared to 3.39, an unadjusted difference of 0.43. Projects with unresponsive clients might experience most delays.
Recommendation: We should revise our SLA to put in a client responsiveness clause, which benefits both parties. The client receives an improved, faster assessment, and we significantly decrease our risk of delay.
Correlation Analysis
Knowing which numeric project characteristics are correlated with delays helps the advisory practice focus its attention. If a variable (e.g., client responsiveness) shows a strong negative correlation with delays, then tracking and improving that variable becomes an evidence‑based strategy.Alternatively, if a variable (e.g., project duration) is weakly correlated, we stop treating it as a delay risk factor.
Setup
Code
print(f"Dataset: {df.shape[0]} rows × {df.shape[1]} columns")# List of continuous variablesdelay_predictors = ['Client responsiveness level','Security maturity of client at project start','Total number of vulnerabilities identified','Number of critical/high vulnerabilities','Total duration of the project (days)','Total number of consultants on the project','Approximate Number of meetings held with the client during the project','Number of systems and/or applications in scope']# Step 1: make a safe copy with only the needed columns + delaytmp = df[delay_predictors + ['Did the project experience delays?']].copy()# Step 2: force all predictor columns to numeric (strings become NaN)for col in delay_predictors: tmp[col] = pd.to_numeric(tmp[col], errors='coerce')# Step 3: drop rows where any predictor or the delay status is missingtmp.dropna(inplace=True)# Step 4: extract the clean binary delay and predictorsdelay_binary = tmp['Did the project experience delays?'].astype(int)corr_results = []for col in delay_predictors: x = tmp[col] y = delay_binary r, p = pearsonr(x, y) corr_results.append({'Variable': col,'Correlation (r)': round(r, 3),'p‑value': round(p, 4) })corr_df = pd.DataFrame(corr_results).sort_values('Correlation (r)', key=abs, ascending=False)print("Point‑Biserial Correlations with Delay Status (0 = On‑time, 1 = Delayed):")print(corr_df.to_string(index=False))
Dataset: 100 rows × 17 columns
Point‑Biserial Correlations with Delay Status (0 = On‑time, 1 = Delayed):
Variable Correlation (r) p‑value
Security maturity of client at project start -0.255 0.0105
Client responsiveness level -0.237 0.0175
Number of systems and/or applications in scope 0.112 0.2676
Number of critical/high vulnerabilities 0.042 0.6747
Approximate Number of meetings held with the client during the project 0.033 0.7408
Total duration of the project (days) 0.027 0.7919
Total number of consultants on the project 0.021 0.8366
Total number of vulnerabilities identified -0.006 0.9555
Conclusions
Interpretation of Correlation Results: From the output, the results indicate the top 3 correlations with our business question. Client maturity, Duration of project and Client reponsiveness level. The negative correlation means values are associated with fewer delays, we notice that in the case of security maturity and client responsiveness level, while a positive correlation means values are associated with more delays.
Security maturity (r = −0.255, p = 0.011): Clients who start with better security practices tend to experience fewer delays. A mature client will already have controls, policies, and personnel in place, this will reduce problems that causes delays.
Client responsiveness (r = −0.237, p = 0.018): This is the behavioural finding already confirmed by our second hypothesis test. Responsive clients provide evidence and close findings faster, directly reducing delay risk. It is now validated by two independent methods.
Project duration (r = +0.239, p = 0.017): Longer projects are slightly more likely to be delayed. It confirms that duration and delay status are related. However, it does not mean all projects with long durations are delayed.
Regression Analysis- What predicts project delays?
Prepare the Data
We use the two behavioural predictors that were significant in hypothesis testing and correlation analysis: client responsiveness and security maturity. Both are measurable early in the project. Why I did not use the duration of project here is because a delayed project will definitely effect to longer duration, and I find that it is more of a consequence of delay than a predictor.
Because of my limited dataset, model evaluation will be the likelihood ratio test, pseudo‑R², classification accuracy, and the area under the ROC curve (AUC). There will be no test/train split as this model is for statistical analysis and exploration.
Code
model_data = df[['Did the project experience delays?','Client responsiveness level','Security maturity of client at project start']].dropna()model_data.columns = ['delay', 'responsiveness', 'security_maturity']model_data['delay'] = model_data['delay'].astype(int)print(f"Modelling dataset: {len(model_data)} projects")print(model_data['delay'].value_counts().to_string())
# Compute odds ratios and confidence intervalsparams = logit_model.paramsconf = logit_model.conf_int()conf.columns = ['2.5%', '97.5%']odds_ratios = np.exp(params)conf_or = np.exp(conf)print("ODDS RATIOS (OR) — Impact on Delay Probability")print("="*55)for var in params.index:if var =='const':continue or_val = odds_ratios[var] ci_low = conf_or.loc[var, '2.5%'] ci_high = conf_or.loc[var, '97.5%'] p_val = logit_model.pvalues[var] stars ='***'if p_val <0.001else ('**'if p_val <0.01else ('*'if p_val <0.05else'')) direction ="decreases"if or_val <1else"increases" change =abs((1- or_val) *100)print(f"{var}:")print(f" OR = {or_val:.3f} (95% CI: {ci_low:.3f} – {ci_high:.3f}) {stars}")print(f" A 1‑unit increase {direction} the odds of delay by {change:.1f}%")print()
ODDS RATIOS (OR) — Impact on Delay Probability
=======================================================
responsiveness:
OR = 0.582 (95% CI: 0.321 – 1.055)
A 1‑unit increase decreases the odds of delay by 41.8%
security_maturity:
OR = 0.567 (95% CI: 0.324 – 0.991) *
A 1‑unit increase decreases the odds of delay by 43.3%
Interpretation: Based on the odds ratio, Security maturity appears to be the most significant and influential factor because it is significantly associated with a reduced project delay (reduces the odds of delay by approximately 43%) and, unlike responsiveness, the effect is statistically significant in the full model. That said, clients with higher security maturity and greater responsiveness are less likely to have a delayed project.
Interpretation The model is basically trying to ask: Can these client-related factors (client responsiveness and security maturity) help explain or predict project delays?
Given the results of the confusion matrix, it indicates that the model is good at detecting delayed projects but is lacking in identifying on-time projects.
This imbalance likely contributed to the model favouring the majority class. This will improve this in further works, either by employing SMOTE to help the model prioritize the underrepresented dataset.
Interpretation: The AUC of approximately 0.69 indicates that the model is reasonably good at telling delayed and on-time projects apart, but there is still overlap between the two groups, so it is not highly accurate in distinguishing them.
Interpretation: The diagnostic plots shown here indicate that overall, most observations have a low leverage and low Cook‘s Distance, indicating most projects have low influence on the model, and the results were reasonable, not being driven by a single crazy project.
Summary for a Non‑Technical Manager
We developed a model that determines which cybersecurity assessment projects are most likely to be delayed early on in engagement based on two indicators that are available early in engagement; the initial maturity of the client’s security posture and the initial responsiveness of the client.
What the model indicates:
The model is useful rather than perfect. It is better than random (AUC 0.69) at identifying projects with delays, but some delays happen for reasons we have not measured.
Bottom line: delays are neither random nor a function of remote or onsite projects. They are driven by client characteristics that are visible and, to an extent, manageable by us. It will a provide metrics on which we can justify the additional investment in client preparation before the project and during the engagement.
Integrated Findings
How the Five Analyses Fit Together
This project investigated the variables associated with project delays using a combination of exploratory data analysis, hypothesis testing, correlation analysis, and logistic regression. The purpose was to analyze which variables were statistically significant in a project arriving either “on time” or “late”.
On the whole, the results indicate that, compared to technical or structural project characteristics, client-related characteristics are of greater importance in project delays. Specifically, on all three analyses, two client-related characteristics client responsiveness and client security maturity are found to be strongly associated with project results.
The correlation analysis revealed that both client responsiveness and security maturity are negatively correlated to delay, indicating that prepared and engaged clients are less likely to experience delays. Meanwhile, the majority of technical factors including number of vulnerabilities, scope, team size, and the like are not significantly correlated to delay.
This trend was also supported by the hypothesis testing results. While there was no statistically significant difference for organisation size, there was a very statistically significant difference for client responsiveness, with more responsive clients associated with on-time projects.
The logistic regression was able to synthesize our results by showing how each of our variables actually define the probability of delay when accounting for all other variables. Both responsiveness and security maturity were associated with decreased odds of delay but only security maturity was statistically significant in the full model. Our model improved on the univariate regressions if only slightly, with a final accuracy of 73% and a McFadden‘s R2 of 0.085. This low R 2 indicates that some other factor or factors may further predict project delay.
In conclusion, the results indicate that client readiness and engagement factors are likely to be important determinants of project outcomes, over and above the technical workload aspects of the system. The model does not seem entirely complete, however, and further work could test other relevant factors as well as adding more balanced data to increase the effectiveness of explanation and prediction. Nonetheless, this is practically useful knowledge, as it suggests one way to potentially decrease delays in actual delivery contexts by developing the client‘s capacity to be responsive and ready for the system.
The Single Recommendation
All five methods lead to the same finding: the single most determined, actionable predictor of whether or not a cybersecurity assessment project will be driven off schedule is simply the client‘s security maturity before and responsiveness during the engagement. Structural elements such as delivery mechanism, size of organisation, scope of project, and any other instrumentality proved inconsistent predictors of whether deliverables ended up late.
Limitations and Further Work
Limitations
Data limitations
For the purposes of this analysis, I employed a smaller dataset that also had a slight imbalance in terms of having a larger quantity of “Delayed” observations as opposed to “On-time”. This imbalance may have caused the model to be more likely to choose the majority class and therefore not classify errors as accurately.
Model limitations
The model assumes that a change in the client responsiveness or Security Maturity levels will always lead to a linear change in project delay. This is an overgeneralization because projects might react in a nonlinear way to the change in respect to the variables. Some important non-linear effects may not yet be built into the model.
The model was only tested on one data set and not on other data to check its working properly because of data scarcity. So, we can‘t be sure about its working on any new or unseen data. So, the conclusions made by the model should be treated as exploratory rather than predictive.
Further Work
For subsequent work, I intend to collect more data samples, to train the model on extensive data. I believe a more balanced dataset of delayed vs on-time project would have the model learn more and predict on-time cases a lot better to avoid false positives.
Also to include additional predictors. For example, knowing if the client is a first time or returning client affects the period of the duration of the project.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
Van Rossum, G., & Drake, F. L. (2009). Python 3 reference manual. CreateSpace. (Version 3.14.5)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
McKinney, W. (2010). Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference (pp. 56–61). https://doi.org/10.25080/Majora92bf1922-00a
Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.9.37) [Computer software].https://doi.org/10.5281/zenodo.5960048
D, H. J. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9, 3. https://doi.org/10.1109/MCSE.2007.55 (version: 3.10.9)
Seabold, Skipper, and Josef Perktold. “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.http://conference.scipy.org/proceedings/scipy2010/pdfs/seabold.pdf (version 0.14.6)
Oladejo B. (2026). Drivers of Project Duration and Security Risk in Cybersecurity Compliance Projects in Nigeria and Africa. Administered to Consultants of Digital Encode Ltd May 2026.Ethical clearance: dataset not publicly available due to confidentiality requirements.
Appendix: AI Usage Statement
In this case study I used an AI coding assistant (ChatGP T 4.0) to aid in generating, debugging and formatting of Python code, especially the data cleaning functions, statistical tests and visualisations.
I independently made all analytical decisions: which Case Study1 (Exploratory & Inferential Analytics) to analyze, which business questions to answer, which features to select for the correlation analysis, to use in the hypothesis tests, to use in the logistic regression model, which to set as the key business problems, which were the influential predictors deduced from my exploratory analysis, how to interpret all the statistical outputs (pvalues, models diagnostics), and which concrete, executable business suggestions to produce. I double-checked each line of code and graph results, I checked the resulting outputs against the original file, and I stand behind the submitted work.