Monte_Carlo_Simulation

Author

Rick Zhang

Published

January 6, 2026

Triangular Distribution

Also referred as three-point estimator, it often used in engineering approximations, Values around the most likely value (c) have higher probability of occurrence.

rtriangle(n = 1, a = 0, b = 1, c = (a + b)/2)

Generates n independent samples from a triangular distribution with: a = minimum (lower bound) b = maximum (upper bound) c = mode (peak, most likely value), where a ≤ c ≤ b

Code

library(triangle)
## view the distribution
tri <- rtriangle(100000, 1, 5, 4)
hist(tri, breaks=100, main="Triangle Distribution", xlab="x")

Code

mean(tri) # 1/3*(1 + 5 + 3) = 3

[1] 3.337401

Code

var(tri) # 1/18*(1^2 + 3^2 + 5^2 - 1*5 - 1*3 - 5*3) = 0.666667

[1] 0.7205638

dtriangle(x, a, b, c): Density (PDF) Returns the height of the probability (y) density function at point(s) x

This is the symmetric case (mode at midpoint). The peak height is always 2/(b-a) for any triangular distribution (area must integrate to 1). Here b-a=1, so peak PDF = 2.

Code

dtriangle(4, 0, 10, 4) # 2/(b-a) = 2

[1] 0.2

ptriangle(q, a, b, c): Cumulative distribution function (CDF) Returns the probability P(X<=q).
Code
ptriangle(5,0,10,5)
```
[1] 0.5
```
qtriangle(p, a, b, c): Quantile function (inverse CDF) Given a probability p (0 ≤ p ≤ 1), returns the value x.

Code

qtriangle(ptriangle(4, 0, 10, 5), 0, 10, 5)

[1] 4

UNIFORM DISTRIBUTION

R provides built-in functions (no extra packages needed):

runif(n, min = 0, max = 1) — Generates n random samples. Example: runif(5, min = 10, max = 20) gives 5 numbers uniformly between 10 and 20.
dunif(x, min = 0, max = 1) — Density (PDF) at x. Example: dunif(0.5, 0, 1) returns 1 (flat height for standard uniform).
punif(q, min = 0, max = 1) — Cumulative probability P(X≤q). Example: punif(0.75, 0, 1) returns 0.75.
qunif(p, min = 0, max = 1) — Quantile (inverse CDF): value where CDF = p. Example: qunif(0.7, 0, 1) returns 0.7.

These functions follow the same naming as rnorm/dnorm/pnorm/qnorm for normal.

Code

set.seed(123)  # For reproducibility

# Generate 100,000 samples from Uniform(0,1)
samples <- runif(100000, 0, 1)

mean(samples)    # ≈ 0.5

[1] 0.4992992

Code

var(samples)     # ≈ 0.0833 (which is 1/12)

[1] 0.08317841

Code

sd(samples)      # ≈ 0.2887

[1] 0.2884067

Code

# Density
dunif(0.5, 0, 1)  # 1

[1] 1

Code

# CDF and quantile round-trip
punif(0.3, -2, 3)   # Probability up to 0.3 in Uniform(-2,3) → 0.46

[1] 0.46

Code

qunif(0.46, -2, 3)  # ≈ 0.3 (recovers the value)

[1] 0.3

Tornado Analysis Comparison

Approach	Basis	Best For	Captures Distributions?	Handles Interactions?	Example Tools
Correlation (e.g., Spearman)	Monte Carlo ranks	Probabilistic risk (your current)	Yes	Partially	mc2d, @RISK, Crystal Ball
OAT Percentage Change	Deterministic ±%	Quick stakeholder communication	No	No	Excel, deterministic models
OAT Range/Absolute Swing	Low-High extremes	Identifying upside/downside risk	No (but can use dist. extremes)	No	Most tornado tools
Change in Output Statistic	Binned MCS results	Conditional impacts in simulation	Yes	No	@RISK
Regression Coefficients	Linear fit to MCS	Scaled importance	Yes	Limited	@RISK, tornado package (R)

Recommendation

Start with correlation-based: It’s fast and gives a good global ranking (as in your warranty example — CW dominated with ~0.90 Spearman).
Follow up with OAT for the top 2–3 drivers if you need actionable “how much” impact or to communicate upside/downside risk to management.

Many professional analyses (AACE, PMI risk guidelines) use both — correlation for screening, OAT for detailed reporting.

Example 4.2 - Page 113 of Practical Reliability Engineering Textbook

Example calculating the probability of exceeding yield strength, with Monte Carlo method, consider a simle stress analysis problem, where a random force F is applied to a rectangular area with dimensions AxB, Based on the previously recorded data and the goodness of fit criteria, force F can be statistically described by the 2-parameter weibull with shape=2.5 and eta = 11300 N (mean value 10026N),

Dim A has mean of 2.0cm with the tolerance of +/-1.0mm and B has the mean of 3.0cm with the tolerance +/-1.5mm.

The structure is expected of function properly while within the elastic strain range, there force the prob of exceeding the yield strength of 30M Pa needs to be estimated.

Uni axial stress can be calculated as the force divided by the area it is acting on S=F/(A*B), A can be defined by three point estimator (0.019-min, 0.02-most likely, 0.021-max), B as (0.0285, 0.03, 0.0315) meters, the sum of runs is m=1000 iterations

Code

library(ggplot2)
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Code

library(tidyr)  # For pivot_longer
# Monte Carlo Simulation: Probability of exceeding yield strength (30 MPa)
# With added tidy sensitivity table

set.seed(123)          # For reproducibility
m <- 10000             # Number of Monte Carlo iterations

# 1. Sample applied force F from 2-parameter Weibull distribution
F <- rweibull(m, shape = 2.5, scale = 11300)

# 2. Sample dimensions using triangular distributions (in meters)
A <- rtriangle(m, a = 0.019,  b = 0.021,  c = 0.020)   
B <- rtriangle(m, a = 0.0285, b = 0.0315, c = 0.030)

# 3. Calculate stress S = F / (A * B) in Pa
S_Pa <- F / (A * B)

# 4. Yield strength limit
yield_strength_Pa <- 30e6

# 5. Estimated probability (base case)
prob_exceed <- mean(S_Pa > yield_strength_Pa)

# Results
cat("Monte Carlo Simulation Results (", m, " iterations )\n", sep = "")

Monte Carlo Simulation Results (10000 iterations )

Code

cat("Estimated probability of exceeding 30 MPa yield strength:\n")

Estimated probability of exceeding 30 MPa yield strength:

Code

cat("P(S > 30 MPa) =", round(prob_exceed, 4), "\n")

P(S > 30 MPa) = 0.0428

Code

cat("Number of failures:", sum(S_Pa > yield_strength_Pa), "\n")

Number of failures: 428

Code

cat("Number of safe cases:", m - sum(S_Pa > yield_strength_Pa), "\n\n")

Number of safe cases: 9572

Code

# Summary and histogram
S_MPa <- S_Pa / 1e6
cat("Summary of stress distribution (in MPa):\n")

Summary of stress distribution (in MPa):

Code

print(summary(S_MPa))

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.3858 11.5555 16.3410 16.7863 21.4556 48.1521

Code

cat("mean of S_MPa",mean(S_MPa))

mean of S_MPa 16.78631

Code

cat("Standard deviation of S_MPa", sd(S_MPa))

Standard deviation of S_MPa 7.121481

Code

cat("The number of Monte Carlo runs,m with achieving 1% accuracy at 95% CI",round((1.645*((mean(S_MPa)/sd(S_MPa))/0.01))^2))

The number of Monte Carlo runs,m with achieving 1% accuracy at 95% CI 150349

Code

hist(S_MPa, breaks = 50, prob = TRUE,
     main = "Distribution of Uniaxial Stress (1000 simulations)",
     xlab = "Stress (MPa)", col = "steelblue", border = "white")
abline(v = 30, col = "red", lwd = 3, lty = 2)
text(30, par("usr")[4] * 0.9, "Yield Strength = 30 MPa", col = "red", adj = c(-0.1, 0))

Code

# Function to compute probability with modified parameters
compute_prob_exceed <- function(weibull_scale = 11300, 
                                weibull_shape = 2.5,
                                a_mode = 0.020, 
                                b_mode = 0.030) {
  # Adjust min/max based on mode (keeping original tolerances)
  a_min <- a_mode - 0.001
  a_max <- a_mode + 0.001
  b_min <- b_mode - 0.0015
  b_max <- b_mode + 0.0015
  
  sims <- replicate(m, {
    F <- rweibull(1, shape = weibull_shape, scale = weibull_scale)
    A <- rtriangle(1, a = a_min, b = a_max, c = a_mode)
    B <- rtriangle(1, a = b_min, b = b_max, c = b_mode)
    stress <- F / (A * B)
    stress > 30e6
  })
  mean(sims)
}


# Define parameters consistently
params_df <- data.frame(
  Parameter = c("Weibull Scale", "Weibull Shape", "A Mode", "B Mode"),
  base = c(11300, 2.5, 0.020, 0.030),
  pert_low = c(-0.10, -0.20, -0.025, -0.025),  # Relative low perturbation
  pert_high = c(0.10, 0.20, 0.025, 0.025)     # Relative high (symmetric here)
)

# Compute sensitivities in one tidy loop
sensitivity_results <- params_df %>%
  rowwise() %>%
  mutate(
    low_val = base * (1 + pert_low),
    high_val = base * (1 + pert_high),
    prob_low = compute_prob_exceed(
      weibull_scale = if(Parameter == "Weibull Scale") low_val else 11300,
      weibull_shape = if(Parameter == "Weibull Shape") low_val else 2.5,
      a_mode = if(Parameter == "A Mode") low_val else 0.020,
      b_mode = if(Parameter == "B Mode") low_val else 0.030
    ),
    prob_high = compute_prob_exceed(
      weibull_scale = if(Parameter == "Weibull Scale") high_val else 11300,
      weibull_shape = if(Parameter == "Weibull Shape") high_val else 2.5,
      a_mode = if(Parameter == "A Mode") high_val else 0.020,
      b_mode = if(Parameter == "B Mode") high_val else 0.030
    ),
    delta_low = prob_low - prob_exceed,
    delta_high = prob_high - prob_exceed
  ) %>%
  select(Parameter, delta_low, delta_high) %>%
  pivot_longer(cols = c(delta_low, delta_high), names_to = "Scenario", values_to = "Delta") %>%
  mutate(Scenario = recode(Scenario, delta_low = "Low", delta_high = "High"))

# Sort by total swing for tornado order
swing_order <- sensitivity_results %>%
  group_by(Parameter) %>%
  summarise(total_swing = sum(abs(Delta))) %>%
  arrange(desc(total_swing)) %>%
  pull(Parameter)

sensitivity_results$Parameter <- factor(sensitivity_results$Parameter, levels = swing_order)

# Simplified Tornado Plot (identical style to yours)
ggplot(sensitivity_results, aes(x = reorder(Parameter, abs(Delta)), y = Delta, fill = Scenario)) +
  geom_bar(stat = "identity", position = "identity") +
  coord_flip() +
  labs(title = "Tornado Plot: Sensitivity of P(S > 30 MPa)",
       x = "Parameter",
       y = "Change in Probability") +
  scale_fill_manual(values = c("Low" = "skyblue", "High" = "salmon")) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "bottom")

Estimate top one-sided 80% percentile on warranty claims cost of a washing machine, Warranty cost = (sales volume) x Cw x (1-NFF) x [1-R(3-yr)]

R(3-yr)-> Reliability at 3 years, constant faiure rate (uniform 0.001 and 0.002 per year), and NFF is te percent of “No Fault Found”.

Sales volumne is uniform distribution between 800,000 and 1M units.

Cw is Cost of warranty is lognormal distribution with u=5.8, sigma = 0.5.

NFF can be modelled by a symmetrical triangular distribution between 20% to 50%.

Requirement: Do Monte Carlo Simulation and sensitivity analysis to determine which variable has the most impact on the total warranty cost.

Code

library(triangle)  # Required for rtriangle

set.seed(123)
n <- 10000
SV <- runif(n, min=800000, max=1000000)  # Sales volume
CW <- rlnorm(n, meanlog=5.8, sdlog=0.5)  # Cost per warranty claim (corrected to lognormal)
NFF <- rtriangle(n, a=0.2, b=0.5, c=0.35)  # Symmetrical triangular
lambda <- runif(n, min=0.001, max=0.002)  # Failure rate (corrected)
R3Y <- exp(-3 * lambda)  # Reliability at 3 years
failure_prob <- 1 - R3Y

# Calculate warranty claims cost
WCC <- SV * CW * (1 - NFF) * failure_prob
WCCM <- WCC / 1e6

cat("Monte Carlo Simulation Results (", n, " iterations )\n", sep = "")

Monte Carlo Simulation Results (10000 iterations )

Code

cat("Mean of warranty claims cost:", mean(WCCM), "(M$) \n")

Mean of warranty claims cost: 0.9852537 (M$)

Code

cat("Standard deviation of warranty claims cost:", sd(WCCM), "(M$) \n")

Standard deviation of warranty claims cost: 0.5760605 (M$)

Code

print(summary(WCCM))

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.09585 0.58659 0.84686 0.98525 1.23888 5.35534

Code

cat("The number of Monte Carlo runs,m with achieving 1% accuracy at 95% CI",round((1.645*((mean(WCCM)/sd(WCCM))/0.01))^2))

The number of Monte Carlo runs,m with achieving 1% accuracy at 95% CI 79157

Code

# Top one-sided 80% upper bound (80th percentile)
upper_80 <- quantile(WCCM, 0.8)
cat("Top one-sided 80% upper bound (80th percentile):", upper_80, "\n")

Top one-sided 80% upper bound (80th percentile): 1.355681

Code

# Sensitivity analysis (Spearman correlations)
corr_SV <- cor(rank(SV), rank(WCCM), method="spearman")
corr_CW <- cor(rank(CW), rank(WCCM), method="spearman")
corr_NFF <- cor(rank(NFF), rank(WCCM), method="spearman")
corr_failure <- cor(rank(failure_prob), rank(WCCM), method="spearman")

cat("\nSensitivity Analysis (Spearman correlations):\n")


Sensitivity Analysis (Spearman correlations):

Code

cat("Sales Volume (SV):", corr_SV, "\n")

Sales Volume (SV): 0.1044329

Code

cat("Cost per Warranty (CW):", corr_CW, "\n")

Cost per Warranty (CW): 0.9030971

Code

cat("NFF:", corr_NFF, "\n")

NFF: -0.1605372

Code

cat("Failure Probability:", corr_failure, "\n")

Failure Probability: 0.3600483

Code

# Variable with most impact (highest absolute correlation)
impacts <- abs(c(corr_SV, corr_CW, corr_NFF, corr_failure))
names(impacts) <- c("SV", "CW", "NFF", "Failure")
most_impact <- names(impacts)[which.max(impacts)]
cat("Variable with most impact:", most_impact, "\n")

Variable with most impact: CW

Code

# Histogram with added vertical lines
hist(WCCM, breaks = 50, prob = TRUE,
     main = "Distribution of Warranty Cost (10000 simulations)",
     xlab = "Cost (M$)", col = "steelblue", border = "white")
abline(v = mean(WCCM), col = "red", lwd = 2, lty = 1)  # Mean line
abline(v = upper_80, col = "green", lwd = 2, lty = 2)  # 80% upper bound line
legend("topright", legend = c("Mean", "80% Upper Bound"),
       col = c("red", "green"), lwd = 2, lty = c(1, 2))

The Risk Driver Method (also called risk driver approach or common risk factors) is a best-practice technique in quantitative risk analysis (QRA) for handling correlations and uncertainties in Monte Carlo simulations (MCS) without requiring a full pairwise correlation matrix. It is explicitly recommended in AACE International Recommended Practices (e.g., RP 57R-09 for integrated cost-schedule risk analysis using Monte Carlo on CPM schedules) and aligns with PMI’s Practice Standard for Scheduling and PMBOK Guide emphasis on modeling risks realistically in schedule/cost simulations. Instead of estimating correlations between every pair of tasks, you:

Identify 5–15 common risk drivers (shared risks/events that can affect multiple tasks). Define each driver’s probability of occurring and impact range (e.g., triangular distribution as a multiplier on duration/cost: 1.0–1.5×). Map (assign) the driver to all affected tasks. In MCS: For each iteration, sample if the driver occurs and its impact → apply the same sampled impact to all mapped tasks. This induces natural, realistic correlations.

This method is scalable for real projects with tens/hundreds of tasks and provides sensitivity analysis (tornado charts) showing which drivers contribute most to delay/cost variance.

Real-World Example: Space Vehicle Development Project A classic example (from David Hulett’s Practical Schedule Risk Analysis, widely cited in PMI/AACE contexts and real aerospace/oil & gas projects) uses discrete risk drivers in a schedule risk analysis for a space vehicle development project.

Common Risk Drivers Identified (7 examples, typical in high-tech/NPD projects):

Integration issues — Probability: 60%, Impact: +20–50% duration multiplier on affected integration tasks.
2. Parts delay — Probability: 40%, Impact: +10–30% on procurement/assembly tasks.
Test failure — Probability: 50%, Impact: +30–80% rework on testing tasks.
Software bugs — Probability: 70%, Impact: +15–40% on software development/validation.
Funding delay — Probability: 20%, Impact: +10–20% on all resource-dependent tasks.
Design change — Probability: 30%, Impact: +20–60% on design and downstream tasks.
Vendor quality issue — Probability: 35%, Impact: +15–45% on supplied components.

Mapping Example (simplified schedule with tasks):

Task A: System Design → Affected by “Design change”.
Task B: Prototype Build → Affected by “Parts delay”, “Vendor quality issue”.
Task C: Software Coding → Affected by “Software bugs”.
Task D: Integration & Test → Affected by “Integration issues”, “Test failure”, “Design change”.
Task E: Assembly → Affected by “Parts delay”, “Funding delay”.

In MCS (e.g., 10,000 iterations):

For “Parts delay”: If it occurs (40% chance), apply the same sampled multiplier (e.g., 1.25×) to both Task B and Task E → creates correlation without manual coefficient guessing. Results: Histogram of project finish dates, P80/P90 confidence levels, and sensitivity showing “Test failure” as top driver (e.g., contributes 35% to variance).

This approach complies with AACE RP 57R-09 (Monte Carlo on CPM with risk drivers for correlations) and PMI’s quantitative techniques (e.g., simulation modeling in PMBOK). In practice, tools like Safran Risk, Primavera Risk Analysis, or @RISK implement this directly. For construction/mining (common AACE examples), drivers often include “Permitting delay”, “Weather impact”, “Labor shortage”, “Site conditions”. This focuses mitigation on high-impact drivers rather than individual tasks.

Code

library(triangle)
library(mc2d)  # For tornado plot

Loading required package: mvtnorm


Attaching package: 'mc2d'

The following objects are masked from 'package:base':

    pmax, pmin

Code

set.seed(123)
num_sims <- 10000

# Base durations for the 5 tasks
base_durations <- c(10, 15, 20, 12, 18)  # A, B, C, D, E
task_names <- c("A", "B", "C", "D", "E")

# Define the 3 common risk drivers
drivers <- list(
  integration_issues = list(prob = 0.60, min = 1.20, mode = 1.35, max = 1.50),
  parts_delay = list(prob = 0.40, min = 1.10, mode = 1.15, max = 1.30),
  test_failure = list(prob = 0.50, min = 1.30, mode = 1.50, max = 1.80),
  software_bugs = list(prob=0.7, min=1.15, mode= 1.275, max=1.40),
  funding_delay = list(prob=0.2, min=1.10, mode=1.15, max =1.20),
  design_change = list(prob=0.3, min=1.20, mode=1.40,max = 1.60),
  vendor_quality = list(prob=0.35, min=1.15, mode= 1.30, max=1.45)
                      )

# Mapping: which tasks are affected
mapping <- list(
  A = "design_change",
  B = c("parts_delay","vendor_quality"),
  C = "software_bugs",
  D = c("integration_issues","test_failure","design_change"),
  E = c("parts_delay","funding_delay")
)

# Affected matrix
affected_matrix <- matrix(0, nrow = 5, ncol = 7)
colnames(affected_matrix) <- names(drivers)
rownames(affected_matrix) <- task_names
for (task in task_names) {
  for (drv in mapping[[task]]) {
    affected_matrix[task, drv] <- 1
  }
}

# Storage for impacts and total durations
total_durations <- numeric(num_sims)
impact_matrix <- matrix(1, nrow = num_sims, ncol = 7)
colnames(impact_matrix) <- names(drivers)

# Main Monte Carlo simulation
total_durations <- replicate(num_sims, {
  current_durations <- base_durations
  
  for (j in seq_along(drivers)) {
    driver <- drivers[[j]]
    driver_name <- names(drivers)[j]
    if (runif(1) < driver$prob) {
      impact <- rtriangle(n=1, a = driver$min,b = driver$max, c = driver$mode)
      affected <- affected_matrix[, driver_name] == 1
      current_durations[affected] <- current_durations[affected] * impact
    }
  }
  sum(current_durations)
})

# Wait — the above has a problem with impact_matrix in replicate (i not defined).
# Better to use a for loop for impact storage:

total_durations <- numeric(num_sims)
for (i in 1:num_sims) {
  current_durations <- base_durations
  for (j in seq_along(drivers)) {
    driver <- drivers[[j]]
    driver_name <- names(drivers)[j]
    if (runif(1) < driver$prob) {
      impact <- rtriangle(1, a = driver$min, b = driver$max, c = driver$mode)
      impact_matrix[i, j] <- impact
      affected <- affected_matrix[, driver_name] == 1
      current_durations[affected] <- current_durations[affected] * impact
    }
  }
  total_durations[i] <- sum(current_durations)
}

# === ALL ORIGINAL OUTPUTS ===
cat("Base (deterministic) duration:", sum(base_durations), "\n")

Base (deterministic) duration: 75

Code

cat("Monte Carlo results (", num_sims, " simulations):\n", sep = "")

Monte Carlo results (10000 simulations):

Code

cat("Mean total duration: ", round(mean(total_durations), 2), "\n")

Mean total duration:  93.24

Code

cat("Standard deviation: ", round(sd(total_durations), 2), "\n")

Standard deviation:  9.27

Code

cat("The number of Monte Carlo runs needed for 5% accuracy at 95% confidence:", 
    round((1.645 * (mean(total_durations)/sd(total_durations)) / 0.05)^2), "\n")

The number of Monte Carlo runs needed for 5% accuracy at 95% confidence: 109471

Code

# histogram
hist(total_durations, breaks = 50, main = "Distribution of Total Project Duration",
     xlab = "Total Duration (days/weeks)", col = "steelblue", border = "white")
abline(v = quantile(total_durations, c(0.80, 0.90)), col = c("orange", "red"), lwd = 2, lty = 2)
legend("topright", legend = c("P80", "P90"), col = c("orange", "red"), lwd = 2, lty = 2)

Code

# === TIDY PERCENTILE TABLE WITH EXCEEDANCE PROBABILITY ===
percentiles <- c(50, 80, 90, 95, 99)  # Common project risk percentiles
perc_values <- quantile(total_durations, probs = percentiles/100)
exceed_prob <- 1 - percentiles/100

perc_table <- data.frame(
  Percentile = paste0("P", percentiles),
  Duration   = round(perc_values, 2),
  `Exceedance Probability` = paste0(round(exceed_prob * 100, 1), "%")
)

cat("\nKey Percentiles and Exceedance Probabilities:\n")


Key Percentiles and Exceedance Probabilities:

Code

print(perc_table, row.names = FALSE)

 Percentile Duration Exceedance.Probability
        P50    92.22                    50%
        P80   100.53                    20%
        P90   106.11                    10%
        P95   110.70                     5%
        P99   118.22                     1%

Code

# === Using mc2d for Correlation-Based Tornado Plot (FINAL VERIFIED VERSION) ===
cat("\n=== Using mc2d for Correlation-Based Tornado Plot ===\n")


=== Using mc2d for Correlation-Based Tornado Plot ===

Code

# Reshape impacts and assign names BEFORE creating mcdata
impact_array <- array(impact_matrix, dim = c(num_sims, 1, 7))
dimnames(impact_array) <- list(
  NULL,                  # variability dimension (simulations)
  NULL,                  # uncertainty dimension (none)
  c("Integration_Issues", "Parts_Delay", "Test_Failure","software_bugs","funding_delay","design_change","vendor_quality")
)

# Reshape output
total_array <- array(total_durations, dim = c(num_sims, 1, 1))
dimnames(total_array) <- list(NULL, NULL, "Total_Duration")

# Create mcdata objects (specify dimensions explicitly)
mc_impacts <- mcdata(impact_array, type = "V", nsv = num_sims, nsu = 1, nvariates = 7)
mc_output  <- mcdata(total_array,    type = "V", nsv = num_sims, nsu = 1, nvariates = 1)

# Combine into mc object
mc_combined <- mc(mc_impacts, mc_output)

# Combine: mc_impacts (multi-variate) first, then output
mc_combined <- mc(mc_impacts, mc_output)

# Compute tornado: output is now at position 2
tor <- mc2d::tornado(mc_combined,output = 2, method = "spearman")

# You called mc(mc_impacts, mc_output), so:
#   
# Position 1: mc_impacts (multi-variate node with your drivers).
# Position 2: mc_output (the single output node, "Total_Duration").
# 
# Thus, output = 2 is correct for tornado(mc_combined, output = 2).
# Print correlations — now with proper driver names!
cat("Spearman Rank Correlations with Total Duration:\n")

Spearman Rank Correlations with Total Duration:

Code

print(tor)

Spearman's rho statistic 
Output:  mc_output 
$mc_output
     mc_impacts.1 mc_impacts.2 mc_impacts.3 mc_impacts.4 mc_impacts.5
corr    0.3323202    0.3507661    0.4862463    0.2816234     0.118885
     mc_impacts.6 mc_impacts.7
corr     0.560953    0.2748722

Code

# Plot the tornado diagram (automatically sorted by absolute correlation)
plot(tor,
     main = "Tornado Plot: Risk Driver Sensitivity\n(Spearman Rank Correlation)",
     xlab = "Spearman Correlation Coefficient",
     ylab = "Risk Drivers",
     col = "steelblue")

Metric	Value	Interpretation
Mean total duration	93.24	Expected (average) project duration across 10,000 scenarios. Risks add ~15.38 units on average to the base 75.
Standard deviation	9.27	Measure of uncertainty. About 68% of outcomes fall between ~80–100 (mean ±1 SD); 95% between ~70–111 (±2 SD). Higher SD reflects correlated risks amplifying variability.
5th percentile (P05)	79.8004523	In the best 5% of scenarios (very lucky), duration is exactly the base 75 — meaning almost no major risks materialized.
50th percentile (median)	92.22048	Most realistic single-point estimate. 50% chance of finishing in 89 or less; 50% chance of longer. Better than the mean for planning (less skewed by extremes).
95th percentile (P95)	110.7045078	In the worst 5% of scenarios, duration reaches ~108. Provides a realistic “worst-case” for high-confidence planning.
P80 contingency reserve	8.3075307	To have an 80% probability of completing on or before the target, add 8.3075307 units to the median. Common target in many organizations (balances risk and cost).

Practical Project Management Insights

Base estimate (75) is overly optimistic — it assumes no risks occur. Using it without contingency has ~50% chance of overrun. Recommended schedule target: 100–101 (median + P80 contingency) → gives 80% confidence of meeting the date. Risk exposure: The ~33-unit spread between P05 (75) and P95 (108) shows significant uncertainty driven by shared risks (correlations via drivers). If you need higher confidence (e.g., 90%): Add ~17–18 units (P90 ≈ 106–107). Management reserve: The difference between mean (90.38) and higher percentiles reflects potential escalation — useful for executive reporting.

Summary Recommendation

Communicate to stakeholders: “Our most likely duration is ~89 units, but with an 80% confidence level, we should plan for 100–101 units to account for integration, parts, and test risks.” This probabilistic view (enabled by the risk driver method) is far more robust than deterministic (single-point) planning and fully aligns with PMI and AACE best practices for quantitative schedule risk analysis.

One-Way Sensitivity Analysis (OAT) for Top 2 Drivers

To provide actionable “how much” insight, we perform OAT on these top 2:

For each driver, we force it to occur (prob = 1) and fix its impact multiplier at the minimum (low scenario) or maximum (high scenario) of its triangular range. All other drivers follow their normal probability and distribution. Output metric: Change in mean total duration relative to base (~90–92 from your simulations).

Code

library(ggplot2)
library(dplyr)
library(tidyr)

# Top 2 drivers (hardcoded from correlation results)
top_drivers <- c("design_change", "test_failure")

# Base mean (use your simulated mean)
base_mean <- mean(total_durations)

# OAT function: fix one driver to forced occurrence + fixed impact
oat_sim <- function(fixed_driver = NULL, fixed_impact = NULL) {
  durations <- numeric(num_sims)
  for (i in 1:num_sims) {
    current_durations <- base_durations
    for (j in seq_along(drivers)) {
      driver_name <- names(drivers)[j]
      driver <- drivers[[j]]
      if (driver_name == fixed_driver) {
        impact <- fixed_impact  # fixed and forced
      } else {
        if (runif(1) < driver$prob) {
          impact <- rtriangle(1, a = driver$min, b = driver$max, c = driver$mode)
        } else {
          impact <- 1
        }
      }
      affected <- affected_matrix[, j] == 1
      current_durations[affected] <- current_durations[affected] * impact
    }
    durations[i] <- sum(current_durations)
  }
  mean(durations)
}

# Compute OAT results
oat_results <- data.frame(
  Driver = character(),
  Scenario = character(),
  Mean_Duration = numeric(),
  Delta = numeric(),
  stringsAsFactors = FALSE
)

for (drv in top_drivers) {
  min_imp <- drivers[[drv]]$min
  max_imp <- drivers[[drv]]$max
  
  low_mean <- oat_sim(fixed_driver = drv, fixed_impact = min_imp)
  high_mean <- oat_sim(fixed_driver = drv, fixed_impact = max_imp)
  
  oat_results <- rbind(oat_results,
    data.frame(Driver = drv, Scenario = "Low (min impact, forced)", Mean_Duration = low_mean, Delta = low_mean - base_mean),
    data.frame(Driver = drv, Scenario = "High (max impact, forced)", Mean_Duration = high_mean, Delta = high_mean - base_mean)
  )
}

cat("\nOne-Way Sensitivity Results (Top 2 Drivers):\n")


One-Way Sensitivity Results (Top 2 Drivers):

Code

print(oat_results %>% mutate(across(where(is.numeric), ~round(., 2))), row.names = FALSE)

        Driver                  Scenario Mean_Duration Delta
 design_change  Low (min impact, forced)         95.60  2.36
 design_change High (max impact, forced)        106.94 13.70
  test_failure  Low (min impact, forced)         93.97  0.73
  test_failure High (max impact, forced)        102.12  8.88

Code

# OAT Tornado Plot
oat_results$Driver <- factor(oat_results$Driver, levels = rev(top_drivers))  # design_change on top

ggplot(oat_results, aes(x = Driver, y = Delta, fill = Scenario)) +
  geom_bar(stat = "identity", position = "identity") +
  coord_flip() +
  labs(title = "One-Way Sensitivity Tornado Plot\n(Top 2 Risk Drivers)",
       x = "Risk Driver",
       y = "Change in Mean Total Duration (from base ~90–92)") +
  scale_fill_manual(values = c("Low (min impact, forced)" = "skyblue", "High (max impact, forced)" = "salmon"),
                    name = "Scenario") +
  theme_minimal(base_size = 12) +
  theme(legend.position = "bottom")

Design_change in its high/forces worst case extends the project by ~13–14 days on average (to ~107 days mean) — confirming it as the top priority for mitigation (e.g., better change control processes).This is a significant upside risk (~15% longer project on average when this driver hits its extreme).